MySQL’s replication mechanism has evolved significantly over the years to improve scalability and reduce lag. Early versions of MySQL used a simple asynchronous model with a single thread applying all changes on a replica. Modern versions support highly parallel replication, enabling replicas to keep up with busy masters more effectively. This article reviews the progression from the original single-threaded replication model to the advanced parallel replication features introduced in MySQL 8.0, highlighting key concepts, configuration examples, and improvements over time. ## Single-Threaded Replication In early MySQL versions (up through 5.5), each replica applied all changes in a single thread. When a transaction commits on the primary (master), its changes are written to the binary log. A replica’s **IO thread** streams these binary-log events from the primary and writes them to a _relay log_. A separate **SQL thread** on the replica reads from the relay log and applies each change to the replica’s data. In effect, the SQL thread “replays” the primary’s transactions one by one. For e.g., a simple replication setup might involve: ```sql -- On the primary: CREATE TABLE example (id INT PRIMARY KEY, val VARCHAR(100)); INSERT INTO example VALUES (1, 'a'), (2, 'b'); -- On the replica: CHANGE MASTER TO MASTER_HOST='master_host', MASTER_USER='repl', MASTER_PASSWORD='secret', MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=107; START SLAVE; ``` In this default mode, the replica’s SQL thread processes events sequentially. The primary’s concurrency does not directly translate to the replica: the IO thread rarely becomes a bottleneck (it only streams data), but the SQL thread often lags behind. As one observer notes, “the primary server often has a few dozen threads processing transactions concurrently, but the replica only has one SQL thread to apply all those transactions”. This mismatch means that on a busy master, replicas can quickly fall behind, resulting in replication lag. Single-threaded replication is straightforward and has low overhead, but it limits performance. Because all changes are applied one at a time, a replica cannot use multiple CPU cores to catch up with a multi-core primary. Any high-volume workload on the master (for e.g., many clients writing in parallel) can overload the single SQL thread on the replica and cause lag. ## Introduction of Parallel Replication MySQL 5.6 introduced the first support for **parallel replication**, allowing a replica to apply multiple transactions concurrently. Specifically, version 5.6.3 added the ability to use several SQL-applier threads in parallel. Initially, this worked by grouping transactions by schema or database: events that affected different databases could be applied in parallel. In practice, you could enable parallel replication on a replica by increasing the number of parallel worker threads. For e.g., on MySQL 5.6 or 5.7 one might configure: ```sql STOP SLAVE; SET GLOBAL slave_parallel_threads = 4; START SLAVE; ``` This command allows up to 4 SQL threads to run in parallel, each taking transactions from the relay log. In MySQL 5.7 and later (including 8.0), the new variable `slave_parallel_workers` replaces `slave_parallel_threads`: ```sql STOP SLAVE; SET GLOBAL slave_parallel_workers = 4; START SLAVE; ``` By default, early parallelism only worked on transactions that were guaranteed not to conflict (e.g. they operated on different databases). In MySQL 5.7, a more robust parallel mode was introduced using **commit-order scheduling** (the `LOGICAL_CLOCK` algorithm). This allowed transactions affecting the _same_ database to run in parallel as long as their commit order could be preserved. Additional parameters control this behavior: - **`slave_parallel_type`**: In MySQL 5.7+, setting this to `LOGICAL_CLOCK` uses a global order to schedule transactions. - **`slave_preserve_commit_order`**: When `ON`, the replica enforces the original commit order of transactions. This ensures consistency but may reduce parallelism. When `OFF`, transactions may commit out of order for higher throughput. For e.g., one could configure a replica as follows: ```sql STOP SLAVE; SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK'; SET GLOBAL slave_parallel_workers = 8; SET GLOBAL slave_preserve_commit_order = ON; START SLAVE; ``` This setup runs 8 parallel applier threads with commit-order enforcement. As one MySQL expert notes, increasing the number of worker threads (even to dozens) can dramatically improve throughput **if transactions are independent**. Using `LOGICAL_CLOCK` without preserving commit order can maximize parallelism, but may risk a temporarily inconsistent apply sequence. Parallel replication significantly improves scalability. In practice, enabling multiple worker threads can reduce lag from minutes or hours down to seconds. With multi-threaded replication, a slave can roughly keep pace with a multi-core master by distributing the load. Key developments in this era include: - **MySQL 5.6.3**: First support for multiple slave SQL threads (per-database parallelism). - **MySQL 5.7**: Introduction of global transaction IDs (GTIDs) and `LOGICAL_CLOCK` parallel mode, allowing intra-schema parallelism. Using GTIDs also simplifies failover. - **Tuning variables**: you can adjust `slave_parallel_workers` (threads), `slave_parallel_type`, and `slave_preserve_commit_order` to trade off parallelism versus strict ordering. In summary, by MySQL 5.7 the emphasis shifted to **multithreaded replication**. Experts agree this is “**the single most important thing you can do to reduce lag**”. However, it also introduced complexity in crash recovery and conflict handling. Additional tools like Percona’s `pt-heartbeat` or setting `slave_checkpoint_period` help keep track of progress and improve reliability in multi-threaded setups. ## Advancements in MySQL 8 MySQL 8.0 has brought further improvements to replication, making parallelism more reliable and performant. Major enhancements in MySQL 8 include: - **WRITESET-Based Dependency Tracking (8.0.22+)**: MySQL 8.0.22 introduced `binlog_transaction_dependency_tracking=WRITESET`, which uses fine-grained dependency tracking to allow even more parallelism. With this mode, transactions are analyzed for conflicts, and independent transactions are applied concurrently regardless of commit order. Starting in 8.0.22, this WRITESET tracking is the default. By automatically detecting whether transactions conflict, the replica can safely apply them out of order and greatly increase throughput. - **Improved Crash Recovery**: MySQL 8 stores relay-log metadata in a transactional system tablespace (instead of raw files), making crash recovery safer. After a crash, the IO thread resumes at the last committed position. In single-threaded mode, all old relay logs are discarded; in multi-threaded mode, recovery is more complex, but MySQL 8’s framework handles it more robustly. (Under the hood, a replica rebuilds missing parts of the relay log to fill “gaps” left by parallel apply.) - **Deadlock Handling in Parallel Replication**: MySQL 8.0.23 fixed a known issue where enforcing commit order could cause deadlocks among replica threads. With the fix, if a parallel worker detects that it cannot commit due to a lock held by an earlier transaction, it will roll back that transaction and then retry correctly. This ensures the deadlocked thread properly signals others and avoids stalling replication. - **Group Commit & Performance**: Modern MySQL uses group commit more efficiently, which also helps replication. When multiple transactions commit at the same time on the master, they are batched in the binary log. Replicas can apply such batches with less contention. Although group commit is a master-side feature, it synergies with parallel apply on the slave. - **Replication Configuration Simplification**: MySQL 8.0 introduced alias commands (`START REPLICA` instead of `START SLAVE`, and you know why) and better defaults. For e.g., if multi-source replication is used (replicating from multiple masters), each channel can also use parallel workers. To illustrate a MySQL 8 replica setup, consider enabling parallel apply and modern features: ```sql -- Enable parallel replication with 8 worker threads STOP SLAVE; SET GLOBAL slave_parallel_workers = 8; SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK'; SET GLOBAL slave_preserve_commit_order = OFF; -- Use the default WRITESET-based dependency tracking (8.0.22+) SET GLOBAL binlog_transaction_dependency_tracking = WRITESET; START SLAVE; ``` In this configuration, the replica uses 8 parallel SQL threads. Because `slave_preserve_commit_order=OFF`, it permits out-of-order commits when safe, relying on the WRITESET dependency tracking to avoid conflicts. This maximizes throughput while maintaining correctness. Overall, MySQL 8.0’s replication advances provide much better performance and flexibility. Users are encouraged to enable GTIDs and use a recent 8.0 version (8.0.22 or later) to gain these benefits. In practice, modern replicas can often run with minimal lag even under heavy write loads, a dramatic improvement over the original single-threaded model. **Key Takeaways:** - **Legacy (Single-Threaded)**: One SQL thread; simple but limited. All changes replay sequentially on the replica. - **Parallel Replication (MySQL 5.6/5.7)**: Multiple SQL threads apply transactions concurrently. Use `slave_parallel_threads` (5.6) or `slave_parallel_workers` (5.7+) and consider `LOGICAL_CLOCK` and GTID for best results. - **MySQL 8 Enhancements**: WRITESET-based dependency tracking by default (since 8.0.22) allows finer-grained parallelism. Reliability improvements (e.g. MySQL 8.0.23 deadlock fix) make multithreaded replication stable. At the end, I hope by reading and understanding these advancements and tuning the relevant parameters, you get familiar with how you can achieve much faster, more reliable replication in MySQL. # References 1. https://www.percona.com/blog/can-mysql-parallel-replication-help-my-slave/ 2. https://medium.com/airtable-eng/optimizing-mysql-replication-lag-with-parallel-replication-and-WRITESET-based-dependency-tracking-1fc405cf023c #mysql #replication #lessons #database