Error
Error Code: 1885

MySQL Error 1885: Replica GTID Divergence

📦 MySQL
📋

Description

This error signifies that a MySQL replica has recorded more Global Transaction Identifiers (GTIDs) than its source server. It indicates a critical divergence in transaction history, where the replica's state is ahead or inconsistent with the source. This issue often arises from data loss or rollbacks on the source server.
💬

Error Message

Replica has more GTIDs than the source has, using the source's SERVER_UUID. This may indicate that the the last binary log file was truncated or lost, e.g., after a power failure when sync_binlog != 1. The source may have rolled back transactions that were already replicated to the replica. Replicate any transactions that source has rolled back from replica to source, and/or commit empty transactions on source to account for transactions that have been committed on source but are not included in GTID_EXECUTED.
🔍

Known Causes

3 known causes
⚠️
Binary Log File Loss
The source server's binary log files, which track transactions via GTIDs, were truncated or lost, often due to an unexpected shutdown or underlying storage issue.
⚠️
Source Transaction Rollback
Transactions that were already committed and replicated to the replica were subsequently rolled back on the source, creating a GTID history divergence.
⚠️
Inadequate sync_binlog Configuration
A `sync_binlog` setting other than 1 can leave binary log data buffered in memory, making it susceptible to loss during system crashes or power failures.
🛠️

Solutions

3 solutions available

1. Reset Replica and Re-sync from Source medium

The most common and safest approach is to reset the replica's replication state and re-sync from a consistent point on the source.

1
Stop the replica's SQL and IO threads.
STOP SLAVE IO_THREAD;
STOP SLAVE SQL_THREAD;
2
Reset the replica's replication configuration. This will clear its GTID state and connection information.
RESET SLAVE ALL;
-- Alternatively, if you only want to reset replication state without losing other configurations:
-- RESET SLAVE;
3
On the source, determine the current GTID set or the last executed binary log file and position. This is crucial for knowing where to restart replication.
SHOW MASTER STATUS;
-- Or, if using GTID:
SELECT @@GLOBAL.GTID_EXECUTED;
4
On the replica, reconfigure replication to point to the source. Use the information obtained in the previous step to ensure consistency. If you are using GTID-based replication, you might not need to specify `MASTER_LOG_FILE` and `MASTER_LOG_POS` if the source is configured correctly.
CHANGE MASTER TO
  MASTER_HOST='<source_host>',
  MASTER_USER='<replication_user>',
  MASTER_PASSWORD='<replication_password>',
  MASTER_PORT=<source_port>,
  MASTER_AUTO_POSITION=1; -- Use 1 for GTID-based replication
5
Start the replica's IO and SQL threads.
START SLAVE IO_THREAD;
START SLAVE SQL_THREAD;
6
Monitor the replica to ensure it's catching up and that the divergence error does not reoccur.
SHOW SLAVE STATUS\G

2. Manually Reconcile GTIDs and Apply Missing Transactions advanced

For advanced users, this involves identifying and applying transactions that are on the replica but not on the source.

1
Stop the replica's SQL thread to prevent further writes.
STOP SLAVE SQL_THREAD;
2
On the source, get the GTID set.
SELECT @@GLOBAL.GTID_EXECUTED;
3
On the replica, get its GTID set.
SELECT @@GLOBAL.GTID_EXECUTED;
4
Compare the GTID sets. Identify the GTIDs present on the replica but not on the source. This can be done by parsing the output of `GTID_EXECUTED` or by using tools to diff the sets.
This step requires manual analysis or scripting to compare the output of the previous two commands.
5
For each missing GTID on the source, you need to replicate that specific transaction. This is complex and might involve replaying binary log events. A common strategy is to re-apply transactions that were rolled back on the source. This often involves:
1. Identifying the binary log file and position on the source corresponding to the GTID.
2. Dumping the specific transaction(s) from the source's binary log using `mysqlbinlog`.
3. Applying these dumped transactions to the source as empty transactions or by replaying them if they were rolled back. The error message suggests committing empty transactions on the source to 'account for transactions that have been committed on source but are not included in GTID_EXECUTED'. This might mean replaying transactions that were committed on the replica but later rolled back on the source.
# Example of dumping a transaction (replace with actual GTID/log details):
mysqlbinlog --start-gtid='<gtid_set_on_replica_not_on_source>' --stop-gtid='<gtid_set_on_replica_not_on_source>' <binary_log_file> | mysql -h <source_host> -u <replication_user> -p<replication_password> <database_name>

# To commit an empty transaction on the source (if the source has rolled back transactions):
-- On the source:
SET SESSION sql_log_bin=1;
COMMIT; -- This generates a new GTID on the source.
6
Once the GTID sets are reconciled (or the source has accounted for the missing transactions), restart the replica's SQL thread.
START SLAVE SQL_THREAD;
7
Monitor the replica closely.
SHOW SLAVE STATUS\G

3. Force Replication to Continue (Use with Caution) advanced

This is a last resort and can lead to data inconsistency if not understood thoroughly. It's for situations where you need the replica to catch up immediately and are willing to accept potential minor data drift.

1
Stop the replica's SQL thread.
STOP SLAVE SQL_THREAD;
2
On the replica, set the `slave_skip_errors` variable to allow skipping specific errors. The error code 1885 is usually not directly skippable, but this might be a precursor if other related errors occur. A more direct approach for divergence is to reset the GTID state.
SET GLOBAL sql_slave_skip_counter = 1; -- This is for skipping a single event, not directly for GTID divergence.
-- For GTID divergence, the primary action is to reset or reconfigure, not skip.
3
The error message explicitly suggests committing empty transactions on the source to account for missing GTIDs. This is the most direct way to resolve divergence without a full resync, but it requires careful application.
-- On the source:
SET SESSION sql_log_bin=1;
COMMIT; -- This creates a new GTID on the source, which the replica will then try to replicate.
4
After committing the empty transaction on the source, restart the replica's SQL thread.
START SLAVE SQL_THREAD;
5
Monitor the replica's status. You may need to repeat the 'commit empty transaction' step on the source multiple times until the replica catches up. This is highly dependent on the exact nature of the divergence.
SHOW SLAVE STATUS\G
6
Once the replica is caught up, consider resetting the replica's replication state to ensure a clean slate, especially if the divergence was significant.
STOP SLAVE IO_THREAD;
STOP SLAVE SQL_THREAD;
RESET SLAVE ALL;
-- Then reconfigure replication as per Solution 1.
🔗

Related Errors

5 related errors