Wednesday, September 18, 2019

How to skip certain error or next n number of errors in MySQL replication environment?


MySQL replication is a process which replicates all the data in the MySQL master database to MySQL slave database/s.



MySQL  replication methods


The traditional replication method writes every change in Master DB to Slave DB/s using binary log. Whenever the master DB is modified, the change is written to the binary log. The slave DB runs a SQL thread that continuously reads that binary log. It identifies new changes using last synchronized position and applies the changes to the slave DB/s.

The newer replication method is based on global transaction identifiers (GTIDs). It is transactional and does not require working with log files or positions within these files. GTID Replication method guarantees consistency between master and slave as long as all transactions committed on the master have also been applied to the slave.


MySQL replication process improves scalability by allowing query traffic to be distributed among multiple slave databases for reads and by limiting the master database to writes.

Additional benefits of replication are

  • data security via slave backups
  • analytic performed on the slaves without affecting the master's performance
  • widespread data distribution
  • fail-over


Depending on the configuration, you can replicate
  • All databases or
  • Selected databases or even
  • Selected tables within a database

Root causes for MySQL  replication failures


MySQL replication process may be not working properly due to the following reasons.

  • Binary logging is not enabled on Master DB
  • Both master DB and slave DB/s are started with the --server-id option, but the server-id value is not unique on each DB server
  • Slave DB/s is not running
  • Slave DB/s has not established a connection to the master DB
  • A statement that succeeded on the master DB refuses to run on the slave DB/s

The last reason never happen if you have taken a proper snapshot of the master DB, and never modified the data on the slave outside of the slave SQL thread.



You can identify the affected table on the slave DB which is different from the master DB by running show slave status \G; command on the slave DB server and checking Last_SQL_Error property value. Then make the slave's table identical to the master's and run START SLAVE command.


If that step does not work and it would be safe to make the update manually then ignore the next statement from the master DB. and you can restart the replication process by one of the following way.


1. If you can't restart the slave database and need to skip next n number of errors,


Set GLOBAL sql_slave_skip_counter by running following command in the MySQL Slave DB Console to skips the next N number of errors during replication process. You may have to run the following command several times in a single occurrence to find the appropriate value of N** by increasing the value of N** until replication service is back to normal.

STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = N**;
START SLAVE;

You can check the replication service is back to normal or not by running the show slave status \G; command and checking Slave_IO_Running and Slave_SQL_Running properties values.

If the replication service is back to the normal both properties value should be "Yes".


2. If you can restart the slave database and need to skip certain error,



You can avoid the situation by skipping certain error in MySQL Slave replication.
You can find the Last_SQL_Errno by running show slave status \G; command on MySQL Slave DB Console. Last_SQL_Errno property will show the error number.


Sample Output

Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table iris_db.ussd_session; Can't find record in 'ussd_session', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.002868, end_log_pos 51592155

Then add the following line to /etc/mysql/my.cnf file and restart the MySQL slave server.

slave-skip-errors = <Last_SQL_Errno>:

Sample line according to the above example

slave-skip-errors = 1032:


Please note that skipping errors several times in MySQL replication environment is not a good practice and it may cause data inconsistency between master DB and slave DB/s.

No comments:

Post a Comment

How to send Slack notification using a Python script?

 In this article,  I am focussing on sending Slack notifications periodically based on the records in the database. Suppose we have to monit...