Last week I wrote about a strange issue with MySQL replication and what I thought I wouldn't see again for a long time. Well, around 1:00am this morning another MySQL server (a different cluster than previous) had the exact same issue as the first cluster. This time, my first step was to reboot the machine to see if it would fix the problem and thus not require me to spend a lot of time exporting and importing data. The reboot fixed the problem with the constant restart of the mysql server, but the replication thread failed to start. In fact, it was failing to start the slave thread on both machines. Issuing ‘start slave’ produced this in the error log of the machine with the problem:
060509 1:32:09 [Note] Slave I/O thread: connected to master 'replxxx@xxx-slave1:3306', replication started in log 'xxx-slave1-bin.004981' at position 79
060509 1:32:09 [ERROR] Error reading packet from server: Could not find first log file name in binary log index file (server_errno=1236)
060509 1:32:09 [ERROR] Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log
060509 1:32:09 [ERROR] Slave I/O thread exiting, read up to log 'xxx-slave1-bin.004981', position 79
This time I was determined to fix the problem without dumping data so I Googled a bit for the error and found this great post:
