Blue green upgrade mysql 5.7 -> 8 cluster replication fails during nightly database updates

0

We are in the progress of updating Aurora Mysql from 5.7 to 8

Our current environment is a 5.7.mysql_aurora.2.11.2 cluster consisting of a writer and a reader replica - this becomes our blue environment. Our green environment then becomes 8.0.mysql_aurora.3.04.0

I cleaned out all errors and all the warnings that concerned me and after that I managed to create a blue green setup and the green cluster is correctly replicated and has gone through our acceptance tests. However at a nightly run with some database heavy operation the replication is lost with the following in our logs:

2023-07-29T03:14:01.294175Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:01.609660Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:01.911041Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:02.311538Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:02.745130Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:03.212729Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:03.781751Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:04.421851Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:05.025411Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:05.672708Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:06.351207Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:07.102832Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:07.898024Z 0 [ERROR] [MY-012298] [InnoDB] (Duplicate key) writing word node to FTS auxiliary index table. (fts0fts.cc:4225) 2023-07-29T03:14:08.154799Z 16 [Note] [MY-010559] [Repl] Multi-threaded slave statistics for channel '': seconds elapsed = 124; events assigned = 3432449; worker queues filled over overrun level = 10897; waited due a Worker queue full = 4494; waited due the total size = 0; waited at clock conflicts = 339498860300 waited (count) when Workers occupied = 54951 waited when Workers occupied = 0 (rpl_replica.cc:4578) 2023-07-29T03:16:14.514919Z 16 [Note] [MY-010559] [Repl] Multi-threaded slave statistics for channel '': seconds elapsed = 126; events assigned = 3436545; worker queues filled over overrun level = 10897; waited due a Worker queue full = 4494; waited due the total size = 0; waited at clock conflicts = 339577551100 waited (count) when Workers occupied = 54951 waited when Workers occupied = 0 (rpl_replica.cc:4578) 2023-07-29T03:18:54.796487Z 16 [Note] [MY-010559] [Repl] Multi-threaded slave statistics for channel '': seconds elapsed = 160; events assigned = 3623937; worker queues filled over overrun level = 12564; waited due a Worker queue full = 4495; waited due the total size = 0; waited at clock conflicts = 351810755900 waited (count) when Workers occupied = 107497 waited when Workers occupied = 0 (rpl_replica.cc:4578) 2023-07-29T03:19:48.602169Z 16 [ERROR] [MY-010411] [Repl] Transaction's sequence number is inconsistent with that of a preceding one: sequence_number (1) <= previous sequence_number (2548) (rpl_mta_submode.cc:669) 2023-07-29T03:19:48.602215Z 16 [Warning] [MY-010584] [Repl] Slave SQL for channel '': Coordinator thread of multi-threaded slave is being stopped in the middle of assigning a group of events; deferring to exit until the group completion ... , Error_code: MY-000000 2023-07-29T03:19:48.602232Z 16 [ERROR] [MY-010584] [Repl] Slave SQL for channel '': Cannot execute the current event group in the parallel mode. Encountered event Anonymous_Gtid, relay-log name /rdsdbdata/log/relaylog/relaylog.000114, position 279 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly.. Error_code: MY-001755 2023-07-29T03:19:48.602240Z 16 [Warning] [MY-010584] [Repl] Slave: Cannot execute the current event group in the parallel mode. Encountered event Anonymous_Gtid, relay-log name /rdsdbdata/log/relaylog/relaylog.000114, position 279 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly.. Error_code: MY-001755 (rpl_replica.cc:6956)

After this nightly run the replication from the blue to the green environment stops, the blue cluster stays up to date and replication is normal to its reader instance. The green updated cluster however stops receiving updates from the blue cluster.

I have deleted and recreated the blue green environment multiple times, and replication (only between blue green) always fails during a write heavy nightly operations. I know when it happens but am unable to find the exact cause.

I have almost hit the switch over button two times, but something is nagging me that reverting is going to be complex if things do not work out.

I wonder if this is this more likely to be an error in the cluster replication between 5.7 and 8.0 that is not likely to exist within the new mysql 8 cluster if I switch over before the nightly run happens?

Should I be concerned about switching over at this stage?

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions