- Newest
- Most votes
- Most comments
This error is occurring because of a character set mismatch when resuming replication after the VPN issue. The error message "Unexpected charset utf8mb3 for column 'COLUMN_NAME'" indicates that AWS DMS is encountering an issue with character set conversion between your MariaDB source (utf8_general_ci) and Aurora MySQL target (utf8mb4_0900_ai_ci).
When data goes through AWS DMS, it undergoes multiple decoding and encoding phases - reading from the source, processing in the replication instance, and loading into the target. During normal operation, DMS might handle these conversions seamlessly, but when resuming after an interruption, it may be encountering issues with how the character sets are being interpreted.
The specific mention of "utf8mb3" in the error is significant because in MySQL/MariaDB, what was traditionally called "utf8" is actually a 3-byte implementation (now sometimes referred to as utf8mb3), while utf8mb4 is the full 4-byte UTF-8 implementation that can handle all Unicode characters.
To resolve this issue, you can add the CharsetMapping extra connection attribute to your source endpoint configuration. This explicitly tells DMS how to map character sets during the migration:
CharsetMapping=utf8,65001
Where 65001 is the code page identifier for UTF-8. This mapping helps DMS properly convert between the character sets when resuming replication.
You might need to restart your AWS DMS migration task from the beginning after adding this extra connection attribute. It's recommended to test this solution in a non-production environment first before applying it to your production migration.
Sources
Troubleshooting - Amazon Aurora MySQL Migration Handbook
Optimize data validation using AWS DMS validation-only tasks | AWS Database Blog
Collation/character set issues with AWS DMS when migrating from RDS Postgres to RDS MariaDB | AWS re:Post
Relevant content
- asked 2 years ago
