How do I troubleshoot high source latency on an AWS DMS task?

4 minute read
0

I observe high source latency on my AWS Database Migration Service (AWS DMS) task. What causes source latency during migration?

Short description

You can monitor your AWS DMS task using Amazon CloudWatch metrics. During migration, you might see source latency during the ongoing replication phase—change data capture (CDC)—of an AWS DMS task. You can use the CloudWatch service metric for CDCLatencySource to monitor the source latency for an AWS DMS task. You might see source latency on an AWS DMS task if:

  • The source database has limited resources.
  • The AWS DMS replication instance has limited resources.
  • The network speed between the source database and the AWS DMS replication instance is slow.
  • AWS DMS reads new changes from the transaction logs of the source database during ongoing replication.
  • AWS DMS task settings are inadequate or large objects (LOBs) are being migrated.
  • The Oracle source database used for the AWS DMS task is using LogMiner for ongoing replication.

Resolution

The source database has limited resources. It's a best practice to use native monitoring for your source DB engine. Using native monitoring makes sure that your database isn't experiencing a performance bottleneck like memory contention or I/O saturation.

The AWS DMS replication instance has limited resources. Monitor the replication instance metrics, such as CPUUtilization, FreeStorageSpace, and FreeableMemory. Confirm that the replication instance has enough resources to manage your task.

The network speed between the source database and the AWS DMS replication instance is slow. By design, a single AWS DMS task can't use the full network bandwidth. If you have a busy production database that has lots of changes, then you might need to increase the network bandwidth. For example, use AWS Direct Connect connections.

AWS DMS reads new changes from the transaction logs of the source database during ongoing replication. Depending on your source DB engine, the source transaction log can also have uncommitted data. During ongoing replication, AWS DMS reads incoming changes from the transaction logs. But AWS DMS forwards only committed changes to the target. Eventually, this can result in source latency. Monitor the replication task metrics for CDC and detailed debug logging for the SOURCE_CAPTURE component to confirm that the task is progressing.

But, when the source database writes a large dataset and runs fewer commits, AWS DMS continues reading from the transaction log. AWS DMS doesn't apply changes on the target until the entire transaction is committed. This can also cause source latency. Because the source latency increases, target latency also increases.

AWS DMS task settings are inadequate or large objects (LOBs) are being migrated. AWS DMS migrates LOB data for ongoing replication in two phases. First, AWS DMS creates a new row in the target table with all columns except those that have LOBs. Then, AWS DMS updates the rows that have LOBs. If you have a source database that frequently updates tables that have LOB columns, then you might see source latency. For more information, see Migrating large binary objects (LOBs).

If the task has too many tables, or multiple tables contain LOB columns, then split your task into multiple tasks. If you have sets of tables that don't participate in common transactions, then divide your migration into multiple tasks. This can help increase performance. Transactional consistency is maintained within a task, so it's important that tables in separate tasks don't participate in common transactions. Also, each task independently reads the transaction stream, so don't put too much stress on the source database. For more information, see the Best practices for AWS Database Migration Service.

The Oracle source database used for the AWS DMS task is using LogMiner for ongoing replication. If your source database generates a large number of redo logs, then use the binary reader method for ongoing replication. You can also use this method if the source database is using Oracle Automatic Storage Management (ASM). For more information, see Using Oracle LogMiner or AWS DMS Binary Reader for CDC.


Related information

Improving the performance of an AWS DMS migration

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago
2 Comments

It may be worthwhile expanding on the "FreeStorageSpace" point.

When the DMS replication instance gets to only 10% free space remaining, the replication task will pause reading from the source, causing source latency.

AWS
replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied a year ago