Skip to content

How do I optimize my full load and CDC AWS DMS tasks for data migration and continuous replication?

5 minute read
0

I want to optimize my full load and change data capture (CDC) AWS Database Migration Service (AWS DMS) tasks to improve performance and resolve replication issues.

Resolution

Optimize full load settings

Configure the target table preparation mode

Configure a value for the TargetTablePrepMode option based on your migration requirements.

Delay primary key or unique index creation

To improve the initial load performance, set the CreatePkAfterFullLoad parameter to true to delay primary key or unique index creation until after the full load completes. 

Use parallel load

To increase the speed of your migration, use parallel load.

For table-level parallelism, increase the MaxFullLoadSubTasks value so that more tables can load simultaneously. The default value is 8. You can set the value to a maximum of 49.

For large table optimization, modify the ParallelLoadThreads and ParallelLoadBufferSize settings.

To increase the number of threads for each table, increase the ParallelLoadThreads setting.

The ParallelLoadThreads setting applies only to the following endpoint engine types:

  • Amazon Redshift
  • Amazon Kinesis Data Streams
  • Amazon OpenSearch Service
  • Amazon Managed Streaming for Apache Kafka (Amazon MSK)
  • Amazon DynamoDB

Note: AWS DMS supports ParallelLoadThreads for MySQL as an extra connection attribute (ECA) but not as a task setting.

To buffer more rows before the task writes, modify the ParallelLoadBufferSize.

Example configuration:

{
  "TargetTablePrepMode": "DROP_AND_CREATE",
  "CreatePkAfterFullLoad": true,
  "MaxFullLoadSubTasks": 16,
  "ParallelLoadThreads": 8,
  "ParallelLoadBufferSize": 2000,
  "ParallelLoadQueuesPerThread": 1,
  "ParallelLoadType": "ranges"
}

For more information, see Using parallel load for selected tables, views, and collections.

Increase the commit rate

The commit rate is the maximum number of records that AWS DMS can transfer together.

To increase the commit rate, complete the following steps:

  1. Open the AWS DMS console.
  2. In the navigation pane, choose Migrate data, and then choose Database migration tasks.
  3. Select your migration task.
  4. On the Details page, choose Actions, and then choose Modify.
  5. Under Task settings, increase the CommitRate setting based on the number of records that you want to load in parallel.
    Note: The default is 10000. The maximum value that you can set is 50000.

Configure LOBs

Calculate the maximum large binary object (LOB) column lengths in your source database.

To prevent data truncation, configure your AWS DMS task LOB settings.

For more information, see How can I improve the speed of an AWS DMS task that has LOB data?

Split large migrations

For migrations with many tables, split your migration into multiple tasks to reduce memory pressure on individual tasks.

Optimize CDC settings

Use AWS DMS batch apply

To improve CDC replication performance and reduce high target latency, use AWS DMS batch apply. For more information about the batch apply setting, see Change processing tuning settings.

Increase memory to reduce swap files usage

Your task might experience a "Reading from source paused" error because the SORTER component swap files reached the maximum storage quota. If long transactions exceed the MemoryLimitTotal and MemoryKeepTime values, then the transactions in memory move to swap files on the replication instance disk.

To resolve this issue, increase the MemoryLimitTotal and MemoryKeepTime values. For more information, see What are swap files and why do the files use space on my AWS DMS instance?

Configure task-level settings

For full-load and CDC tasks, use one of the following parameters for the Stop task after full load completes setting:

  • Set StopTaskCachedChangesNotApplied to true to stop the task after full load but before AWS DMS applies cached changes. AWS DMS then creates secondary indexes to speed up cached change application.
  • Set StopTaskCachedChangesApplied to true to stop the task after AWS DMS applies cached changes. This provides a consistent data state to add foreign keys without constraint violations.

For more information, see When can I add secondary objects to a target database during AWS DMS migration?

Monitor CloudWatch metrics

Monitor replication instances

If you use provisioned AWS DMS tasks, then the replication instances are Amazon Elastic Compute Cloud (Amazon EC2) instances. EC2 replication instances use resources such as CPU, RAM, and storage based on the instance class and allocated storage on the instance.

To avoid resource constraints on the replication instance, use Amazon CloudWatch to monitor the following replication instance metrics:

  • If FreeMemory is low, then increase the instance size or choose a memory-optimized replication instance.
  • If CPUUtilization is high, then tune the task settings.
  • If SwapUsage is high, then monitor FreeMemory on the instance.
  • If the sum of the values for ReadIOPS and WriteIOPS is higher than the base performance for a volume, then increase the burst balance.
  • If ReadThroughput and WriteThroughput show a low throughput rate, then check your network bandwidth.
  • If your FreeStorageSpace is more than 90%, then increase storage or clean up your logs.

Monitor replication tasks

For replication tasks, use CloudWatch to monitor the following metrics:

  • FullLoadThroughputBandwidthTarget
  • FullLoadThroughputRowsTarget
  • CDCIncomingChanges
  • CDCLatencySource
  • CDCLatencyTarget

If your task experiences high latency, then see the following AWS Knowledge Center articles:

For more information, see AWS DMS key troubleshooting metrics and performance enhancers.

Turn on debug logging

To troubleshoot performance issues, turn on detailed debug logging.

Related information

Best practices for AWS Database Migration Service

Troubleshooting migration tasks in AWS Database Migration Service