How to use parallel loading on DMS target DynamoDB

0

Background

We are trying to migrate a relatively large table with 200+ million rows from the Aurora PostgreSQL source to the DynamoDB target. While reading the guide, "Using an Amazon DynamoDB database as a target for AWS Database Migration Service" I stumbled upon this,

Note DMS assigns each segment of a table to its own thread for loading. Therefore, set ParallelLoadThreads to the maximum number of segments that you specify for a table in the source.

And,

Table-mapping settings for individual tables – Use table-settings rules to identify individual tables from the source that you want to load in parallel. Also, use these rules to specify how to segment the rows of each table for multithreaded loading. For more information, see Table and collection settings rules and operations.

Error

Looking at the guide, I have defined the parallel load for the task (moved the terraform configuration to the end of the post). But trying to apply those changes fails,

│ Error: updating DMS Replication Task (messaging-migrate-db-subscriptions-dms-full-load-and-cdc): InvalidParameterValueException: Error in mapping rules. Rule with ruleId = 3 failed validation. Table setting 'parallel-load' is not supported for the target endpoint type 'dynamodb'

So I think there are two options here:

  1. Guide is misleading because it is not possible to define segmentation using rules for DynamoDB target.
  2. There is another way to define segmentation rules that I am not aware of.

Thanks!


Attachments

      {
        rule-type = "table-settings"
        rule-id   = 3
        rule-name = "ParallelLoadSettings"
        object-locator = {
          schema-name = "***"
          table-name  = "***"
        }
        type = "ranges"
        columns = [
          "id"
        ]
        boundaries = [
          ["10000000-00000000-00000000-00000000"],
          ["20000000-00000000-00000000-00000000"],
          ["30000000-00000000-00000000-00000000"],
          ["40000000-00000000-00000000-00000000"],
          ["50000000-00000000-00000000-00000000"],
          ["60000000-00000000-00000000-00000000"],
          ["70000000-00000000-00000000-00000000"],
          ["80000000-00000000-00000000-00000000"],
          ["90000000-00000000-00000000-00000000"],
          ["a0000000-00000000-00000000-00000000"],
          ["b0000000-00000000-00000000-00000000"],
          ["c0000000-00000000-00000000-00000000"],
          ["d0000000-00000000-00000000-00000000"],
          ["e0000000-00000000-00000000-00000000"],
          ["f0000000-00000000-00000000-00000000"],
        ]
      }
1 Answer
0

DynamoDB does not use parallel-load but rather ParallelLoadThreads which is defined in Task Settings:

ParallelLoadThreads – Specifies the number of threads that AWS DMS uses to load each table into the target database. This parameter has maximum values for non-RDBMS targets. The maximum value for a DynamoDB target is 200. The maximum value for an Amazon Kinesis Data Streams, Apache Kafka, or Amazon OpenSearch Service target is 32. You can ask to have this maximum limit increased. ParallelLoadThreads applies to Full Load tasks. For information on the settings for parallel load of individual tables, see Table and collection settings rules and operations.

This setting applies to the following endpoint engine types:

  • DynamoDB
  • Amazon Kinesis Data Streams
  • Amazon MSK
  • Amazon OpenSearch Service
  • Amazon Redshift
profile pictureAWS
EXPERT
answered 10 months ago
  • As far as I understand this statement from the official guide is incorrect:

    Use table-settings rules to identify individual tables from the source that you want to load in parallel. Also, use these rules to specify how to segment the rows of each table for multithreaded loading.

    Can we update the page so it is clear how Dynamodb parallelization works? It seems like ParallelLoadThreads automatically scales the full-load process without writing any table setting.

  • Best way to have the documentations updated is to provide feedback directly on the page of concern (bottom left). It will cut a ticket to the owning team. Thanks.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions