En utilisant AWS re:Post, vous acceptez les AWS re:Post Conditions d’utilisation

DynamoDB table purge 400M records from a 1Billion item table & batchwrite throttling

0

Example scenario - DynamoDB table has 1 billion rows company splits, we make a copy of the table for the other company and now we need to delete 400M of the 1B items. TTL nor deleting and recreating the table are options as we need to keep our part of the records. We looked at parallel scan reading the table and pushing rows to delete into a batchwrite operation to delete rows. Is there a better option to consider?

Side note initial tests with parallel scan and batchwrite are having throttling issues with single partition 1000wcu limit table has plenty of wcu capacity. Any other suggested approaches or thoughts to address single partition wcu issue?

table has customerid number PK ; LOBid string SK; and other columns <1KB / item.

AWS
demandé il y a 4 ans1,3 k vues
1 réponse
2
Réponse acceptée

With parallel scan, each thread scans a continuous segment in the table. If the items to be deleted are evenly distributed across the table space, you would encounter table level throttling instead of partition level throttling. The fact that you encountered partition level throttling without exceeding the provisioned WCU indicated that the items to be deleted might reside in a few partitions only. The result is, almost all items in a particular BatchWriteItem API call belong to the same partition, and you might be deleting from only a few partitions at a time.

One way to improve the performance would be perform a shuffling before doing the deletes. That is, the parallel scan threads pushes records to be deleted into a list. After the parallel scan finishes, perform a random ordering on items in the list. After that, the delete threads retrieve items from the list for deletion. With this approach, you increase the possibility that items in a BatchWriteItem API call are distributed in multiple partitions, taking advantage of the write capacity in multiple partitions.

AWS
répondu il y a 4 ans
profile picture
EXPERT
vérifié il y a 7 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions