What is the best practice to archive items to S3 using dynamoDB TTL?

2

I am considering ways to archive items to S3 using dynamoDB TTL. According to this official AWS blog , the architect is

**DyanmoDB Stream => Lambda => Firehose => S3. **

Why not directly writing from Lambda to S3?

** **DyanmoDB Stream => Lambda => S3 ****

Thank you!

hai
質問済み 2年前1667ビュー
3回答
4
承認された回答

DyanmoDB Stream => Lambda => Firehose => S3 This would be the recommended method as Marco has already mentioned Firehose acts as a buffer. While the Lambda does take a batch of 100 for example, this will result in 100 PutObject requests to S3. Whereas, using Firehose, it will merge the objects in to larger files and can also handle partitioning which will result in cheaper Puts to S3 and more efficient retrieval if needed. See CompressionFormat here.

Furthermore, you can make use of Lambda Event Filters to only invoke your functions should the item be evicted for expiring TTL. I wrote a short blog about it here. It highlights how you can make the entire process more efficient.

profile pictureAWS
エキスパート
回答済み 2年前
AWS
エキスパート
レビュー済み 2年前
  • Thank you @Leeroy. Is this possible if I merge 100 records inside my lambda function, and do one PutObject to S3?

  • Yes, you can do that, but you will have to manually write the code to compress the items and you will also lose partitioning ability.

2

I also reached out to the author of the blog post and he also gave a similar answer with details. So I would like to share that detail

 Lambda has a max invocation payload of 6MB, so even if your batch size is 10,000, with larger records you'll be limited by the 6MB limit. With firehose it's easy to get objects in the 100s of MB range on a busy stream. 

Thank you all & I feel great supports and learn a lot!

hai
回答済み 2年前
1

Firehose acts as a buffer. When you write directly to S3 this can result in a lot of requests to S3. You have to pay also for the amount of requests to S3. Without that buffering you can end up with a huge portion of your bill just the S3 requests.
This happens to me with an IoT example long time ago.

AWS
Marco
回答済み 2年前
AWS
エキスパート
レビュー済み 2年前
  • @Macro Thank you! However according to this official AWS blog, the batch size of 100 has been configured between AWS DynamoDB stream and Lambda. It means that Lambda receives a batch of 100 records, then it can write 100 records to S3 directly?

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ