2개 답변
- 최신
- 최다 투표
- 가장 많은 댓글
4
- Without LSI, is there any size limitation for each partition?
- A physical DynamoDB partition still has a maximum size of 10GB but an item-collection (items which share the same partition key) can span multiple partitions which means there is no size limit. When you use an LSI, it prevents an item-collection from spanning multiple partitions as it needs to ensure the item-collection remains available for strongly consistent reading, which would be impossible while also maintaining performance if it spanned multiple partitions.
- When do the partition key + sort key query (e.g. primary hit), how much impact on big volume partition data? for example: if a partition has 1TB size, would it be much slower than a partiion having 10GB size?
- While there is not limit on how much data you can store in an item-collection, you wouldn't want to be reading GB's or TB's of data from DynamoDB. DynamoDB implements a 1MB page size limit, meaning a single request can only return up to 1MB, and to obtain more you would have to paginate.
- Ideally when you have such large amounts of data pertaining to a single item-collection you would rely on the sort-key to efficiently query the data. Imagine a scenario where an IOT device updates its status every second, over the course of a year you would have 30M items. Its very unlikely your use case wants to access the data which is a year old, you more than likely want to know what the IOT device done in the last min/hour/day, which means you can use the sort-key to refine 30M items to 60/3600/86400 items, which is much more efficient and performant.
- My GSI key are
content_type
as partition key,end_time
as sort key, so customer can search documents based on content type and end_time range. For each content type, I might have 0.5 billion entry per day. when build GSI, it would be about 0.2KB * 0.5 billion = 100GB size per day and 36TB per year. customer wants to keep data at least a year. If I don't split the table to multiple tables in time series way, would be dynamodb good enough to hold such a 36TB as one item collection in one table and provide a fast access for partition key + small range sort key query?
- Yes, DynamoDB has no problem storing TB's of data for a single partition key, performance will remain unchanged so long as your condition on the sort key is sufficient to narrow down the search. But keep in mind, even if you search on a single day, you are trying to fetch 100GB of data which could be a slow request.
1
-
Yes the partition size is still 10GB without LSIs, but an Itemcollection isn't restricted to 10GB in that case - it can span multiple partitions.
-
Returning a big volume of data will be much slower than a small amount of course, but if you mean whether querying the same amount of data from a big Itemcollection is much slower than from a small Itemcollection then no, not substantially.
Hi Skinsman, I update my question in more detail, could you please help review it?
관련 콘텐츠
- AWS 공식업데이트됨 6달 전
Thanks, Leeroy. I update my question in more detail, could you please help review it?
Updated my answer
When I query with
content_type
+end_time(last 30 days)
, although the hitted dataset is quite large(3TB size), I suppose it should be no problem if I did the paging for query, since user generally browse a couple of first pages in the WEB front-end. Is this correct?Yes, in that case you would
Limit
each page. If your front end shows 50 items, then useLimit=50
and when the use clicksnext
, use theLastEvaluatedKey
from the previous response to obtain the next 50 items in the results.Thanks for your valuable feedback