Does a paginated scan return pages in a deterministic or undermistic order?

0

Say my table looks like this:

                               |                attributes
 partition key: username       |        gender       |        age

I would like it to be possible for a user to go through all rows in this table (a little by a little, meaning, in a paginated way). Obviously, a paginated scan should do the job here. But now, my worry is that this will result in the first item in the table being accessed way more than the remaining items (which is bad).

Is this really the case? Will a paginated scan always return items in a deterministic manner? (From start to end of table). Or will it scan through the whole table in a random fashion?

asked 2 years ago554 views
2 Answers
0

Items have an internal order that's used during scans. You can't predict from the outside what the order will be because it's based on things like the hashes of PK values.

You want a semi-random item from the table? You can use the parallel scan functionality and specify up to a million segments and pick a random segment number to start your scan from. That would give you a million starting points (evenly distributed among the internal order of items).

AWS
answered 2 years ago
  • I don't know why you mention parallel scans all of a sudden, is not possible to pick a random segment number to start a scan from with a regular scan? Also, just to be clear, are you then saying that a regular scan with a random segment number to start the scan from will be what you would advice in my situation (and that it would be a good solution)?

  • To be clear, I didn't say to do a parallel scan. I said to use the parallel scan functionality which lets you split the scan into segments and you can pick a random one to read from. That was assuming you wanted a random item. I can't really give advice for your situation since you only asked about scan behaviors and didn't give your requirements. Are your requirements to pick a single random item from a table?

-1

DynamoDB paginates the results from Scan operations. With pagination, the Scan results are divided into "pages" of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on. A single Scan only returns a result set that fits within the 1 MB size limit.

Please read through the below links to get more information regarding the same.

Link- https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html -- [1]

https://aws.amazon.com/blogs/developer/understanding-auto-paginated-scan-with-dynamodbmapper/ -- [2]

profile pictureAWS
SUPPORT ENGINEER
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions