- Newest
- Most votes
- Most comments
Generally speaking, DDB best practice is to use a single table for a given use case / app / microservice.
The two-table approach requires extra work at the application layer to query across the two tables, aggregate results, etc. You're trying to apply relational patterns to a non-relational database. If you're using multiple tables in DDB, then typically you can either combine them into one by rethinking your key selection or, perhaps, a relational database may be the better choice.
One would need a thorough understanding of your customer's specific read and write patterns and volume before being comfortable giving advice on key selection.
Customers new to DDB often fall into relational habits of giving their primary and sort key the name of a specific attribute, such tag_id or sched_id. While this is totally acceptable and fits many use cases, there are ways to think outside the box here.
For example, you could name your primary key primary_key and your sort key sort_key. By doing this, it becomes apparent than a single DDB table can hold more than 1 conceptual table from a relational world. e.g. your one table could hold customer data where the primary_key attribute has a format of cust_123
and, in the same table, hold order data where the primary_key = order_xyz
.
Below is an example of a DDB table that holds both customer master data and order history. In a relational world, this would need to be (at least) two tables, but in DDB it could be one:
primary_key (other attributes)
------------------ ------------------------------------------
cust_123 name=matt, state=AZ
cust_124 name=sam , state=NY
order_1 customer=cust123 order_date=12/31/2015, order_items=[......]
Recommend watching Advanced Design Patterns for DynamoDB: https://www.youtube.com/watch?v=HaEPXoXVf2k
Taking into account that content_info will often be the same in many entries from TAGS_TABLE does it make sense to split table in two? However, content_info itself will not be requested separately from tag.
If content_info is stored with each item:
- read logic is easier, just get a single item
- consumes more storage for the duplicate content
- requires fewer GetItem() calls to retrieve data = faster response, potentially less RCU
- if content_info changes, more WCU to update across all affected items
if content_info stored as separate item:
- read logic more complicated in app tier
- consumes less storage
- requires two GetItem() API calls to for each logical read
- if content_info changes, only requires one write
Other questions:
- how often is content_info changing?
- how many tag_id items might have the same content_info?
Relevant content
- asked 10 months ago
- asked 2 years ago
- Accepted Answerasked 10 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 6 months ago