DocumentDB Design change- Multiple collections to single collection

0

Current design: 3000 timeseries collections- each has single index and TTL used.

Problems: a) Write operation happens every 10 mins in all 3000 collections, which is time consuming. b) TTL implemented, uses IOPS which is costly. c) No. of collections will increase further with time which will lead to increase in storage(Metadata and indexes).

Question: what if we move to a single collection then what is better TTL or some daily cron? And if we move to single collection will there be IOs benefit?

Ruhi
asked 7 months ago163 views
1 Answer
0

Hi,

Overall it will depend on the number of documents and If the overall index of the single large collection fits in memory. Generally consolidating to a single collection improves IO performance especially if you have queries that are pulling from multiple collections.

The downside will be sequential scans if the single collection is extremely large (combination of 3k collections), removing the other collections and their indexes will leave memory for more indexes such as compounds if certain queries are hitting certain fields. The explain plan will help here, one of the main goals after consolidating would be to decrease the number of documents scanned, the fewer, the better IO performance.

Before making the decision, I'd recommend looking at the current BuffercacheHitRatio, we'd want this close to 100% as possible. If you do notice a dip with multiple collections, consolidating may help if the queries used are frequent.

In terms of TTL, IO costs of document deletion by a scheduled process would be similar to using a TTL index. Consolidating to a single collection would help make it easier to create rolling collections, this is the least IO expensive method of deleting scheduled documents.

References: [1] multiple collections vs single: https://stackoverflow.com/questions/15314769/mongodb-multiple-collections-or-one-big-collection-w-index

[2] Indexes in Documentdb: https://aws.amazon.com/blogs/database/how-to-index-on-amazon-documentdb-with-mongodb-compatibility/

[3] Rolling collections: https://aws.amazon.com/blogs/database/optimize-data-archival-costs-in-amazon-documentdb-using-rolling-collections/

[4] Explain plans: https://docs.aws.amazon.com/documentdb/latest/developerguide/user_diagnostics.html#user_diagnostics-query_plan

[5] BuffercacheHitRatio: https://docs.aws.amazon.com/documentdb/latest/developerguide/best_practices.html#best_practices-instance_sizing

answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions