DocumentDB Design change- Multiple collections to single collection

0

Current design: 3000 timeseries collections- each has single index and TTL used.

Problems: a) Write operation happens every 10 mins in all 3000 collections, which is time consuming. b) TTL implemented, uses IOPS which is costly. c) No. of collections will increase further with time which will lead to increase in storage(Metadata and indexes).

Question: what if we move to a single collection then what is better TTL or some daily cron? And if we move to single collection will there be IOs benefit?

Ruhi
gefragt vor 7 Monaten171 Aufrufe
1 Antwort
0

Hi,

Overall it will depend on the number of documents and If the overall index of the single large collection fits in memory. Generally consolidating to a single collection improves IO performance especially if you have queries that are pulling from multiple collections.

The downside will be sequential scans if the single collection is extremely large (combination of 3k collections), removing the other collections and their indexes will leave memory for more indexes such as compounds if certain queries are hitting certain fields. The explain plan will help here, one of the main goals after consolidating would be to decrease the number of documents scanned, the fewer, the better IO performance.

Before making the decision, I'd recommend looking at the current BuffercacheHitRatio, we'd want this close to 100% as possible. If you do notice a dip with multiple collections, consolidating may help if the queries used are frequent.

In terms of TTL, IO costs of document deletion by a scheduled process would be similar to using a TTL index. Consolidating to a single collection would help make it easier to create rolling collections, this is the least IO expensive method of deleting scheduled documents.

References: [1] multiple collections vs single: https://stackoverflow.com/questions/15314769/mongodb-multiple-collections-or-one-big-collection-w-index

[2] Indexes in Documentdb: https://aws.amazon.com/blogs/database/how-to-index-on-amazon-documentdb-with-mongodb-compatibility/

[3] Rolling collections: https://aws.amazon.com/blogs/database/optimize-data-archival-costs-in-amazon-documentdb-using-rolling-collections/

[4] Explain plans: https://docs.aws.amazon.com/documentdb/latest/developerguide/user_diagnostics.html#user_diagnostics-query_plan

[5] BuffercacheHitRatio: https://docs.aws.amazon.com/documentdb/latest/developerguide/best_practices.html#best_practices-instance_sizing

beantwortet vor 7 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen