Postgres + Timestream DB an advisable combination?

0

Hi community,

I need to store hourly and daily data for about 6,000 assets in my PostgreSQL Aurora DB. Given the scale, I plan to keep three years of daily data and three months of hourly data. Additionally, I have metadata tables that are relational.

I’m considering using Amazon Timestream for the hourly and daily time-series data, while keeping the relational metadata in PostgreSQL. I’ve read that Timestream is optimized for time-based data.

Would this approach improve performance? Also, I need to relate the time-series data with the metadata for some APIs.

Looking forward for your insights!

asked a month ago60 views
2 Answers
1

Using a combination of Amazon Timestream and PostgreSQL Aurora could indeed be an advisable approach for your use case, given the nature of your data and requirements. Here's why:

  1. Optimized for time-series data: Amazon Timestream is specifically designed for time-series data, which makes it well-suited for storing your hourly and daily asset data. It's optimized for high-speed ingestion and fast query performance on time-series data, which could significantly improve performance compared to storing all data in PostgreSQL.

  2. Scalability: Timestream can handle large volumes of time-series data efficiently. It can ingest tens of gigabytes of data per minute and run SQL queries on terabytes of time-series data in seconds. This scalability is beneficial for your requirement of storing data for 6,000 assets over extended periods.

  3. Storage tiering: Timestream offers automatic storage tiering with a memory store for recent data and a magnetic store for historical data. This aligns well with your need to keep three months of hourly data (which could be in the memory store for fast access) and three years of daily data (which could be in the magnetic store for cost-effective long-term storage).

  4. Relational data in PostgreSQL: Keeping your metadata in PostgreSQL Aurora is a good choice as it's well-suited for relational data and complex queries involving multiple tables.

  5. SQL compatibility: Timestream supports SQL queries, which makes it easier to integrate with your existing PostgreSQL-based system and maintain a consistent query language across your data stores.

  6. Data correlation: You can still relate the time-series data in Timestream with the metadata in PostgreSQL. While you can't directly join tables across these two databases, you can query both separately and combine the results in your application layer. Alternatively, you could use scheduled queries in Timestream to aggregate data and store the results in a format that's easier to correlate with your PostgreSQL data.

  7. Cost-effectiveness: This approach could be more cost-effective than storing all data in PostgreSQL, as Timestream's storage tiering and optimization for time-series data can reduce overall storage and query costs.

However, there are a few considerations:

  1. Complexity: Using two different databases adds some complexity to your architecture and may require changes to your application logic.

  2. Data consistency: You'll need to ensure data consistency between the two databases, which might require additional effort in your data management processes.

  3. Learning curve: If you're not familiar with Timestream, there might be a learning curve in understanding its specific features and best practices.

Overall, given the time-series nature of your asset data and the need for relational metadata, using Timestream for time-series data and PostgreSQL for relational data could indeed improve performance and provide a more optimized solution for your specific requirements.
Sources
Storage - Amazon Timestream
Queries - Amazon Timestream
Amazon Timestream Features – Time-Series Database – AWS

profile picture
answered a month ago
0

I have seen use-cases where using both Postgres and Timestream together to satisfy the query requirements. It goes back to access patterns, if you have lot of relational queries then using together is recommended. Remember Timestream also supports Joins so you can still store them in different Timestream table and perform joins with time-series and your metadata but data needs to de-normalized and make efficient querying to fit Timestream best practices (Example without a time filter Timestream scans entire table, if its small few MB that's fine).

https://docs.aws.amazon.com/timestream/latest/developerguide/best-practices.html https://aws.amazon.com/blogs/database/understanding-and-optimizing-amazon-timestream-compute-units-for-efficient-time-series-data-management/

AWS
answered 25 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions