1 Answer
- Newest
- Most votes
- Most comments
1
Hello.
According to the client's requirements, they want to store the data for a year, which is not feasible.
Why is it not possible to retain data for one year?
I think it would be a good idea to make it disappear in one year with TTL, but is there any reason why this would not work?
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html
Relevant content
- asked a year ago
- asked 4 months ago
- asked 4 months ago
- AWS OFFICIALUpdated 10 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 months ago
It is possible to run Lambda after sending data to AWS IoT Core. It may be a good idea to process the sent data using Lambda and store it in S3. I think it is possible to use Amazon Athena and Amazon QuickSight to graph and visualize data. https://docs.aws.amazon.com/iot/latest/developerguide/lambda-rule-action.html
Thank you for responding. By "not feasible," I meant that since one device will be sending data every second, storing that many data points from just one device would be manageable. However, with many more devices, storing and querying data from them to display in a graph and to allow downloading would be quite demanding on the system. This would likely increase costs. I am new here and don't have much experience, but I believe this could be quite heavy on the system. Could you suggest some techniques that would be suitable for my requirements?
Thank you @Riku_Kobayashi for answering. That approach could work as well. To clarify, I plan to set up S3 to process data from DynamoDB, or alternatively, directly use data from MQTT and create a Lambda function that captures one data point every 2 minutes. These data points will then be stored in a database with a 1-year expiry. This would result in approximately 250,000 data points per device annually (365 days * 24 hours * 30 points/hour).
If a user wants to access data within ranges like 1 hour, 1 day, 1 week, or 1 year, these data points should be sufficient for graphing and generating Excel sheets. I also intend to use DynamoDB for real-time data.
Is this the right approach? For future scalability, for instance with 100 devices, would this not significantly increase costs? Additionally, when downloading data to Excel, I don't foresee frequent requests being an issue. However, for graphing purposes, would reading data from S3 cause substantial loading times? I want to optimize the data structure to avoid excessive load and unnecessary costs. I hope this is clear, and I appreciate all the responses as this is my first time here 🥲, thank you tho
I think testing is necessary, but I think the architecture you have in mind will work. DynamoDB pricing is calculated based on data reads, writes, and storage usage, so the more IoT devices you have, the more you pay. But I think this is a cost you have to accept if you use DynamoDB.
If you're looking for query speed, I think it's better to use a data warehouse like Redshift instead of S3.