Dynamically handling changes in IoT telemetry data payload in real-time using Kinesis data streams & Lambd

0

I'm working with a fleet of 300+ IoT devices sending telemetry data to a Kinesis Data Stream at a rate of 25 million messages per day. The data is stored in a MongoDB database using a schema with three collections:

  1. Devices
  2. Variables
  3. Values (timeseries)

The data is ingested pipeline is as follows:

IoT core -> IoT core rules engine -> Kinesis data streams -> lambda function -> mongodb

A kinesis firehose delivery stream is attached to data stream for storing data in s3.

Problem:

The devices can have their payload dynamically changed during operation, adding or removing variables. For example, a device named "machine-2" initially sends the following payload:

    {
      "currentPowerConsumption": 6.7,
      "upTime": 456789,
      "motorHealth": 100
    }

However, after a reconfiguration, it now sends:

    {
      "currentPowerConsumption": 6.7,
      "upTime": 456789,
      "motorHealth": 100,
      "motorTemperature": 45
    }

Current approach:

I'm currently using Amazon ElastiCache for Redis to store the number of variables each device sends with a key-value pair like device-id: number-of-variables. For example:

"machine-2": 3

When a payload arrives, In the lambda function, I:

  • Check the Redis store for the corresponding device ID.
  • If the number of variables hasn't changed, I directly insert the data into the values collection.
  • If the number of variables has changed, I update the variables collection and then insert the data into the values collection.

Questions:

  1. Is this approach efficient and scalable for handling such a large volume of data?
  2. Are there any potential improvements or alternative solutions I should consider for dynamically handling changes in the data payload?
  3. Is storing the number of variables in Redis the best way to track schema changes, or are there other options?
已提問 6 個月前檢視次數 331 次
1 個回答
0

I would say that storing the number of variables is the wrong approach as a change may be also removing one variable and adding a different one, making the number of variables the same, but a different schema.

Not an expert on MongoDB, but if possible, I would store both the value and variable in the collection, and don't worry about the schema. Maybe create a collection per variable, or something similar.

profile pictureAWS
專家
Uri
已回答 6 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南