Dynamically handling changes in IoT telemetry data payload in real-time using Kinesis data streams & Lambd

0

I'm working with a fleet of 300+ IoT devices sending telemetry data to a Kinesis Data Stream at a rate of 25 million messages per day. The data is stored in a MongoDB database using a schema with three collections:

  1. Devices
  2. Variables
  3. Values (timeseries)

The data is ingested pipeline is as follows:

IoT core -> IoT core rules engine -> Kinesis data streams -> lambda function -> mongodb

A kinesis firehose delivery stream is attached to data stream for storing data in s3.

Problem:

The devices can have their payload dynamically changed during operation, adding or removing variables. For example, a device named "machine-2" initially sends the following payload:

    {
      "currentPowerConsumption": 6.7,
      "upTime": 456789,
      "motorHealth": 100
    }

However, after a reconfiguration, it now sends:

    {
      "currentPowerConsumption": 6.7,
      "upTime": 456789,
      "motorHealth": 100,
      "motorTemperature": 45
    }

Current approach:

I'm currently using Amazon ElastiCache for Redis to store the number of variables each device sends with a key-value pair like device-id: number-of-variables. For example:

"machine-2": 3

When a payload arrives, In the lambda function, I:

  • Check the Redis store for the corresponding device ID.
  • If the number of variables hasn't changed, I directly insert the data into the values collection.
  • If the number of variables has changed, I update the variables collection and then insert the data into the values collection.

Questions:

  1. Is this approach efficient and scalable for handling such a large volume of data?
  2. Are there any potential improvements or alternative solutions I should consider for dynamically handling changes in the data payload?
  3. Is storing the number of variables in Redis the best way to track schema changes, or are there other options?
已提问 6 个月前331 查看次数
1 回答
0

I would say that storing the number of variables is the wrong approach as a change may be also removing one variable and adding a different one, making the number of variables the same, but a different schema.

Not an expert on MongoDB, but if possible, I would store both the value and variable in the collection, and don't worry about the schema. Maybe create a collection per variable, or something similar.

profile pictureAWS
专家
Uri
已回答 6 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则