How to control lambda run only once for dynamodb stream

1

I set dynamodb stream to trigger lambda whenever new item is added to table. Lambda will provision ec2 instance to migrate to db for specific record. When I add record to table, lambda got run time error because syntax error. I delete record, fix function code. I again add the same record, lambda fail again because error in code. The same problem occur 4 times. When it finally run, I realize that lambda is triggered 4 times, result in 4 ec2 are provisioned. My lambda code has logic that after migration successful, create table backup. The result is 4 backups of table. This mean that the record is not deleted from stream even though lambda fail to run. How can I make lambda to run only once for the same record ? Is records not removed from streams ?. Can I use sqs fifo queue to control this ?

asked a month ago42 views
2 Answers
3

What you need is for your Lambda function to be idempotent (only processes each event exactly once). As well as your issue of having multiple events created, Lambda also does not provide exactly once processing. To overcome that we can leverage Powertools for Lambda, which can handle the idempotentcy for you.

import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';
import type { Context } from 'aws-lambda';
import type { Request, Response, SubscriptionResult } from './types';

export const handler = makeIdempotent(
  async (event: Request, _context: Context): Promise<Response> => {
    try {
      const payment = … // create payment
	  
      return {
        paymentId: payment.id,
        message: 'success',
        statusCode: 200,
      };

    } catch (error) {
      throw new Error('Error creating payment');
    }
  },
  {
    persistenceStore,
  }
);

It can be implemented for all runtimes, I'm just sharing a Typescript example. You can read more on it's capabilities here:

https://aws.amazon.com/blogs/compute/implementing-idempotent-aws-lambda-functions-with-powertools-for-aws-lambda-typescript/

profile pictureAWS
EXPERT
answered a month ago
profile picture
EXPERT
reviewed a month ago
2
Accepted Answer

Hi,

This page gives you the needed explanation to align your Lambda code with the DDB trigger mechanism: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.html

The AWS Lambda service polls the stream for new records four times per second. 
When new stream records are available, your Lambda function is synchronously 
invoked. You can subscribe up to two Lambda functions to the same DynamoDB stream. 
If you subscribe more than two Lambda functions to the same DynamoDB stream, 
read throttling might occur.

<...>

If your function returns an error, Lambda retries the batch until it processes successfully 
or the data expires. You can also configure Lambda to retry with a smaller batch, limit
 the number of retries, discard records once they become too old, and other options.

So, to cope with retries, you have to store state somewhere: the smallest possible service that I know of to do so is System Manager Parameter Store. See https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html

Using it when your stored state is small (i.e a few values: last processed DDB stream item + datetime in your case) usually leads to smaller costs than SQS or other more complex services,

Best,

Didier

profile pictureAWS
EXPERT
answered a month ago
profile picture
EXPERT
reviewed a month ago
  • thanks for your response. I will try to use parameter store to solve the problem

  • Glad to have helped. Thank you for accepting my response.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions