Dynamic s3 prefix in s3 trigger for Lambda Function

0

Hi Team, Just started exploring s3 and AWS Lambda. I have a s3 folder with dynamic date/time/hour prefix i,e s3://mybucket/hostData/yyyy/mm/dd/hh/mm where yyyy = 2023, mm = 08 dd = 01, hh = 01, mm = 00. What would be the appropriate prefix to handle the dynamic datetime when triggering the lambda function ? I want to trigger an API based on the data arrived in the above s3 folder. Also, how can I mark a file as processed to avoid duplicate API call ? Thanks a lot in advance.

質問済み 10ヶ月前1278ビュー
1回答
1
承認された回答

Hello.
Lambda can be executed when an object is created in S3 by using the settings described in the following document.
https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html

It is also a good idea to tag processed objects.
Tag the object once processing is complete using "put_object_tagging" as described in the following documentation.
Since the tags are not set on objects for which processing has not been completed, it is recommended that a judgment be made as to whether the tags have been set or not, using if statements or other means.
Alternatively, moving the files to a different folder when the process is complete would be a good response. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object_tagging.html

profile picture
エキスパート
回答済み 10ヶ月前
profile picture
エキスパート
レビュー済み 10ヶ月前
profile picture
エキスパート
レビュー済み 10ヶ月前
  • Thanks Riku. Appreciate your help. What about the dynamic date arguments in the s3 prefix ?

  • Lambda is executed when an object is created in the folder "s3://mybucket/hostData/yyyyy/mm/dd/hh/mm". When the Lambda is executed, the message is passed to the Lambda handler as an "event". The message contains S3 folder information. The sample code shows that the object key is obtained by "key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')".

    import json
    import urllib.parse
    import boto3
    
    print('Loading function')
    
    s3 = boto3.client('s3')
    
    
    def lambda_handler(event, context):
        #print("Received event: " + json.dumps(event, indent=2))
    
        # Get the object from the event and show its content type
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
        try:
            response = s3.get_object(Bucket=bucket, Key=key)
            print("CONTENT TYPE: " + response['ContentType'])
            return response['ContentType']
        except Exception as e:
            print(e)
            print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
            raise e
    
  • Thanks a lot Riku. I am able to invoke the API via Lambda.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ