Dynamic s3 prefix in s3 trigger for Lambda Function

0

Hi Team, Just started exploring s3 and AWS Lambda. I have a s3 folder with dynamic date/time/hour prefix i,e s3://mybucket/hostData/yyyy/mm/dd/hh/mm where yyyy = 2023, mm = 08 dd = 01, hh = 01, mm = 00. What would be the appropriate prefix to handle the dynamic datetime when triggering the lambda function ? I want to trigger an API based on the data arrived in the above s3 folder. Also, how can I mark a file as processed to avoid duplicate API call ? Thanks a lot in advance.

已提问 10 个月前1278 查看次数
1 回答
1
已接受的回答

Hello.
Lambda can be executed when an object is created in S3 by using the settings described in the following document.
https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html

It is also a good idea to tag processed objects.
Tag the object once processing is complete using "put_object_tagging" as described in the following documentation.
Since the tags are not set on objects for which processing has not been completed, it is recommended that a judgment be made as to whether the tags have been set or not, using if statements or other means.
Alternatively, moving the files to a different folder when the process is complete would be a good response. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object_tagging.html

profile picture
专家
已回答 10 个月前
profile picture
专家
已审核 10 个月前
profile picture
专家
已审核 10 个月前
  • Thanks Riku. Appreciate your help. What about the dynamic date arguments in the s3 prefix ?

  • Lambda is executed when an object is created in the folder "s3://mybucket/hostData/yyyyy/mm/dd/hh/mm". When the Lambda is executed, the message is passed to the Lambda handler as an "event". The message contains S3 folder information. The sample code shows that the object key is obtained by "key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')".

    import json
    import urllib.parse
    import boto3
    
    print('Loading function')
    
    s3 = boto3.client('s3')
    
    
    def lambda_handler(event, context):
        #print("Received event: " + json.dumps(event, indent=2))
    
        # Get the object from the event and show its content type
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
        try:
            response = s3.get_object(Bucket=bucket, Key=key)
            print("CONTENT TYPE: " + response['ContentType'])
            return response['ContentType']
        except Exception as e:
            print(e)
            print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
            raise e
    
  • Thanks a lot Riku. I am able to invoke the API via Lambda.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则