Lambda function correctly invoked locally, but crashing on AWS

0

Making a local invoke:

!sam local invoke "export-redshift-s3" --event event.json

Returns a 200 response, indicating the lambda function was correctly executed:

{"statusCode": 200, "body": "Data successfully exported to S3"}
Invoking Container created from asdf.dkr.ecr.region1.amazonaws.com/lambda-redshift:latest
Building image.................

Using local image: asdf.dkr.ecr.region1.amazonaws.com/lambda-redshift:rapid-x86_64.

START RequestId: asdf-9410-47a6-abb0-1c9fdb1ed422 Version: $LATEST
END RequestId: asdf-a82a-474b-b2b9-b569e464db71
REPORT RequestId: asdf-a82a-474b-b2b9-b569e464db71	Init Duration: 0.14 ms	Duration: 9436.90 ms	Billed Duration: 9437 ms	Memory Size: 128 MB	Max Memory Used: 128 MB

The lambda function within the container:

from io import StringIO
import os
import boto3
import json
import pandas as pd
import sqlalchemy
from datetime import datetime
import utils
import logging

logging.basicConfig(level=logging.INFO)

def handler(event, context):
    
    ID_PROJECT = event['ID_PROJECT']
    
    logging.info(f"ID_PROJECT: {ID_PROJECT}")
        
    db_credentials = utils.get_secret()
    
    logging.debug("db_credentials got from secret manager")

    conn_string = (
        f"redshift+psycopg2://{db_credentials['user']}:{db_credentials['password']}@"
        f"{db_credentials['endpoint']}:{db_credentials['port']}/{db_credentials['name']}"
    )

    db_credentials = utils.get_secret()
    engine = sqlalchemy.create_engine(conn_string)

    with engine.connect() as conn:

        ID_EXPERIMENT = conn.execute('''
        SELECT
            asdf
        FROM
            asdf
        ''').fetchone()[0] + 1
        
        now = datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')
        
        insert_query = f"""
            INSERT INTO
                asdf
                    asdf
                VALUES
                    asdf
        """

        conn.execute(insert_query)
        
        query = f'''
        UNLOAD (
            'SELECT
                *
            FROM
                asdf
            WHERE
                asdf
        )

        TO 's3://asdf'
        IAM_ROLE 'arn:aws:iam::asdf:role/lambda-export-redshift-s3'
        ALLOWOVERWRITE
        PARALLEL OFF    
        FORMAT AS PARQUET;
        '''

        conn.execute(query)
    
    return {
        'statusCode': 200,
        'body': 'Data successfully exported to S3'
    }

The Dockerfile:

FROM public.ecr.aws/lambda/python:3.12

# Copy requirements.txt
COPY requirements.txt ${LAMBDA_TASK_ROOT}

# Update setuptools and install specified packages
RUN pip install --upgrade setuptools && \
    pip install -r requirements.txt

# Copy function code
COPY lambda_function.py ${LAMBDA_TASK_ROOT}
ADD utils.py ${LAMBDA_TASK_ROOT}

# Set the CMD to your handler
CMD [ "lambda_function.handler" ]

From the X-Ray traces, the initialization of the container seems to be correct.

X-Ray traces

I even gave Lambda's role admin access:

Lambda's role

As I am using lambda to unload data from Redshift to S3, the role is attached to the Redshift cluster (since it's working locally, this configuration is good to go).

Any guidance on why the lambda function is not working when invoked?

Enter image description here

It takes more than 4 minutes, but it's timed out. It shouldn't take more than 15s as proved in the local invocation.

Thanks in advance for your willingness to help out!

  • Can you share the error when crashing? Any log outputs?

3 Answers
1

Apparently, your function is in a VPC. If it's really necessary to use a VPC, check the subnet; if it is public, for private it must go out to the internet via NAT Gateway, or you will need to use VPC Endpoints (there's a cost) to reach the necessary AWS services, such as ECR, S3, Redshift. It's also necessary to check the VPC security group rules for this function to see if they allow the necessary connections. Are you using ECR for your image? Is the function at least able to download the image?

profile picture
answered 3 months ago
1

Arhh I think I know whats happening.

Your Lambda function may not have access to the S3 or RedShift Endpoints. Can you confirm if your Lambda is connected to a VPC or not? If it is VPC connected, you will either need the Lambda function in a Private subnet with a route to a NAT gateway OR to have VPC Endpoints setup for each service the Lambda function needs to consume.

If its not in a VPC, then connectivity will be good for S3

profile picture
EXPERT
answered 3 months ago
0

The Lambda function wasn't connected to a VPC.

Since the Redshift is in a VPC, that's why the Lambda function is not working?

I'll work on the NAT Gateway and get back here. Thanks for the kind guidance!

datons
answered 3 months ago
  • If your Redshift is only accessible privately through the VPC, your Lambda function must be in the same VPC for the connection to occur. With that, follow the other tips mentioned above.

    When the Lambda function is not in a VPC, it only gets a temporary EIP and works directly via the internet.

  • Gotcha! Thanks for the confirmation. I'll follow the steps you have proposed.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions