Skip to content

How do I create an AWS Lambda function that runs queries in Amazon Redshift?

3 minute read
0

I want to create an AWS Lambda function that runs queries on Amazon Redshift.

Resolution

Prerequisite: To set up the Amazon Virtual Private Cloud endpoints, complete the following steps.

  1. Create a VPC with a private subnet.
  2. Create a subnet group. Add the VPC and subnet that you just created.
  3. Create a private Amazon Redshift cluster. Select the VPC and subnet group that you just created.
  4. Create a new secret for Amazon Redshift with AWS Secrets Manager. For your secret name, enter redshift.

Create your Lambda function

To create a Lambda function that queries your Amazon Redshift cluster, complete the following steps:

  1. Open the Lambda console.
  2. Choose Create function.
  3. Choose the Author from Scratch option.
  4. Update the following fields:
    For Function name, enter a custom name.
    For Runtime, enter your code environment.
    Note: The examples in this resolution are compatible with Python 3.9.
    For Architecture, enter your system architecture.
    Note: The examples in this resolution are compatible with x86_64.
    For Permissions, choose Create a new role with basic Lambda permissions.
  5. Choose Create function.

Set the permissions for your Lambda function

Complete the following steps:

  1. Open the Lambda console.
  2. Choose Configuration.
  3. Choose Permissions.
  4. Choose the role created for your Lambda function.
  5. Choose Add Permissions.
  6. Choose Attach policies.
  7. Add the AmazonRedshiftDataFullAccess and SecretsManagerReadWrite policies to your Lambda execution role.
    Note: It's a best practice to grant least privilege for only the permissions required to perform a task.

Add Python code to your Lambda function

In the following example, Lambda connects to the Amazon Redshift database and creates a lambda_func table in the public schema.

Complete the following steps:

  1. Open the Lambda console.

  2. Choose Code.

  3. Enter the following code into the Code box:

    import os 
    import json 
    import boto3 
    import botocore  
    import botocore.session as bc 
    from botocore.client import Config 
     
    print('Loading function') 
     
    secret_name=os.environ['SecretId'] # getting SecretId from Environment varibales 
    session = boto3.session.Session() 
    region = session.region_name 
     
    # Initializing Secret Manager's client     
    client = session.client( 
        service_name='secretsmanager', 
            region_name=region 
        ) 
     
    get_secret_value_response = client.get_secret_value( 
            SecretId=secret_name 
        ) 
    secret_arn=get_secret_value_response['ARN'] 
     
    secret = get_secret_value_response['SecretString'] 
     
    secret_json = json.loads(secret) 
     
    cluster_id=secret_json['dbClusterIdentifier'] 
     
    # Initializing Botocore client 
    bc_session = bc.get_session() 
     
    session = boto3.Session( 
            botocore_session=bc_session, 
            region_name=region 
        ) 
     
    # Initializing Redshift's client    
    config = Config(connect_timeout=5, read_timeout=5) 
    client_redshift = session.client("redshift-data", config = config) 
     
    def lambda_handler(event, context): 
        print("Entered lambda_handler") 
     
        query_str = "create table public.lambda_func (id int);" 
        try: 
            result = client_redshift.execute_statement(Database= 'dev', SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id) 
            print("API successfully executed") 
             
        except botocore.exceptions.ConnectionError as e: 
            client_redshift_1 = session.client("redshift-data", config = config) 
            result = client_redshift_1.execute_statement(Database= 'dev', SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id) 
            print("API executed after reestablishing the connection") 
            return str(result) 
             
        except Exception as e: 
            raise Exception(e) 
             
        return str(result) 

    Note: Replace dev with the name of your database.

  4. Add an environment variable in the Lambda function's configuration section for the key SecretId and secret_name.

AWS OFFICIALUpdated 15 days ago
7 Comments

This code is missing the important part of making describe_statement request to actually see the results of your statement execution.

replied 3 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
EXPERT
replied 3 years ago

I am able to run the lambda against a serverless redshift cluster. The execute statement command works, but I am not able to see the returned result. result = client_redshift.execute_statement(Database= 'dev', SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id)

I am running Boto3 version 1.24.65

Logging the results end up blank. With this statement "logger.info('response', response)" I see this in CloudWatch:

[INFO] 2023-05-10T03:51:56.094Z a01a4549-5dc6-4ead-8060-c004b331f6ed response

The response object is never logged. And when trying to convert "result" to string with str(result) gives the following error:

[ERROR] TypeError: not all arguments converted during string formatting

But when I check the cluster via the AWS console, I see the new table was created. What needs to be done to be able to see the result of the query execution?

replied 3 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
EXPERT
replied 3 years ago

As mentioned you'd need something to get the result of the execute command. I have a snippet below that I did, but you also have to import the time library as the time it takes the command to run can vary:

    statement_id = result['Id']

    describe_statement = client_redshift.describe_statement (
        Id = statement_id
    )
    while True:
        describe_statement = client_redshift.describe_statement (Id = statement_id)
        if (describe_statement["Status"] == "FINISHED"):
            print("Query Status - " + describe_statement["Status"])
            break;
        else:
            print("Query Status - " + describe_statement["Status"])
            time.sleep(2)
    statement_result = client_redshift.get_statement_result(
        Id = statement_id
    )

You can then take statement_result and pull out information you may need. For instance if you were running a SELECT query, you could see that by calling/printing statement_result["Records"]

replied 3 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
EXPERT
replied 3 years ago

This article was reviewed and updated on 2026-03-31.

AWS
MODERATOR
replied 15 days ago