Redshift serverless data api sql calls

0

Hi,

I am new to Redshift serverless and working on a Lambda function to connect to serverless database using Python redshift Data API and execute bunch of SQL and stored procedures that are stored in a Nexus repository artifact. I am seeing errors when I try to execute SQL statements read from a file as a string. Here is an example of a DDL from one of the scripts.

-- Table Definitions


-- test1 -----------------------


DROP TABLE IF EXISTS test1; CREATE TABLE test1 ( id varchar(32), name varchar(64) DEFAULT NULL, grade varchar(64) DEFAULT NULL, zip varchar(5) DEFAULT NULL

-- test2 --------------------------


DROP TABLE IF EXISTS test2; CREATE TABLE test2( id varchar(32), user_id varchar(32) DEFAULT NULL, hnum varchar(6), ts_created timestamp DEFAULT NULL, ts_updated timestamp DEFAULT NULL

-- and few other tables in the same script

The function runs fine if I hard code the sql query in the code and I don't see any syntax or other errors with the sql file contents since I could run those using Redshift query editor by manually copy n pasting all the DDLs. Am I missing anything or using data API is not the right approach for this use case?

Error and Traceback from the lambda function execution:

During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/runtime/bootstrap.py", line 60, in <module> main() File "/var/runtime/bootstrap.py", line 57, in main awslambdaricmain.main([os.environ["LAMBDA_TASK_ROOT"], os.environ["_HANDLER"]]) File "/var/runtime/awslambdaric/main.py", line 21, in main bootstrap.run(app_root, handler, lambda_runtime_api_addr) File "/var/runtime/awslambdaric/bootstrap.py", line 405, in run handle_event_request( File "/var/runtime/awslambdaric/bootstrap.py", line 165, in handle_event_request xray_fault = make_xray_fault(etype.name, str(value), os.getcwd(), tb_tuples) FileNotFoundError: [Errno 2] No such file or directory

1 Answer
0

Hello,

Thank you for writing on re:Post

Kindly know that “FileNotFoundError” error could be caused due to the following reasons:

  • The path of .SQL files in Nexus repository is not correct.
  • It could also be due to a python module missing that the function is trying to connect.

Also, if you are using Boto3, Python SDK, please note that the "execute_statement" API, which helps to run SQL statement, is not yet supported with Boto3 for Redshift-Serverless. So, we would recommend connecting to Redshift Serverless cluster, from Lambda, using redshift-connector.

The Amazon Redshift Connector module in python can help in connecting to Redshift-Serverless cluster and it also easily integrates with pandas and numpy. The connector supports numerous Amazon Redshift specific features that helps you get the most out of your data.

Kindly find below the steps to make the connection to cluster using Redshift Connector:

  1. In order to connect to Redshift Serverless, from a Lambda Function, you would first require to add the Redshift connector and the AWSDataWrangler-Python39 layers for the AWS Lambda Function as the Redshift Connector module is not an in-built module.

Required Layers:

a) Redshift Connector : This has to be downloaded in the form of a zip file and added as a layer. You can download the zip file from the following GitHub page: https://github.com/aws/amazon-redshift-python-driver

b) AWSDataWrangler-Python39 which is an AWS layer (a layer from a list of layers provided by AWS). Data Wrangler is built on top of other open-source projects like pandas, apache, arrow and boto3, it offers abstracted functions to execute usual ETL tasks like load/unload data from Data Lakes, Data Warehouse and Databases.

  1. Below is a sample code to connect to Redshift Serverless using a python script in AWS Lambda Service.

import redshift_connector

def lambda_handler(event, context):

conn = redshift_connector.connect

(
     host='HOST_NAME', database='DATABASE_NAME',

     user='DATABASE_USER',

     SecretArn: '',

     port=5439

)

cursor=conn.cursor()

cursor.execute("YOUR SQL STATEMENT")

result: tuple = cursor.fetchall()

print(result)

Note: HOST_NAME in the format : workgroup-name.account-ID.region.redshift-serverless.amazonaws.com . Also, please note that this sample code has been shared on a best-effort basis and we would request that you test the code thoroughly before deploying to your production environment, if you decide to use this code snippet.

Kindly try the above suggestions. Hope they will improve your experience. However, if the issue persists and you would like us to investigate the issue further then please create a technical support case with Amazon RedShift team or AWS Lambda team.

In order to open a Technical Support Case, you will need to have a Developer, Business or Enterprise Support Plan. If you do not have a Support Plan in place, please kindly visit the link provided below on how to sign up for one: https://aws.amazon.com/premiumsupport/knowledge-center/sign-up-support/

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions