How do I create a schema/table in Redshift cluster from inside a Glue script?

0

I have established a Redshift connection in AWS Glue already. My project would require me to create tables and schemas dynamically on the fly using the Glue script. Supposing that I have already got the field names to populate to column names in the table, how do I go about that?

I tried below script in Jupyter Notebook platform of Glue.

%idle_timeout 2880
%glue_version 4.0
%worker_type G.1X
%number_of_workers 5
%connections redshift

import sys, boto3
from awsglue.transforms import *
from pyspark.context import SparkContext
from awsglue.context import GlueContext
  
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session

redshift_connection = boto3.client('redshift')

spark.sql('CREATE TABLE IF NOT EXISTS public.test (id int)')

The last command gives below error.

AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to glue.ap-south-1.amazonaws.com:443 [glue.ap-south-1.amazonaws.com/43.204.239.230, glue.ap-south-1.amazonaws.com/13.234.208.37, glue.ap-south-1.amazonaws.com/13.232.151.228] failed: connect timed out

What do I do to achieve the goal? Assist me with the approaches. Thanks in advance.

posta 7 mesi fa332 visualizzazioni
2 Risposte
1

Hello.

The error message you received is related to a connectivity timeout, which may not be directly related to your SQL query but instead due to issues related to networking or permissions. This needs to be resolved first before executing SQL statements.

Ensure that your Glue jobs/development endpoint has network access to your Redshift cluster. You may need to adjust your VPC/security group rules.

Make sure your Glue IAM role has the necessary permissions to interact with Redshift. Attach the necessary policies related to Redshift to the IAM role assumed by Glue.

Regards, Andrii

profile picture
ESPERTO
con risposta 7 mesi fa
profile picture
ESPERTO
verificato 2 giorni fa
1

Glue cannot reach the catalog, the "redshift" connection has a VPC without connectivity to the internet.
The best solution is to add the regional glue service endpoint to the VPC.

profile pictureAWS
ESPERTO
con risposta 7 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande