How do I create a schema/table in Redshift cluster from inside a Glue script?

0

I have established a Redshift connection in AWS Glue already. My project would require me to create tables and schemas dynamically on the fly using the Glue script. Supposing that I have already got the field names to populate to column names in the table, how do I go about that?

I tried below script in Jupyter Notebook platform of Glue.

%idle_timeout 2880
%glue_version 4.0
%worker_type G.1X
%number_of_workers 5
%connections redshift

import sys, boto3
from awsglue.transforms import *
from pyspark.context import SparkContext
from awsglue.context import GlueContext
  
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session

redshift_connection = boto3.client('redshift')

spark.sql('CREATE TABLE IF NOT EXISTS public.test (id int)')

The last command gives below error.

AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to glue.ap-south-1.amazonaws.com:443 [glue.ap-south-1.amazonaws.com/43.204.239.230, glue.ap-south-1.amazonaws.com/13.234.208.37, glue.ap-south-1.amazonaws.com/13.232.151.228] failed: connect timed out

What do I do to achieve the goal? Assist me with the approaches. Thanks in advance.

demandé il y a 7 mois333 vues
2 réponses
1

Hello.

The error message you received is related to a connectivity timeout, which may not be directly related to your SQL query but instead due to issues related to networking or permissions. This needs to be resolved first before executing SQL statements.

Ensure that your Glue jobs/development endpoint has network access to your Redshift cluster. You may need to adjust your VPC/security group rules.

Make sure your Glue IAM role has the necessary permissions to interact with Redshift. Attach the necessary policies related to Redshift to the IAM role assumed by Glue.

Regards, Andrii

profile picture
EXPERT
répondu il y a 7 mois
profile picture
EXPERT
vérifié il y a 3 jours
1

Glue cannot reach the catalog, the "redshift" connection has a VPC without connectivity to the internet.
The best solution is to add the regional glue service endpoint to the VPC.

profile pictureAWS
EXPERT
répondu il y a 7 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions