How do I create a schema/table in Redshift cluster from inside a Glue script?

0

I have established a Redshift connection in AWS Glue already. My project would require me to create tables and schemas dynamically on the fly using the Glue script. Supposing that I have already got the field names to populate to column names in the table, how do I go about that?

I tried below script in Jupyter Notebook platform of Glue.

%idle_timeout 2880
%glue_version 4.0
%worker_type G.1X
%number_of_workers 5
%connections redshift

import sys, boto3
from awsglue.transforms import *
from pyspark.context import SparkContext
from awsglue.context import GlueContext
  
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session

redshift_connection = boto3.client('redshift')

spark.sql('CREATE TABLE IF NOT EXISTS public.test (id int)')

The last command gives below error.

AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to glue.ap-south-1.amazonaws.com:443 [glue.ap-south-1.amazonaws.com/43.204.239.230, glue.ap-south-1.amazonaws.com/13.234.208.37, glue.ap-south-1.amazonaws.com/13.232.151.228] failed: connect timed out

What do I do to achieve the goal? Assist me with the approaches. Thanks in advance.

feita há 7 meses333 visualizações
2 Respostas
1

Hello.

The error message you received is related to a connectivity timeout, which may not be directly related to your SQL query but instead due to issues related to networking or permissions. This needs to be resolved first before executing SQL statements.

Ensure that your Glue jobs/development endpoint has network access to your Redshift cluster. You may need to adjust your VPC/security group rules.

Make sure your Glue IAM role has the necessary permissions to interact with Redshift. Attach the necessary policies related to Redshift to the IAM role assumed by Glue.

Regards, Andrii

profile picture
ESPECIALISTA
respondido há 7 meses
profile picture
ESPECIALISTA
avaliado há 3 dias
1

Glue cannot reach the catalog, the "redshift" connection has a VPC without connectivity to the internet.
The best solution is to add the regional glue service endpoint to the VPC.

profile pictureAWS
ESPECIALISTA
respondido há 7 meses

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas