Hi. I am trying to run an AWS Glue job where I transfer data from S3 to Amazon Redshift. However, I am receiving the following error:
Error Category: UNCLASSIFIED_ERROR; An error occurred while calling o107.pyWriteDynamicFrame. Exception thrown in awaitResult:
I'm really not sure where to begin to try to resolve this error.
Here is the script for the Glue job that I am trying to run:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from awsglue import DynamicFrame
args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)
# Script generated for node Amazon S3
AmazonS3_node1704977097218 = glueContext.create_dynamic_frame.from_options(
format_options={"quoteChar": '"', "withHeader": True, "separator": ","},
connection_type="s3",
format="csv",
connection_options={"paths": ["s3://smedia-data-raw-dev/google/"], "recurse": True},
transformation_ctx="AmazonS3_node1704977097218",
)
# Script generated for node Amazon Redshift
AmazonRedshift_node1704977113638 = glueContext.write_dynamic_frame.from_options(
frame=AmazonS3_node1704977097218,
connection_type="redshift",
connection_options={
"redshiftTmpDir": "s3://aws-glue-assets-191965435652-us-east-1/temporary/",
"useConnectionProperties": "true",
"dbtable": "public.google_test",
"connectionName": "Redshift connection",
"preactions": "DROP TABLE IF EXISTS public.google_test; CREATE TABLE IF NOT EXISTS public.google_test (resourcename VARCHAR, status VARCHAR, basecampaign VARCHAR, name VARCHAR, id VARCHAR, campaignbudget VARCHAR, startdate VARCHAR, enddate VARCHAR, adservingoptimizationstatus VARCHAR, advertisingchanneltype VARCHAR, advertisingchannelsubtype VARCHAR, experimenttype VARCHAR, servingstatus VARCHAR, biddingstrategytype VARCHAR, domainname VARCHAR, languagecode VARCHAR, usesuppliedurlsonly VARCHAR, positivegeotargettype VARCHAR, negativegeotargettype VARCHAR, paymentmode VARCHAR, optimizationgoaltypes VARCHAR, date VARCHAR, averagecost VARCHAR, clicks VARCHAR, costmicros VARCHAR, impressions VARCHAR, useaudiencegrouped VARCHAR, activeviewmeasurablecostmicros VARCHAR, costperallconversions VARCHAR, costperconversion VARCHAR, invalidclicks VARCHAR, publisherpurchasedclicks VARCHAR, averagepageviews VARCHAR, videoviews VARCHAR, allconversionsbyconversiondate VARCHAR, allconversionsvaluebyconversiondate VARCHAR, conversionsbyconversiondate VARCHAR, conversionsvaluebyconversiondate VARCHAR, valueperallconversionsbyconversiondate VARCHAR, valueperconversionsbyconversiondate VARCHAR, allconversions VARCHAR, absolutetopimpressionpercentage VARCHAR, searchabsolutetopimpressionshare VARCHAR, averagecpc VARCHAR, searchimpressionshare VARCHAR, searchtopimpressionshare VARCHAR, activeviewctr VARCHAR, ctr VARCHAR, relativectr VARCHAR);",
},
transformation_ctx="AmazonRedshift_node1704977113638",
)
job.commit()
Any help on what I can do to resolve this issue would be greatly appreciated. Thank you!
Hi. Thank you for your response.
I checked the logs but couldn't find any errors around the error I was getting. I couldn't even find the specific error I was getting. Also, I'm using dBeaver to connect to my Redshift cluster, and queried
stl_load_errors
, but I am getting zero results. Maybe I am doing something wrong. Do you have any suggestions?