How can i run SQL statements in my AWS Glue Script

0

Hello,

I am using AWS Glue to read a data from S3 and write in a table in redshift.

I want to run some SQL statement in my AWS glue auto generated script.

I am importing the from pyspark.sql.functions import * and then using spark.sql and putting my sql inside it .

spark.sql(""" update datahub_source.dos.DemographicsSCPR set ETL_CURR_REC='Y',ETL_CREATED_DT=GETDATE(),ETL_UPDATED_DT=GETDATE() WHERE ETL_CURR_REC IS NULL"""

Its throwing the error below so please help as i am new to Glue and doing a POC-

ParseException: "\nmismatched input 'update' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 1)\n\n== SQL ==\n update datahub_source.dos.DemographicsSCPR set ETL_CURR_REC='Y',ETL_CREATED_DT=GETDATE(),ETL_UPDATED_DT=GETDATE() WHERE ETL_CURR_REC IS NULL \n-^^^\n"

asked 2 years ago3754 views
1 Answer
0

Hello,

Spark SQL does not support UPDATE queries, hence it is not able to recognize the update keyword. You can find the Spark SQL syntax [here] (https://spark.apache.org/docs/latest/sql-ref-syntax.html)

If you are using AWS Glue to load your data into redshift, then I would suggest you to check on this article which shows a way to run SQL queries on Redshift database using Glue pre and post actions.

AWS
SUPPORT ENGINEER
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions