Thanks for your answer ! You are right glue doesn't support recursive function, thus i created a view in SQL server. Then i imported this view as a table in glue and replaced te recusrive function in the script ;)
I would like to inform you that Glue/Spark doesn't support recursive query. But you can use spark dataframe with the wildcard to load data from the recursive way.
Let say you have data inside the following s3 path: s3://kg-testing/test2/2018/07/10/*.files.
Then you can use the following glue code to read it recursively using wildcard (*).
import sys from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.job import Job ## @params: [JOB_NAME] args = getResolvedOptions(sys.argv, ['JOB_NAME']) sc = SparkContext() glueContext = GlueContext(sc) spark = glueContext.spark_session job = Job(glueContext) job.init(args['JOB_NAME'], args) print("**********DF printing*********") spark.read .csv("s3://kg-testing/test2/*/*/*/*").createOrReplaceTempView("testsample") sqlDF=spark.sql("select * from testsample") print(sqlDF.show ())
In order for me to troubleshoot further, by taking a look at the logs in the backend, please feel free to open a support case with AWS using the following link with the sanitized script, the job run and we would be happy to help.
- asked a year ago
- How do I run SQL commands on an Amazon Redshift table before or after writing data in an AWS Glue job?AWS OFFICIALUpdated a month ago
- How do I resolve access denied errors using Redshift Spectrum with Amazon S3 buckets in the same account as Amazon Redshift?AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 4 months ago
- EXPERTpublished 2 months ago
- EXPERTpublished a month ago