Questions tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

1732 results
Hi, I have been using a docker image from amazon/aws-glue-libs:glue_libs_4.0.0_image_01 to run locally Glue Spark jobs. I also want to test Ray locally in the same manner, by inspecting the image...
0
answers
0
votes
224
views
asked 2 months ago
We are using Step Functions for our ETL pipeline. The first step kicks off 21 jobs that each take about 1-3 minutes each consuming 2 DPUs. The Step Function fails with the below error when trying to...
1
answers
0
votes
276
views
asked 2 months ago
We are trying to read a CSV file to process the data using AWS Glue and we are getting an error message as below: Py4JJavaError: An error occurred while calling o91.schema. :...
1
answers
0
votes
302
views
sravan
asked 2 months ago
How can I monitor who is querying which glue tables? After some trial and error I found that the BatchGetTable Glue API event is recorded using CloudTrail every time I run an Athena query, and...
2
answers
0
votes
493
views
Soumaya
asked 2 months ago
I have Athena Iceberg table. The table has 2 partitions. Each hour I update it with `MERGE` and `DELETE` commands. ``` SELECT count(*) FROM "my_table$files" ``` now **gives 16. Meanwhile data...
1
answers
0
votes
448
views
profile picture
Smotrov
asked 2 months ago
I have a bunch of parquet files in a flat S3 folder, no partitions:...
1
answers
0
votes
177
views
ecmons
asked 2 months ago
Hi. I had a table that was created by a crawler, then I deleted the table ( in Athena) and created it by DDL. after running crawler. it could not find the table and create a new table. note: The s3...
1
answers
0
votes
361
views
profile picture
gh02
asked 2 months ago
I have a few text files on S3 that I need to add to the Glue Catalog in order to use them in a job. None of them have separators, they are all fixed-width files. I have the schemas, but the crawler...
Accepted AnswerAWS Glue
1
answers
0
votes
96
views
profile picture
asked 2 months ago
Hello, I am currently working with AWS Glue ETL Jobs and encountered an issue where the "Push to repository" and "Pull from repository" options are disabled when trying to push the script/job to...
1
answers
0
votes
209
views
asked 2 months ago
I created a custom visual transform component and put the needed json and python files in S3. The component loaded up as expected. Later, I needed to do some more adjustments to the parameters...
2
answers
0
votes
171
views
EdwardR
asked 2 months ago
I have a glue pyspark script that processes DDB data exported to S3 and writes it to Redshift. Initially, it was using below logic: ``` redshiftConnectionOptions = { "postactions": "BEGIN; MERGE...
1
answers
0
votes
229
views
asked 2 months ago
I just can't understand what I'm doing wrong. I have a table. ``` CREATE EXTERNAL TABLE test ( originalrequest string, requeststarted string ) PARTITIONED BY ( req_start_partition...
Accepted AnswerAmazon AthenaAWS Glue
2
answers
0
votes
367
views
profile picture
Smotrov
asked 2 months ago