Questions tagged with AWS Glue
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
1736 results
Hello,
I am facing this weird issue from AWS Glue. I do have a NAT Gateway in the VPC, which should take care of the network issues. So I am not sure why the networking issue exists/persists. I...
I'm running a Visual ETL job under Glue service. I'm testing that service thru visual editor and I stated thru datasource pointing to some DynamoDB table (before I made a crawler, run it then I aws...
I have an EventBridge rule that triggers when a new file is added to an S3 bucket with the EventBridge target being a glue workflow. Now I want to pass event data from EventBridge to my glue workflow...
I zipped my modules into zip file, uploaded to s3 and added to Pyspark and Shell jobs under `Python library path
` parameter:
![Enter image description...
Hi, I have been using a docker image from amazon/aws-glue-libs:glue_libs_4.0.0_image_01 to run locally Glue Spark jobs.
I also want to test Ray locally in the same manner, by inspecting the image...
We are using Step Functions for our ETL pipeline. The first step kicks off 21 jobs that each take about 1-3 minutes each consuming 2 DPUs. The Step Function fails with the below error when trying to...
We are trying to read a CSV file to process the data using AWS Glue and we are getting an error message as below:
Py4JJavaError: An error occurred while calling o91.schema.
:...
How can I monitor who is querying which glue tables?
After some trial and error I found that the BatchGetTable Glue API event is recorded using CloudTrail every time I run an Athena query, and...
I have Athena Iceberg table.
The table has 2 partitions.
Each hour I update it with `MERGE` and `DELETE` commands.
```
SELECT count(*) FROM "my_table$files"
```
now **gives 16. Meanwhile data...
I have a bunch of parquet files in a flat S3 folder, no partitions:...
Hi. I had a table that was created by a crawler, then I deleted the table ( in Athena) and created it by DDL. after running crawler. it could not find the table and create a new table.
note: The s3...
I have a few text files on S3 that I need to add to the Glue Catalog in order to use them in a job. None of them have separators, they are all fixed-width files. I have the schemas, but the crawler...