Questions tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

1733 results
This code is working for some of the assets in datazone but giving error for some assets, Do I need to change code for according to the asset type? or what can I change to fix the error? Content = ...
1
answers
0
votes
45
views
Harsh
asked 2 months ago
I get this Error when run Glue ETL job: `Error Category: RESOURCE_NOT_FOUND_ERROR; An error occurred while calling o228.pyWriteDynamicFrame....
1
answers
0
votes
52
views
asked 2 months ago
* **Glue version**: 4.0 * **the Python codes that occurs the error:** ``` df.select([col(c).cast("string") for c in df.columns]).repartition(1).write.mode('overwrite').option('header',...
1
answers
0
votes
72
views
Chahin
asked 2 months ago
I have a crawler that I'm trying to have extract headers and data from a CSV file. When I run the crawler and then use Athena to query the table it returns the no data. It seems to only extract the...
1
answers
0
votes
70
views
asked 2 months ago
Invoking a Glue Workflow from Step Functions got the following error when deploying Cloudformation: ``` Resource handler returned message: "Invalid State Machine Definition:...
2
answers
0
votes
85
views
asked 2 months ago
I have 4 csvs that have same columns and I am able to crawl them as 1 data table. the issue I am facing is even after adding areColumnsQuoted = true I am seeing each column value enclosed with double...
Accepted AnswerAmazon AthenaAWS Glue
1
answers
0
votes
113
views
akash
asked 2 months ago
I have below code for setting up alarm for AWS glue job using CDK: ` ``` import { aws_cloudwatch as cloudwatch, aws_events as events } from 'aws-cdk-lib'; // jobName. This is our AWS Glue script to...
5
answers
0
votes
154
views
RahulD
asked 2 months ago
I want to create a crawler on my RDS database but I cannot create the role needed as it it disabled. The AWS console user I am using has admin level role. ![Enter image description...
1
answers
0
votes
160
views
Remiby
asked 2 months ago
I have written an ETL job in AWS Glue using the interactive notebook and I want to enable job bookmark to avoid reprocessing already processed data. The source data are in an S3 bucket, a Glue data...
2
answers
0
votes
194
views
Vas
asked 2 months ago
Hi, I am using a s3 bucket for data shuffling. The Glue job failed with the following error: "An error occurred while calling o147.saveAsTable. Job aborted due to stage failure: ResultStage 5...
2
answers
0
votes
177
views
mykc
asked 2 months ago
Hallo, I wanted to add file pattern in AWS Glue ETL job python script where it should generate the files in s3 bucket with pattern dostrp*.csv.gz but could not find way how to provide this file...
Accepted AnswerAWS Glue
1
answers
0
votes
143
views
RahulD
asked 2 months ago
I am developing a data pipeline for building a Redshift data warehouse as below: 1. Export DynamoDB data to S3 using 'export to S3' feature 2. In glue, create a spark data-frame on the S3 exported...
0
answers
0
votes
159
views
asked 2 months ago