Questions tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

1732 results
I get this Error when run Glue ETL job: `Error Category: RESOURCE_NOT_FOUND_ERROR; An error occurred while calling o228.pyWriteDynamicFrame....
1
answers
0
votes
50
views
asked a month ago
* **Glue version**: 4.0 * **the Python codes that occurs the error:** ``` df.select([col(c).cast("string") for c in df.columns]).repartition(1).write.mode('overwrite').option('header',...
1
answers
0
votes
69
views
Chahin
asked 2 months ago
I have a crawler that I'm trying to have extract headers and data from a CSV file. When I run the crawler and then use Athena to query the table it returns the no data. It seems to only extract the...
1
answers
0
votes
65
views
asked 2 months ago
Invoking a Glue Workflow from Step Functions got the following error when deploying Cloudformation: ``` Resource handler returned message: "Invalid State Machine Definition:...
2
answers
0
votes
83
views
asked 2 months ago
I have 4 csvs that have same columns and I am able to crawl them as 1 data table. the issue I am facing is even after adding areColumnsQuoted = true I am seeing each column value enclosed with double...
Accepted AnswerAmazon AthenaAWS Glue
1
answers
0
votes
112
views
akash
asked 2 months ago
I have below code for setting up alarm for AWS glue job using CDK: ` ``` import { aws_cloudwatch as cloudwatch, aws_events as events } from 'aws-cdk-lib'; // jobName. This is our AWS Glue script to...
5
answers
0
votes
152
views
RahulD
asked 2 months ago
I want to create a crawler on my RDS database but I cannot create the role needed as it it disabled. The AWS console user I am using has admin level role. ![Enter image description...
1
answers
0
votes
156
views
Remiby
asked 2 months ago
I have written an ETL job in AWS Glue using the interactive notebook and I want to enable job bookmark to avoid reprocessing already processed data. The source data are in an S3 bucket, a Glue data...
2
answers
0
votes
193
views
Vas
asked 2 months ago
Hi, I am using a s3 bucket for data shuffling. The Glue job failed with the following error: "An error occurred while calling o147.saveAsTable. Job aborted due to stage failure: ResultStage 5...
2
answers
0
votes
176
views
mykc
asked 2 months ago
Hallo, I wanted to add file pattern in AWS Glue ETL job python script where it should generate the files in s3 bucket with pattern dostrp*.csv.gz but could not find way how to provide this file...
Accepted AnswerAWS Glue
1
answers
0
votes
141
views
RahulD
asked 2 months ago
I am developing a data pipeline for building a Redshift data warehouse as below: 1. Export DynamoDB data to S3 using 'export to S3' feature 2. In glue, create a spark data-frame on the S3 exported...
0
answers
0
votes
159
views
asked 2 months ago
I'm trying to build a regular expression for a Grok pattern. The goal is: giving a sequence of digits, I need to assign the first eleven numbers to a named capturing group, and then digits from...
0
answers
0
votes
146
views
profile picture
asked 2 months ago