Questions tagged with AWS Glue

Content language: English

Select up to 5 tags to filter

Sort by most recent

Filter Questions by

AllAnsweredUnansweredNo Answer

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Support for Record Matching transforms in Glue 3.0/4.0

I'm working on a project that makes use of Glue Record Matching transforms which, by my best research though AWS docs, is only supported in Glue 2.0 jobs (and additionally, the maximum Glue version I...

AWS Glue

answers

votes

views

bryanhart

asked 4 months ago

Delete old parquet files of overwritten Iceberg table

I am trying to write a pyspark dataframe to S3 and the AWS data catalog using the Iceberg format and the pyspark.sql.DataFrameWriterV2 with the createOrReplace function. When I write the same...

Amazon Athena AWS Glue

answers

votes

639

views

Thomas Mueller

asked 4 months ago

Getting Error Category: UNCLASSIFIED_ERROR; An error occurred while calling o107.pyWriteDynamicFrame. Exception thrown in awaitResult: when running Glue job to transfer data from S3 to Redshift

Hi. I am trying to run an AWS Glue job where I transfer data from S3 to Amazon Redshift. However, I am receiving the following error: ``` Error Category: UNCLASSIFIED_ERROR; An error occurred while...

AWS Glue Extract Transform & Load Data Amazon Redshift

answers

votes

1124

views

Matt_J

asked 4 months ago

Redshift Serverless -> Aurora Serverless Postgres (using AWS Glue)

I have a data pipeline built in Redshift Serverless, with some final tables being the result. We are also running a web app that I have set up an Aurora Serverless Postgres DB, to run from. The idea...

Aurora PostgreSQL AWS Glue Amazon Aurora Amazon Redshift Serverless

answers

votes

121

views

Dan

asked 4 months ago

HIVE_UNSUPPORTED_FORMAT: Unable to create input format

Can someone please help with this error? I have a csv file in an S3 bucket, created a crawler to update a table in glue, and the crawler runs but when I try to view the data in athena I get this...

Accepted AnswerAmazon Simple Storage Service Amazon Athena AWS Glue

answers

votes

573

views

Jessica Awesome

asked 4 months ago

AWS Glue DynamicFrame .. where to get corrupt records?

Hi this question is regarding corrupt or malformed records in Glue ETL. Spark DataFrames obviously have an option for indicated column for _corrupt_record when this happens and the entire record is...

AWS Glue

answers

votes

210

views

Ric

asked 4 months ago

DataBrew - Iceberg Tables Support

Hello, I would like to know if there is a way to query Iceberg tables (backed with S3 parquet files) cataloged within the AWS Glue Catalog using AWS Databrew. (maybe through Athena?). Also, is it...

Accepted AnswerAmazon Athena AWS Glue DataBrew AWS Glue

answers

votes

576

views

Miguel Garcia

asked 4 months ago

AWS DataCatalog aws tags format ":" malformat metadata

Hi Trying to craw connect logs create bad metadata with fields like this inside the table: struct<connect\:Subtype:struct<ValueString:string>> obvious running this struct inside athena result in a...

Amazon Athena AWS Glue

answers

votes

428

views

Elvin

asked 4 months ago

Launching the Spark history server and viewing the Spark UI using Docker

Hi, Have followed the below documentation to set up the Spark History server to see Spark UI Logs. Am able to run the container but not able to access the URL http://localhost:18080 . docker run...

AWS Glue

answers

votes

229

views

rePost-User-0428096

asked 5 months ago

Athena federated query to Timestream slow

We connected Timestream to Athena using the [Athena Timestream connector](https://docs.aws.amazon.com/athena/latest/ug/connectors-timestream.html). When running a federated query through Athena to...

Amazon Athena Analytics Database AWS Glue Amazon Timestream

answers

votes

778

views

Drew

asked 5 months ago

setting connection between aws glue and mongodb atlas

i got a mongodb atlas cluster outside aws. I want to use aws glue with my mongo db databases so i created a connection but im getting "InvalidInputException: Unable to resolve any valid connection". ...

AWS Glue

answers

votes

268

views

ignacio

asked 5 months ago

is it possible to converta spark dataframe to dynamic frame and then using bookmark feature on the s3 folder used to read data in spark frame

``` df = spark.read.parquet("s3://folder/") df = df.withColumn('filename', input_file_name()) AmazonS3_node1697616892615 = DynamicFrame.fromDF(df, glueContext, "s3sparkread") ``` if this is the code...

Amazon Simple Storage Service AWS Glue Extract Transform & Load Data

answers

votes

365

views

asked 5 months ago