Unanswered Questions tagged with AWS Glue

Content language: English

Select up to 5 tags to filter

Sort by most recent

Filter Questions by

AllAnsweredUnansweredNo Answer

Browse through the questions and answers listed below or filter and sort to narrow down your results.

GLUE Custom transform: pyspark.sql.functions does not import mode

Hello, I'm writing a custom transform where I want to use mode within pyspark.sql.functions but I get the same issue irrespective of whether I use * or import the specific module. How can I resolve...

Analytics AWS Glue Extract Transform & Load Data

answers

votes

views

Rohan

asked 10 months ago

How to create SQL applications (legacy) in amazon kinesis analytical studio

Hello All, I m doing one AWS workshop(Data Engineering Immersion Day), as part of workshop, I have to create sql application(legacy), but now AWS deprecated that feature to create new applications. ...

Analytics Database AWS Glue Amazon Kinesis Data Analytics Amazon Kinesis

answers

votes

121

views

Sai

asked 10 months ago

Glue column values as null when values contains #

I am trying to use the aws dynamodb export to s3 but when I read the data in glue. the value of an entire column is being received as null. I have tried multiple times doing same thing. And I ran a...

Amazon Simple Storage Service AWS Glue Amazon DynamoDB

answers

votes

102

views

harshitgupta9715

asked 10 months ago

[CDK] Create a Glue Trigger that triggers Glue Crawler after Glue Job is finished successfully.

I want to build a Glue Trigger that triggers Glue Crawler after Glue Job is finished successfully. I looked over the cfnTrigger and wrote a code for it. After CDK DEPLOY and finishing Glue Job...

AWS Glue AWS Cloud Development Kit (CDK)

answers

votes

112

views

rePost-User-1921592

asked 10 months ago

class org.apache.parquet.io.GroupColumnIO cannot be cast to class org.apache.parquet.io.PrimitiveColumnIO

Using Athena on an s3 bucket that's been crawled and get the error: class org.apache.parquet.io.GroupColumnIO cannot be cast to class org.apache.parquet.io.PrimitiveColumnIO I've narrowed down the...

Amazon Athena AWS Glue

answers

votes

views

Marc

asked 10 months ago

AWS Glue Job error "AnalysisException: "

Hi Team, I am trying to archive the mongodb data to S3 as a parquet format, so that i have created spark script for that, When i am execute the spark script getting below error. How to resolve this...

AWS Glue DataBrew Amazon DocumentDB AWS Glue Parameter Store AWS Glue for Ray

answers

votes

125

views

Krishnakumar

asked 10 months ago

Suggested enhancements for Glue Workflows

Hi all, I've noticed some limitations while using Glue Workflows, that I'd like to suggest or possibly hear if there are alternatives. 1) Suppose you have job C depending on both jobs A and B...

AWS Glue AWS Batch Operational Excellence

answers

votes

225

views

MarcosNagato-Araujo

asked 10 months ago

How to read data from raw bucket and write to discovery bucket using Aws glue job

I have a raw bucket which performs read using glue job and writes to discovery bucket . In this process I’m facing error like not able to process the files present in location raw bucket ( from logs...

Amazon Simple Storage Service AWS Glue

answers

votes

views

Madhu

asked 10 months ago

Cloudformation to Update Existing AWS Glue Crawler for DocumentDB Collections

I'm trying to update an existing AWS Glue Crawler for a DocumentDB instance. Given that it won't take a wildcard to add all the collections to the crawler I'm looking for an easy way to add several...

AWS CloudFormation AWS Glue

answers

votes

views

texnoob

asked 10 months ago

AWS Glue Job detects schema changes, but they don't appear in Redshift.

Hello, we have an S3 bucket with various CSV files and an AWS Glue crawler to update the Data Catalog and finally an AWS Glue job to move the data to RedShift. The handling of data and target table is...

AWS Glue Extract Transform & Load Data Amazon Redshift

answers

votes

132

views

Isabel

asked 10 months ago

Athena Federated Queries to DocumentDB: COLUMN_NOT_FOUND: line 1:8: Column cannot be resolved or requester is not authorized to access requested resources

I've got Athena setup to query a DocumentDB instance with the Lambda function built and AWS Glue configured. The setup was done through the datasource connector for DocumentDB. I can see the database...

Amazon Athena Amazon DocumentDB AWS Glue

answers

votes

191

views

texnoob

asked 10 months ago

Data processing best practices

Hello everyone. Data from the rest api in the form of JSON is loaded daily by lambda into s3-bucket-1. Then this data should be stored in s3-bucket-2 in the form of a flat parquet table. I did it in...

Product Design AWS Glue Extract Transform & Load Data

answers

votes

views

Anton

asked 10 months ago

1
•••
5
6
7
8
9
•••
19
12 / page