Questions tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

1922 results
I have a large DDB table with 7 TB of data, 25 billion rows. This is a production table. I need to scan and add/update a column to each row of the table. The table has daily export to s3. I am conside...
2
answers
0
votes
77
views
asked a month ago
I am trying to set the identifier-field-ids to the Iceberg tables so that Firehose can perform update/delete operations on Iceberg tables as cannot add unique keys on dynamic database. I am creating I...
1
answers
0
votes
241
views
AWS
asked a month ago
I have created sagemaker unified studio using manual setup method, I have created project within the domain having project profile of "All Capabilities" also I have created and added new compute of em...
2
answers
0
votes
108
views
asked a month ago
Hi, I'm attempting to create a Ray glue interactive session, Based on the [announcement blog post](https://aws.amazon.com/blogs/big-data/introducing-aws-glue-for-ray-scaling-your-data-integration-wor...
1
answers
0
votes
51
views
asked a month ago
I have created a compute connection for existing redshift serverless in sagemaker unified studio also added neccesary tag in workgroup as mentioned in the document. ![Enter image description here](/...
1
answers
0
votes
65
views
asked 2 months ago
when using ATHENA to query a redshift table it fail on column of type timestamptz ICEBERG_BAD_DATA: Field created_at's type INT96 in parquet file s3://AAAAAAAAAAAAAAA-5b5dc388-103a-4130-bab4-c1508e...
2
answers
0
votes
70
views
asked 2 months ago
I have Glue schema registry Registry A created in AWS account A and I want to provide access to resources in AWS account to be able to retrieve schemas from Registry A
1
answers
0
votes
75
views
asked 2 months ago
Hi, I have been looking into a solution option that uses the Athena invoker_principal to get the ARN of the IAM role being used into the SQL query. Is there a way to do the same if EMR or Redshift...
1
answers
0
votes
61
views
asked 2 months ago
I am trying to get the metadata of a database by running a query in Athena:- SELECT 'DEV' DF_ENVIRONMENT, 'Source Layer' DATA_LAYER, CAST(TABLE_CATALOG AS VARCHAR) DATABASE_NAME, CAST(TABL...
2
answers
0
votes
78
views
asked 2 months ago
Whenever I run a data quality job on a Glue table that was created via a spark SQL CTAS command in a Glue Job, I get the following error: *Exception in User Class: java.lang.RuntimeException : Failed...
1
answers
0
votes
55
views
asked 2 months ago
Hello, I have an AWS Glue 5.0 job where I am specifying `--additional-python-modules s3://my-dev/other-dependencies /MyPackage-0.1.1-py3-none-any.whl` in my job options. My glue job itself is just a `...
2
answers
0
votes
67
views
asked 2 months ago
Hi AWS, in our organization we monthly generate a report which we need to share with a Partner AWS account. The report is in parquet format stored inside one of the Glue Tables under the Glue Database...
1
answers
0
votes
94
views
profile picture
asked 2 months ago