Questions tagged with Data Lakes

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

92 results
Hi, I'm relatively new to AWS glue and was having trouble in the following transformation codes: ``` DataSource4 = glueContext.create_dynamic_frame.from_catalog(database = "beta", table_name =...
1
answers
0
votes
4460
views
asked 3 years ago
We are exploring usecases where we want to achieve in-place transformation and querying of S3 data lake data. We don't want to provision database and create tables (so we are not keen to consider...
5
answers
0
votes
1171
views
Mayura
asked 3 years ago
I have a table with columns A, B, C, D, ..., where A is a partition key. In a Glue job I want to group records of this table by column A. Is there a way to make the glue workers aware of the...
1
answers
0
votes
776
views
asked 3 years ago
What is the best way to scale cross-account AWS KMS–encrypted Amazon S3 bucket access using ABAC? Tag Name – scaling-cross-account-kms-encrypted-s3-access-using-ABAC
1
answers
0
votes
493
views
AWS
asked 3 years ago
My blueprint needs require an S3 PutObject event to start a Glue ETL job. I only see On-Demand and schedule based triggers as options when creating a blueprint. Does anyone know of a method to...
1
answers
0
votes
366
views
asked 3 years ago
I have an AWS Data Lake that is ready to be used at the moment. My use case for the Data Lake is to be able to, ingest data from different API connectors (coming from other data vendors and service...
1
answers
0
votes
1880
views
Dinesh
asked 3 years ago
I ran an Lake Formation BluePrint and result was as follows: | method to run | result | | --- | --- | | by hand | COMPLETED | | scheduled | IMPORT FAILED(NoCredentialsError: Unable to locate...
0
answers
0
votes
171
views
asked 3 years ago
Hi all, I'm using a glue job that reads CSV files from S3 and injects data to MySQL RDS. at the end of my pyspark glue script, I call a stored procedure. the issue is the glue job status end by...
1
answers
0
votes
1584
views
Jess
asked 3 years ago
Following the aws blog post : [Dynamic Partitioning](https://aws.amazon.com/blogs/big-data/kinesis-data-firehose-now-supports-dynamic-partitioning-to-amazon-s3/). So I have a firehose delivery...
2
answers
1
votes
3689
views
AWS-LDD
asked 3 years ago
Hi Team, I run an AWS glue job that reads data from a CSV file located on an S3 bucket to my aurora MySQL DB. My job fails because it interprets an empty string from the CSV ("") as a null value then...
1
answers
0
votes
4256
views
Jess
asked 3 years ago
Does AWS currently, or does it have planned for the future, an equivalent service to "NewSQL" services such as TiDB, Vitess, Yugabyte, and Oracle Heatwave? My understanding is that these services...
1
answers
0
votes
2048
views
asked 3 years ago
Could you please tell me whether there is any AWS service that can mask the content of S3 files? (Data masking/Anonymization). For Example, 1. Masking Name from John to 'abcd' 2. Masking Phone...
3
answers
1
votes
2817
views
asked 3 years ago