Questions tagged with Data Lakes
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
92 results
Hi, I'm relatively new to AWS glue and was having trouble in the following transformation codes:
```
DataSource4 = glueContext.create_dynamic_frame.from_catalog(database = "beta", table_name =...
We are exploring usecases where we want to achieve in-place transformation and querying of S3 data lake data. We don't want to provision database and create tables (so we are not keen to consider...
I have a table with columns A, B, C, D, ..., where A is a partition key. In a Glue job I want to group records of this table by column A. Is there a way to make the glue workers aware of the...
What is the best way to scale cross-account AWS KMS–encrypted Amazon S3 bucket access using ABAC?
Tag Name – scaling-cross-account-kms-encrypted-s3-access-using-ABAC
My blueprint needs require an S3 PutObject event to start a Glue ETL job. I only see On-Demand and schedule based triggers as options when creating a blueprint. Does anyone know of a method to...
I have an AWS Data Lake that is ready to be used at the moment.
My use case for the Data Lake is to be able to, ingest data from different API connectors (coming from other data vendors and service...
I ran an Lake Formation BluePrint and result was as follows:
| method to run | result |
| --- | --- |
| by hand | COMPLETED |
| scheduled | IMPORT FAILED(NoCredentialsError: Unable to locate...
Hi all,
I'm using a glue job that reads CSV files from S3 and injects data to MySQL RDS.
at the end of my pyspark glue script, I call a stored procedure. the issue is the glue job status end by...
Following the aws blog post : [Dynamic Partitioning](https://aws.amazon.com/blogs/big-data/kinesis-data-firehose-now-supports-dynamic-partitioning-to-amazon-s3/).
So I have a firehose delivery...
Hi Team,
I run an AWS glue job that reads data from a CSV file located on an S3 bucket to my aurora MySQL DB.
My job fails because it interprets an empty string from the CSV ("") as a null value then...
Does AWS currently, or does it have planned for the future, an equivalent service to "NewSQL" services such as TiDB, Vitess, Yugabyte, and Oracle Heatwave? My understanding is that these services...
Could you please tell me whether there is any AWS service that can mask the content of S3 files? (Data masking/Anonymization).
For Example,
1. Masking Name from John to 'abcd'
2. Masking Phone...