Questions tagged with Data Lakes
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
92 results
Hi team, I have an AWS glue job that reads data from the CSV file in s3 and injects the data on a table in MySQL RDS Aurora DB.
the escapeChar used on the CSV file is the backslash (\).
I have 2...
Hi team,
I have an AWS glue job that reads data from the CSV file in s3 and injects the data on a table in MySQL RDS Aurora DB.
The issue is all lines in the CSV file with escaped characters are...
I'm syncing data written to S3 using Apache Hudi with Hive & Glue.
Hudi options:
```
hudi_options:
'hoodie.table.name': mytable
'hoodie.datasource.write.recordkey.field': Id
...
Hi,
I am signed in with a userId which has IAM Administrator access and DataLake Administrator access. The userId also has permissions to table columns set up in Lake Formation. However when...
Business users regularly host data on Microsoft SharePoint. Is there any way to get csv or xlsx data in SPICE, that doesn't involve using a prorietary solution ?
Hello,
As part of a SaaS solution, I'm currently setting up the structure for a S3 bucket which will contian multiple clients' data.
The idea is to use one access point per client, in order to...
Are governed tables insert/append only? Is it possible to update data already in the table? ...
With the Glue Console (Glue 3.0 - python and spark), I'm need to overwrite the data of an S3 bucket in a automated daily process. I tried with the `glueContext.purge_s3_path( ...
I've been trying this for a week but I'm starting to give up - I need some help understanding this. I have an S3 bucket full of XML files, and I am creating a pyspark ETL job to convert them to...
We have a BI feature where a web app which uses non-aws authentication queries Athena for data which is hive partitioned by customer. Currently any BI query gets modified to filter data down to just...
When defining blueprints in AWS Lake Formation, can we specify a particular snapshot? Does Lake Formation always uses the recent snapshot by default?
A customer is interested in doing analytics using the data stored in multiple platforms like NetSuite ERP and Magenta (RDS MariaDB db backend) in AWS. They are looking to integrate the data (about 8...