Questions tagged with Data Lakes

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Hi team, I have an AWS glue job that reads data from the CSV file in s3 and injects the data on a table in MySQL RDS Aurora DB. The issue is all lines in the CSV file with escaped characters are...
1
answers
0
votes
2091
views
Jess
asked 3 years ago
I'm syncing data written to S3 using Apache Hudi with Hive & Glue. Hudi options: ``` hudi_options: 'hoodie.table.name': mytable 'hoodie.datasource.write.recordkey.field': Id ...
1
answers
0
votes
4718
views
spree
asked 3 years ago
Hi, I am signed in with a userId which has IAM Administrator access and DataLake Administrator access. The userId also has permissions to table columns set up in Lake Formation. However when...
2
answers
0
votes
595
views
AWS
asked 3 years ago
Business users regularly host data on Microsoft SharePoint. Is there any way to get csv or xlsx data in SPICE, that doesn't involve using a prorietary solution ?
1
answers
0
votes
1183
views
PaulS
asked 3 years ago
Hello, As part of a SaaS solution, I'm currently setting up the structure for a S3 bucket which will contian multiple clients' data. The idea is to use one access point per client, in order to...
1
answers
1
votes
645
views
asked 3 years ago
Are governed tables insert/append only? Is it possible to update data already in the table? ...
2
answers
0
votes
1678
views
asked 3 years ago
With the Glue Console (Glue 3.0 - python and spark), I'm need to overwrite the data of an S3 bucket in a automated daily process. I tried with the `glueContext.purge_s3_path( ...
2
answers
0
votes
6405
views
asked 3 years ago
I've been trying this for a week but I'm starting to give up - I need some help understanding this. I have an S3 bucket full of XML files, and I am creating a pyspark ETL job to convert them to...
3
answers
0
votes
2002
views
Eelviny
asked 3 years ago
We have a BI feature where a web app which uses non-aws authentication queries Athena for data which is hive partitioned by customer. Currently any BI query gets modified to filter data down to just...
2
answers
0
votes
1127
views
asked 3 years ago
When defining blueprints in AWS Lake Formation, can we specify a particular snapshot? Does Lake Formation always uses the recent snapshot by default?
2
answers
0
votes
591
views
asked 4 years ago
A customer is interested in doing analytics using the data stored in multiple platforms like NetSuite ERP and Magenta (RDS MariaDB db backend) in AWS. They are looking to integrate the data (about 8...
1
answers
0
votes
828
views
AWS
asked 4 years ago
IHAC that uses Ab Initio on an enterprise scale to do on-prem ETL workloads. They are now wanting to build a Data Lake on AWS, and would prefer using this already established tool to write from source...
1
answers
0
votes
2463
views
AWS
asked 4 years ago