Questions tagged with Data Lakes

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

We are using AWS EMR hadoop cluster, where prestodb is running. willing secure prestodb. Need complete step by step implementation guide on apache ranger (open source) or via ldap. Application...
0
answers
0
votes
171
views
asked 2 years ago
I have created a templated job (with parameters) to ingest data from different tables (passing the database and table as parameter) and write the data to S3 (passing the destination bucket as...
1
answers
1
votes
462
views
asked 2 years ago
I am using EMR 6.6.0, which has hudi 10.1. I am trying to bulkinsert and do inline clustering using Hudi. But seems its not clustering the file as per file size being mentioned. But it is still...
1
answers
0
votes
574
views
AWS
Zahid
asked 2 years ago
I was trying to go to documentation for Health Lake this is what I get: Bad Request Your browser sent a request that the server could not understand. size of request header field excess server...
1
answers
0
votes
373
views
asked 2 years ago
When I try to run the following query via the Athena JDBC Driver ```sql describe gitlab.issues ``` I get the following error: > [Simba][AthenaJDBC](100071) An error has been thrown from the AWS...
0
answers
0
votes
332
views
asked 2 years ago
working on a POC to understand how data Governed Tables compaction work, after governed table is created and data getting loaded into the table using a Glue job, compaction is getting triggered...
1
answers
0
votes
370
views
asked 2 years ago
When developing some Glue scripts from a successful Crawler run from a JDBC Oracle data source, I am encountering an error that I cannot resolve. ``` An error occurred while calling...
0
answers
0
votes
141
views
asked 2 years ago
Have created a DMS task to migrate data from MongoDB to S3 in parquet, and will be using parquet files in Glue. But the column names contain spaces in their names, due to which the parquet files are...
1
answers
0
votes
1975
views
asked 2 years ago
Hi, I would like to know, when crawling the data from s3 in order to create a database; does the database must be a relational database ? It can have tables that no relation with other tables ?
1
answers
0
votes
319
views
posix
asked 2 years ago
Hi I was trying to work on a simple problem where I am taking data from 3 csv files in s3(source) and after combining them, I am appending them to a table in postgre sql where my database is by...
1
answers
0
votes
436
views
asked 2 years ago
_temp lake formation blueprint pipeline tables appears to IAM user in Athena editor, although I didn't give this user permission on them below the policy granted to this IAM user,also in lake...
1
answers
0
votes
427
views
asked 2 years ago
Hello. Development Endpoint only supports Glue version <= 1.0. With upgraded Glue Versions, will Glue Version 1.0 eventually be deprecated? I saw the following post related to development under Glue...
1
answers
0
votes
727
views
asked 2 years ago