Questions tagged with Data Lakes

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

92 results
I need to pre-process some data on S3 before the Glue Crawler crawls the data. For this I created an S3 Object Lambda to do the pre-processing. If I test the Object Lambda using the CLI, it provides...
0
answers
1
votes
201
views
JannesH
asked 2 years ago
We are using AWS EMR hadoop cluster, where prestodb is running. willing secure prestodb. Need complete step by step implementation guide on apache ranger (open source) or via ldap. Application...
0
answers
0
votes
171
views
asked 2 years ago
I have created a templated job (with parameters) to ingest data from different tables (passing the database and table as parameter) and write the data to S3 (passing the destination bucket as...
1
answers
1
votes
473
views
asked 2 years ago
I am using EMR 6.6.0, which has hudi 10.1. I am trying to bulkinsert and do inline clustering using Hudi. But seems its not clustering the file as per file size being mentioned. But it is still...
1
answers
0
votes
585
views
AWS
Zahid
asked 2 years ago
I was trying to go to documentation for Health Lake this is what I get: Bad Request Your browser sent a request that the server could not understand. size of request header field excess server...
1
answers
0
votes
383
views
asked 2 years ago
When I try to run the following query via the Athena JDBC Driver ```sql describe gitlab.issues ``` I get the following error: > [Simba][AthenaJDBC](100071) An error has been thrown from the AWS...
0
answers
0
votes
336
views
asked 2 years ago
working on a POC to understand how data Governed Tables compaction work, after governed table is created and data getting loaded into the table using a Glue job, compaction is getting triggered...
1
answers
0
votes
377
views
asked 2 years ago
When developing some Glue scripts from a successful Crawler run from a JDBC Oracle data source, I am encountering an error that I cannot resolve. ``` An error occurred while calling...
0
answers
0
votes
141
views
asked 2 years ago
Have created a DMS task to migrate data from MongoDB to S3 in parquet, and will be using parquet files in Glue. But the column names contain spaces in their names, due to which the parquet files are...
1
answers
0
votes
2008
views
asked 2 years ago
Hi, I would like to know, when crawling the data from s3 in order to create a database; does the database must be a relational database ? It can have tables that no relation with other tables ?
1
answers
0
votes
330
views
posix
asked 2 years ago
Hi I was trying to work on a simple problem where I am taking data from 3 csv files in s3(source) and after combining them, I am appending them to a table in postgre sql where my database is by...
1
answers
0
votes
448
views
asked 2 years ago
_temp lake formation blueprint pipeline tables appears to IAM user in Athena editor, although I didn't give this user permission on them below the policy granted to this IAM user,also in lake...
1
answers
0
votes
434
views
asked 2 years ago