Questions tagged with Data Lakes
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
92 results
I need to pre-process some data on S3 before the Glue Crawler crawls the data. For this I created an S3 Object Lambda to do the pre-processing. If I test the Object Lambda using the CLI, it provides...
We are using AWS EMR hadoop cluster, where prestodb is running. willing secure prestodb.
Need complete step by step implementation guide on apache ranger (open source) or via ldap.
Application...
I have created a templated job (with parameters) to ingest data from different tables (passing the database and table as parameter) and write the data to S3 (passing the destination bucket as...
I am using EMR 6.6.0, which has hudi 10.1. I am trying to bulkinsert and do inline clustering using Hudi. But seems its not clustering the file as per file size being mentioned. But it is still...
I was trying to go to documentation for Health Lake this is what I get:
Bad Request Your browser sent a request that the server could not understand. size of request header field excess server...
When I try to run the following query via the Athena JDBC Driver
```sql
describe gitlab.issues
```
I get the following error:
> [Simba][AthenaJDBC](100071) An error has been thrown from the AWS...
working on a POC to understand how data Governed Tables compaction work, after governed table is created and data getting loaded into the table using a Glue job,
compaction is getting triggered...
When developing some Glue scripts from a successful Crawler run from a JDBC Oracle data source, I am encountering an error that I cannot resolve.
```
An error occurred while calling...
Have created a DMS task to migrate data from MongoDB to S3 in parquet, and will be using parquet files in Glue. But the column names contain spaces in their names, due to which the parquet files are...
Hi,
I would like to know, when crawling the data from s3 in order to create a database; does the database must be a relational database ? It can have tables that no relation with other tables ?
Hi
I was trying to work on a simple problem where I am taking data from 3 csv files in s3(source) and after combining them, I am appending them to a table in postgre sql where my database is by...
_temp lake formation blueprint pipeline tables appears to IAM user in Athena editor, although I didn't give this user permission on them below the policy granted to this IAM user,also in lake...