Questions tagged with Data Lakes
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
92 results
Hello. Development Endpoint only supports Glue version <= 1.0. With upgraded Glue Versions, will Glue Version 1.0 eventually be deprecated?
I saw the following post related to development under Glue...
1. What is the best way to Create a subset of factory location data
Current process: Query location data for specific factories, save in a new Athena table with a direct insert statement
2. Get...
I have two S3 buckets with data tables, namely A and B, and a Glue job, that transforms data from A to B. Both tables contain a column called x. The Glue job performs a GroupBy operation on this...
**ERROR MESSAGE:** An error occurred while calling o518.pyWriteDynamicFrame. Unsupported case of DataType: com.amazonaws.services.glue.schema.types.StringType@235d3a6f and DynamicNode: integernode.
I...
Hi,
I have a database with around 40 tables. However, some end users don't need to see all tables in the database. I'm using Lake Formation Tagging and know that if a tag is added to the database...
Hi everyone,
I have 270GB of data in my NAS. So what we are doing right now is that we have set up bidirectional sync from dropbox. Through windows explorer, I have given access to NAS to all users....
I have implemented LakeFormation on my data bucket.
I have a step function in which one step consists of running a GlueJob that reads and writes to the data catalog.
I have upgraded my DataLake...
## Problem
I want to know, understand and correct my knowledge, approach on, Setting up an Data Ingestion pipeline, which collects "events" or "data" from any possible external application sources...
Hi Team,
I couldn't find list/details of the tools to which Amazon S3 integrates with by using a S3 connector. Which tools integrate with S3 to provide in-place querying of S3 data (i.e. data...
I need to load data from multiple tables in a SQL server to S3 for some batch processing. Can AWS Glue read data from different SQL Server table, generate csv files and zipping it to S3?
And can AWS...
Hey I am trying to learn what others and what the best practices are with glue for development automation and testing/validation.
I have a large dataset (table) with >1e9 records (rows) in Glue. The tables are partitioned by column A, which is a n-letters subtring of column B. For example:
| A (partition key) | B | ... |
| ---...