All Content tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Hi everyone,
sorry for the basic questions but I haven't been able to find any answer online yet.
I am regularly running time-consuming data analytics jobs on AWS Batch. Each job is essentially a...
We have data stored in Cosmos DB NoSQL and need to migrate it to Snowflake using AWS Glue with a Change Data Capture (CDC) approach.
Our objective is to perform CRUD operations based on CDC to handle...
I'm aware of the Zero-ETL integration. But the integration fails because I have foreign keys with a CASCADE constraint.
I tried DMS but I have the exact same problem.
> Tables in task scope have...
Came across the following useful documentation:
https://docs.aws.amazon.com/prescriptive-guidance/latest/apache-iceberg-on-aws/best-practices-read.html#read-sort-order
I have a large table where I...
Currently sourcing Facebook Ads data through AppFlow, specifically the **Insights - Ad Sets** object, but the data is grouped into total metrics for the lifespan of an Ad Set (if ads were running from...
I have a crawler that I'm trying to have extract headers and data from a CSV file. When I run the crawler and then use Athena to query the table it returns the no data. It seems to only extract the...
I have written an ETL job in AWS Glue using the interactive notebook and I want to enable job bookmark to avoid reprocessing already processed data. The source data are in an S3 bucket, a Glue data...
I am developing a data pipeline for building a Redshift data warehouse as below:
1. Export DynamoDB data to S3 using 'export to S3' feature
2. In glue, create a spark data-frame on the S3 exported...
Hello,
I’m trying to create a transformation rule in AWS DMS to add a new column based on the email column’s value. Specifically, I want to check if the email starts with “sample”. Below is the JSON...
I have an EventBridge rule that triggers when a new file is added to an S3 bucket with the EventBridge target being a glue workflow. Now I want to pass event data from EventBridge to my glue workflow...
We are trying to read a CSV file to process the data using AWS Glue and we are getting an error message as below:
Py4JJavaError: An error occurred while calling o91.schema.
:...
I have been implementing a small ETL job using Pyspark.
**I plan to deploy it to AWS Glue and will use an S3 bucket. to read and write my files instead of local file, once it is ready.**
This ETL...