- Newest
- Most votes
- Most comments
One of the options to identify incremental data is to create a view on top of the source table where the create or update timestamp is greater than the last job run time. You can create the job metadata table to capture the last job run time or any other information based on your requirement to process incremental data.
Since both source and target clusters are Redshift, you may want to explore Data Sharing, which is available in the Ra3 instance type and will allow you to read data directly from the source cluster. In that case, you do ELT in the source cluster and share live data with the target cluster. For additional information, please refer to
https://aws.amazon.com/redshift/features/data-sharing/ https://docs.aws.amazon.com/redshift/latest/dg/datashare-overview.html
Unless I am missing something, I would not use AWS Glue for a Redshift to Redshift ETL process. Glue would be great for file based (likely parquet) data in S3. I would write a Redshift stored procedure and use a Step Function to parameterize and schedule it. For some info on how to incrementally process in Redshift, see this article: https://docs.aws.amazon.com/redshift/latest/dg/t_updating-inserting-using-staging-tables-.html
AWS Glue jobs support bookmarks. You need to have a compound key with those key columns (I'd try only with ID and UpdatedDate first) and use it as job bookmark key in your ETL job.
Is there a reason to ETL from one cluster to another?
If your clusters are using RA3 nodes, using data sharing between the clusters will remove any ETL requirements as you will get fast access to up to date, transactionally consistent data without needing to move data between your clusters.
If there are additional processing needed, you can use native stored procedures on top of datasharing to accomplish the task.
Relevant content
- asked 2 years ago
- Accepted Answerasked 3 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago