AWS Glue: ETL Jobs Connection bug

0

I have found that when multiple connections are used in a Glue ETL job, the order of the connections relative to the data source and data target nodes matter. For example, I have a SQL server data source node and a Redshift target node. When I make any changes the connection order swaps effectively making the first connection in the job details tab my Redshift connection and the next the SQL server connection. The ETL job fails with a cannot connect error when trying to connect to SQL server since it's not the first connection in the job details. The only solution i've found is deleting the target node and reintroducing it to fix the job.

Solomon
asked 6 months ago188 views
2 Answers
1
Accepted Answer

That's correct, you can have multiple connections in terms of driver and configuration, but only one network configuration can be used (the job cannot be on multiple VPC/subnets at the same time).
https://docs.aws.amazon.com/glue/latest/dg/glue-troubleshooting-errors.html#vpc-failover-behavior-error-10
For that you would need a single VPC that can reach both systems (e.g. peering)

profile pictureAWS
EXPERT
answered 6 months ago
profile picture
EXPERT
reviewed a month ago
0

Thank you for clarifying.

Solomon
answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions