Glue ETL job using multiple connections with multiple subnets

0

Hi,

I'm using Glue ETL jobs and have create several data connections. Currently, the number of jobs is increasing in huge quantity, and occasionally we will face the issue of not having enough subnet IP for several jobs running at the same time. Thus, I decided to move some of the connections to a new subnet. As of what the document says: Currently, an ETL job can use JDBC connections within only one subnet. If you have multiple data stores in a job, they must be on the same subnet, or accessible from the subnet. <https://docs.aws.amazon.com/glue/latest/dg/connection-properties.html>

And I found another post saying about this issue (the link referenced in this post is archived so I can't access to read more about it): When you use a JDBC connection as the data source, an ENI is launched in the subnet which is defined in the Connection. Glue resources uses this ENI for all of the traffic to your data sources. When you add multiple connections to a job, it will always launch the ENI in the subnet specified with the first connection that is added to the job. <https://stackoverflow.com/questions/60497935/is-it-possible-to-use-one-aws-glue-job-to-write-data-into-different-databases>

I have researched for a while but cannot find any other documents that discuss details about this issue. But, combining the two above, I understand that I can create a job with connections in multiple subnets, and I will only have to allow connection from the IP range from subnet of the first connection to the second database (as specified in the second connection).

Is my understanding right and do I have to make any further configuration to the two subnets in those two connections.

Thanks in advance and it would be great if you could provide some other documents that give detailed information or best practices for managing networking of Glue job and Glue connection.

asked 7 months ago967 views
1 Answer
0

Glue will only run in a single subnet, it will check the connections with network configuration in order until it finds one suitable and use the network configuration of that one only. You can see details here: https://docs.aws.amazon.com/glue/latest/dg/glue-troubleshooting-errors.html#vpc-failover-behavior-error-10

profile pictureAWS
EXPERT
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions