This article aims to explain how AWS Glue handles networking when multiple connections are attached to a job. It clarifies the connection selection process, validation steps, and behavior during job execution and retries.
AWS Glue supports attaching multiple connections to a job, but it creates an Elastic Network Interface (ENI) in only one subnet. This article explains how Glue selects and uses connections for job execution, and addresses challenges when working with resources in different VPCs.
Connection Selection Process:
- Glue checks connections with network configurations in the order they're listed until it finds a suitable one.
- The selected connection's network configuration is used for the job run.
- Connection selection occurs when a job run is submitted.
- When multiple connections (e.g., RedshiftConnection, MysqlConnection, PostgreSQLConnection) are attached to the same job, Glue creates an ENI in the subnet of the first suitable connection it finds.
Connection Validation:
AWS Glue validates connections by:
- Verifying the Amazon VPC ID and subnet are valid
- Confirming a NAT gateway or Amazon VPC endpoint exists
- Ensuring the subnet has available IP addresses
- Checking that the Availability Zone (AZ) is healthy
Note: AWS Glue cannot verify connectivity at job run submission time.
Job Execution Behavior:
- All drivers and executors for VPC-based jobs are created in the same AZ as the selected connection.
- If a job run encounters issues (e.g., lack of IP addresses, connectivity problems, routing issues), it will fail.
Retry Behavior:
- If retries are configured, AWS Glue will use the same connection for retry attempts.
- This approach is used because connection problems may be temporary.
- In case of an AZ failure, existing job runs in that AZ may fail depending on their stage.
- A retry should detect an AZ failure and choose another AZ for the new run.
Availability Zone Failures:
- During connection health checks, connections from a failed AZ will be skipped.
- This ensures that jobs can continue running in healthy AZs.
Challenges with Multiple VPC Resources:
- If Glue creates an ENI in one connection's subnet (e.g., RedshiftConnection), resources in other connections' subnets (e.g. MysqlConnection) may not be reachable, and vice versa.
- To mitigate this issue and work with multiple resources present in different VPCs, you need to establish VPC peering connections between the VPCs and subnets used in Glue connections.
VPC Peering Solution:
VPC peering enables direct communication between different VPCs using private IP addresses. This allows resources in peered VPCs to interact as if they were on the same network, without traversing the public internet. By implementing VPC peering:
- Resources in different VPCs become reachable from all connections attached to the Glue job.
- Traffic between peered VPCs remains secure and doesn't traverse the public internet.
- You can connect VPCs within the same AWS region or across different regions.
- Network latency is reduced, enhancing performance for cross-VPC communications.
Conclusion:
When working with AWS Glue jobs that require access to resources across multiple VPCs, it's crucial to understand how Glue handles connection selection and ENI creation. By implementing VPC peering, you can overcome the limitations of single-subnet ENI creation and ensure seamless communication between resources in different VPCs. This approach not only solves connectivity issues but also maintains security and improves network performance for your Glue jobs accessing diverse AWS resources.