Skip to content

Snowflake -> Glue Connection + Network Configuration (VPC, subnet, security id, nat gateway) using Key Pair

0

Hello AWS Support,

I need clarification regarding the networking setup for AWS Glue when connecting to Snowflake.

Normally, when creating a Glue Snowflake connection in the console, I can attach it to a VPC, Subnet, and Security Group. This makes Glue traffic route through that VPC setup, and if I use a NAT Gateway, I can achieve a static IP for Snowflake allow-listing.

However, in my Glue job, I am bypassing the Glue-managed connection and instead directly using the Snowflake Spark Connector with sfOptions (example below):

SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake" sfOptions = { "sfURL": args['URL'], "sfUser": args['USERNAME'], "pem_private_key": secretstring, "sfDatabase": args['DB'], "sfSchema": args['SCHEMA'], "sfWarehouse": args['WAREHOUSE'], "application": "AWSGlue" }

My questions are: (Currently - Key-pair is the method we are using however it seems that sfOptions bypass the network configuration from connector)

  1. If I bypass the Glue connection this way, does Glue still use the same networking configuration (VPC, Subnet, Security Group) that I attached to the Glue job?

Or does bypassing mean Glue ignores the VPC/subnet/security group config and just connects directly to Snowflake’s endpoint over the public internet?

  1. If I want Glue’s outbound IP to remain static via NAT Gateway, do I still need to define the Glue connection in the console and attach networking there, or will the NAT configuration apply regardless of whether I use the Glue connection vs. manual sfOptions?

  2. Is there an AWS best practice for handling static IP requirements when using Snowflake with Glue?

Thanks in advance for clarifying.

Best regards, Ray Charles Villaruel

2 Answers
0

When you use the Snowflake Spark Connector with sfOptions directly in your AWS Glue job instead of using a Glue-managed connection, the networking behavior depends on how your Glue job is configured.

If your AWS Glue job has network configurations (VPC, subnet, security group) attached to it, then the job will still use that network configuration regardless of whether you're using a Glue-managed connection or directly using the Snowflake Spark Connector with sfOptions. The network configuration is applied at the job level, not at the connection level.

When your Glue job runs with a VPC configuration, it creates an Elastic Network Interface (ENI) in the specified subnet with the specified security group. All network traffic from your Glue job, including connections to Snowflake, will flow through this ENI. If your subnet has a NAT Gateway configured, then the outbound traffic will go through the NAT Gateway, providing you with a static IP address that you can use for Snowflake allow-listing.

If you don't attach any network configuration to your Glue job, then your job will connect directly to Snowflake over the public internet, and you won't have a static IP address for Snowflake allow-listing.

For handling static IP requirements when using Snowflake with Glue, the AWS best practice is to:

  1. Configure your Glue job to run in a private subnet within a VPC
  2. Set up a NAT Gateway in a public subnet of the same VPC
  3. Configure route tables to direct outbound traffic from the private subnet through the NAT Gateway
  4. Use the Elastic IP address of the NAT Gateway for Snowflake allow-listing

This approach provides a static IP for outbound connections while maintaining security by keeping your Glue job in a private subnet. This configuration works regardless of whether you're using a Glue-managed connection or directly using the Snowflake Spark Connector with sfOptions.
Sources
Building a secure data pipeline - AWS Glue Best Practices: Building a Secure and Reliable Data Pipeline
AWS Glue Job Networking: Managing Multiple Connections | AWS re:Post

answered 6 months ago
AWS
SUPPORT ENGINEER
reviewed 6 months ago
  • How about I am using key-pair to connect ..

0

Hi,

The key-pair is used for authentication. The network path remains same as mentioned above when using key-pair to connect as well.

Thank you!

AWS
SUPPORT ENGINEER
answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.