This article will help you setup a AWS Glue Connector with data sources that are running in VMware Cloud on AWS
In this article, I will show you how to create AWS Glue Connections that will provide connectivity between data sources (eg. databases) in VMware Cloud on AWS and AWS Glue.
This becomes useful when you want to use AWS Glue with VMware Cloud on AWS as a data source. AWS Glue is a serverless data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning (ML), and application development.
Prerequisites
Method
- Open the AWS Console, open the AWS Glue Service
- . Select Connections from the AWS Glue menu
- Select Create connection
- Fill in the Connection properties (only select "Require SSL" if you have configured SSL on the data source)
- Fill in the Connection access properties. (If you are using AWS Secrets Manager you can can select the secret from the drop down, else you can use username / password)
- JDBC Driver Class and JDBC Driver S3 Path is optional.
- Fill in the Network options, this says optional but it is essential if you are using VMware Cloud on AWS
- Select the correct VPC and Subnet (these will need to have connectivity / be able to route to your VMware Cloud on AWS data source)
- For the Security Group see step 10
- To enable AWS Glue to communicate between its components, either select or create a security group with a self-referencing inbound rule for all TCP ports.
- Add a self-referencing inbound rule to allow AWS Glue components to communicate. Specifically, add or confirm that there is a rule of Type All TCP, Protocol is TCP, Port Range includes all ports, and whose Source is the same security group name as the Group ID
- Add a rule to for outbound traffic also. Either open outbound traffic to all ports, or create a self-referencing rule of Type All TCP, Protocol is TCP, Port Range includes all ports, and whose Source is the same security group name as the Group ID
- Select Create connection
- One the connection is created, we need to test it for connectivity
- Select the Connection you have just created, select the Actions drop down and select Test connection
- Select an IAM role, I have used the AWSGlueServiceRole, select Confirm
- This should take 1 - 2 mins to test connectivity
Now that you have your AWS Glue Connection created, you can start to create AWS Glue Crawlers and AWS Glue Jobs.