Why does my AWS Glue test connection fail?

6 minute read
0

I want to troubleshoot a failed test connection in AWS Glue.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

To troubleshoot your failed test connection in AWS Glue, check your network and authentication connections.

Network issues

Check connectivity to JDBC data stores

AWS Glue creates elastic network interfaces with private IP addresses in the connection's subnet. Data that you store outside the Amazon Virtual Private Cloud (Amazon VPC) requires the subnet's route table to include a NAT gateway in a public subnet. Otherwise, the connection times out.

Note: The data that you store outside the Amazon VPC is an on-premises data store or an Amazon Relational Database Service (Amazon RDS) resource with a public hostname.

Check that the connection's security groups and network access control list (network ACL) allow traffic to the data in the VPC. Then, use the AWSSupport-TroubleshootGlueConnection runbook in AWS Systems Manager. For more information, see How do I troubleshoot errors with an AWS Glue connection that has a JDBC source?

If the connection requires a NAT gateway or access to AWS Secrets Manager and AWS Security Token Service (AWS STS), then attach the endpoints. For more information, see Connecting to data.

Check the connection's security groups

One of the security groups that's associated with the connection must have a self-referenced inbound rule that's open to all TCP ports. One of the security groups must be open to all outbound traffic. You can use a self-referenced rule to restrict outbound traffic to the VPC. For more information, see Setting up Amazon VPC for JDBC connections to Amazon RDS data stores from AWS Glue.

Check the number of free IP addresses

The number of free IP addresses in the subnet must be greater than the number of workers that you specify for the job. This allows AWS Glue to create network interfaces in the specified subnet.

Check that the subnet can access Amazon S3

Provide an Amazon Simple Storage Service (Amazon S3) endpoint or a route to a NAT gateway in your subnet's route table. For more information, see Error: Could not find S3 endpoint or NAT Gateway for subnetId in VPC.

Check whether you have an AWS KMS VPC endpoint

To encrypt connections for your AWS Glue Data Catalog, be sure that you have a route to AWS Key Management Service (AWS KMS). For example, the route can be an AWS KMS VPC interface endpoint. For more information, see Connecting to AWS KMS through a VPC endpoint.

Check whether the AWS Glue connection and the database use different VPCs

Your test connection fails with a timeout error when the following conditions are true:

  • The database isn't publicly accessible.
  • You attached the AWS Glue job to a connection that uses a different VPC without VPC peering.

To resolve a test connection failure, create a dedicated AWS Glue VPC and set up the associated VPC peering with your other VPCs.

Check the connectivity to the on-premises data store

To check the AWS Glue connection to an on-premises database that connects to an Amazon Elastic Compute Cloud (Amazon EC2) instance, run the following commands:

$ telnet hostname port  
$ nc -zv hostname port  
$ dig hostname  
$ traceroute -AnT -p IP port

Check your VPN and your VPC, subnet, security group, and network ACLs configurations. Be sure that the configurations don't block the connectivity from VPC to your on-premises database or create firewall issues from the on-premises database. For more information, see How to access and analyze on-premises data stores using AWS Glue.

Authentication issues

Choose the correct IAM role

The AWS Identity and Access Management (IAM) role that you select for the test connection must have a trust relationship with AWS Glue. Choose a service-linked role that has the AWSGlueServiceRole policy attached to it to create the trust relationship.

Check the connection's IAM role

If you encrypted the connection password with AWS KMS, then check that the connection's IAM role allows the kms:Decrypt action for the key. For more information, see Setting up encryption in AWS Glue.

Check the connection logs

Check the logs for error messages. You can find logs from test connections in Amazon CloudWatch Logs under /aws-glue/testconnection/output.

Check the SSL settings

If the data store requires SSL connectivity for the specified user, then select Require SSL connection when you create the connection on the console. Select this option only when the data store supports SSL.

Check the JDBC username and password

Users must have sufficient permissions to access the Java Database Connectivity (JDBC) data store. For example, AWS Glue crawlers require the SELECT permission. A job that writes to a data store requires INSERT, UPDATE, and DELETE permissions.

Check the JDBC URL syntax

Syntax requirements vary by database engine. For more information, see AWS Glue JDBC connection properties and review the examples under JDBC URL.

Additional tips

Check the connection type

Be sure to choose the correct connection type. When you choose Amazon RDS or Amazon Redshift for Connection type, AWS Glue auto populates the VPC, subnet, and security group.

The test connection feature only works for MySQL 5.x versions. The built-in AWS Glue JDBC driver doesn't support MySQL version 8. If you test the connection against a MySQL version that's later than version 5.x, then you might get a connection timeout error. Use your AWS Glue connection to connect to MySQL version 8. Enter the compatible driver Java Archive (JAR) for MySQL versions 8 and later to use the connection on an extract, load, and transform (ETL) job. Then, load the JAR file into your job. For more information, see Connection types and options for ETL in AWS Glue for Apache Spark.

Verify that DNS isn't causing the issues

To verify that DNS isn't causing the issues, use the data store's public or private IP address as the JDBC URL for the AWS Glue connection. Clear the Require SSL connection field because a domain name is no longer used.

Check whether the driver is incompatible

Provide the correct driver as an extra JAR file in the job properties along with the failed connection name. When you specify the connection name as a job property, AWS Glue uses the connection's network settings, such as the VPC and subnets. Create the Spark DataFrame with the JAR file in the job properties to override the default AWS Glue data store drivers.

You can also convert the DataFrame into an AWS Glue DynamicFrame. For more information, see fromDF.

Check whether the JDBC data store is publicly accessible

Use MySQL Workbench and the JDBC URL to connect to the data store. Or, launch an Amazon EC2 instance that has SSH access to the same subnet and security groups that you use for the connection. Then, use SSH to connect to the instance, and run the following command to test connectivity:

dig hostname$ nc -zv hostname port

Related information

Troubleshooting Spark errors

AWS OFFICIAL
AWS OFFICIALUpdated 3 months ago
6 Comments

I followed all the tips above, but the error to connect to the onpremisse database continues. Can anyone help? Failed to test connection Neuroteks due to FAILED status.

replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
EXPERT
replied a year ago

Any way to get more error or debug logs on connection errors? The number of potential causes for Test connection failed for connection is quite high.

replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
EXPERT
replied a year ago

Can you support to fix the below problem:

CloudWatch Log groups /aws-glue/testconnection/error/staging 18-July-2024-3-38-AM-UTC Log events No older events at this moment. Retry 2024-07-18T03:38:04.697Z ERROR StatusLogger Unrecognized format specifier [d] 2024-07-18T03:38:04.701Z ERROR StatusLogger Unrecognized conversion specifier [d] starting at position 16 in conversion pattern. 2024-07-18T03:38:04.701Z ERROR StatusLogger Unrecognized format specifier [thread] 2024-07-18T03:38:04.702Z ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern. 2024-07-18T03:38:04.702Z ERROR StatusLogger Unrecognized format specifier [level] 2024-07-18T03:38:04.702Z ERROR StatusLogger Unrecognized conversion specifier [level] starting at position 35 in conversion pattern. 2024-07-18T03:38:04.702Z ERROR StatusLogger Unrecognized format specifier [logger] 2024-07-18T03:38:04.702Z ERROR StatusLogger Unrecognized conversion specifier [logger] starting at position 47 in conversion pattern.

replied 10 months ago

If none of the above works, you can create secretsmanager vpc endpoint. Check cloudwatch logs from a JDBC connection. This worked for me

replied 4 months ago