Can a Glue connection access an OpenSearch VPC domain?

0

I am blocked trying to use the OpenSearch Glue connection to connect to an OpenSearch domain that is hosted on a VPC. I have configured the connection to OpenSearch with a security group that allows incoming traffic over all ports, but the following error messages are showing in the ETL job logs. I have also added a VPC endpoint to the OpenSearch domain and tried associating it with both the public and private subnets in the VPC.

ExceptionErrorMessage failureReason: An error occurred while calling o115.pyWriteDynamicFrame. Cannot detect OpenSearch version - typically this happens if the network/OpenSearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'opensearch.nodes.wan.only'

...

Caused by: org.opensearch.hadoop.rest.OpenSearchHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[https://MY_VPC_DOMAIN.us-east-1.es.amazonaws.com:9200]] 

...

ERROR NetworkClient: Node [https://MY_VPC_DOMAIN.us-east-1.es.amazonaws.com:9200] failed (org.opensearch.hadoop.thirdparty.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 60000 ms); no other nodes left - aborting...

Also, I have toggled the "WAN only enabled" setting on the connection. When the setting is disabled, the timeout error messages log the private IP addresses of the OpenSearch nodes. These are the IP addresses returned when I dig / nslookup the domain endpoint hostname.

NetworkClient: Node [https://10.150.74.223:9200] failed (org.opensearch.hadoop.thirdparty.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 60000 ms); no other nodes left - aborting...

Is this a known issue with the underlying client for Glue connections to OpenSearch domains?

2 Answers
0
Accepted Answer

Yes, the Glue job is connected to the VPC. I had ignored the port we were using to connect, my coworker pointed out that it needed to be 443 instead of 9200 because we are using a https endpoint. After testing this, I also dropped the VPC endpoint from the OpenSearch domain and the connection still worked.

answered 2 months ago
0

Hello.

Since the Glue ETL job itself can be run within a VPC, I thought it would be possible to connect to OpenSearch running within a VPC.
Is your Glue ETL job connected to a VPC?
https://docs.aws.amazon.com/glue/latest/dg/getting-started-vpc-config.html

profile picture
EXPERT
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions