I am blocked trying to use the OpenSearch Glue connection to connect to an OpenSearch domain that is hosted on a VPC. I have configured the connection to OpenSearch with a security group that allows incoming traffic over all ports, but the following error messages are showing in the ETL job logs. I have also added a VPC endpoint to the OpenSearch domain and tried associating it with both the public and private subnets in the VPC.
ExceptionErrorMessage failureReason: An error occurred while calling o115.pyWriteDynamicFrame. Cannot detect OpenSearch version - typically this happens if the network/OpenSearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'opensearch.nodes.wan.only'
...
Caused by: org.opensearch.hadoop.rest.OpenSearchHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[https://MY_VPC_DOMAIN.us-east-1.es.amazonaws.com:9200]]
...
ERROR NetworkClient: Node [https://MY_VPC_DOMAIN.us-east-1.es.amazonaws.com:9200] failed (org.opensearch.hadoop.thirdparty.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 60000 ms); no other nodes left - aborting...
Also, I have toggled the "WAN only enabled" setting on the connection. When the setting is disabled, the timeout error messages log the private IP addresses of the OpenSearch nodes. These are the IP addresses returned when I dig
/ nslookup
the domain endpoint hostname.
NetworkClient: Node [https://10.150.74.223:9200] failed (org.opensearch.hadoop.thirdparty.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 60000 ms); no other nodes left - aborting...
Is this a known issue with the underlying client for Glue connections to OpenSearch domains?