My client application gets connection timeout or socket errors when I try to connect to Amazon OpenSearch Service.
Resolution
Troubleshoot timeout issues
Take the following actions:
Troubleshoot "Connection timed out" errors
If your connection times out, then you receive an error message similar to the following example:
"curl: (7) Failed to connect to vpc-acbdefxyz.us-east-1.es.amazonaws.com port 443: Connection timed out
curl: (28) Operation timed out after 1001 milliseconds with 0 out of 0 bytes received"
Based on the type of domain that you use, take the following troubleshooting actions.
Public domains
Public domains are accessible over the internet when the client has connectivity or routes to the internet. If the client doesn't have connectivity or routes to the internet, then you might receive the following output when you run a telnet or curl command:
Trying xyz.xyz.xyz.xyz...
telnet: connect to address xyz.xyz.xyz.xyz: Connection timed out
-or-
* Trying xyz.xyz.xyz.xyz:443...
* connect to xyz.xyz.xyz.xyz port 443 failed: Operation timed out
* Failed to connect to search-domain-name-someid.aws-region.es.amazonaws.com port 443 after 75243 ms: Couldn't connect to server
* Closing connection 0
curl: (28) Failed to connect to search-domain-name-someid.aws-region.es.amazonaws.com port 443 after 75243 ms: Couldn't connect to server
To resolve this issue, make sure that the client has routes to the internet and doesn't block outbound requests to the search endpoint.
To test your connection, run the following command:
telnet search-domain-name-someid.aws-region.es.amazonaws.com 443
Note: Replace search-domain-name-someid with your domain name and aws-region with your AWS Region.
Example of a successful response from the search endpoint:
Trying xyz.xyz.xyz.xyz...
Connected to search-domain-name-someid.aws-region.es.amazonaws.com.
Escape character is '^]'.
Domains inside a VPC
For OpenSearch Service domains that you create inside a virtual private cloud (VPC), each data node in the VPC has an elastic network interface. The network interfaces forward network traffic to your domain.
To check connectivity between the network interfaces and your domain, complete the following steps:
-
Run one of the following commands to get the network interface IP addresses in your VPC:
nslookup -q=A vpc-domain-name-id.aws-region.es.amazonaws.com
-or-
dig +short vpc-domain-name-id.aws-region.es.amazonaws.com
Note: Replace domain-name-id with your domain name and aws-region with your Region.
In the output, note the network interface IP address.
-
To test the connection, run one of the following commands for each data node IP address:
telnet ip-address 443
-or-
curl -v telnet://ip-address:443
Note: Replace ip-address with the network interface IP address.
Example timeout response:
Trying xyz.xyz.xyz.xyz...telnet: connect to address xyz.xyz.xyz.xyz: Connection timed out
Example successful response:
Trying xyz.xyz.xyz.xyz...Connected to xyz.xyz.xyz.xyz.
Escape character is '^]'.
-
If the connection times out, then confirm that your VPC's security groups, route tables, and network access control list (network ACL) allow access to the domain. If you can connect to some network interfaces but others time out, then create an AWS Support case.
Note: You can't access your OpenSearch Service domains from outside the VPC. For more information, see Launching your OpenSearch Service domains within a VPC.
Troubleshoot "HTTP 504 gateway timeout" error
The load balancer distributes inbound traffic to the data node. If the OpenSearch Service request doesn't return a confirmation within the idle timeout period, then the load balancer closes the TCP connection. As a result, you receive an error message similar to the following example:
"error msg: org.elasticsearch.client.ResponseException:
method[POST], host[https: //acbdefjhxyz.eu-central-1.es.amazonaws.com:443],
URI [/eks*/_search?typed_keys=true],status line [HTTP/1.1 504 Gateway Time-out]"
Typically, the "HTTP 504 Gateway timeout" error occurs when OpenSearch Service receives too many simultaneous requests or requests are complex. The error doesn't necessarily indicate that there's an issue in the cluster. For more information about causes and troubleshooting steps, see HTTP 504: Gateway timeout.
Troubleshoot "SocketTimeoutException" errors
Socket timeout errors typically occur when a client sends too many requests or complex requests. The OpenSearch Service domain might experience high resource usage with delayed client responses. Example error message:
"j.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-5083 [ACTIVE]
java.net.SocketTimeoutException: Read timed out
Caused by: java.net.SocketTimeoutException: Read timed out"
To resolve this issue, take the following actions:
Related information
Troubleshooting OpenSearch Service
How do I troubleshoot search latency spikes in my OpenSearch Service cluster?
How can I improve the indexing performance on my OpenSearch Service cluster?
Analyzing OpenSearch Service slow logs using Amazon CloudWatch Logs streaming and Kibana