How do I troubleshoot retry and timeout issues when invoking a Lambda function using an AWS SDK?
When I invoke my AWS Lambda function using an AWS SDK, the function times out, the API request stops responding, or an API action is duplicated. How do I troubleshoot these issues?
Short description
There are three reasons why retry and timeout issues occur when invoking a Lambda function with an AWS SDK:
- A remote API is unreachable or takes too long to respond to an API call.
- The API call doesn't get a response within the socket timeout.
- The API call doesn't get a response within the Lambda function's timeout period.
Note: API calls can take longer than expected when network connection issues occur. Network issues can also cause retries and duplicated API requests. To prepare for these occurrences, make sure that your Lambda function is idempotent.
If you make an API call using an AWS SDK and the call fails, the AWS SDK automatically retries the call. How many times the AWS SDK retries and for how long is determined by settings that vary among each AWS SDK.
Default AWS SDK retry settings
Note: Some values may be different for other AWS services.
AWS SDK | Maximum retry count | Connection timeout | Socket timeout |
Python (Boto 3) | depends on service | 60 seconds | 60 seconds |
JavaScript/Node.js | depends on service | N/A | 120 seconds |
Java | 3 | 10 seconds | 50 seconds |
.NET | 4 | 100 seconds | 300 seconds |
Go | 3 | N/A | N/A |
To troubleshoot the retry and timeout issues, first review the logs of the API call to find the problem. Then, change the retry count and timeout settings of the AWS SDK as needed for each use case. To allow enough time for a response to the API call, add time to the Lambda function timeout setting.
Resolution
Log the API calls made by the AWS SDK
Use Amazon CloudWatch Logs to get details about failed connections and the number of attempted retries for each. For more information, see Accessing Amazon CloudWatch logs for AWS Lambda. Or, see the following instructions for the AWS SDK that you're using:
- AWS Lambda function logging in Python
- Logging AWS SDK for JavaScript calls
- Logging AWS SDK for Java calls
- Logging with the AWS SDK for .NET
- Logging service calls (AWS SDK for Go)
Example error log where the API call failed to establish a connection (connection timeout)
START RequestId: b81e56a9-90e0-11e8-bfa8-b9f44c99e76d Version: $LATEST 2018-07-26T14:32:27.393Z b81e56a9-90e0-11e8-bfa8-b9f44c99e76d [AWS ec2 undefined 40.29s 3 retries] describeInstances({}) 2018-07-26T14:32:27.393Z b81e56a9-90e0-11e8-bfa8-b9f44c99e76d { TimeoutError: Socket timed out without establishing a connection ...
Example error log where the API call connection was successful, but timed out after the API response took too long (socket timeout)
START RequestId: 3c0523f4-9650-11e8-bd98-0df3c5cf9bd8 Version: $LATEST 2018-08-02T12:33:18.958Z 3c0523f4-9650-11e8-bd98-0df3c5cf9bd8 [AWS ec2 undefined 30.596s 3 retries] describeInstances({}) 2018-08-02T12:33:18.978Z 3c0523f4-9650-11e8-bd98-0df3c5cf9bd8 { TimeoutError: Connection timed out after 30s
Note: These logs aren't generated if the API request doesn't get a response within your Lambda function's timeout. If the API request ends because of a function timeout, try one of the following:
- Change the retry settings in the SDK so that all retries are made within the timeout.
- Increase the Lambda function timeout setting temporarily to allow enough time to generate SDK logs.
Change the AWS SDK's settings
The retry count and timeout settings of the AWS SDK should allow enough time for your API call to get a response. To determine the right values for each setting, test different configurations and get the following information:
- Average time to establish a successful connection
- Average time that a full API request takes (until it's successfully returned)
- If retries should be made by the AWS SDK or code
For more information on changing retry count and timeout settings, see the following AWS SDK client configuration documentation:
The following are some example commands that change retry count and timeout settings for each runtime.
Important: Before using any of the following commands, replace the example values for each setting with the values for your use case.
Example Python (Boto 3) command to change retry count and timeout settings
# max_attempts: retry count / read_timeout: socket timeout / connect_timeout: new connection timeout from botocore.session import Session from botocore.config import Config s = Session() c = s.create_client('s3', config=Config(connect_timeout=5, read_timeout=60, retries={'max_attempts': 2}))
Example JavaScript/Node.js command to change retry count and timeout settings
// maxRetries: retry count / timeout: socket timeout / connectTimeout: new connection timeout var AWS = require('aws-sdk'); AWS.config.update({ maxRetries: 2, httpOptions: { timeout: 30000, connectTimeout: 5000 } });
Example Java command to change retry count and timeout settings
// setMaxErrorRetry(): retry count / setSocketTimeout(): socket timeout / setConnectionTimeout(): new connection timeout ClientConfiguration clientConfig = new ClientConfiguration(); clientConfig.setSocketTimeout(60000); clientConfig.setConnectionTimeout(5000); clientConfig.setMaxErrorRetry(2); AmazonDynamoDBClient ddb = new AmazonDynamoDBClient(credentialsProvider,clientConfig);
Example .NET command to change retry count and timeout settings
// MaxErrorRetry: retry count / ReadWriteTimeout: socket timeout / Timeout: new connection timeout var client = new AmazonS3Client( new AmazonS3Config { Timeout = TimeSpan.FromSeconds(5), ReadWriteTimeout = TimeSpan.FromSeconds(60), MaxErrorRetry = 2 });
Example Go command to change retry count settings
// Create Session with MaxRetry configuration to be shared by multiple service clients. sess := session.Must(session.NewSession(&aws.Config{ MaxRetries: aws.Int(3), })) // Create S3 service client with a specific Region. svc := s3.New(sess, &aws.Config{ Region: aws.String("us-west-2"), })
Example Go command to change request timeout settings
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() // SQS ReceiveMessage params := &sqs.ReceiveMessageInput{ ... } req, resp := s.ReceiveMessageRequest(params) req.HTTPRequest = req.HTTPRequest.WithContext(ctx) err := req.Send()
(Optional) Change your Lambda function's timeout setting
A low Lambda function timeout can cause healthy connections to be dropped early. If that's happening in your use case, increase the function timeout setting to allow enough time for your API call to get a response.
Use the following formula to estimate the base time needed for the function timeout:
First attempt (connection timeout + socket timeout) + Number of retries x (connection timeout + socket timeout) + 20 seconds additional code runtime margin = Required Lambda function timeout
Example Lambda function timeout calculation
Note: The following calculation is for an AWS SDK that's configured for three retries, a 10-second connection timeout, and a 30-second socket timeout.
First attempt (10 seconds + 30 seconds) + Number of retries [3 * (10 seconds + 30 seconds)] + 20 seconds additional code runtime margin = 180 seconds
Related information
Invoke (Lambda API reference)
Relevant content
- asked a year agolg...
- asked a year agolg...
- asked 5 months agolg...
- asked 2 years agolg...
- asked 8 months agolg...
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 9 months ago