Recently, our AWS Lambda microservices have been failing intermittently due to OS error: EBUSY, getaddrinfo. This caused a function timeout (in DNS call by Redis client, namely).
Because of these intermittent failures, I created a wrapper function whose sole responsibility is to call another function whose timeout is shorter than this.
This does not work either, now the LambdaClient invoke fails for the same reason!
import { LambdaClient, InvokeCommand } from "@aws-sdk/client-lambda";
export const handler = async (event) => {
const lambda = new LambdaClient();
try {
var params = {
FunctionName: '<anotherfunction>', // the lambda function we are going to invoke
Payload: JSON.stringify(event)
};
const command = new InvokeCommand(params);
let ret = await lambda.send(command);
if (ret && ctx.Payload) {
let payload = JSON.parse(Buffer.from(ret.Payload));
if (payload.statusCode == 200) {
return payload;
}
}
} catch (e) {
console.error(e);
}
...
The failure I get is:
ERROR Error: getaddrinfo EBUSY lambda.eu-west-1.amazonaws.com
at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:118:26) {
errno: -16,
code: 'EBUSY',
syscall: 'getaddrinfo',
hostname: 'lambda.eu-west-1.amazonaws.com',
'$metadata': { attempts: 1, totalRetryDelay: 0 }
}
As far as I know, getaddrinfo EBUSY error comes from underlying OS, and is caused by too many open files. Evidence for that is that I got rid of it (for a while) by changing the function memory parameters a bit so the funcrtion was moved to another tenant.
Is there anything I can do to prevent this? Is there anything AWS can do to prevent this?
BR, Joni
So what are our options? Ditch AWS Lambda? This causes severe downgrade in our production environment.