Since July 29, we have discovered that we are unable to stop at least two EC2 instances. These instances are i-eb4f3b88
and i-af40dccc
. They are both located in the us-east-1b availability zone, but many other instances we have in the same availability zone are not affected by this issue. When we try to do so, we get an "internal error":
Using the SDK and CLI results in similar error messages. I can confirm I have the IAM permissions necessary to call the StopInstances API on the two instances mentioned. I am aware that there was increased error rates for EC2 APIs reported between 1:03PM and 1:49PM PDT, but this issue has been affecting us long before then and it is still affecting us as of 3:50PM PDT.
Could someone please look at it from the AWS side, to see if we are impacted by a strange issue?
The SDK error message is "message=An internal error has occurred, code=InternalError, time=Thu Aug 03 2023 10:07:55 GMT+0000 (Coordinated Universal Time), requestId=13a56eb6-c9e1-4a80-8c9c-26bfeb472bc9, statusCode=500, retryable=true"
Bottom part of the CLI error message (after running aws --debug ec2 stop-instances --instance-ids i-eb4f3b88
) is:
2023-08-03 20:05:53,377 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://ec2.us-east-1.amazonaws.com/, headers={'Content-Type': b'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent': b'aws-cli/2.13.3 Python/3.11.4 Linux/5.19.0-50-generic exe/x86_64.ubuntu.22 prompt/off command/ec2.stop-instances', 'X-Amz-Date': b'20230803T223553Z', 'X-Amz-Security-Token': b'redacted security token', 'Authorization': b'AWS4-HMAC-SHA256 Credential=redacted role name/20230803/us-east-1/ec2/aws4_request, SignedHeaders=content-type;host;x-amz-date;x-amz-security-token, Signature=redacted signature', 'Content-Length': '63'}>
2023-08-03 20:05:53,377 - MainThread - botocore.httpsession - DEBUG - Certificate path: /usr/local/aws-cli/v2/2.13.3/dist/awscli/botocore/cacert.pem
2023-08-03 20:05:53,377 - MainThread - urllib3.connectionpool - DEBUG - Resetting dropped connection: ec2.us-east-1.amazonaws.com
2023-08-03 20:05:53,861 - MainThread - urllib3.connectionpool - DEBUG - https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 500 None
2023-08-03 20:05:53,861 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amzn-RequestId': 'b6f5bc4f-44ef-4016-84de-2adf68c1be68', 'Cache-Control': 'no-cache, no-store', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains', 'vary': 'accept-encoding', 'Content-Type': 'text/xml;charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Date': 'Thu, 03 Aug 2023 22:35:53 GMT', 'Connection': 'close', 'Server': 'AmazonEC2'}
2023-08-03 20:05:53,862 - MainThread - botocore.parsers - DEBUG - Response body:
b'<?xml version="1.0" encoding="UTF-8"?>\n<Response><Errors><Error><Code>InternalError</Code><Message>An internal error has occurred</Message></Error></Errors><RequestID>b6f5bc4f-44ef-4016-84de-2adf68c1be68</RequestID></Response>'
2023-08-03 20:05:53,862 - MainThread - botocore.hooks - DEBUG - Event needs-retry.ec2.StopInstances: calling handler <bound method RetryHandler.needs_retry of <botocore.retries.standard.RetryHandler object at 0x7f95db8f65d0>>
2023-08-03 20:05:53,862 - MainThread - botocore.retries.standard - DEBUG - Max attempts of 3 reached.
2023-08-03 20:05:53,862 - MainThread - botocore.retries.standard - DEBUG - Not retrying request.
2023-08-03 20:05:53,862 - MainThread - botocore.hooks - DEBUG - Event after-call.ec2.StopInstances: calling handler <bound method RetryQuotaChecker.release_retry_quota of <botocore.retries.standard.RetryQuotaChecker object at 0x7f95e29ae010>>
2023-08-03 20:05:53,862 - MainThread - awscli.clidriver - DEBUG - Exception caught in main()
Traceback (most recent call last):
File "awscli/clidriver.py", line 460, in main
File "awscli/clidriver.py", line 595, in __call__
File "awscli/clidriver.py", line 798, in __call__
File "awscli/clidriver.py", line 929, in invoke
File "awscli/clidriver.py", line 941, in _make_client_call
File "awscli/botocore/client.py", line 341, in _api_call
File "awscli/botocore/client.py", line 697, in _make_api_call
botocore.exceptions.ClientError: An error occurred (InternalError) when calling the StopInstances operation (reached max retries: 2): An internal error has occurred
An error occurred (InternalError) when calling the StopInstances operation (reached max retries: 2): An internal error has occurred
Do you have any additional questions, feel free to comment here, happy to assist.