I want to troubleshoot errors that I receive when I use Amazon Elastic Container Service (Amazon ECS) Exec on my AWS Fargate tasks.
Resolution
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
When you use ECS Exec on Fargate tasks, you might receive one of the following error messages:
- "An error occurred (InvalidParameterException) when calling the ExecuteCommand operation: The execute command failed because execute command was not enabled when the task was run or the execute command agent isn't running. Wait and try again or run a new task with execute command enabled and try again."
- "An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later."
It's a best practice to use AWS CloudShell to troubleshoot ECS Exec on Fargate tasks. CloudShell comes preinstalled with AWS Systems Manager Agent (SSM Agent) and the AWS CLI.
InvalidParameterException error
If the ExecuteCommand option for your Fargate task is deactivated, then you receive the InvalidParameterException error.
To resolve this issue, complete the following steps:
-
To check whether the enableExecuteCommand parameter is set to true or false, run the describe-tasks command:
aws ecs describe-tasks --cluster example-cluster-name --tasks example-task-id| grep enableExecuteCommand
Note: Replace example-cluster-name with your cluster and example-task-id with your task ID.
-
If the enableExecuteCommand parameter is false, then run the following update-service command to update the parameter to true:
aws ecs update-service --cluster example-cluster-name --service example-service --region example-region --enable-execute-command --force-new-deployment
Note: Replace example-cluster-name with your cluster, example-service with your service, and example-region with your AWS Region. The force-new-deployment option creates a new deployment that starts new tasks and stops earlier tasks based on the service's deployment configuration. If your services use the AWS CodeDeploy blue/green deployment, then initiate a CODE_DEPLOY deployment instead of force-new-deployment. You can't use force-new-deployment for a blue/green deployment because a forced deployment launches a rolling update.
-
To check the status of ExecuteCommandAgent, run the following describe-tasks command:
aws ecs describe-tasks --cluster example-cluster-name --tasks example-task-id | grep -A 6 managedAgents
Note: Replace example-cluster-name with your cluster and example-task-id with your task ID.
-
Check the command's output for the state of the ExecuteCommand agent. If lastStatus of ExecuteCommandAgent isn't RUNNING, then check the ExecuteCommandAgent agent logs to identify the root cause. Proceed to the Generate logs for ECS Exec to identify issues troubleshooting steps to generate the ExecuteCommandAgent logs.
If ExecuteCommandAgent can't retrieve credentials because you configured a proxy in the container, then add the following NO_PROXY option to your container instance's configuration files:
env no_proxy=169.254.169.254,169.254.170.2
TargetNotConnectedException error
To resolve a TargetNotConnectionException error, take the following actions.
Add the required permissions and validate the networking configuration
Complete the following steps:
- Add the required permissions to your Amazon ECS task's AWS Identity and Access Management (IAM) role. If the task's IAM role already has the required permissions, then make sure that service control policies (SCPs) aren't blocking the task's connection to SSM Agent.
- If you use Amazon Virtual Private Cloud (Amazon VPC) interface endpoints with Amazon ECS, then create the following endpoints:
ec2messages.region.amazonaws.com
ssm.region.amazonaws.com
ssmmessages.region.amazonaws.com
Note: Replace region with your Region.
- To confirm that your AWS CLI environment and Amazon ECS cluster or task are ready for ECS Exec, run the check-ecs-exec.sh script. The output of the check-ecs-exec.sh script shows what you must resolve before you use ECS Exec. For information about prerequisites and usage, see Amazon ECS Exec Checker on the GitHub website.
The following example output shows that ECS Exec is turned off for the task and the task role doesn't have the required Systems Manager permissions:
Prerequisites for check-ecs-exec.sh v0.7
-------------------------------------------------------------
jq | OK (/usr/bin/jq)
AWS CLI | OK (/usr/local/bin/aws)
-------------------------------------------------------------
Prerequisites for the AWS CLI to use ECS Exec
-------------------------------------------------------------
AWS CLI Version | OK (aws-cli/2.11.0 Python/3.11.2 Linux/4.14.255-291-231.527.amzn2.x86_64 exec-env/CloudShell exe/x86_64.amzn.2 prompt/off)
Session Manager Plugin | OK (1.2.398.0)
-------------------------------------------------------------
Checks on ECS task and other resources
-------------------------------------------------------------
Region : us-east-1
Cluster: Fargate-Testing
Task : ca27e41ea3f54fd1804ca00feffa178d
-------------------------------------------------------------
Cluster Configuration | Audit Logging Not Configured
Can I ExecuteCommand? | arn:aws:iam::12345678:role/Admin
ecs:ExecuteCommand: allowed
ssm:StartSession denied?: allowed
Task Status | RUNNING
Launch Type | Fargate
Platform Version | 1.4.0
Exec Enabled for Task | NO
Container-Level Checks |
----------
Managed Agent Status - SKIPPED
----------
----------
Init Process Enabled (Exec-check:2)
----------
1. Disabled - "nginx"
----------
Read-Only Root Filesystem (Exec-check:2)
----------
1. Disabled - "nginx"
Task Role Permissions | arn:aws:iam::12345678:role/L3-session
ssmmessages:CreateControlChannel: implicitDeny
ssmmessages:CreateDataChannel: implicitDeny
ssmmessages:OpenControlChannel: implicitDeny
ssmmessages:OpenDataChannel: implicitDeny
VPC Endpoints | SKIPPED (vpc-abcd - No additional VPC endpoints required)
Environment Variables | (Exec-check:2)
1. container "nginx"
- AWS_ACCESS_KEY: not defined
- AWS_ACCESS_KEY_ID: not defined
- AWS_SECRET_ACCESS_KEY: not defined
Note: To run ECS Exec, you must set the ReadonlyRootFilesystem parameter to false in the task definition. If ReadonlyRootFileSystem is true, then the SSM Agent can't create the required directories.
- Verify that you configured IAM user credentials at the container level, such as an access key or secret access key.
Note: SSM Agent uses the AWS SDK for Java when it checks authentication. If you configure the access key or secret access key in the container instance as environment variables, then you override task-level permissions. To use ECS Exec, the IAM credentials at the container level must provide permissions for SSM Agent.
- In your task definition, confirm that pidMode isn't set to task.
Note: You can have only one ECS Exec session for each process ID (PID) namespace. If you share a PID namespace in a task, then you can start ECS Exec sessions in only one container.
Use ECS Exec to get into the container with the correct shell
Different base images can have different shells within them. If you use the incorrect shell, then you receive errors. Make sure that you use the correct shell based on your application image.
To use ECS Exec to get into the container, run the execute-command command:
aws ecs execute-command --region example-region --cluster example-cluster --container example-container --task example-task --command "example_shell" --interactive
Note: Replace example-region with your Region, example-cluster with your cluster name, example-container with your container instance name, and example-task with your task name.
Generate logs for ECS Exec to identify issues
To generate SSM Agent logs with information about why ECS Exec isn't working, run the following command in the environment section of the container definition.
Console command:
bin/bash,-c,sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log
JSON command:
"/bin/bash","-c","sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log"
Note: Different applications have different shells and editors. Modify the preceding command parameters for your application's requirements.
If you use the awslogs log driver, then the preceding commands generate SSM Agent logs and transfer them to the Amazon CloudWatch log group. If you use other log drivers or logging endpoints, then the SSM Agent logs transfer to those locations.
JSON example:
"entryPoint": [],
"portMappings": [],
"command": [
"bin/bash",
"-c",
"sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log"
],