I have an ECS task running on Fargate with three containers. The task is configured to run with the pid_mode=task (because I want to collect process-metrics via an agent in one of the containers). However when I try to connect to my containers via "ecs execute-command", I am only able to do so for the container which is defined first in my task, for the other two containers I always get the error "An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later". If I remove the setting pid_mode=task, then I am able to connect to all three containers of my task without any problem.
The utility "check-ecs-exec.sh" also doesn't show any issues, everything is either green or at least yellow:
-------------------------------------------------------------
Prerequisites for check-ecs-exec.sh v0.7
-------------------------------------------------------------
jq | OK (/usr/bin/jq)
AWS CLI | OK (/usr/local/bin/aws)
-------------------------------------------------------------
Prerequisites for the AWS CLI to use ECS Exec
-------------------------------------------------------------
AWS CLI Version | OK (aws-cli/1.27.44 Python/3.10.12 Linux/5.15.0-91-generic botocore/1.29.44)
Session Manager Plugin | OK (1.2.536.0)
-------------------------------------------------------------
Checks on ECS task and other resources
-------------------------------------------------------------
Region : eu-west-1
Cluster: ecstest-stage
Task : 97d1a…
-------------------------------------------------------------
Cluster Configuration | Audit Logging Not Configured
Can I ExecuteCommand? | arn:aws:iam::…
ecs:ExecuteCommand: allowed
ssm:StartSession denied?: allowed
Task Status | RUNNING
Launch Type | Fargate
Platform Version | 1.4.0
Exec Enabled for Task | OK
Container-Level Checks |
----------
Managed Agent Status
----------
1. RUNNING for "datadog-agent"
2. RUNNING for "log-router"
3. RUNNING for "ecstest-webserver"
----------
Init Process Enabled (ecstest-stage:28)
----------
1. Enabled - "ecstest-webserver"
2. Enabled - "log-router"
3. Enabled - "datadog-agent"
----------
Read-Only Root Filesystem (ecstest-stage:28)
----------
1. Disabled - "ecstest-webserver"
2. Disabled - "log-router"
3. Disabled - "datadog-agent"
Task Role Permissions | arn:aws:iam::…
ssmmessages:CreateControlChannel: allowed
ssmmessages:CreateDataChannel: allowed
ssmmessages:OpenControlChannel: allowed
ssmmessages:OpenDataChannel: allowed
VPC Endpoints |
Found existing endpoints for vpc-0d10e…:
- com.amazonaws.eu-west-1.s3
SSM PrivateLink "com.amazonaws.eu-west-1.ssmmessages" not found. You must ensure your task has proper outbound internet connectivity. Environment Variables | (ecstest-stage:28)
1. container "ecstest-webserver"
- AWS_ACCESS_KEY: not defined
- AWS_ACCESS_KEY_ID: not defined
- AWS_SECRET_ACCESS_KEY: not defined
2. container "log-router"
- AWS_ACCESS_KEY: not defined
- AWS_ACCESS_KEY_ID: not defined
- AWS_SECRET_ACCESS_KEY: not defined
3. container "datadog-agent"
- AWS_ACCESS_KEY: not defined
- AWS_ACCESS_KEY_ID: not defined
- AWS_SECRET_ACCESS_KEY: not defined
Yes, "enableExecuteCommand" is indeed set to true in the task definition, which I have confirmed by running "describe-tasks". Additionally I am able to log in to the first container in my example above (ecstest-webserver), just the other two containers don't work while "pid_mode=task" is enabled.