ecs execute-command not working for pid_mode=task

0

I have an ECS task running on Fargate with three containers. The task is configured to run with the pid_mode=task (because I want to collect process-metrics via an agent in one of the containers). However when I try to connect to my containers via "ecs execute-command", I am only able to do so for the container which is defined first in my task, for the other two containers I always get the error "An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later". If I remove the setting pid_mode=task, then I am able to connect to all three containers of my task without any problem.

The utility "check-ecs-exec.sh" also doesn't show any issues, everything is either green or at least yellow:

-------------------------------------------------------------
Prerequisites for check-ecs-exec.sh v0.7
-------------------------------------------------------------
  jq      | OK (/usr/bin/jq)
  AWS CLI | OK (/usr/local/bin/aws)

-------------------------------------------------------------
Prerequisites for the AWS CLI to use ECS Exec
-------------------------------------------------------------
  AWS CLI Version        | OK (aws-cli/1.27.44 Python/3.10.12 Linux/5.15.0-91-generic botocore/1.29.44)
  Session Manager Plugin | OK (1.2.536.0)

-------------------------------------------------------------
Checks on ECS task and other resources
-------------------------------------------------------------
Region : eu-west-1
Cluster: ecstest-stage
Task   : 97d1a…
-------------------------------------------------------------
  Cluster Configuration  | Audit Logging Not Configured
  Can I ExecuteCommand?  | arn:aws:iam::…
     ecs:ExecuteCommand: allowed
     ssm:StartSession denied?: allowed
  Task Status            | RUNNING
  Launch Type            | Fargate
  Platform Version       | 1.4.0
  Exec Enabled for Task  | OK
  Container-Level Checks | 
    ----------
      Managed Agent Status
    ----------
         1. RUNNING for "datadog-agent"
         2. RUNNING for "log-router"
         3. RUNNING for "ecstest-webserver"
    ----------
      Init Process Enabled (ecstest-stage:28)
    ----------
         1. Enabled - "ecstest-webserver"
         2. Enabled - "log-router"
         3. Enabled - "datadog-agent"
    ----------
      Read-Only Root Filesystem (ecstest-stage:28)
    ----------
         1. Disabled - "ecstest-webserver"
         2. Disabled - "log-router"
         3. Disabled - "datadog-agent"
  Task Role Permissions  | arn:aws:iam::…
     ssmmessages:CreateControlChannel: allowed
     ssmmessages:CreateDataChannel: allowed
     ssmmessages:OpenControlChannel: allowed
     ssmmessages:OpenDataChannel: allowed
  VPC Endpoints          | 
    Found existing endpoints for vpc-0d10e…:
      - com.amazonaws.eu-west-1.s3
    SSM PrivateLink "com.amazonaws.eu-west-1.ssmmessages" not found. You must ensure your task has proper outbound internet connectivity.  Environment Variables  | (ecstest-stage:28)
       1. container "ecstest-webserver"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined
       2. container "log-router"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined
       3. container "datadog-agent"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined
MMoench
gefragt vor 4 Monaten422 Aufrufe
3 Antworten
0
Akzeptierte Antwort

At this time, the behavior of Amazon ECS is non-deterministic with respect to enableExecuteCommand when pidMode is set to task. The AWS SSM agent (which powers the feature) will be running in one of the containers only, but right now you cannot specify which container is the one in which it will run, nor can you specify that you want it to run in all of them.

The ECS service team is aware of this limitation. If you'd like to track the progress of the issue, I'd recommend you create a GitHub Issue at https://github.com/aws/containers-roadmap/issues and discuss your use case.

AWS
EXPERTE
beantwortet vor 4 Monaten
0

Hi MMoench,

Did you make sure that in your task definition's structure/JSON "enableExecuteCommand" is set to true ?

aws ecs describe-tasks --cluster <CLUSTER> --tasks <TASK_ARN>

If not try to enable it.

Best,

Didier

profile pictureAWS
EXPERTE
beantwortet vor 4 Monaten
  • Yes, "enableExecuteCommand" is indeed set to true in the task definition, which I have confirmed by running "describe-tasks". Additionally I am able to log in to the first container in my example above (ecstest-webserver), just the other two containers don't work while "pid_mode=task" is enabled.

0

hello - is there a solution to this? I have the same issue. I'm using a datadog sidecar for metrics and logging and need to be able to send a flare as per: https://docs.datadoghq.com/agent/troubleshooting/send_a_flare/?tab=agentv6v7#ecs-fargate

beantwortet vor 2 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen