ecs execute-command not working for pid_mode=task

0

I have an ECS task running on Fargate with three containers. The task is configured to run with the pid_mode=task (because I want to collect process-metrics via an agent in one of the containers). However when I try to connect to my containers via "ecs execute-command", I am only able to do so for the container which is defined first in my task, for the other two containers I always get the error "An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later". If I remove the setting pid_mode=task, then I am able to connect to all three containers of my task without any problem.

The utility "check-ecs-exec.sh" also doesn't show any issues, everything is either green or at least yellow:

-------------------------------------------------------------
Prerequisites for check-ecs-exec.sh v0.7
-------------------------------------------------------------
  jq      | OK (/usr/bin/jq)
  AWS CLI | OK (/usr/local/bin/aws)

-------------------------------------------------------------
Prerequisites for the AWS CLI to use ECS Exec
-------------------------------------------------------------
  AWS CLI Version        | OK (aws-cli/1.27.44 Python/3.10.12 Linux/5.15.0-91-generic botocore/1.29.44)
  Session Manager Plugin | OK (1.2.536.0)

-------------------------------------------------------------
Checks on ECS task and other resources
-------------------------------------------------------------
Region : eu-west-1
Cluster: ecstest-stage
Task   : 97d1a…
-------------------------------------------------------------
  Cluster Configuration  | Audit Logging Not Configured
  Can I ExecuteCommand?  | arn:aws:iam::…
     ecs:ExecuteCommand: allowed
     ssm:StartSession denied?: allowed
  Task Status            | RUNNING
  Launch Type            | Fargate
  Platform Version       | 1.4.0
  Exec Enabled for Task  | OK
  Container-Level Checks | 
    ----------
      Managed Agent Status
    ----------
         1. RUNNING for "datadog-agent"
         2. RUNNING for "log-router"
         3. RUNNING for "ecstest-webserver"
    ----------
      Init Process Enabled (ecstest-stage:28)
    ----------
         1. Enabled - "ecstest-webserver"
         2. Enabled - "log-router"
         3. Enabled - "datadog-agent"
    ----------
      Read-Only Root Filesystem (ecstest-stage:28)
    ----------
         1. Disabled - "ecstest-webserver"
         2. Disabled - "log-router"
         3. Disabled - "datadog-agent"
  Task Role Permissions  | arn:aws:iam::…
     ssmmessages:CreateControlChannel: allowed
     ssmmessages:CreateDataChannel: allowed
     ssmmessages:OpenControlChannel: allowed
     ssmmessages:OpenDataChannel: allowed
  VPC Endpoints          | 
    Found existing endpoints for vpc-0d10e…:
      - com.amazonaws.eu-west-1.s3
    SSM PrivateLink "com.amazonaws.eu-west-1.ssmmessages" not found. You must ensure your task has proper outbound internet connectivity.  Environment Variables  | (ecstest-stage:28)
       1. container "ecstest-webserver"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined
       2. container "log-router"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined
       3. container "datadog-agent"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined
MMoench
질문됨 4달 전424회 조회
3개 답변
0
수락된 답변

At this time, the behavior of Amazon ECS is non-deterministic with respect to enableExecuteCommand when pidMode is set to task. The AWS SSM agent (which powers the feature) will be running in one of the containers only, but right now you cannot specify which container is the one in which it will run, nor can you specify that you want it to run in all of them.

The ECS service team is aware of this limitation. If you'd like to track the progress of the issue, I'd recommend you create a GitHub Issue at https://github.com/aws/containers-roadmap/issues and discuss your use case.

AWS
전문가
답변함 4달 전
0

Hi MMoench,

Did you make sure that in your task definition's structure/JSON "enableExecuteCommand" is set to true ?

aws ecs describe-tasks --cluster <CLUSTER> --tasks <TASK_ARN>

If not try to enable it.

Best,

Didier

profile pictureAWS
전문가
답변함 4달 전
  • Yes, "enableExecuteCommand" is indeed set to true in the task definition, which I have confirmed by running "describe-tasks". Additionally I am able to log in to the first container in my example above (ecstest-webserver), just the other two containers don't work while "pid_mode=task" is enabled.

0

hello - is there a solution to this? I have the same issue. I'm using a datadog sidecar for metrics and logging and need to be able to send a flare as per: https://docs.datadoghq.com/agent/troubleshooting/send_a_flare/?tab=agentv6v7#ecs-fargate

답변함 2달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠