Strange certificate errors when launching Fargate tasks

0

Sometimes when we try to launch a new ECS task in a Fargate instance a task will be created but it will never progress into RUNNING. It will try for a while until it stops. It happens rarely but it's a problem across multiple of our accounts.

The event of such an occasion has the following stoppedReason:

"CannotPullContainerError: ref pull has been retried 1 time(s): failed to copy: httpReadSeeker: failed open: failed to do request: Get [account-id].dkr.ecr.eu-central-1.amazonaws.com/[application-name]:1.22.38: tls: failed to verify certificate: x509: certificate is valid for *.s3.amazonaws.com, s3.amazonaws.com, not prod-eu-central-1-starport-layer-bucket.s3.eu-central-1.amazonaws.com"

This error seems to be out of our control to fix. Why does it happen? How should we avoid this from happening? Are we required to monitor all the Fargate tasks that we start to make sure they actually start running, or is there any better way to make sure our started tasks are ran?

2 Answers
2

From the error message you shared it looks like ECS was sending a request to the prod-eu-central-1-starport-layer-bucket.s3.eu-central-1.amazonaws.com bucket to download the container image layer but instead the request reached prod-eu-central-1-starport-layer-bucket.s3.amazonaws.com which causes a certificate mismatch.

It looks like something in your environment manipulates the DNS.

  1. Are you using VPC endpoint to reach ECR and S3 in the VPC where your ECS cluster is deployed?
  2. Check if perhaps you have a route 53 private hosted zone attached you your VPC with a CNAME record that does this manipulation
aws route53 list-hosted-zones-by-vpc \
    --vpc-id <your-vpc-id> \
    --vpc-region <your-vpc-region>
profile pictureAWS
EXPERT
answered a month ago
profile picture
EXPERT
reviewed a month ago
0
  1. Yes, we are.
  2. While we do have a couple of hosted zones they are both public.

I'm sorry I can't provide more information, as I've looked around I haven't found anything about our setup that differs from the documentation pages I've read. It seems like we have a very basic setup of ECS and ECR.

Kevin
answered 24 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions