What is the best strategy for determining ECS Task CPU size and number of tasks while minimizing latency and costs

0

Background

I have one containerized web app I want to deploy with ECS Fargate. I found with one user; the program runs well on 1 task with 1 vCPU. I expect the number of concurrent users to raise to about 50 on one particular day. I believe simply adding a load balancer and enabling ECS autoscaling is enough to reliably handle any increase in load. However, I have no experience here and want to be sure my architecture design method is adequate.

Current Method

As described above, I designed my method as follows:

  1. Determine the minimum number of resources necessary to minimize latency for one user.
  2. Set the ECS Task Definition to the resources discovered in step one.
  3. Add a load balancer to the ECS service.
  4. Enable ECS autoscaling based on the limiting factor (CPU utilization, Memory, etc.)

My Question

If my method above has flaws, please comment for correction or directly answer the question in the title:

What is the best strategy for determining ECS Task CPU size and number of tasks while minimizing latency and costs?

1 Answer
0

Hello There,

I understand that you have queries around practices and strategy for your deployment with efficient CPU allocation for your ECS tasks and cost optimisation.

================

  • [+] To begin with, it does seem like your workflow is basically well planned and conceptually good enough. I would still recommend adding another step where you check and monitor the resource utilisation and behaviour when you have higher number of users. Scale up and down the number of tasks, Memory/CPU values and observe whether the trade off between extra resource cost and latency/results improvement are worth it. That's the only correct way of determining the most suited resource requirements as such scenarios are application dependent and there is no pre determined values or methodology for the same. [1]

  • [+] Furthermore, I would say set the task definition resource values by selecting a constant small number for the resource limits. Calculate a value which ensures that your resources are not heavily underutilised at any period of time, while at the same time are also not so small that you have to scale the number of tasks too often, scaling too often can lead to disruptions as you need time to register and deregister from your ELB. You need to strike the balance, and that's the tough part. Other than that, let the service autoscaling handle the requests by scaling accordingly. [2]

For example, if you expect the concurrent users to rise to 50 at maximum and 1 at minimum and you require say 1 vCPU and 2 GB Memory for every 10 concurrent users before you need a new task due to resource constraint, a sample good strategy could be to keep 1 vCPU and 2 GB Memory as task limits and scale up the number of tasks at somewhere between 65-75% of vCPU and/or Memory usage. This ensures that you are not overpaying/overconsuming while also ensuring a good headroom in case you need to scale up to serve more concurrent users at a short notice.

  • [+] Moreover, there still are limitations. Fargate does not cache images, so if your image is large, this can cause higher task startup times (way higher startup times if windows) and can cause disruptions. Here you have opportunities for improvement: (Skip if you have small sized images)
  1. Reduce the size of the Docker image, if possible.
  2. Use VPC Endpoint for ECR registry
  3. Network throughput is based on the CPU allocation to the Fargate Task, i.e. More vCPU means more network bandwidth for faster downloads.
  4. Store the Docker images in the same region as the Cluster/Task.

Check the attached documentation regarding ECS task launch time. [3]

  • [+] Additionally, do consider other things, like if you have a steady or expected workload, EC2 can have better returns than Fargate. Also, identifying the correct metric and value setting for scaling the tasks can have significant impact on costs, using Container Insights for analysing the resource utilisations, throughputs and other metrics. Checking if metrics like "ActiveConnectionCount" to your ALB can be used for scaling and so on. I would strongly recommend going through a few documentations published by AWS on best practices which can provide a strong insight on how to approach optimisation of your deployment. [4]

================

[1] Cost Optimisation Checklist for Amazon ECS and AWS Fargate: https://aws.amazon.com/blogs/containers/cost-optimization-checklist-for-ecs-fargate/

[2] How Amazon ECS manages CPU and memory resources: https://aws.amazon.com/blogs/containers/how-amazon-ecs-manages-cpu-and-memory-resources/

[3] Optimise Amazon ECS task launch time: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-recommendations.html

[4] Optimise Amazon ECS service auto scaling: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/capacity-autoscaling.html

[5] Amazon ECS best practices: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-best-practices.html

AWS
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions