- Newest
- Most votes
- Most comments
Hi Tharunkumar,
Please try this below solution, I hope it will help to resolve your issue.
Implement an ECS Task Rebalancer:
-
Create a Lambda Function: This function will check the task placement and stop tasks that need to be redistributed.
-
Invoke Lambda Function Periodically: Use CloudWatch Events to trigger the Lambda function at regular intervals.
-
CloudFormation Template: Use a CloudFormation template to create the Lambda function and set up the CloudWatch Event rule.
Lambda Function (Python)
This function lists tasks in your ECS cluster, groups them by instance, and stops tasks from under-utilized instances:
import boto3
ecs_client = boto3.client('ecs')
def lambda_handler(event, context):
cluster_name = 'your-cluster-name'
service_name = 'your-service-name'
# List tasks
tasks = ecs_client.list_tasks(cluster=cluster_name, serviceName=service_name)['taskArns']
# Describe tasks
tasks_details = ecs_client.describe_tasks(cluster=cluster_name, tasks=tasks)['tasks']
# Group tasks by instance
tasks_by_instance = {}
for task in tasks_details:
instance_id = task['containerInstanceArn']
if instance_id not in tasks_by_instance:
tasks_by_instance[instance_id] = []
tasks_by_instance[instance_id].append(task['taskArn'])
# Example logic: Stop tasks from under-utilized instances
for instance_id, task_arns in tasks_by_instance.items():
if len(task_arns) == 1: # Adjust this threshold based on your binpack strategy
ecs_client.stop_task(cluster=cluster_name, task=task_arns[0])
return {
'statusCode': 200,
'body': 'Rebalanced tasks'
}
CloudFormation Template
This template sets up the Lambda function and the CloudWatch Event rule to trigger it periodically:
AWSTemplateFormatVersion: '2010-09-09'
Resources:
MyLambdaExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: lambda.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: LambdaExecutionPolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
- ecs:ListTasks
- ecs:DescribeTasks
- ecs:StopTask
Resource: '*'
MyLambdaFunction:
Type: AWS::Lambda::Function
Properties:
FunctionName: MyRebalanceFunction
Handler: index.lambda_handler
Role: !GetAtt MyLambdaExecutionRole.Arn
Code:
ZipFile: |
import boto3
ecs_client = boto3.client('ecs')
def lambda_handler(event, context):
cluster_name = 'your-cluster-name'
service_name = 'your-service-name'
# List tasks
tasks = ecs_client.list_tasks(cluster=cluster_name, serviceName=service_name)['taskArns']
# Describe tasks
tasks_details = ecs_client.describe_tasks(cluster=cluster_name, tasks=tasks)['tasks']
# Group tasks by instance
tasks_by_instance = {}
for task in tasks_details:
instance_id = task['containerInstanceArn']
if instance_id not in tasks_by_instance:
tasks_by_instance[instance_id] = []
tasks_by_instance[instance_id].append(task['taskArn'])
# Example logic: Stop tasks from under-utilized instances
for instance_id, task_arns in tasks_by_instance.items():
if len(task_arns) == 1: # Adjust this threshold based on your binpack strategy
ecs_client.stop_task(cluster=cluster_name, task=task_arns[0])
return {
'statusCode': 200,
'body': 'Rebalanced tasks'
}
Runtime: python3.8
CloudWatchEventRule:
Type: AWS::Events::Rule
Properties:
ScheduleExpression: rate(5 minutes)
Targets:
- Arn: !GetAtt MyLambdaFunction.Arn
Id: "TargetFunctionV1"
LambdaInvokePermission:
Type: AWS::Lambda::Permission
Properties:
Action: lambda:InvokeFunction
FunctionName: !Ref MyLambdaFunction
Principal: events.amazonaws.com
SourceArn: !GetAtt CloudWatchEventRule.Arn
Please go through the below useful AWS documentation links for the services involved
1. Lambda
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-lambda-function.html
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-lambda-permission.html
2. AWS IAM
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-iam-role.html
3. AWS Cloud Watch Events
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-events-rule.html
4.AWS ECS:
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ListTasks.html
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_DescribeTasks.html
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_StopTask.html
Are you using a Capacity Provider for the ASG? If so, do you have Managed Termination Protection feature enabled? This will prevent instances from being scaled-in as long as there's any replica tasks running on them.
If you want to have tasks killed and replaced on new instances to binpack better, disabled Managed Termination Protection, and instead enabled Managed Draining and set the target value to 100
Relevant content
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 7 months ago
I done this but it is not working