- Newest
- Most votes
- Most comments
Hello.
I set up EventBridge Scheduler and AWS Batch in my AWS account to run a Batch job every 5 minutes, but the revision number of the ECS task definition did not increase from 1.
Is it possible for you to share what settings you are using on EventBridge?
The configuration below is a sample EventBridge scheduler that I created.
aws scheduler get-schedule --name batch-test
{
"ActionAfterCompletion": "NONE",
"Arn": "arn:aws:scheduler:ap-northeast-1:111111111111:schedule/default/batch-test",
"CreationDate": "2024-10-25T00:35:42.536000+00:00",
"Description": "",
"FlexibleTimeWindow": {
"Mode": "OFF"
},
"GroupName": "default",
"LastModificationDate": "2024-10-25T00:35:42.536000+00:00",
"Name": "batch-test",
"ScheduleExpression": "rate(5 minutes)",
"ScheduleExpressionTimezone": "Asia/Tokyo",
"State": "ENABLED",
"Target": {
"Arn": "arn:aws:scheduler:::aws-sdk:batch:submitJob",
"Input": "{\n \"JobDefinition\": \"arn:aws:batch:ap-northeast-1:111111111111:job-definition/getting-started-wizard-job-definition:1\",\n \"JobName\": \"getting-started-wizard-job-test\",\n \"JobQueue\": \"arn:aws:batch:ap-northeast-1:111111111111:job-queue/getting-started-wizard-job-queue\"\n}",
"RetryPolicy": {
"MaximumEventAgeInSeconds": 86400,
"MaximumRetryAttempts": 185
},
"RoleArn": "arn:aws:iam::111111111111:role/service-role/Amazon_EventBridge_Scheduler"
}
}
Also, if you search for "RegisterTaskDefinition" in CloudTrail's event history, you can search for events that update the task definition, so you may be able to identify the cause from there.
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/tutorial-event-history.html
The behavior you're experiencing with AWS Batch creating new ECS Task Definition revisions for each job run is not typical and suggests there might be a misconfiguration in your setup.
Normally, AWS Batch should use an existing Task Definition revision rather than creating a new one each time a job runs. The Task Definition specifies the container properties, resource requirements, and other configurations for your batch job. It should remain consistent unless you intentionally update it.
There are a few potential reasons why this might be happening:
-
Your EventBridge rule or AWS Batch job definition might be configured to create a new Task Definition revision each time it runs. Check your EventBridge rule configuration and Batch job definition to ensure they're not set to generate new revisions.
-
If you're using Terraform or another Infrastructure as Code tool to manage your AWS resources, it's possible that your code is set to create a new Task Definition revision on each apply or run.
-
There might be a misunderstanding in how you're specifying the Task Definition in your Batch job or EventBridge rule. Instead of referencing a specific revision, you should typically reference the Task Definition family without a revision number.
To resolve this issue and use an existing Task Definition revision:
-
In your AWS Batch job definition or EventBridge rule, make sure you're specifying the Task Definition family name without a revision number. This allows AWS Batch to use the latest active revision automatically.
-
Review your EventBridge scheduler configuration. Ensure it's not set to create a new Task Definition revision each time it runs.
-
If you're using Terraform or another IaC tool, check your code to make sure it's not inadvertently creating new revisions.
-
Consider using the AWS Batch console or CLI to manually deactivate unnecessary Task Definition revisions to clean up your environment.
By making these adjustments, you should be able to prevent the creation of unnecessary Task Definition revisions and have AWS Batch use an existing revision for your nightly process.
Sources
EventBridge scheduler schedules wrong ECS task definition | AWS re:Post
AWS Batch Features
According to AWS Support the answer is:
"When a new version of the JobDefinition is created, then during the next “SubmitJob” API Call, AWS Batch automatically creates a new version of the associated TaskDefinition to ensure that the task definition remains compatible with the updated job definition. This behavior is by design and is a fundamental aspect of AWS Batch's versioning system."
This kinda makes sense I guess, but it's still a bit surprising. I didn't realize that ECS Tasks and AWS Batch were so closely related. In fact, the only relation that I can find is that the TaskDefinition and JobDefinition share the same container image.
Part of my confusion on this was that my ECS TaskDefinition and Batch JobDefinition resources had the same name (ie. "my-app"). So, every time Batch updated the definition it "overwrote" the TaskDefinition by creating a new one. This was super frustrating and not very clear from the documentation I found.
One possible solution is simply use different names for the ECS definition (ie. "my-app") and another for the Batch definition (ie "batch-my-app"). At least it keeps ECS from trying to execute "batch" definitions.
Relevant content
- asked 3 years ago
- asked 4 years ago
- asked 9 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 years ago
My configuration is similar but the Target:Input includes container overrides. Maybe that is causing the new task definitions to be defined?
That was a good tip to look for "RegisterTaskDefinition" in CloudTrail. I found the events that were causing the problem quickly, and indeed it appears to be the "aws-batch" user.
Assuming that it's the container overrides that are causing the duplicate Task Definitions, it's interesting that none of the overrides actually appear in the new Task Definitions. I'm stumped!
I tried running AWS Batch by setting "ContainerOverrides" in my AWS account, but no new revisions were created in the ECS task definition. Is it possible for you to share the details of the "RegisterTaskDefinition" event recorded in CloudTrail?
@Riku_Kobayashi I updated the question with more details. Hopefully that helps resolve the issue.
Isn't the AWS Batch job definition updated every time with "RegisterJobDefinition" before "RegisterTaskDefinition"? If the job definition is updated every time, the task definition will also be updated every time. In other words, the following flow may occur in your AWS account every time. I think the cause is "RegisterJobDefinition", so I need to look for this event in CloudTrail to determine who is updating it. For example, do you have some kind of process that updates the job definition when batch processing ends?