Skip to content

Why does my EventBridge Scheduler schedule run but usually not start my ECS task?

0

I have an EventBridge Scheduler schedule that has an ECS task as its target. The cron schedule is UTC (8 0 ? * 2-6 *). I can see Scheduler attempts in CloudWatch metrics, but my ECS task only starts up once or twice a week. There are no errors on the days the task fails to start. Happy to share any other info, but it would be nice if someone at AWS would just take a look at my setup and see if they can tell what the problem is.

2 Answers
0

Based on your description, it sounds like your ECS tasks might be failing to start due to capacity unavailability in AWS Fargate. This is a common issue that doesn't always generate visible errors in your standard monitoring.

When Fargate capacity is unavailable in a particular availability zone at the time your schedule triggers, the RunTask API call can fail with a reason like "Capacity is unavailable at this time. Please try again later or in a different availability zone." However, this failure might not be immediately visible in your standard CloudWatch metrics.

To troubleshoot and resolve this issue:

  1. Set up an EventBridge rule to capture ECS service action events, specifically looking for SERVICE_TASK_PLACEMENT_FAILURE events with RESOURCE:FARGATE as the reason.

  2. Configure a retry mechanism for when capacity is unavailable. You could use AWS Step Functions with EventBridge integration to implement retry logic that can:

  • Wait and retry the task launch
  • Try different subnets to target different availability zones
  1. Check your network configuration:
  • If using awsvpc network mode, ensure proper NetworkConfiguration
  • For tasks needing internet access, either use public subnets with "Auto-assign public IP" enabled or private subnets with a NAT Gateway
  1. If you're using the EventBridge UI to configure your schedule, there might be UI issues with the "Auto-assign public IP" option. Some users have reported only seeing the "Disabled" option. You can work around this by:
  • Using the "Universal target definition" option and manually editing the JSON
  • Adding necessary parameters like "LaunchType": "FARGATE" and setting the public IP assignment
  1. Monitor CloudWatch metrics like InvocationAttemptCount and TargetErrorCount to verify if your schedule is triggering and if invocations are failing.

Since your schedule works occasionally but fails most of the time without errors, this strongly suggests a capacity or networking issue rather than a scheduling problem.
Sources
ECS task trigger from EventBridge, but ECS capacity unavailable | AWS re:Post
Troubleshooting Amazon EventBridge Scheduler - EventBridge Scheduler
AWS EventBridge issue when try to scehdule AWS ECS task to process | AWS re:Post

answered 2 months ago
EXPERT
reviewed 2 months ago
0

Are you running the ECS using spot? If so, maybe there is capacity issue for the spot type.

EXPERT
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.