Skip to content

Automatically Select Cost-Effective Instances in AWS Batch with New Default Categories

7 minute read
Content level: Intermediate
1

AWS Batch introduces new default instance categories (default_x86_64 and default_arm64), eliminating manual instance type management. Organizations can leverage architecture-specific optimizations -up to 40% cost savings with Graviton processors - while simplifying operations. Users can build multi-architecture images once and deploy across separate compute environments, with Batch automatically selecting appropriate instances as new generations become available.

AWS Batch has introduced two new instance type categories (default_x86_64 and default_arm64) that make it easier for customers to run workloads on cost-effective and up-to-date Amazon EC2 instances. These options complement the existing 'optimal' category, which will continue to be supported. With this enhancement, AWS Batch customers gain more flexibility in selecting the right compute resources for their workloads, whether using x86 or Arm-based processors. The recent introduction of default_x86_64 and default_arm64 categories is just the starting point. Successful implementation depends on addressing several critical considerations:

  • Strategic implementation that can deliver up to 40% cost savings
  • Combining with Spot instances for up to 90% savings
  • Multi-architecture strategies that future-proof your workloads

What Changed?

Until now, AWS Batch supported a single shorthand category for instanceTypes: 'optimal'. The 'optimal' option maps to a static set of instance families in the M, C, and R series (for example, c5, m5, r5). While this has been convenient, it did not always align with customer expectations of accessing the latest, most cost-effective instances. It also excluded AWS Graviton-based families, which offer strong price-performance benefits. With the launch of the new categories—default_x86_64 and default_arm64—AWS Batch now automatically selects from a pool of the latest generation of instance families, depending on your workload’s architecture. These pools will evolve over time, ensuring your workloads benefit from the newest EC2 offerings without manual updates.

How It Works

When you configure your Batch compute environment, you can now specify either 'default_x86_64' (the default) or 'default_arm64' in the instanceTypes parameter. These settings automatically select cost-effective instance types across families and generations that match your job queue requirements. For example, selecting default_arm64 will direct AWS Batch to launch instances from the AWS Graviton-based families such as m6g, c6g, r6g, or c7g, depending on availability in your region. Meanwhile, default_x86_64 covers the latest Intel- and AMD-based families. Note: At this time, AWS Batch does not support mixing architecture types within a single compute environment. You'll need separate compute environments for x86_64 and arm64 workloads, which can then be ordered within the same job queue for fallback scenarios. To configure the new default categories through the AWS Management Console, navigate to AWS Batch > Compute environments > Create. In the instance types section, you can now select from the new default options:
Allowed instance types

With compute environments configured to use the new default categories, the next step is preparing your applications for multi-architecture deployment. This foundation enables the advanced cost optimization strategies covered in subsequent sections.

Multi-Architecture Strategy

Since AWS Batch compute environments cannot mix architecture types, you'll need separate compute environments for x86_64 and ARM64 workloads. However, you can use the same multi-architecture container image across both environments.

Container Image Preparation

Building multi-architecture container images ensures compatibility with both x86 and ARM instances. The Dockerfile uses build arguments to detect the target platform and install architecture-specific optimizations.

FROM --platform=$BUILDPLATFORM public.ecr.aws/docker/library/python:3.9-slim

ARG TARGETPLATFORM
ARG BUILDPLATFORM

RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
      echo "Installing ARM64 optimizations" && \
      apt-get update && apt-get install -y libopenblas-dev; \
    else \
      echo "Installing x86_64 optimizations" && \
      apt-get update && apt-get install -y libmkl-dev; \
    fi

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app.py .
CMD ["python", "app.py"]

The BUILDPLATFORM variable represents the architecture of the build machine, while TARGETPLATFORM specifies the architecture being built for (linux/amd64 or linux/arm64). This approach allows different optimizations: ARM64 uses OpenBLAS for mathematical operations, while x86_64 leverages Intel MKL for better performance.

Build and Push Multi-Arch Images

Docker buildx enables building images for multiple architectures simultaneously. This creates a single image manifest that automatically selects the correct architecture when pulled by the Batch jobs.

docker buildx create --use

docker buildx build --platform linux/amd64,linux/arm64 \
  -t your-account.dkr.ecr.region.amazonaws.com/your-app:latest \
  --push .

docker buildx imagetools inspect your-account.dkr.ecr.region.amazonaws.com/your-app:latest

The --platform flag specifies both target architectures, and the --push flag uploads directly to the registry. The final inspect command verifies that both architecture variants were successfully created and uploaded.

Job Definition Configuration

The same job definition works with both x86_64 and ARM64 compute environments - Docker automatically pulls the correct architecture variant:

{
  "jobDefinitionName": "multi-arch-job",
  "type": "container",
  "containerProperties": {
    "image": "your-account.dkr.ecr.region.amazonaws.com/your-app:latest",
    "vcpus": 2,
    "memory": 4096,
    "jobRoleArn": "arn:aws:iam::account:role/BatchJobRole"
  },
  "platformCapabilities": ["EC2"],
  "retryStrategy": {
    "attempts": 3
  }
}

Cost Optimization

Spot Instance Integration

Spot instances provide up to 90% cost savings compared to On-Demand pricing. The SPOT_PRICE_CAPACITY_OPTIMIZED allocation strategy selects instance types with the lowest likelihood of interruption.

resource "aws_batch_compute_environment" "spot_default" {
  compute_environment_name = "spot-default-x86"
  type                    = "MANAGED"
  state                   = "ENABLED"

  compute_resources {
    type                = "SPOT"
    allocation_strategy = "SPOT_PRICE_CAPACITY_OPTIMIZED"
    instance_type      = ["default-x86_64"]
    spot_iam_fleet_request_role = aws_iam_role.aws_ec2_spot_fleet_role.arn
    
    bid_percentage = 50
    
    min_vcpus     = 0
    max_vcpus     = 1000
    desired_vcpus = 0
  }
}

The bid_percentage sets the maximum price as a percentage of On-Demand pricing. Setting it to 50% means you'll pay up to half the On-Demand price. Spot instances can be interrupted with a 2-minute Instance interruption notice, so ensure jobs are fault-tolerant or implement checkpointing for long-running tasks.

Benefits

  • Simplified configuration – Use a single shorthand to stay current with evolving EC2 families.
  • Cost optimization – Automatically leverage newer, lower-cost, higher-performance instance generations.
  • Flexibility – Choose between Arm-based (Graviton) or x86-based compute depending on workload compatibility.
  • Future-proof – As AWS launches new instance types, they will automatically be included in the default pools.

Tracking Cost Savings

One of the biggest advantages of using the new default_x86_64 and default_arm64 categories is that AWS Batch will automatically place your workloads on newer, cost-effective instance types as they become available in your Region. To validate the benefits, you can track cost savings and performance improvements with a few simple steps:

  1. Use AWS Cost Explorer or CUR (Cost and Usage Reports) - Filter costs by service = "Amazon Elastic Compute Cloud - Compute" and usage type containing "Batch" to compare spend before and after enabling the new defaults. Look specifically at changes in average hourly rate per vCPU when jobs are placed on newer instance generations or Graviton-based families.
  2. Monitor per-job efficiency in CloudWatch - Batch automatically publishes metrics such as vCPUs and memory used. By dividing job runtime against cost data, you can evaluate cost per job. Many customers find that Graviton-based jobs complete faster and at a lower hourly price, resulting in a double savings effect.
  3. Compare workloads across architectures - Since you'll need separate compute environments for each architecture, you can run representative jobs in each and compare cost per output unit (e.g., cost per dataset processed, cost per simulation). This gives you concrete evidence of the value of adopting Graviton.
  4. Automate tracking with tags - Add cost allocation tags such as "Architecture=x86_64" or "Architecture=arm64" to your compute environments. This makes it easy to track costs by architecture in Cost Explorer or custom dashboards. By regularly monitoring these signals, you’ll be able to show tangible improvements from adopting the new defaults. For many workloads, customers have observed up to 40% better price/performance when shifting suitable jobs to AWS Graviton processors.

Getting Started

To take advantage of the new instance categories, simply specify default_x86_64 or default_arm64 when creating or updating your compute environments. You don’t need to create new compute environments, and existing configurations using 'optimal' remain valid. The behavior of optimal will stay the same until early November 2025. After that, optimal will behave the same as the default_x86_64 instance category. For more information on troubleshooting AWS Batch environments, visit the AWS Batch Troubleshooting Guide.

Conclusion

With these new instance type categories, AWS Batch makes it even easier to run cost-effective, high-performance workloads without manual intervention. Whether you choose x86 or Arm, your jobs will run on the most up-to-date and efficient EC2 instances available in your Region.

AWS
EXPERT
published a month ago122 views