Skip to content

EKS automode: Karpenter could not find available resources for a nodepool that worked in the past

0

Nodepool definition

(fifm) (base) ubuntu@ip-:~/machine_learning$ kubectl describe nodepool p4d-24xlarge-spot
Name:         p4d-24xlarge-spot
Namespace:    
Labels:       <none>
Annotations:  karpenter.sh/nodepool-hash: 14804313366754770785
              karpenter.sh/nodepool-hash-version: v3
API Version:  karpenter.sh/v1
Kind:         NodePool
Metadata:
  Creation Timestamp:  2025-06-16T18:15:38Z
  Generation:          1
  Resource Version:    18098774
  UID:                 bc34d606-99df-4244-8859-243a4b16c6e5
Spec:
  Disruption:
    Budgets:
      Nodes:               10%
    Consolidate After:     600s
    Consolidation Policy:  WhenEmptyOrUnderutilized
  Template:
    Metadata:
      Labels:
        ray.io/node-type:  p4d-24xlarge-spot
    Spec:
      Expire After:  336h
      Node Class Ref:
        Group:  eks.amazonaws.com
        Kind:   NodeClass
        Name:   default
      Requirements:
        Key:       karpenter.sh/capacity-type
        Operator:  In
        Values:
          spot
        Key:       kubernetes.io/arch
        Operator:  In
        Values:
          amd64
        Key:       eks.amazonaws.com/instance-family
        Operator:  In
        Values:
          p4d
        Key:       eks.amazonaws.com/instance-size
        Operator:  In
        Values:
          24xlarge
        Key:       eks.amazonaws.com/instance-gpu-count
        Operator:  Gt
        Values:
          0
      Taints:
        Effect:                  NoSchedule
        Key:                     ray.io/node-type
        Value:                   p4d-24xlarge-spot
      Termination Grace Period:  600s
Status:
  Conditions:
    Last Transition Time:          2025-06-16T18:15:38Z
    Message:                       
    Observed Generation:           1
    Reason:                        ValidationSucceeded
    Status:                        True
    Type:                          ValidationSucceeded
    Last Transition Time:          2025-06-16T18:15:38Z
    Message:                       
    Observed Generation:           1
    Reason:                        NodeClassReady
    Status:                        True
    Type:                          NodeClassReady
    Last Transition Time:          2025-06-16T18:20:32Z
    Message:                       
    Observed Generation:           1
    Reason:                        NodeRegistrationHealthy
    Status:                        True
    Type:                          NodeRegistrationHealthy
    Last Transition Time:          2025-06-16T18:15:38Z
    Message:                       
    Observed Generation:           1
    Reason:                        Ready
    Status:                        True
    Type:                          Ready
  Node Class Observed Generation:  1
  Resources:
    Cpu:                  0
    Ephemeral - Storage:  0
    Memory:               0
    Nodes:                0
    Pods:                 0
Events:
  Type     Reason                     Age                  From       Message
  ----     ------                     ----                 ----       -------
  Warning  NoCompatibleInstanceTypes  13s (x888 over 25h)  karpenter  NodePool requirements filtered out all compatible available instance types

This nodepool was able to find instances initially. But at some point, it just kept saying this "NodePool requirements filtered out all compatible available instance types" for the past 25 hours

I checked my quota. I have enough quota

quota

1 Answer
0

The issue you're experiencing with your EKS Auto Mode nodepool not finding compatible instance types despite having sufficient quota could be due to several factors:

  1. Instance Availability: P4d instances are specialized GPU instances that may have limited availability in your region, especially for spot instances. Even with sufficient quota, the actual availability of these instances can fluctuate based on demand.

  2. Spot Capacity Constraints: Since you're requesting spot instances specifically, there might not be any p4d.24xlarge spot instances available at the moment. Spot capacity can be particularly limited for specialized GPU instances.

  3. Region/AZ Limitations: Your nodepool might be targeting specific Availability Zones where p4d instances are currently unavailable. You might want to consider expanding your AZ selection if you haven't specified any.

  4. Price Constraints: There might be implicit price constraints in the default NodeClass that are preventing Karpenter from provisioning these instances if the current spot price is too high.

To troubleshoot this issue:

  1. Check NodeDiagnostic: Use the Kubernetes NodeDiagnostic resource to retrieve more detailed logs about why instances can't be found.

  2. Review EC2 Console Output: Use the AWS EC2 CLI command 'get-console-output' to get more information about any failed launch attempts.

  3. Modify Your NodePool: Consider temporarily relaxing some requirements to see if instances become available:

  • Try adding more instance families that have similar GPU capabilities
  • Consider using on-demand instances temporarily
  • Check if there are specific AZ constraints that might be limiting availability
  1. Check AWS Events: There might be capacity issues in your region for these specific instance types. Check AWS Health Dashboard or Service Health Dashboard for any relevant announcements.

Remember that in EKS Auto Mode, AWS manages Karpenter for you, so you don't have direct access to all Karpenter logs. If the issue persists, you may need to contact AWS support for more detailed diagnostics, as they have access to the underlying Karpenter logs and infrastructure.
Sources
How to provision GPU nodes in EKS Auto mode type cluster | AWS re:Post
EKS Karpenter nodeclaim error | AWS re:Post
How to debug Auto-Mode custom NodePool | AWS re:Post

answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.