- Newest
- Most votes
- Most comments
The issue you're experiencing with your EKS Auto Mode nodepool not finding compatible instance types despite having sufficient quota could be due to several factors:
-
Instance Availability: P4d instances are specialized GPU instances that may have limited availability in your region, especially for spot instances. Even with sufficient quota, the actual availability of these instances can fluctuate based on demand.
-
Spot Capacity Constraints: Since you're requesting spot instances specifically, there might not be any p4d.24xlarge spot instances available at the moment. Spot capacity can be particularly limited for specialized GPU instances.
-
Region/AZ Limitations: Your nodepool might be targeting specific Availability Zones where p4d instances are currently unavailable. You might want to consider expanding your AZ selection if you haven't specified any.
-
Price Constraints: There might be implicit price constraints in the default NodeClass that are preventing Karpenter from provisioning these instances if the current spot price is too high.
To troubleshoot this issue:
-
Check NodeDiagnostic: Use the Kubernetes NodeDiagnostic resource to retrieve more detailed logs about why instances can't be found.
-
Review EC2 Console Output: Use the AWS EC2 CLI command 'get-console-output' to get more information about any failed launch attempts.
-
Modify Your NodePool: Consider temporarily relaxing some requirements to see if instances become available:
- Try adding more instance families that have similar GPU capabilities
- Consider using on-demand instances temporarily
- Check if there are specific AZ constraints that might be limiting availability
- Check AWS Events: There might be capacity issues in your region for these specific instance types. Check AWS Health Dashboard or Service Health Dashboard for any relevant announcements.
Remember that in EKS Auto Mode, AWS manages Karpenter for you, so you don't have direct access to all Karpenter logs. If the issue persists, you may need to contact AWS support for more detailed diagnostics, as they have access to the underlying Karpenter logs and infrastructure.
Sources
How to provision GPU nodes in EKS Auto mode type cluster | AWS re:Post
EKS Karpenter nodeclaim error | AWS re:Post
How to debug Auto-Mode custom NodePool | AWS re:Post
Relevant content
- asked a year ago
- asked 4 months ago
- asked 5 months ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 6 months ago
