Description
Observed Behavior:
We run EKS with Security Groups for Pods enabled (ENABLE_POD_ENI=true, enforcing mode standard). During scheduling, Karpenter provisioned a new node with instance type r8a.xlarge. The node then emitted an event:
Unsupported: The instance type r8a.xlarge is not supported yet by the vpc resource controller
As a result, pods requiring Pod ENIs could not be scheduled and remained Pending with:
FailedScheduling: Insufficient vpc.amazonaws.com/pod-eni
Karpenter provisions nodes (e.g. r8a.xlarge) that later cannot be trunk-enabled by the EKS-managed VPC Resource Controller. Pods requiring vpc.amazonaws.com/pod-eni remain Pending with Insufficient vpc.amazonaws.com/pod-eni.
Expected Behavior:
Karpenter identify the EKS VPC Resource Controller version of current cluster and won't provision a node which is not in the list - https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/v1.7.15/pkg/aws/vpc/limits.go
Ideally, Karpenter would help avoid provisioning instance types that are incompatible with SG-for-Pods ENI trunking in the current control plane, or at least provide a clearer/first-class mechanism to prevent this mismatch.
Reproduction Steps (Please include YAML):
-
EKS cluster with Security Groups for Pods enabled:
ENABLE_POD_ENI=true
POD_SECURITY_GROUP_ENFORCING_MODE=standard
Command:
kubectl -n kube-system describe ds aws-node | grep -E 'ENABLE_POD_ENI|POD_SECURITY_GROUP_ENFORCING_MODE'
-
Karpenter NodePool allows the new family (example: r8a is not excluded).
-
Deploy a workload that requires Pod ENIs (Security Groups for Pods).
-
Karpenter provisions r8a.xlarge which is not in the EKS VPC Resource Controller version 1.7.15
-
Node event shows VPC RC incompatibility:
Unsupported: The instance type r8a.xlarge is not supported yet by the vpc resource controller
-
Pod remains pending with:
FailedScheduling: Insufficient vpc.amazonaws.com/pod-eni
Evidence / references
Mitigation
We mitigated by excluding r8a from our Karpenter NodePool requirements so Karpenter cannot provision that family for SG-for-Pods workloads.
Versions:
- Chart Version: 1.8.5
- Kubernetes Version (
kubectl version): 1.34
Suggested here to re-open in this repo - kubernetes-sigs/karpenter#2863 (comment)
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Description
Observed Behavior:
We run EKS with Security Groups for Pods enabled (
ENABLE_POD_ENI=true, enforcing modestandard). During scheduling, Karpenter provisioned a new node with instance type r8a.xlarge. The node then emitted an event:Unsupported: The instance type r8a.xlarge is not supported yet by the vpc resource controllerAs a result, pods requiring Pod ENIs could not be scheduled and remained Pending with:
FailedScheduling: Insufficient vpc.amazonaws.com/pod-eniKarpenter provisions nodes (e.g. r8a.xlarge) that later cannot be trunk-enabled by the EKS-managed VPC Resource Controller. Pods requiring
vpc.amazonaws.com/pod-eniremain Pending withInsufficient vpc.amazonaws.com/pod-eni.Expected Behavior:
Karpenter identify the EKS VPC Resource Controller version of current cluster and won't provision a node which is not in the list - https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/v1.7.15/pkg/aws/vpc/limits.go
Ideally, Karpenter would help avoid provisioning instance types that are incompatible with SG-for-Pods ENI trunking in the current control plane, or at least provide a clearer/first-class mechanism to prevent this mismatch.
Reproduction Steps (Please include YAML):
EKS cluster with Security Groups for Pods enabled:
ENABLE_POD_ENI=truePOD_SECURITY_GROUP_ENFORCING_MODE=standardCommand:
kubectl -n kube-system describe ds aws-node | grep -E 'ENABLE_POD_ENI|POD_SECURITY_GROUP_ENFORCING_MODE'Karpenter NodePool allows the new family (example: r8a is not excluded).
Deploy a workload that requires Pod ENIs (Security Groups for Pods).
Karpenter provisions r8a.xlarge which is not in the EKS VPC Resource Controller version 1.7.15
Node event shows VPC RC incompatibility:
Unsupported: The instance type r8a.xlarge is not supported yet by the vpc resource controllerPod remains pending with:
FailedScheduling: Insufficient vpc.amazonaws.com/pod-eniEvidence / references
EKS Security Groups for Pods docs:
https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
Node event reporting controller version:
ControllerVersionNotice: The node is managed by VPC resource controller version v1.7.15Node event reporting incompatibility:
Unsupported: The instance type r8a.xlarge is not supported yet by the vpc resource controllerVPC resource controller compatibility list (
limits.go):https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/v1.7.15/pkg/aws/vpc/limits.go
Mitigation
We mitigated by excluding
r8afrom our Karpenter NodePool requirements so Karpenter cannot provision that family for SG-for-Pods workloads.Versions:
kubectl version): 1.34Suggested here to re-open in this repo - kubernetes-sigs/karpenter#2863 (comment)