/kind bug
What happened?
- I am using Terraform to manage AWS resources.
- I tried to deploy, via Terraform, an EKS cluster with no nodes, but with the EFS CSI Add-On (and others). Nodes to be provisioned by Karpenter. The Karpenter controller itself is deployed to Fargate.
- Karpenter provisions EC2 nodes on demand to run Kubernetes Pods.
- I want the Pods (on EC2, provisioned by Karpenter) to have access to EFS.
- Terraform fails to deploy the EKS cluster because the EFS Add-On never becomes ready (reports status as "Degraded"). I believe this is similar to EBS CSI ISSUE #1801: the controller pods need to be running for the Add-On to report being healthy, but they have no place to run.
- I added a Fargate profile, targeting label
app = "efs-csi-controller", so that the EFS controller would be launched to Fargate.
- The Add-On still would not become healthy because the communication sockets were not created/available, and still reports status as "Degraded".
- After Karpenter was deployed, it started nodes, and the
efs-csi-node Daemonset successfully deployed to the EC2 nodes, but the efs-csi-controller Pods were still in a CrashLoopBackoff and the Add-On still reports status as "Degraded"..
What you expected to happen?
The controller pods would be deployed to Fargate and and work without the Node component, and the Add-On would report status as "Active". As EC2 Nodes were provisioned, controller Pods would work from Fargate while Node Pods worked properly on EC2 Nodes.
How to reproduce it (as minimally and precisely as possible)?
See "What happened" above.
Anything else we need to know?:
The failure that is reported to Kubernetes comes from the efs-plugin container exiting with an error. IMHO it should not try to run on Fargate, and probably should not be deployed as part of the controller for this reason.
Environment
- Kubernetes version (use
kubectl version): v1.27.4-eks-2d98532
- Driver version: v1.5.8-eksbuild.1
Please also attach debug logs to help us better diagnose
Log excerpts (each one just keeps repeating the quoted excerpt):
efs-csi-controller csi-provisioner
W0816 04:26:59.779601 1 connection.go:183] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
efs-csi-controller liveness-probe
W0816 04:27:00.989300 1 connection.go:173] Still connecting to unix:///csi/csi.sock
efs-csi-controller efs-plugin
I0816 05:54:46.413768 1 config_dir.go:63] Mounted directories do not exist, creating directory at '/etc/amazon/efs'
I0816 05:54:46.418766 1 metadata.go:63] getting MetadataService...
I0816 05:54:52.757469 1 metadata.go:71] retrieving metadata from Kubernetes API
F0816 05:54:52.773395 1 driver.go:56] could not get metadata: did not find aws instance ID in node providerID string
/kind bug
What happened?
app = "efs-csi-controller", so that the EFS controller would be launched to Fargate.efs-csi-nodeDaemonset successfully deployed to the EC2 nodes, but theefs-csi-controllerPods were still in a CrashLoopBackoff and the Add-On still reports status as "Degraded"..What you expected to happen?
The controller pods would be deployed to Fargate and and work without the Node component, and the Add-On would report status as "Active". As EC2 Nodes were provisioned, controller Pods would work from Fargate while Node Pods worked properly on EC2 Nodes.
How to reproduce it (as minimally and precisely as possible)?
See "What happened" above.
Anything else we need to know?:
The failure that is reported to Kubernetes comes from the
efs-plugincontainer exiting with an error. IMHO it should not try to run on Fargate, and probably should not be deployed as part of the controller for this reason.Environment
kubectl version): v1.27.4-eks-2d98532Please also attach debug logs to help us better diagnose
Log excerpts (each one just keeps repeating the quoted excerpt):
efs-csi-controller csi-provisionerefs-csi-controller liveness-probeefs-csi-controller efs-plugin