Skip to content

Unable to deploy efs-csi-controller to Fargate to support Karpenter-provisioned EKS cluster #1100

@Nuru

Description

@Nuru

/kind bug

What happened?

  • I am using Terraform to manage AWS resources.
  • I tried to deploy, via Terraform, an EKS cluster with no nodes, but with the EFS CSI Add-On (and others). Nodes to be provisioned by Karpenter. The Karpenter controller itself is deployed to Fargate.
    • Karpenter provisions EC2 nodes on demand to run Kubernetes Pods.
    • I want the Pods (on EC2, provisioned by Karpenter) to have access to EFS.
    • Terraform fails to deploy the EKS cluster because the EFS Add-On never becomes ready (reports status as "Degraded"). I believe this is similar to EBS CSI ISSUE #1801: the controller pods need to be running for the Add-On to report being healthy, but they have no place to run.
  • I added a Fargate profile, targeting label app = "efs-csi-controller", so that the EFS controller would be launched to Fargate.
  • The Add-On still would not become healthy because the communication sockets were not created/available, and still reports status as "Degraded".
  • After Karpenter was deployed, it started nodes, and the efs-csi-node Daemonset successfully deployed to the EC2 nodes, but the efs-csi-controller Pods were still in a CrashLoopBackoff and the Add-On still reports status as "Degraded"..

What you expected to happen?

The controller pods would be deployed to Fargate and and work without the Node component, and the Add-On would report status as "Active". As EC2 Nodes were provisioned, controller Pods would work from Fargate while Node Pods worked properly on EC2 Nodes.

How to reproduce it (as minimally and precisely as possible)?

See "What happened" above.

Anything else we need to know?:

The failure that is reported to Kubernetes comes from the efs-plugin container exiting with an error. IMHO it should not try to run on Fargate, and probably should not be deployed as part of the controller for this reason.

Environment

  • Kubernetes version (use kubectl version): v1.27.4-eks-2d98532
  • Driver version: v1.5.8-eksbuild.1

Please also attach debug logs to help us better diagnose

Log excerpts (each one just keeps repeating the quoted excerpt):

efs-csi-controller csi-provisioner

W0816 04:26:59.779601       1 connection.go:183] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock

efs-csi-controller liveness-probe

W0816 04:27:00.989300       1 connection.go:173] Still connecting to unix:///csi/csi.sock

efs-csi-controller efs-plugin

I0816 05:54:46.413768       1 config_dir.go:63] Mounted directories do not exist, creating directory at '/etc/amazon/efs'
I0816 05:54:46.418766       1 metadata.go:63] getting MetadataService...
I0816 05:54:52.757469       1 metadata.go:71] retrieving metadata from Kubernetes API
F0816 05:54:52.773395       1 driver.go:56] could not get metadata: did not find aws instance ID in node providerID string

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions