This was found by @gauravkghildiyal , see doc with more details.
in DRANET, a DRA driver for kubernetes networking, we implement an architecture with 2 legs for the driver:
- Kubelet via DRA
- Container runtime via NRI
This architecture allow to handle complex workloads that require to execute at different stages of a pod creation. Also, this guarantees that the kubernetes system will not schedule nothing to the node until the DRA driver in the node is running, the driver can store the state in memory during the kubelet DRA NodePrepareResources() callback, and use the NRI callback to execute the actions on the Pod during the container runtime stage.
However, there is still a race in this process, if for any case the runtime is restarted, the DRA driver will not be able to notice it and the Pod can be running without the NRI call back called ... we need to ensure that a specific driver is always called for a specific pod
NRI added the concept of required plugins #278 , this indicates that drivers can implement this behavior globally modifying the configuration or via annotations.
Since global is an admin option that requires changing containerd config, is hard for independent drivers to modify it, so we need to use the annotations path, however, the DRA hook in kubelet does not have a way to plumb dumb annotations (Device plugin does have)
I commented this during kubecon with NRI maintainers @klihub @kad they mentioned it should be possible to use the existing NodePrepareResources() hook to be able to use CDI to allow this functionality, but may require some work in some places to enable it.
@klihub @gauravkghildiyal @samuelkarp @chrishenzie
This was found by @gauravkghildiyal , see doc with more details.
in DRANET, a DRA driver for kubernetes networking, we implement an architecture with 2 legs for the driver:
This architecture allow to handle complex workloads that require to execute at different stages of a pod creation. Also, this guarantees that the kubernetes system will not schedule nothing to the node until the DRA driver in the node is running, the driver can store the state in memory during the kubelet DRA NodePrepareResources() callback, and use the NRI callback to execute the actions on the Pod during the container runtime stage.
However, there is still a race in this process, if for any case the runtime is restarted, the DRA driver will not be able to notice it and the Pod can be running without the NRI call back called ... we need to ensure that a specific driver is always called for a specific pod
NRI added the concept of required plugins #278 , this indicates that drivers can implement this behavior globally modifying the configuration or via annotations.
Since global is an admin option that requires changing containerd config, is hard for independent drivers to modify it, so we need to use the annotations path, however, the DRA hook in kubelet does not have a way to plumb dumb annotations (Device plugin does have)
I commented this during kubecon with NRI maintainers @klihub @kad they mentioned it should be possible to use the existing NodePrepareResources() hook to be able to use CDI to allow this functionality, but may require some work in some places to enable it.
@klihub @gauravkghildiyal @samuelkarp @chrishenzie