Version
2.6.0
Describe the bug.
The Helm chart templates for embedding and reranking NIMs do not pass through the model.profiles configuration to the NIMCache custom resource, causing the NIM Operator to download all available profiles instead of only the user-specified profile.
Expected Behavior:
When a user specifies a single profile in their values override file:
nimOperator:
nvidia-nim-llama-nemotron-embed-vl-1b-v2:
model:
profiles:
- f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f
Only that profile should be downloaded by the NIMCache job.
Actual Behavior:
All available profiles are downloaded, consuming:
- Significantly more storage
- Longer cache job execution time
- Unnecessary network bandwidth
This impacts deployment in resource-constrained environment where storage is a critical `factor.```
Minimum reproducible example
Step 1: Create minimal values override file (test-values.yaml):
imagePullSecret:
name: "ngc-secret"
create: true
password: "<YOUR_NGC_API_KEY>"
ngcApiSecret:
name: "ngc-api"
create: true
password: "<YOUR_NGC_API_KEY>"
nimOperator:
nim-llm:
enabled: false
nvidia-nim-llama-nemotron-embed-1b-v2:
enabled: false
nvidia-nim-llama-nemotron-embed-vl-1b-v2:
enabled: true
model:
profiles:
- f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f
nvidia-nim-llama-nemotron-rerank-1b-v2:
enabled: false
Step 2: Deploy the chart:
helm upgrade --install rag -n test-nim-bug https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.6.0.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
-f test-values.yaml \
--create-namespace
Step 3: Verify the issue:
Check the NIMCache spec (should have model.profiles but doesn't):
kubectl get nimcache nemotron-vlm-embedding-ms-cache -n test-nim-bug -o yaml | grep -A 10 "spec:"
Expected output:
spec:
source:
ngc:
modelPuller: "nvcr.io/nim/nvidia/llama-nemotron-embed-vl-1b-v2:1.12.0"
pullSecret: ngc-secret
authSecret: ngc-api
model:
profiles:
- f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f
Actual output:
spec:
source:
ngc:
modelPuller: "nvcr.io/nim/nvidia/llama-nemotron-embed-vl-1b-v2:1.12.0"
pullSecret: ngc-secret
authSecret: ngc-api
# model section is missing
Check how many profiles are being downloaded:
kubectl get job -n test-nim-bug nemotron-vlm-embedding-ms-cache-job \
-o jsonpath='{.spec.template.spec.containers[0].args}' | jq
Expected: Single profile ID in args
["--profiles", "f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f"]
Actual: All 11 profile IDs
[
"--profiles",
"30deb3f6ab82ce062835b4e97894ad0ea76b5459ecd7f1f113ee7f4a920f81e9",
"448d9749eede748c122f71cb0d0ad009557b5f6d7c1599cce7dfe24af4474d83",
"44d1f6e283eb0365efb617f8c255e42f9d8c97fe039774c2837cdfcef54296ce",
"8f04bf773931e5b983e6812a1bae11bbccf68f4cb62644421c42c3174c0ae143",
"91b1800959ef14ba7340038e06e3ac51d7fa71975ba6e89c580affaf3cce7788",
"a1c13d988e661ffcff9de9a919de942e2257c8a1990280e18f1fcfbf1bdd39e5",
"a2a98d9d648d970f17ebca4b8ddacf96aaf7266ec66c74c83c6b42e3f5f5b97c",
"ac9898291cd6590f16956c43779ee8a191bd9ff1c091cf2ddecf51ea13f72877",
"c10e0d36e7605cdd0ccbb0e25f0be6d97c00b4113cc9242b577d550b92d83da8",
"c3caccfe9fcbc6c3d8d911cfa19992e6bf3398d38298cfa97da79ad8f5f35723",
"f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f"
]
Relevant log output
Full env printout
Other/Misc.
No response
Code of Conduct
Version
2.6.0
Describe the bug.
The Helm chart templates for embedding and reranking NIMs do not pass through the model.profiles configuration to the NIMCache custom resource, causing the NIM Operator to download all available profiles instead of only the user-specified profile.
Expected Behavior:
When a user specifies a single profile in their values override file:
Only that profile should be downloaded by the NIMCache job.
Actual Behavior:
All available profiles are downloaded, consuming:
This impacts deployment in resource-constrained environment where storage is a critical `factor.```
Minimum reproducible example
Step 1: Create minimal values override file (test-values.yaml): imagePullSecret: name: "ngc-secret" create: true password: "<YOUR_NGC_API_KEY>" ngcApiSecret: name: "ngc-api" create: true password: "<YOUR_NGC_API_KEY>" nimOperator: nim-llm: enabled: false nvidia-nim-llama-nemotron-embed-1b-v2: enabled: false nvidia-nim-llama-nemotron-embed-vl-1b-v2: enabled: true model: profiles: - f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f nvidia-nim-llama-nemotron-rerank-1b-v2: enabled: false Step 2: Deploy the chart: helm upgrade --install rag -n test-nim-bug https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.6.0.tgz \ --username '$oauthtoken' \ --password "${NGC_API_KEY}" \ -f test-values.yaml \ --create-namespace Step 3: Verify the issue: Check the NIMCache spec (should have model.profiles but doesn't): kubectl get nimcache nemotron-vlm-embedding-ms-cache -n test-nim-bug -o yaml | grep -A 10 "spec:" Expected output: spec: source: ngc: modelPuller: "nvcr.io/nim/nvidia/llama-nemotron-embed-vl-1b-v2:1.12.0" pullSecret: ngc-secret authSecret: ngc-api model: profiles: - f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f Actual output: spec: source: ngc: modelPuller: "nvcr.io/nim/nvidia/llama-nemotron-embed-vl-1b-v2:1.12.0" pullSecret: ngc-secret authSecret: ngc-api # model section is missing Check how many profiles are being downloaded: kubectl get job -n test-nim-bug nemotron-vlm-embedding-ms-cache-job \ -o jsonpath='{.spec.template.spec.containers[0].args}' | jq Expected: Single profile ID in args ["--profiles", "f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f"] Actual: All 11 profile IDs [ "--profiles", "30deb3f6ab82ce062835b4e97894ad0ea76b5459ecd7f1f113ee7f4a920f81e9", "448d9749eede748c122f71cb0d0ad009557b5f6d7c1599cce7dfe24af4474d83", "44d1f6e283eb0365efb617f8c255e42f9d8c97fe039774c2837cdfcef54296ce", "8f04bf773931e5b983e6812a1bae11bbccf68f4cb62644421c42c3174c0ae143", "91b1800959ef14ba7340038e06e3ac51d7fa71975ba6e89c580affaf3cce7788", "a1c13d988e661ffcff9de9a919de942e2257c8a1990280e18f1fcfbf1bdd39e5", "a2a98d9d648d970f17ebca4b8ddacf96aaf7266ec66c74c83c6b42e3f5f5b97c", "ac9898291cd6590f16956c43779ee8a191bd9ff1c091cf2ddecf51ea13f72877", "c10e0d36e7605cdd0ccbb0e25f0be6d97c00b4113cc9242b577d550b92d83da8", "c3caccfe9fcbc6c3d8d911cfa19992e6bf3398d38298cfa97da79ad8f5f35723", "f7391ddbcb95b2406853526b8e489fedf20083a2420563ca3e65358ff417b10f" ]Relevant log output
Full env printout
Other/Misc.
No response
Code of Conduct