Skip to content

[vLLM] Parametrize embedding benchmarks into EMBEDDING_CONFIGS#5473

Open
jazpurTT wants to merge 1 commit into
mainfrom
jazpur/vllm-embedding
Open

[vLLM] Parametrize embedding benchmarks into EMBEDDING_CONFIGS#5473
jazpurTT wants to merge 1 commit into
mainfrom
jazpur/vllm-embedding

Conversation

@jazpurTT

Copy link
Copy Markdown
Contributor

Problem Description

  • Embedding benchmarks were defined as one hand-written test function per model/batch variant (test_vllm_bge_m3_batch1, test_vllm_bge_m3_batch32, test_vllm_qwen3_embedding_4b_batch1), unlike generative models which parametrize over the SINGLE_DEVICE_CONFIGS / TP_CONFIGS arrays. This doesn't scale as we expand the embedding config matrix — every new model/config would need another bespoke function.

What's Changes

  • Added an EMBEDDING_CONFIGS array (mirroring SINGLE_DEVICE_CONFIGS / TP_CONFIGS) with stable pytest.param(..., id=...) IDs.
  • Replaced the three per-model functions with a single parametrized test_vllm_embedding_benchmark.
  • Updated the three embedding pytest node IDs in .github/workflows/perf-bench-matrix.json to the new parametrized form (e.g. test_vllm_embedding_benchmark[bge-m3-batch1]).
  • No behavior change: same 3 configs and args; pure refactor to ready the infra before adding new embedding models/configs.

Checklist

  • New/Existing tests provide coverage for changes

@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 33.82%. Comparing base (9b0c875) to head (7280820).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5473      +/-   ##
==========================================
- Coverage   33.84%   33.82%   -0.03%     
==========================================
  Files          37       37              
  Lines        4990     4990              
==========================================
- Hits         1689     1688       -1     
- Misses       3301     3302       +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants