Skip to content

tests: sync v1 kv cache tests for vllm 0.23#292

Open
xzh25 wants to merge 2 commits into
MetaX-MACA:masterfrom
xzh25:fix-vllm023-kv-cache-tests
Open

tests: sync v1 kv cache tests for vllm 0.23#292
xzh25 wants to merge 2 commits into
MetaX-MACA:masterfrom
xzh25:fix-vllm023-kv-cache-tests

Conversation

@xzh25

@xzh25 xzh25 commented Jun 21, 2026

Copy link
Copy Markdown

Summary

  • sync v1 KV cache CPU tests with the vLLM 0.23 API surface
  • update shared test imports for moved config/logprob/input-processing helpers
  • mark the synced pure CPU KV cache tests to skip global accelerator cleanup, avoiding MetaX torch 2.8 torch.accelerator.empty_cache teardown failures

Validation

Run on MetaX C500 remote workspace /data/vllm-metax-contribution/src/vLLM-metax with isolated vLLM 0.23 environment /data/vllm-metax-contribution/envs/vllm023:

git diff --check
python -m py_compile   tests/conftest.py   tests/models/registry.py   tests/models/utils.py   tests/v1/core/test_kv_cache_utils.py   tests/v1/core/test_single_type_kv_cache_manager.py
PYTHONPATH="$VLLM_METAX_SRC:${PYTHONPATH:-}" python -m pytest --collect-only -q   tests/v1/core/test_kv_cache_utils.py   tests/v1/core/test_single_type_kv_cache_manager.py   --tb=short
PYTHONPATH="$VLLM_METAX_SRC:${PYTHONPATH:-}" python -m pytest -q   tests/v1/core/test_kv_cache_utils.py::test_none_hash   tests/v1/core/test_single_type_kv_cache_manager.py   --tb=short --disable-warnings

Results:

  • git diff --check: passed
  • py_compile: passed
  • collect-only: 66 tests collected
  • CPU smoke: 10 passed, 5 warnings in 0.06s

Signed-off-by: xzh25 <xzh25@users.noreply.github.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors and expands the test suite for the KV cache management system, updating imports and adding comprehensive tests for multi-worker configurations, pipeline parallel sharding, prompt embeddings, and auto-fitting of the maximum model length. Feedback on the changes highlights multiple instances in the test suite where ModelConfig is instantiated without its required first positional argument (model), which will lead to TypeError exceptions during test execution.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated


def test_get_kv_cache_configs_multiple_workers():
model_config = ModelConfig(max_model_len=16)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=16)
model_config = ModelConfig("dummy", max_model_len=16)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
ids=["symmetric", "asymmetric"],
)
def test_get_kv_cache_configs_pp_sharding(asymmetric_memory):
model_config = ModelConfig(max_model_len=512)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=512)
model_config = ModelConfig("dummy", max_model_len=512)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
def test_get_kv_cache_config():
def test_get_kv_cache_config_one_worker():
# pass max_model_len to pass check_enough_kv_cache_memory
model_config = ModelConfig(max_model_len=16)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=16)
model_config = ModelConfig("dummy", max_model_len=16)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated

def test_get_kv_cache_configs_attention_free():
kv_cache_specs: dict[str, KVCacheSpec] = {}
vllm_config = VllmConfig(model_config=ModelConfig(max_model_len=16))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
vllm_config = VllmConfig(model_config=ModelConfig(max_model_len=16))
vllm_config = VllmConfig(model_config=ModelConfig("dummy", max_model_len=16))

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
def test_auto_fit_max_model_len():
"""Test that max_model_len=-1 auto-fits to available GPU memory."""
# Create config with original_max_model_len=-1 to trigger auto-fit
model_config = ModelConfig(max_model_len=1024)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=1024)
model_config = ModelConfig("dummy", max_model_len=1024)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
def test_auto_fit_max_model_len_with_hybrid():
"""Test that auto-fit works with hybrid KV cache specs."""
# Create config with original_max_model_len=-1 to trigger auto-fit
model_config = ModelConfig(max_model_len=8192)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=8192)
model_config = ModelConfig("dummy", max_model_len=8192)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated

def test_auto_fit_max_model_len_not_triggered():
"""Test that auto-fit is not triggered when original_max_model_len is not -1."""
model_config = ModelConfig(max_model_len=16)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=16)
model_config = ModelConfig("dummy", max_model_len=16)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
the raw `available_memory`. Without this, auto-fit could pick a
max_model_len that no longer fits once `num_gpu_blocks_override` is applied.
"""
model_config = ModelConfig(max_model_len=16384)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=16384)
model_config = ModelConfig("dummy", max_model_len=16384)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
`available_memory`. Without this, startup could accept a max_model_len
that does not actually fit in `num_gpu_blocks_override` blocks.
"""
model_config = ModelConfig(max_model_len=16384)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=16384)
model_config = ModelConfig("dummy", max_model_len=16384)

Comment thread tests/v1/core/test_kv_cache_utils.py Outdated
to False (i.e. HMA remains enabled) when no other condition requires it
to be disabled.
"""
model_config = ModelConfig(max_model_len=16)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ModelConfig constructor requires model (the model name/path) as its first positional argument. Instantiating it without this argument will raise a TypeError when the test is executed. Please provide a dummy model name like "dummy".

Suggested change
model_config = ModelConfig(max_model_len=16)
model_config = ModelConfig("dummy", max_model_len=16)

Signed-off-by: xzh25 <xzh25@users.noreply.github.com>
@xzh25

xzh25 commented Jun 21, 2026

Copy link
Copy Markdown
Author

Addressed the Gemini review feedback on ModelConfig(max_model_len=...) in tests/v1/core/test_kv_cache_utils.py.

Changes:

  • Replaced validation-triggering ModelConfig/VllmConfig construction in lightweight KV cache utility tests with local SimpleNamespace helpers.
  • Kept the helper limited to fields read by KV cache config/memory sizing paths.
  • Removed the temporary dummy model approach so tests do not depend on HF model validation.

Validation on the MetaX remote vLLM 0.23 environment:

  • git diff --check: passed
  • python -m py_compile tests/v1/core/test_kv_cache_utils.py: passed
  • Review-targeted KV cache utils cases: 11 passed
  • Previous smoke coverage (test_none_hash + test_single_type_kv_cache_manager.py): 10 passed

Remote log: /data/vllm-metax-contribution/ops/logs/058_verify_pr292_modelconfig_followup_scheduler_tokens.txt

@xzh25

xzh25 commented Jun 22, 2026

Copy link
Copy Markdown
Author

Added a stronger clean-worktree validation pass for this PR on the MetaX C500 remote environment.

Validation details:

  • Verified PR head from a clean worktree: 425a76942c6ee48ccb78a1460d22fdb54d366f17
  • Device/runtime: MetaX C500, torch==2.8.0+metax3.5.3.9, vllm==0.23.0
  • git diff --check: passed
  • py_compile for both target files: passed
  • Full target collection: 66 tests collected
  • Unfiltered full pytest reached 33 passed, then blocked in urllib3 while resolving external HuggingFace model config for existing ModelConfig("Qwen/Qwen1.5-7B", ...) tests.
  • Local CPU-safe target run excluding only those external model-config resolution cases: 63 passed, 3 deselected, 5 warnings

The excluded cases are the existing real-model config tests (test_estimate_max_model_len and test_get_max_concurrency_for_kv_cache_config), not the lightweight KV cache utility paths changed by this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant