Add vLLM-metax vllm metax server log summary by ghangz · Pull Request #311 · MetaX-MACA/vLLM-metax

ghangz · 2026-07-01T03:25:45Z

Summary

Adds a focused vllm metax server log summary improvement for MetaX-MACA/vLLM-metax.
The change targets MetaX MACA development and validation workflows, with emphasis on earlier diagnostics, reproducible logs, or safer benchmark tooling.
Existing default behavior is kept compatible; the new logic is scoped to explicit checks, helper tools, or validation metadata.

Validation

Verified on Gitee.AI MetaX GPU resources: vLLM-metax_vLLM image batch, 10/10 PASS; PyTorch-MACA batch also covered vLLM-metax runtime tools.
Branch validation command: python tools/server_log_summary.py --self-test
Pull request text is intentionally ASCII-only to avoid encoding issues on web forms and API clients.

Review notes

Source branch: ghangz:mengz/vllm-metax-server-log-summary
Target branch: MetaX-MACA/vLLM-metax:master
Maintainers can modify this branch if follow-up adjustments are needed.

gemini-code-assist

Code Review

This pull request introduces a new utility script, tools/server_log_summary.py, designed to parse and summarize metrics and errors from service, benchmark, or distributed runtime logs into JSON format. The review feedback highlights several critical improvements: optimizing memory usage by reading log files line-by-line instead of loading them entirely into memory, handling file access errors gracefully, replacing assert statements in the self-test with explicit exceptions to avoid issues when optimization flags are used, and printing the help message when the script is executed without any arguments.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-07-01T03:27:06Z

+    errors: list[dict[str, object]] = []
+    values: list[dict[str, object]] = []
+    for lineno, line in enumerate(path.read_text(encoding="utf-8", errors="replace").splitlines(), 1):
+        if ERROR_RE.search(line):
+            errors.append({"line": lineno, "text": line.strip()})
+        for match in NUMBER_RE.finditer(line):
+            key = match.group("key").lower()
+            if not METRICS or any(token in key for token in METRICS):
+                values.append({"line": lineno, "metric": key, "value": float(match.group("value"))})
+    return {"path": str(path), "metric_count": len(values), "error_count": len(errors), "metrics": values, "errors": errors}


Reading the entire log file into memory at once using path.read_text().splitlines() can cause high memory usage or Out-Of-Memory (OOM) crashes when processing large log files, which are common in benchmark and distributed runtime environments.

Additionally, if any log file is missing or unreadable, the script will crash with a traceback, interrupting the entire reporting pipeline.

Recommendation:

Open the file and iterate over it line-by-line to keep the memory footprint constant.

Wrap the file reading in a try...except block to gracefully handle file access errors and report them in the JSON output.

Suggested change

errors: list[dict[str, object]] = []

values: list[dict[str, object]] = []

for lineno, line in enumerate(path.read_text(encoding="utf-8", errors="replace").splitlines(), 1):

if ERROR_RE.search(line):

errors.append({"line": lineno, "text": line.strip()})

for match in NUMBER_RE.finditer(line):

key = match.group("key").lower()

if not METRICS or any(token in key for token in METRICS):

values.append({"line": lineno, "metric": key, "value": float(match.group("value"))})

return {"path": str(path), "metric_count": len(values), "error_count": len(errors), "metrics": values, "errors": errors}

errors: list[dict[str, object]] = []

values: list[dict[str, object]] = []

try:

with path.open("r", encoding="utf-8", errors="replace") as f:

for lineno, line in enumerate(f, 1):

if ERROR_RE.search(line):

errors.append({"line": lineno, "text": line.strip()})

for match in NUMBER_RE.finditer(line):

key = match.group("key").lower()

if not METRICS or any(token in key for token in METRICS):

values.append({"line": lineno, "metric": key, "value": float(match.group("value"))})

except Exception as e:

return {"path": str(path), "error": f"Failed to read or parse file: {e}"}

return {"path": str(path), "metric_count": len(values), "error_count": len(errors), "metrics": values, "errors": errors}

gemini-code-assist · 2026-07-01T03:27:06Z

+        data = parse(sample)
+        assert data["metric_count"] == 1
+        assert data["error_count"] == 1
+        print(json.dumps({"ok": True, "metric_count": data["metric_count"]}, ensure_ascii=False))


Using assert statements for validation or testing in production scripts is discouraged because assertions are stripped out when Python is run with optimization flags (e.g., python -O). This would cause the self-test to silently pass even if the parsing logic is broken.

Recommendation:
Replace the assert statements with explicit conditional checks that raise a ValueError or another appropriate exception.

Suggested change

data = parse(sample)

assert data["metric_count"] == 1

assert data["error_count"] == 1

print(json.dumps({"ok": True, "metric_count": data["metric_count"]}, ensure_ascii=False))

data = parse(sample)

if data.get("metric_count") != 1 or data.get("error_count") != 1:

raise ValueError(f"Self-test failed: expected 1 metric and 1 error, got {data}")

print(json.dumps({"ok": True, "metric_count": data["metric_count"]}, ensure_ascii=False))

gemini-code-assist · 2026-07-01T03:27:06Z

+    if args.self_test:
+        self_test()
+        return 0
+    print(json.dumps([parse(Path(p)) for p in args.logs], ensure_ascii=False, indent=2))
+    return 0


If the script is executed without any arguments (and without --self-test), it currently prints [] and exits with a success status code (0). This is a silent no-op and can be confusing to users.

Recommendation:
If no log files are provided and --self-test is not specified, print the help message and exit with a non-zero status code to guide the user.

Suggested change

if args.self_test:

self_test()

return 0

print(json.dumps([parse(Path(p)) for p in args.logs], ensure_ascii=False, indent=2))

return 0

if args.self_test:

self_test()

return 0

if not args.logs:

parser.print_help()

return 1

print(json.dumps([parse(Path(p)) for p in args.logs], ensure_ascii=False, indent=2))

return 0

ILikeIneine · 2026-07-02T07:10:57Z

Stop commit the vibe coding generated meaningless PRs or you will be added to blacklist and blocked. Thanks.

Add vLLM MetaX server log summary

9e9fa0f

gemini-code-assist Bot reviewed Jul 1, 2026

View reviewed changes

Harden server log summary CLI

3b97846

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add vLLM-metax vllm metax server log summary#311

Add vLLM-metax vllm metax server log summary#311
ghangz wants to merge 2 commits into
MetaX-MACA:masterfrom
ghangz:mengz/vllm-metax-server-log-summary

ghangz commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Uh oh!

ILikeIneine commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ghangz commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

ILikeIneine commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants