Skip to content

CI でテスト / 継続ベンチマークとその結果をホストする#174

Merged
abap34 merged 4 commits intomainfrom
ci/quality-dashboard
Apr 8, 2026
Merged

CI でテスト / 継続ベンチマークとその結果をホストする#174
abap34 merged 4 commits intomainfrom
ci/quality-dashboard

Conversation

@abap34
Copy link
Copy Markdown
Owner

@abap34 abap34 commented Apr 8, 2026

Co-author @codex 🤖

概要

  • make test make bench make coverage を追加し、GitHub Actions から実行できるようにしました
  • PR にテスト・ベンチマーク・カバレッジの要約コメントを付ける CI を追加しました
  • gh-pages/dev/bench/ にベンチマーク履歴を公開する workflow を追加しました
  • ベンチマーク履歴ページは外部ライブラリに依存せず、HTML/CSS/JS の自前実装にしています
  • 最低限の C++ テストランナーとベンチマークランナーを追加しました

変更点

  • Makefiletest bench coverage ターゲットを追加
  • .github/workflows/ci.yml で quality check と PR コメント生成を実施
  • .github/workflows/benchmark-pages.yml で benchmark 履歴を gh-pages に反映
  • scripts/render_ci_comment.py で PR / step summary 用の Markdown を生成
  • scripts/update_benchmark_history.pyhistory.json を更新
  • scripts/benchmark_pages/ に自前の benchmark dashboard を追加
  • tests/test_main.cpptests/benchmark_main.cpp を追加

確認

  • make test
  • make bench BENCH_ARGS='--json --output build/benchmark.json --iterations 2'
  • make coverage
  • act pull_request -W .github/workflows/ci.yml -j quality --container-architecture linux/amd64 --artifact-server-path /tmp/act-artifacts --artifact-server-addr 127.0.0.1
  • act push -W .github/workflows/benchmark-pages.yml -j publish-benchmarks --container-architecture linux/amd64 --artifact-server-path /tmp/act-artifacts --artifact-server-addr 127.0.0.1

act 実行時の結果

  • Tests: 4 passed / 0 failed
  • Benchmarks: 3 benchmarks captured
  • Coverage: 44.1% (606 of 1375 lines)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

CI Summary

Check Status Details
Tests Success 4 passed / 0 failed
Benchmarks Success 3 benchmark(s) captured
Coverage Success 44.1% (606 of 1375 lines)

Benchmark Means

Benchmark Mean (ms) Min (ms) Max (ms) Iterations
parse_small_document 10.828 10.598 11.040 5
parse_large_document 211.900 204.726 222.362 5
render_large_document 244.540 233.598 266.314 5

Benchmark dashboard: https://abap34.github.io/almo/dev/bench/?pr=174
PR preview: https://abap34.github.io/almo/dev/previews/pr-174/

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

CI からテスト・ベンチ・カバレッジを実行し、PR へ品質サマリーをコメント投稿、さらにベンチマーク履歴を GitHub Pages(gh-pages/dev/bench/)へ公開する仕組みを追加する PR です。

Changes:

  • make test / bench / coverage と最小限の C++ テスト/ベンチランナーを追加
  • GitHub Actions で quality 実行+PR/Step Summary 向け Markdown 生成&投稿を追加
  • ベンチ履歴(history.json)更新と、外部依存なしのベンチダッシュボードを gh-pages へデプロイ

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
Makefile test/bench/coverage ターゲットと CI 向けビルド設定を追加
.github/workflows/ci.yml テスト/ベンチ/カバレッジ実行、サマリー生成、PR への sticky コメントを追加
.github/workflows/benchmark-pages.yml main push 時にベンチ実行→履歴更新→gh-pages へ反映する workflow を追加
scripts/render_ci_comment.py ログ/JSON から CI サマリー Markdown を生成
scripts/update_benchmark_history.py ベンチ JSON を history.json に追記/更新(最大件数制限あり)
scripts/benchmark_pages/index.html ベンチ履歴ダッシュボードの HTML を追加
scripts/benchmark_pages/styles.css ダッシュボードのスタイルを追加
scripts/benchmark_pages/app.js history.json 読み込み・タブ/チャート/テーブル描画を実装
scripts/setup.sh ビルド生成物の raw string 化処理を微修正(改行付与)
tests/test_main.cpp 最小テストランナーと基本テストケースを追加
tests/benchmark_main.cpp 最小ベンチランナー(JSON 出力対応)を追加
.gitignore CI 生成物 .ci を ignore 追加

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/benchmark_main.cpp
Comment on lines +71 to +99
BenchmarkResult run_benchmark(const std::string& name,
const std::function<void()>& body,
int iterations, std::size_t work_units) {
std::vector<double> samples;
samples.reserve(iterations);

body();

for (int i = 0; i < iterations; ++i) {
const auto start = std::chrono::steady_clock::now();
body();
const auto end = std::chrono::steady_clock::now();
const std::chrono::duration<double, std::milli> elapsed = end - start;
samples.push_back(elapsed.count());
}

const auto [min_it, max_it] =
std::minmax_element(samples.begin(), samples.end());
const double total =
std::accumulate(samples.begin(), samples.end(), 0.0);

return {
name,
iterations,
work_units,
*min_it,
total / static_cast<double>(samples.size()),
*max_it,
};
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_benchmark() assumes iterations >= 1, but --iterations is parsed from CLI without validation. If a user passes 0 (or a negative value), samples becomes empty and std::minmax_element(samples.begin(), samples.end()) is undefined behavior; negative values can also make reserve(iterations) request a huge allocation due to signed-to-unsigned conversion. Validate iterations after parsing (and/or inside run_benchmark) and return a clear error before benchmarking when it is < 1.

Copilot uses AI. Check for mistakes.
Comment thread tests/benchmark_main.cpp
Comment on lines +102 to +116
std::string to_json(const std::vector<BenchmarkResult>& results) {
std::ostringstream oss;
oss << std::fixed << std::setprecision(3);
oss << "{\n \"benchmarks\": [\n";

for (std::size_t i = 0; i < results.size(); ++i) {
const auto& result = results[i];
oss << " {\n";
oss << " \"name\": \"" << result.name << "\",\n";
oss << " \"iterations\": " << result.iterations << ",\n";
oss << " \"work_units\": " << result.work_units << ",\n";
oss << " \"min_ms\": " << result.min_ms << ",\n";
oss << " \"mean_ms\": " << result.mean_ms << ",\n";
oss << " \"max_ms\": " << result.max_ms << '\n';
oss << " }";
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON writer prints result.name directly into a quoted JSON string without escaping. If benchmark names ever contain ", newlines, or other control characters, the output becomes invalid JSON (and can break downstream parsing/rendering). Consider adding a minimal JSON string escaping helper for the name field (and any other string fields you may add later).

Copilot uses AI. Check for mistakes.
Comment thread scripts/update_benchmark_history.py Outdated
}

entries = history_payload.get("entries", [])
if entries and entries[-1]["commit"] == args.commit and entries[-1]["branch"] == args.branch:
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If history.json exists but was produced by an older schema (or was manually edited), entries[-1]["commit"] / entries[-1]["branch"] can raise KeyError. Using .get() (or validating required keys) would make the updater more robust and prevent the Pages workflow from failing unexpectedly on pre-existing history files.

Suggested change
if entries and entries[-1]["commit"] == args.commit and entries[-1]["branch"] == args.branch:
if (
entries
and entries[-1].get("commit") == args.commit
and entries[-1].get("branch") == args.branch
):

Copilot uses AI. Check for mistakes.
Comment thread scripts/update_benchmark_history.py Outdated
entries.append(entry)

history_payload["repo"] = args.repo
history_payload["entries"] = entries[-args.max_entries :]
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

history_payload["entries"] = entries[-args.max_entries :] behaves unexpectedly when --max-entries 0 is passed (because -0 is 0, the slice keeps all entries instead of none). If max-entries is meant to allow disabling history retention, handle 0 explicitly (e.g., store an empty list).

Suggested change
history_payload["entries"] = entries[-args.max_entries :]
history_payload["entries"] = [] if args.max_entries == 0 else entries[-args.max_entries :]

Copilot uses AI. Check for mistakes.
Comment thread scripts/benchmark_pages/app.js Outdated
Comment on lines +42 to +50
function renderHeader(history) {
const latest = history.entries[history.entries.length - 1];
document.getElementById("repo-name").textContent = history.repo;
document.getElementById("latest-commit").innerHTML =
latest.run_url
? `<a href="${latest.run_url}">${shortCommit(latest.commit)}</a>`
: shortCommit(latest.commit);
document.getElementById("latest-run").textContent = new Date(latest.timestamp).toLocaleString();
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code assumes history.entries is non-empty. If history.json is missing entries (e.g., first-time publish mistake or a corrupted file), history.entries[history.entries.length - 1] becomes undefined and subsequent property access throws, leaving a blank page. Add a guard for empty/missing entries and render a friendly message instead of crashing.

Copilot uses AI. Check for mistakes.
Comment thread scripts/benchmark_pages/app.js Outdated
Comment on lines +45 to +66
document.getElementById("latest-commit").innerHTML =
latest.run_url
? `<a href="${latest.run_url}">${shortCommit(latest.commit)}</a>`
: shortCommit(latest.commit);
document.getElementById("latest-run").textContent = new Date(latest.timestamp).toLocaleString();
}

function renderLatestTable(entries) {
const latest = entries[entries.length - 1];
const tbody = document.getElementById("latest-table");
tbody.innerHTML = "";

latest.benchmarks.forEach((benchmark) => {
const row = document.createElement("tr");
row.innerHTML = `
<td>${benchmark.name}</td>
<td>${benchmark.mean_ms.toFixed(3)}</td>
<td>${benchmark.min_ms.toFixed(3)}</td>
<td>${benchmark.max_ms.toFixed(3)}</td>
<td>${benchmark.iterations}</td>
`;
tbody.appendChild(row);
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several places build HTML via innerHTML using values loaded from history.json (e.g. latest.run_url, latest.commit, benchmark.name). If that JSON is ever malformed or attacker-controlled, this becomes an XSS vector on the published GitHub Pages site. Prefer creating DOM nodes and assigning textContent/href directly (and avoid string interpolation into HTML) to keep the dashboard safe by construction.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml
Comment on lines +9 to +12
permissions:
contents: read
pull-requests: write

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workflow permissions grant pull-requests: write for both pull_request and push events. Since the workflow only posts PR comments on pull_request, this is broader than necessary on push runs. Consider reducing default permissions (e.g., keep contents: read globally and move pull-requests: write to a dedicated PR-only job/workflow) to follow least-privilege.

Copilot uses AI. Check for mistakes.
Comment thread scripts/benchmark_pages/styles.css Outdated
}

.page-shell {
width: min(100vw - 20px, 1120px);
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the mobile media query, width: min(100vw - 20px, 1120px); is not valid CSS because 100vw - 20px must be wrapped in calc(...) when used as a length expression. As written, browsers may drop the whole declaration, breaking the responsive layout. Use min(1120px, calc(100vw - 20px)) (or similar) for a valid value.

Suggested change
width: min(100vw - 20px, 1120px);
width: min(1120px, calc(100vw - 20px));

Copilot uses AI. Check for mistakes.
@abap34
Copy link
Copy Markdown
Owner Author

abap34 commented Apr 8, 2026

fix #84 , #168

@abap34 abap34 changed the title CI に品質レポートとベンチマークダッシュボードを追加 CI でテスト / 継続ベンチマークとその結果をホストする Apr 8, 2026
@abap34 abap34 merged commit dcfaf41 into main Apr 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants