CI でテスト / 継続ベンチマークとその結果をホストする by abap34 · Pull Request #174 · abap34/almo

abap34 · 2026-04-08T15:00:04Z

Co-author @codex 🤖

概要

make test make bench make coverage を追加し、GitHub Actions から実行できるようにしました
PR にテスト・ベンチマーク・カバレッジの要約コメントを付ける CI を追加しました
gh-pages/dev/bench/ にベンチマーク履歴を公開する workflow を追加しました
ベンチマーク履歴ページは外部ライブラリに依存せず、HTML/CSS/JS の自前実装にしています
最低限の C++ テストランナーとベンチマークランナーを追加しました

変更点

Makefile に test bench coverage ターゲットを追加
.github/workflows/ci.yml で quality check と PR コメント生成を実施
.github/workflows/benchmark-pages.yml で benchmark 履歴を gh-pages に反映
scripts/render_ci_comment.py で PR / step summary 用の Markdown を生成
scripts/update_benchmark_history.py で history.json を更新
scripts/benchmark_pages/ に自前の benchmark dashboard を追加
tests/test_main.cpp と tests/benchmark_main.cpp を追加

確認

make test
make bench BENCH_ARGS='--json --output build/benchmark.json --iterations 2'
make coverage
act pull_request -W .github/workflows/ci.yml -j quality --container-architecture linux/amd64 --artifact-server-path /tmp/act-artifacts --artifact-server-addr 127.0.0.1
act push -W .github/workflows/benchmark-pages.yml -j publish-benchmarks --container-architecture linux/amd64 --artifact-server-path /tmp/act-artifacts --artifact-server-addr 127.0.0.1

act 実行時の結果

Tests: 4 passed / 0 failed
Benchmarks: 3 benchmarks captured
Coverage: 44.1% (606 of 1375 lines)

github-actions · 2026-04-08T15:01:29Z

CI Summary

Check	Status	Details
Tests	Success	4 passed / 0 failed
Benchmarks	Success	3 benchmark(s) captured
Coverage	Success	44.1% (606 of 1375 lines)

Benchmark Means

Benchmark	Mean (ms)	Min (ms)	Max (ms)	Iterations
parse_small_document	10.828	10.598	11.040	5
parse_large_document	211.900	204.726	222.362	5
render_large_document	244.540	233.598	266.314	5

Benchmark dashboard: https://abap34.github.io/almo/dev/bench/?pr=174
PR preview: https://abap34.github.io/almo/dev/previews/pr-174/

Copilot

Pull request overview

CI からテスト・ベンチ・カバレッジを実行し、PR へ品質サマリーをコメント投稿、さらにベンチマーク履歴を GitHub Pages（gh-pages/dev/bench/）へ公開する仕組みを追加する PR です。

Changes:

make test / bench / coverage と最小限の C++ テスト/ベンチランナーを追加
GitHub Actions で quality 実行＋PR/Step Summary 向け Markdown 生成＆投稿を追加
ベンチ履歴（history.json）更新と、外部依存なしのベンチダッシュボードを gh-pages へデプロイ

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
Makefile	`test/bench/coverage` ターゲットと CI 向けビルド設定を追加
.github/workflows/ci.yml	テスト/ベンチ/カバレッジ実行、サマリー生成、PR への sticky コメントを追加
.github/workflows/benchmark-pages.yml	main push 時にベンチ実行→履歴更新→gh-pages へ反映する workflow を追加
scripts/render_ci_comment.py	ログ/JSON から CI サマリー Markdown を生成
scripts/update_benchmark_history.py	ベンチ JSON を `history.json` に追記/更新（最大件数制限あり）
scripts/benchmark_pages/index.html	ベンチ履歴ダッシュボードの HTML を追加
scripts/benchmark_pages/styles.css	ダッシュボードのスタイルを追加
scripts/benchmark_pages/app.js	`history.json` 読み込み・タブ/チャート/テーブル描画を実装
scripts/setup.sh	ビルド生成物の raw string 化処理を微修正（改行付与）
tests/test_main.cpp	最小テストランナーと基本テストケースを追加
tests/benchmark_main.cpp	最小ベンチランナー（JSON 出力対応）を追加
.gitignore	CI 生成物 `.ci` を ignore 追加

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T15:07:42Z

+BenchmarkResult run_benchmark(const std::string& name,
+                              const std::function<void()>& body,
+                              int iterations, std::size_t work_units) {
+    std::vector<double> samples;
+    samples.reserve(iterations);
+
+    body();
+
+    for (int i = 0; i < iterations; ++i) {
+        const auto start = std::chrono::steady_clock::now();
+        body();
+        const auto end = std::chrono::steady_clock::now();
+        const std::chrono::duration<double, std::milli> elapsed = end - start;
+        samples.push_back(elapsed.count());
+    }
+
+    const auto [min_it, max_it] =
+        std::minmax_element(samples.begin(), samples.end());
+    const double total =
+        std::accumulate(samples.begin(), samples.end(), 0.0);
+
+    return {
+        name,
+        iterations,
+        work_units,
+        *min_it,
+        total / static_cast<double>(samples.size()),
+        *max_it,
+    };


run_benchmark() assumes iterations >= 1, but --iterations is parsed from CLI without validation. If a user passes 0 (or a negative value), samples becomes empty and std::minmax_element(samples.begin(), samples.end()) is undefined behavior; negative values can also make reserve(iterations) request a huge allocation due to signed-to-unsigned conversion. Validate iterations after parsing (and/or inside run_benchmark) and return a clear error before benchmarking when it is < 1.

Copilot · 2026-04-08T15:07:43Z

+std::string to_json(const std::vector<BenchmarkResult>& results) {
+    std::ostringstream oss;
+    oss << std::fixed << std::setprecision(3);
+    oss << "{\n  \"benchmarks\": [\n";
+
+    for (std::size_t i = 0; i < results.size(); ++i) {
+        const auto& result = results[i];
+        oss << "    {\n";
+        oss << "      \"name\": \"" << result.name << "\",\n";
+        oss << "      \"iterations\": " << result.iterations << ",\n";
+        oss << "      \"work_units\": " << result.work_units << ",\n";
+        oss << "      \"min_ms\": " << result.min_ms << ",\n";
+        oss << "      \"mean_ms\": " << result.mean_ms << ",\n";
+        oss << "      \"max_ms\": " << result.max_ms << '\n';
+        oss << "    }";


The JSON writer prints result.name directly into a quoted JSON string without escaping. If benchmark names ever contain ", newlines, or other control characters, the output becomes invalid JSON (and can break downstream parsing/rendering). Consider adding a minimal JSON string escaping helper for the name field (and any other string fields you may add later).

Copilot · 2026-04-08T15:07:43Z

+    }
+
+    entries = history_payload.get("entries", [])
+    if entries and entries[-1]["commit"] == args.commit and entries[-1]["branch"] == args.branch:


If history.json exists but was produced by an older schema (or was manually edited), entries[-1]["commit"] / entries[-1]["branch"] can raise KeyError. Using .get() (or validating required keys) would make the updater more robust and prevent the Pages workflow from failing unexpectedly on pre-existing history files.

Suggested change

if entries and entries[-1]["commit"] == args.commit and entries[-1]["branch"] == args.branch:

if (

entries

and entries[-1].get("commit") == args.commit

and entries[-1].get("branch") == args.branch

):

Copilot · 2026-04-08T15:07:43Z

+        entries.append(entry)
+
+    history_payload["repo"] = args.repo
+    history_payload["entries"] = entries[-args.max_entries :]


history_payload["entries"] = entries[-args.max_entries :] behaves unexpectedly when --max-entries 0 is passed (because -0 is 0, the slice keeps all entries instead of none). If max-entries is meant to allow disabling history retention, handle 0 explicitly (e.g., store an empty list).

Suggested change

history_payload["entries"] = entries[-args.max_entries :]

history_payload["entries"] = [] if args.max_entries == 0 else entries[-args.max_entries :]

Copilot · 2026-04-08T15:07:44Z

+function renderHeader(history) {
+  const latest = history.entries[history.entries.length - 1];
+  document.getElementById("repo-name").textContent = history.repo;
+  document.getElementById("latest-commit").innerHTML =
+    latest.run_url
+      ? `<a href="${latest.run_url}">${shortCommit(latest.commit)}</a>`
+      : shortCommit(latest.commit);
+  document.getElementById("latest-run").textContent = new Date(latest.timestamp).toLocaleString();
+}


This code assumes history.entries is non-empty. If history.json is missing entries (e.g., first-time publish mistake or a corrupted file), history.entries[history.entries.length - 1] becomes undefined and subsequent property access throws, leaving a blank page. Add a guard for empty/missing entries and render a friendly message instead of crashing.

Copilot · 2026-04-08T15:07:44Z

+  document.getElementById("latest-commit").innerHTML =
+    latest.run_url
+      ? `<a href="${latest.run_url}">${shortCommit(latest.commit)}</a>`
+      : shortCommit(latest.commit);
+  document.getElementById("latest-run").textContent = new Date(latest.timestamp).toLocaleString();
+}
+
+function renderLatestTable(entries) {
+  const latest = entries[entries.length - 1];
+  const tbody = document.getElementById("latest-table");
+  tbody.innerHTML = "";
+
+  latest.benchmarks.forEach((benchmark) => {
+    const row = document.createElement("tr");
+    row.innerHTML = `
+      <td>${benchmark.name}</td>
+      <td>${benchmark.mean_ms.toFixed(3)}</td>
+      <td>${benchmark.min_ms.toFixed(3)}</td>
+      <td>${benchmark.max_ms.toFixed(3)}</td>
+      <td>${benchmark.iterations}</td>
+    `;
+    tbody.appendChild(row);


Several places build HTML via innerHTML using values loaded from history.json (e.g. latest.run_url, latest.commit, benchmark.name). If that JSON is ever malformed or attacker-controlled, this becomes an XSS vector on the published GitHub Pages site. Prefer creating DOM nodes and assigning textContent/href directly (and avoid string interpolation into HTML) to keep the dashboard safe by construction.

Copilot · 2026-04-08T15:07:44Z

+permissions:
+  contents: read
+  pull-requests: write
+


Workflow permissions grant pull-requests: write for both pull_request and push events. Since the workflow only posts PR comments on pull_request, this is broader than necessary on push runs. Consider reducing default permissions (e.g., keep contents: read globally and move pull-requests: write to a dedicated PR-only job/workflow) to follow least-privilege.

Copilot · 2026-04-08T15:07:44Z

+  }
+
+  .page-shell {
+    width: min(100vw - 20px, 1120px);


In the mobile media query, width: min(100vw - 20px, 1120px); is not valid CSS because 100vw - 20px must be wrapped in calc(...) when used as a length expression. As written, browsers may drop the whole declaration, breaking the responsive layout. Use min(1120px, calc(100vw - 20px)) (or similar) for a valid value.

Suggested change

width: min(100vw - 20px, 1120px);

width: min(1120px, calc(100vw - 20px));

abap34 · 2026-04-08T16:10:59Z

fix #84 , #168

Add CI quality reporting and benchmark dashboard

4b1bafc

abap34 requested a review from Copilot April 8, 2026 15:01

Copilot started reviewing on behalf of abap34 April 8, 2026 15:02 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

abap34 added 3 commits April 9, 2026 00:21

Improve benchmark dashboard for PR review

9165f05

Publish coverage reports with the dashboard

1e07403

Publish PR preview sites on Pages

112ed73

abap34 changed the title ~~CI に品質レポートとベンチマークダッシュボードを追加~~ CI でテスト / 継続ベンチマークとその結果をホストする Apr 8, 2026

abap34 merged commit dcfaf41 into main Apr 8, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI でテスト / 継続ベンチマークとその結果をホストする#174

CI でテスト / 継続ベンチマークとその結果をホストする#174
abap34 merged 4 commits intomainfrom
ci/quality-dashboard

abap34 commented Apr 8, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

abap34 commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    if entries and entries[-1]["commit"] == args.commit and entries[-1]["branch"] == args.branch:
+    if (
+        entries
+        and entries[-1].get("commit") == args.commit
+        and entries[-1].get("branch") == args.branch
+    ):

	history_payload["entries"] = entries[-args.max_entries :]
	history_payload["entries"] = [] if args.max_entries == 0 else entries[-args.max_entries :]

	width: min(100vw - 20px, 1120px);
	width: min(1120px, calc(100vw - 20px));

Conversation

abap34 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

概要

変更点

確認

act 実行時の結果

Uh oh!

github-actions bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Summary

Benchmark Means

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

abap34 commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abap34 commented Apr 8, 2026 •

edited

Loading

github-actions bot commented Apr 8, 2026 •

edited

Loading