Consolidate SKILL-REPORT filename matching into shared API module by msalaman · Pull Request #1765 · microsoft/GitHub-Copilot-for-Azure

msalaman · 2026-04-07T22:56:54Z

This PR is a follow-up on the dashboard confidence-level work (which reads Average Confidence from per-skill *-SKILL-REPORT.md files). It addresses review feedback by removing duplicated filename-matching logic in the API to avoid drift between endpoints.

What changed
- Shared report pattern
  - Added dashboard/api/src/skillReport.ts with a single exported SKILL_REPORT_PATTERN.
- Function updates
  - Updated getReports.ts to import the shared pattern instead of defining a local regex.
  - Updated getTestResults.ts to import the same shared pattern.
- Documentation
  - Added JSDoc to the shared constant describing expected filename format.
Result
- Both report consumers now use one source of truth for identifying SKILL-REPORT blobs, reducing maintenance risk as report naming evolves.

// dashboard/api/src/skillReport.ts
/**
 * Matches per-run markdown skill report filenames, e.g. test-run-123-SKILL-REPORT.md.
 */
export const SKILL_REPORT_PATTERN = /^test-run-.*-SKILL-REPORT\.md$/;

Copilot

Pull request overview

Adds a new “Confidence Level per Skill” panel to the repository health dashboard by extracting the “Average Confidence” metric from per-skill SKILL-REPORT markdown files and surfacing it via the dashboard API.

Changes:

Adds a new dashboard panel section for “Confidence Level per Skill”.
Extends the dashboard client to fetch/render the per-skill confidence metric with pass/fail filtering and a progress-bar visualization.
Updates the /api/test-results/{date} function to parse SKILL-REPORT.md blobs and include averageConfidence in the per-skill JSON response.

Show a summary per file

File	Description
dashboard/index.html	Adds the new “Confidence Level per Skill” panel container to the dashboard layout.
dashboard/assets/dashboard.js	Implements loading + rendering logic for the confidence-per-skill panel and hooks it into initialization.
dashboard/api/src/functions/getTestResults.ts	Parses SKILL-REPORT.md files to compute `averageConfidence` and returns it alongside existing skill stats.

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 3

dashboard/api/src/functions/getTestResults.ts

dashboard/assets/dashboard.js

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a new Dashboard panel that surfaces “Confidence Level per Skill” by extracting the Average Confidence metric from per-skill *-SKILL-REPORT.md artifacts and exposing it via the existing test-results API.

Changes:

Added a new “Confidence Level per Skill” panel section to the dashboard HTML layout.
Implemented client-side loading/rendering for the confidence table (including filtering/collapsing behavior).
Extended GET /api/test-results/{date} to parse *-SKILL-REPORT.md files and return averageConfidence per skill.

Show a summary per file

File	Description
dashboard/index.html	Adds a new dashboard panel container for “Confidence Level per Skill”.
dashboard/assets/dashboard.js	Fetches `averageConfidence` from `/api/test-results/{date}` and renders a sortable, filterable list with progress bars.
dashboard/api/src/functions/getTestResults.ts	Collects SKILL-REPORT blobs, extracts “Average Confidence”, and returns it as `averageConfidence` in the API response.

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 2

Copilot · 2026-04-07T23:47:01Z

dashboard/assets/dashboard.js

+ * A skill is considered passing when its average rate is >= SIR_THRESHOLD (80%).
+ */
+async function loadConfidenceLevelPerSkill() {
+  const section = document.getElementById("panel-confidence-level");
+  if (!section) return;
+
+  try {
+    const { latestDate, skillResults } = await fetchLatestTestResults();
+
+    if (!latestDate) {
+      renderConfidenceLevelPanel(section, [], "skip", 0, 0, null);
+      return;
+    }
+
+    // Build one entry per skill using its averageConfidence from the SKILL-REPORT
+    const skills = [];
+    for (const [skillName, stats] of Object.entries(skillResults)) {
+      if (stats.averageConfidence === null || stats.averageConfidence === undefined) continue;
+      // averageConfidence is stored as 0–100 in the API response
+      skills.push({ skillName, rate: stats.averageConfidence / 100 });
+    }
+
+    // Sort: below threshold first (worst first), then ascending within each group
+    skills.sort(function (a, b) {
+      const aPass = a.rate >= SIR_THRESHOLD;
+      const bPass = b.rate >= SIR_THRESHOLD;
+      if (aPass !== bPass) return aPass ? 1 : -1;
+      return a.rate - b.rate;
+    });


This panel reuses SIR_THRESHOLD / SIR_THRESHOLD_PCT (documented as the minimum passing rate for skill invocation tests) to judge/label “confidence level”. Even if the numeric threshold is intentionally the same (80%), it couples two different metrics and makes the code harder to reason about. Consider introducing a dedicated CONFIDENCE_THRESHOLD (and % constant) or renaming the shared constant to something metric-agnostic if it’s meant to apply to both.

dashboard/api/src/functions/getTestResults.ts

Agent-Logs-Url: https://github.com/microsoft/GitHub-Copilot-for-Azure/sessions/58a5e2d6-7576-4192-9b1e-c91d17393cef Co-authored-by: msalaman <28122166+msalaman@users.noreply.github.com>

Copilot

Pull request overview

Adds a new dashboard panel to surface per-skill “Average Confidence” values (sourced from SKILL-REPORT artifacts) alongside existing skill health panels.

Changes:

Add a new “Confidence Level per Skill” panel to the dashboard UI and initialize it on page load.
Extend the dashboard API’s /api/test-results/{date} response to include averageConfidence per skill by parsing *-SKILL-REPORT.md blobs.
Render a per-skill confidence table with pass/fail thresholding and sorting.

Show a summary per file

File	Description
dashboard/index.html	Adds a new panel section for “Confidence Level per Skill”.
dashboard/assets/dashboard.js	Fetches `averageConfidence` and renders the new confidence panel with threshold-based status.
dashboard/api/src/functions/getTestResults.ts	Parses SKILL-REPORT markdown to compute and return `averageConfidence` per skill.

Copilot's findings

Files reviewed: 5/5 changed files
Comments generated: 1

Copilot · 2026-04-08T00:14:54Z

dashboard/api/src/functions/getTestResults.ts

+        const stats = computeSkillStats(results);
+        const confValues = confidenceBySkill.get(skillName);
+        if (confValues && confValues.length > 0) {
+            stats.averageConfidence = confValues.reduce((a, b) => a + b, 0) / confValues.length;
+        }


averageConfidence is computed as an unweighted average across all matching *-SKILL-REPORT.md files for a skill. On dates with multiple runs (or multiple azure-deploy test groups), this can skew the final value because each report contributes equally regardless of how many tests/runs it represents. Consider either selecting a single canonical report per skill (e.g., most recent) or computing a weighted average (e.g., weight by Total Individual Runs / Total Test Cases extracted from the report).

add table for confidence level per skill

ee9289f

msalaman marked this pull request as ready for review April 7, 2026 22:57

Copilot AI review requested due to automatic review settings April 7, 2026 22:57

Copilot started reviewing on behalf of msalaman April 7, 2026 22:58 View session

BestVantage93 approved these changes Apr 7, 2026

View reviewed changes

Copilot AI reviewed Apr 7, 2026

View reviewed changes

dashboard/api/src/functions/getTestResults.ts Outdated Show resolved Hide resolved

dashboard/assets/dashboard.js Show resolved Hide resolved

dashboard/assets/dashboard.js Outdated Show resolved Hide resolved

Update dashboard/api/src/functions/getTestResults.ts

418dbc6

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 7, 2026 23:43

Copilot started reviewing on behalf of msalaman April 7, 2026 23:43 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

msalaman added 2 commits April 7, 2026 18:55

address comments

ccd1b8f

addressed comment

2d3d699

Copilot AI review requested due to automatic review settings April 8, 2026 00:10

Copilot started work on behalf of msalaman April 8, 2026 00:11 View session

Copilot started reviewing on behalf of msalaman April 8, 2026 00:11 View session

Copilot AI and others added 2 commits April 8, 2026 00:12

refactor(api): share skill report filename pattern

0084111

Agent-Logs-Url: https://github.com/microsoft/GitHub-Copilot-for-Azure/sessions/58a5e2d6-7576-4192-9b1e-c91d17393cef Co-authored-by: msalaman <28122166+msalaman@users.noreply.github.com>

docs(api): document shared skill report pattern

321b640

Agent-Logs-Url: https://github.com/microsoft/GitHub-Copilot-for-Azure/sessions/58a5e2d6-7576-4192-9b1e-c91d17393cef Co-authored-by: msalaman <28122166+msalaman@users.noreply.github.com>

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Copilot AI changed the title ~~Add table for confidence level per skill to Dashboard~~ Consolidate SKILL-REPORT filename matching into shared API module Apr 8, 2026

Copilot finished work on behalf of msalaman April 8, 2026 00:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate SKILL-REPORT filename matching into shared API module#1765

Consolidate SKILL-REPORT filename matching into shared API module#1765
msalaman wants to merge 6 commits intomainfrom
masalama/moreDashboardImprovements

msalaman commented Apr 7, 2026 •

edited by Copilot AI

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

msalaman commented Apr 7, 2026 • edited by Copilot AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

msalaman commented Apr 7, 2026 •

edited by Copilot AI

Loading