Add regression checks for AI Credits math in Go and JavaScript by Copilot · Pull Request #38638 · github/gh-aw

Copilot · 2026-06-11T15:41:17Z

This PR strengthens confidence in AI Credits computation by validating both Go and JavaScript pricing paths against real Copilot usage data from run 27355745225 / job 80833046748. It targets arithmetic correctness, Copilot model-price mapping behavior, and cache-read billing coverage.

Scope: AI Credits computation parity
- Added regression coverage in both runtimes for the same live token-usage samples.
- Ensures both implementations compute identical AIC outcomes for equivalent inputs.
Go: live arithmetic assertions
- Added TestComputeModelInferenceAICLiveCopilotData in pkg/cli/model_costs_test.go.
- Verifies computeModelInferenceAIC("copilot", "gpt-5.4-2026-03-05", ...) against ai_credits_this_response values from the live artifact.
JavaScript: live arithmetic assertions
- Added matches live Copilot billing arithmetic for gpt-5.4 dated model in actions/setup/js/model_costs.test.cjs.
- Verifies computeInferenceAIC({ provider: "copilot", model: "gpt-5.4-2026-03-05", ... }) against the same live values.
Cache-read regression coverage (Go + JavaScript)
- Added TestComputeModelInferenceAICCopilotCacheRead in pkg/cli/model_costs_test.go.
- Added includes cache read pricing for copilot gpt-5.4 dated model in actions/setup/js/model_costs.test.cjs.
- Verifies cache-read token pricing is included correctly for Copilot (cache_read path), with expected 0.56 AIC for input=1000, output=200, cacheRead=400.
Pricing-model resolution behavior covered
- Validates dated model resolution (gpt-5.4-2026-03-05 → gpt-5.4 pricing entry) and Copilot provider alias handling (copilot → github-copilot) in both paths.

got := computeModelInferenceAIC("copilot", "gpt-5.4-2026-03-05", 20314, 970, 0, 0, 0)
// expected from live token_usage.jsonl ai_credits_this_response
// want: 6.5335

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-06-11T15:48:39Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #38638 does not have the 'implementation' label and has only 24 new lines of code in business logic directories (threshold is 100). No custom config present.

github-actions · 2026-06-11T15:48:39Z

🧪 Test Quality Sentinel completed test quality analysis.

Copilot

Pull request overview

This PR adds regression tests that validate AI Credits (AIC) inference-cost arithmetic and Copilot provider/model normalization behavior using a fixed set of token-usage samples sourced from a specific GitHub Actions run artifact.

Changes:

Added a Go unit test (TestComputeModelInferenceAICLiveCopilotData) that asserts AIC results for the copilot provider and dated model gpt-5.4-2026-03-05 match expected live-derived values.
Added a JavaScript unit test that asserts the same live-derived samples match computeInferenceAIC behavior, including dated-model resolution and provider aliasing.

Show a summary per file

File	Description
pkg/cli/model_costs_test.go	Adds a table-driven Go regression test verifying AIC math for Copilot usage samples and dated model resolution.
actions/setup/js/model_costs.test.cjs	Adds a Vitest regression test verifying the JS AIC computation matches the same Copilot usage samples and alias/model resolution behavior.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 0

github-actions · 2026-06-11T15:48:53Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-06-11T15:49:26Z

✅ PR Code Quality Reviewer completed the code quality review.

pelikhan · 2026-06-11T15:51:29Z

@copilot Add tests that involve cache read

github-actions · 2026-06-11T15:55:19Z

🧪 Test Quality Sentinel Report

⚠️ Test Quality Score: 60/100 — Acceptable, with suggestions

Analyzed 2 test(s): 2 design, 0 implementation, 0 critical guideline violations. Both tests validate billing arithmetic against live Copilot API data.

📊 Metrics & Test Classification (2 tests analyzed)

Metric	Value
New/modified tests analyzed	2
✅ Design tests (behavioral contracts)	2 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	0 (0%)
Duplicate test clusters	0
Test inflation detected	Yes (test-only PR: +63 test lines, 0 production lines added)
🚨 Coding-guideline violations	1 (missing assertion message in Go)

Test Classification Details

Test	File	Classification	Issues Detected
`TestComputeModelInferenceAICLiveCopilotData` (4 table rows)	`pkg/cli/model_costs_test.go:65`	✅ Design	No error/edge cases; missing assertion message on `assert.InDelta`
`it("matches live Copilot billing arithmetic for gpt-5.4 dated model")`	`actions/setup/js/model_costs.test.cjs:135`	✅ Design	No error/edge cases

Language Support

Tests analyzed:

🐹 Go (*_test.go): 1 test function (table-driven, 4 rows) — unit (//go:build !integration)
🟨 JavaScript (*.test.cjs): 1 test — vitest

⚠️ Flagged Tests — Requires Review (2 issue(s))

⚠️ `TestComputeModelInferenceAICLiveCopilotData` (`pkg/cli/model_costs_test.go:84`)

Classification: Design test (behavioral contract)
Issue 1 — Missing assertion message (guideline violation): assert.InDelta(t, tt.wantAIC, got, 1e-9) has no descriptive message argument. When this fails, the output is cryptic.
Suggested improvement: Add a message for context:

assert.InDelta(t, tt.wantAIC, got, 1e-9, "AIC for %s should match live billing data", tt.name)

Issue 2 — Happy-path only: All 4 table rows use valid positive token counts. No error paths (e.g., unknown model, zero tokens) are covered.
Suggested improvement: Add an expectError column or a separate test for invalid inputs such as an unrecognised model name.

⚠️ `it("matches live Copilot billing arithmetic for gpt-5.4 dated model")` (`actions/setup/js/model_costs.test.cjs:135`)

Classification: Design test (behavioral contract)
Issue — Happy-path only: All 4 samples use well-formed inputs. No edge cases (e.g., unknown provider/model alias, cacheReadTokens > 0) are tested.
Suggested improvement: Add one sample with cacheReadTokens > 0 to verify the cache-read pricing path, and one with an unknown model to confirm graceful fallback behaviour.

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). No critical guideline violations detected (no mock libraries, build tags present). The missing assertion message and lack of error-case coverage are noted but do not block merge — address in a follow-up for improved debuggability.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §27359120423

🧪 Test quality analysis by Test Quality Sentinel · 152.5 AIC · ⌖ 20.4 AIC · ◷

github-actions

✅ Test Quality Sentinel: 60/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). Both new tests are behavioral design tests that validate AI Credits math against live Copilot API data. Minor suggestions: add assertion messages to the Go assert.InDelta call and consider adding edge-case coverage (cache-read tokens, unknown model) in a follow-up.

github-actions

Skills-Based Review 🧠

Applied /tdd — no blocking issues, but 6 observations worth addressing before this anchors future arithmetic regression suites.

📋 Key Themes & Highlights

Key Themes

Test-fixture inconsistency (Go vs JS): The JS test pins exact pricing in a fixture; the Go test reads from the embedded data/models.json. If gpt-5.4 prices are ever updated, the Go wantAIC values go stale silently.
Debuggability: Generic test names (request_1–request_4) and a for loop in JS make failure triage harder than it needs to be — a one-line naming change and switching to it.each() would give precise failing-sample context.
Precision asymmetry: toBeCloseTo(9) in JS is 2× stricter than InDelta(1e-9) in Go, which subtly undermines the cross-runtime parity guarantee.
Coverage gap: cache_read pricing is present in both fixtures but no sample exercises a non-zero cacheReadTokens — the cache billing path is unvalidated.

Positive Highlights

✅ Excellent idea to anchor regression tests against live run artifacts with exact URLs — future readers know exactly where the expected values came from.
✅ Four samples with varying input/output ratios provide good arithmetic diversity.
✅ Both runtimes covering the same samples is a strong parity signal.
✅ Dated model resolution (gpt-5.4-2026-03-05 → gpt-5.4) and provider alias (copilot → github-copilot) are both exercised, which was the stated goal.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 308 AIC · ⌖ 34.6 AIC

github-actions · 2026-06-11T15:57:35Z

+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := computeModelInferenceAIC("copilot", "gpt-5.4-2026-03-05", tt.inputTokens, tt.outputTokens, 0, 0, 0)


[/tdd] Go test relies on the embedded production data/models.json, unlike the JS counterpart which injects an explicit fixture — if gpt-5.4 pricing is updated in the catalog, these wantAIC values become stale and the test fails with a misleading arithmetic delta rather than a clear "pricing changed" signal.

💡 Suggestion: use a fixture like the JS test

The JS test uses writeModelsFixture({ "github-copilot": { models: { "gpt-5.4": { cost: { ... } } } } }) to pin the exact prices it is testing. Doing the same in Go (via os.Setenv("GH_AW_MODELS_JSON_PATH", ...)) would make this test a pure arithmetic regression rather than a combined pricing-plus-arithmetic check.

This also means any future price update to gpt-5.4 in data/models.json will require updating wantAIC here — a non-obvious maintenance burden if the relationship is not documented.

github-actions · 2026-06-11T15:57:35Z

+		outputTokens int
+		wantAIC      float64
+	}{
+		{name: "request_1", inputTokens: 20314, outputTokens: 970, wantAIC: 6.5335},


[/tdd] Test case names request_1–request_4 do not read as specifications — a failing test run just says TestComputeModelInferenceAICLiveCopilotData/request_3 with no hint of what makes that case distinctive.

💡 Suggested naming

Encode the distinguishing characteristic in the name. For example:

{name: "gpt54_20k_input_1k_output", inputTokens: 20314, outputTokens: 970, wantAIC: 6.5335}, {name: "gpt54_24k_input_15k_output", inputTokens: 24162, outputTokens: 1534, wantAIC: 8.3415}, {name: "gpt54_54k_input_4k_output", inputTokens: 54255, outputTokens: 4411, wantAIC: 20.18025}, {name: "gpt54_66k_input_low_output", inputTokens: 65750, outputTokens: 224, wantAIC: 16.7735},

This lets a failing test message immediately surface the input range, making it faster to correlate back to the live artifact.

github-actions · 2026-06-11T15:57:35Z

+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := computeModelInferenceAIC("copilot", "gpt-5.4-2026-03-05", tt.inputTokens, tt.outputTokens, 0, 0, 0)
+			assert.InDelta(t, tt.wantAIC, got, 1e-9)


[/tdd] assert.InDelta is called without a failure message — every other assertion in this file includes a diagnostic string. Without it, a failure just shows the numeric delta, leaving the reader to infer which field or sample was wrong.

💡 Add a failure message

assert.InDelta(t, tt.wantAIC, got, 1e-9, "AIC mismatch for %s (input=%d, output=%d)", tt.name, tt.inputTokens, tt.outputTokens)

github-actions · 2026-06-11T15:57:36Z

+      { inputTokens: 65750, outputTokens: 224, wantAIC: 16.7735 },
+    ];
+
+    for (const sample of samples) {


[/tdd] A plain for loop over samples means the test fails on the first bad value and does not identify which sample caused it — expect errors will just show "at line 171" with no indication of which token-count pair is implicated.

💡 Use it.each() for per-sample test IDs

const samples = [ { inputTokens: 20314, outputTokens: 970, wantAIC: 6.5335 }, { inputTokens: 24162, outputTokens: 1534, wantAIC: 8.3415 }, { inputTokens: 54255, outputTokens: 4411, wantAIC: 20.18025 }, { inputTokens: 65750, outputTokens: 224, wantAIC: 16.7735 }, ]; it.each(samples)( "gpt-5.4 live: $inputTokens in / $outputTokens out → $wantAIC AIC", async ({ inputTokens, outputTokens, wantAIC }) => { const { computeInferenceAIC } = await import("./model_costs.cjs"); const got = computeInferenceAIC({ provider: "copilot", model: "gpt-5.4-2026-03-05", inputTokens, outputTokens, cacheReadTokens: 0, cacheWriteTokens: 0 }); expect(got).toBeCloseTo(wantAIC, 9); } );

Each sample becomes a named subtest, surfacing exactly which input data failed.

github-actions · 2026-06-11T15:57:36Z

+        cacheReadTokens: 0,
+        cacheWriteTokens: 0,
+      });
+      expect(got).toBeCloseTo(sample.wantAIC, 9);


[/tdd] Precision asymmetry with the Go counterpart: toBeCloseTo(value, 9) checks |got − want| < 5e-10, but the Go test uses assert.InDelta(..., 1e-9) which allows up to 1e-9. A JS floating-point computation differing by 6e-10 would fail JS and pass Go — quietly undermining the cross-runtime parity guarantee these tests are meant to provide.

💡 Align tolerances

Choose one tolerance and use it consistently:

JS stricter: change Go to assert.InDelta(t, tt.wantAIC, got, 5e-10, ...)

Go looser: change JS to toBeCloseTo(sample.wantAIC, 8) (which checks within 5e-9)

Since the values are exact floating-point arithmetic with the given prices, 1e-9 or tighter should be achievable on both sides.

github-actions · 2026-06-11T15:57:36Z

+            cost: {
+              input: "0.0000025",
+              output: "0.000015",
+              cache_read: "0.00000025",


[/tdd] The fixture includes cache_read: "0.00000025" but every sample passes cacheReadTokens: 0 — the cache-hit cost path is never exercised. This is a coverage gap in a test explicitly designed as an arithmetic regression.

💡 Add a cache-read sample

Derive a wantAIC that includes cache read tokens (e.g., from the live artifact if a cache hit is present, or synthetically with a known token count):

// synthetic: 10000 cache-read tokens at 0.00000025 = 0.0025 USD = 0.25 AIC { inputTokens: 10000, outputTokens: 0, cacheReadTokens: 10000, wantAIC: /* input+cache */ }

Without this, a bug that zeroes out cache-read billing would pass all four existing assertions.

github-actions

Non-blocking observations on the new live-data tests

Two test-only files changed, adding live-billing regression tests for gpt-5.4-2026-03-05. The math checks out (verified manually), the model-alias resolution is correct, and the JS test is properly fixture-isolated. Two non-blocking observations inline.

📋 Review summary

What was checked

Arithmetic correctness of all four wantAIC values against the fixture/embedded pricing ✓
Model fuzzy-match path: gpt-5.4-2026-03-05 → gpt-5.4 via longest-prefix logic ✓
Provider alias: "copilot" → "github-copilot" ✓
cacheWriteTokens=0 / cacheReadTokens=0 edge cases ✓

Issues found

(medium) Go test couples wantAIC values to live models.json pricing — no fixture isolation
(low) JS test uses precision=9 vs. precision=6 everywhere else in the file

Neither is blocking.

🔎 Code quality review by PR Code Quality Reviewer · ⌖ 13.7 AIC

github-actions · 2026-06-11T16:02:57Z

+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := computeModelInferenceAIC("copilot", "gpt-5.4-2026-03-05", tt.inputTokens, tt.outputTokens, 0, 0, 0)


Go test is not fixture-isolated: the wantAIC values are coupled to the live models.json embedded at compile time, not to a pinned fixture.

💡 Why this matters and how to fix it

The JS counterpart writes its own fixture with writeModelsFixture, so its expectations are independent of models.json. This Go test does not — it relies on gpt-5.4 pricing in the embedded data/models.json matching the hardcoded expected values exactly.

When gpt-5.4 pricing is updated in models.json (corrected rates, model deprecation, etc.), this test fails with a numeric mismatch that has no obvious connection to the pricing file. The failure message will show something like expected 6.5335 but got 7.2100, leaving the developer wondering why a live-data test broke after an unrelated JSON change.

Consider using an environment-variable override analogous to GH_AW_MODELS_JSON_PATH (the JS side) or temporarily swapping the embedded bytes, e.g.:

// write a temp models.json with the known gpt-5.4 prices dir := t.TempDir() modelsPath := filepath.Join(dir, "models.json") // ... write fixture ... t.Setenv("GH_AW_MODELS_JSON_PATH", modelsPath)

If the Go implementation has no equivalent override, the test comment should at least say which prices it depends on and point to data/models.json so the coupling is explicit.

github-actions · 2026-06-11T16:02:57Z

+        cacheReadTokens: 0,
+        cacheWriteTokens: 0,
+      });
+      expect(got).toBeCloseTo(sample.wantAIC, 9);


Inconsistent toBeCloseTo precision: this test uses precision=9 (tolerance 5e-10), 1000× tighter than the precision=6 used everywhere else in this file.

💡 Details

The two existing toBeCloseTo calls use precision=6:

line 56: expect(aic).toBeCloseTo(0.54825, 6)

line 104: expect(aicViaEmbedded).toBeCloseTo(aicViaProvider, 6)

The new test uses precision=9. The arithmetic here is straightforward (a few multiplications + additions on ~20k–65k token counts), so there is no correctness reason to demand sub-nanosecond precision. Using a tighter-than-needed tolerance:

Creates an unexplained inconsistency that will confuse the next person writing a test in this file.

Could cause spurious failures if a future refactor introduces a legitimate intermediate rounding step.

precision=6 (or even 4) is sufficient to verify the billing arithmetic is correct.

github-actions · 2026-06-11T16:02:57Z

+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := computeModelInferenceAIC("copilot", "gpt-5.4-2026-03-05", tt.inputTokens, tt.outputTokens, 0, 0, 0)


Implicit prefix-match resolution is not documented and silently breakable: gpt-5.4-2026-03-05 resolves to gpt-5.4 today via longest-HasPrefix, but nothing in the test documents this or guards against it changing.

💡 Details

findModelPricing selects the catalog entry with the longest matching prefix of the normalized model ID. If data/models.json gains a new entry like gpt-5.4-2026 in the github-copilot provider, it becomes the new longest prefix for gpt-5.4-2026-03-05 and silently overrides the pricing. The wantAIC values would then be wrong, but the test would fail for a completely opaque reason.

Minimally, add a comment stating which pricing record is expected (gpt-5.4, input=2.5e-6, output=1.5e-5) and how the wantAIC values were derived. Better yet, this is another reason to add fixture isolation: a test-local models.json with only gpt-5.4 removes the ambiguity entirely.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-11T16:09:25Z

@copilot Add tests that involve cache read

Added cache-read coverage in both runtimes:

Go: TestComputeModelInferenceAICCopilotCacheRead in pkg/cli/model_costs_test.go
JS: includes cache read pricing for copilot gpt-5.4 dated model in actions/setup/js/model_costs.test.cjs

Addressed in commits 311e5d8 and c111ec0.

test: validate AI credits arithmetic with live Copilot usage data

670d047

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Jun 11, 2026

Copilot AI changed the title ~~test: verify AI credits arithmetic against live Copilot usage data~~ Add live-data regression checks for AI Credits math in Go and JavaScript Jun 11, 2026

Copilot created this pull request from a session on behalf of pelikhan June 11, 2026 15:42 View session

Copilot finished work on behalf of pelikhan June 11, 2026 15:42

Copilot AI requested a review from pelikhan June 11, 2026 15:42

pelikhan marked this pull request as ready for review June 11, 2026 15:44

Copilot AI review requested due to automatic review settings June 11, 2026 15:44

Copilot started reviewing on behalf of pelikhan June 11, 2026 15:45 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Copilot started work on behalf of pelikhan June 11, 2026 15:51 View session

github-actions Bot mentioned this pull request Jun 11, 2026

[aw] No-Op Runs #38496

Closed

github-actions Bot approved these changes Jun 11, 2026

View reviewed changes

github-actions Bot reviewed Jun 11, 2026

View reviewed changes

Copilot AI and others added 2 commits June 11, 2026 16:05

test: cover copilot cache-read AIC math in Go and JS

311e5d8

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

test: document cache-read expectation math in AIC regression tests

c111ec0

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~Add live-data regression checks for AI Credits math in Go and JavaScript~~ Add regression checks for AI Credits math in Go and JavaScript Jun 11, 2026

Copilot finished work on behalf of pelikhan June 11, 2026 16:10

pelikhan closed this Jun 11, 2026

Conversation

Copilot AI commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pelikhan commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

🧪 Test Quality Sentinel Report

Test Classification Details

Language Support

⚠️ TestComputeModelInferenceAICLiveCopilotData (pkg/cli/model_costs_test.go:84)

⚠️ it("matches live Copilot billing arithmetic for gpt-5.4 dated model") (actions/setup/js/model_costs.test.cjs:135)

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Skills-Based Review 🧠

Key Themes

Positive Highlights

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Non-blocking observations on the new live-data tests

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

Copilot AI commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

⚠️ `TestComputeModelInferenceAICLiveCopilotData` (`pkg/cli/model_costs_test.go:84`)

⚠️ `it("matches live Copilot billing arithmetic for gpt-5.4 dated model")` (`actions/setup/js/model_costs.test.cjs:135`)