Skip to content

[aw-failures] P1: Codex gpt-5-codex experiment arm 404s on gpt-5-codex-alpha-2025-11-07 — Daily Cache Strategy Analyzer fail [Content truncated due to length] #39140

@github-actions

Description

@github-actions

Recommendation

Fix the gpt-5-codex experiment arm — it 404s 100% of the time because Codex CLI 0.137.0 requests gpt-5-codex-alpha-2025-11-07, a model the api-proxy does not serve. Pin the arm to a provisioned codex model id (or drop the arm) so the Daily Cache Strategy Analyzer stops hard-failing whenever the experiment selects this variant.

Problem statement

The Daily Cache Strategy Analyzer runs an A/B experiment over model_size variants [gpt-5.4, gpt-5-codex]; every run that picks the gpt-5-codex arm fails after exhausting all retries.

Affected workflow and run IDs

  • Workflow: .github/workflows/daily-cache-strategy-analyzer.md (engine codex, model ${{ needs.activation.outputs.model_size }}, experiment arm gpt-5-codex)
  • Representative failed run: §27475817126 — step "Execute Codex CLI", all 4 attempts failed (exitCode=1 totalDuration=1m 7s)

Root cause

The proxy allowlist maps gpt-5-codexcopilot/gpt-5*codex* / openai/gpt-5*codex*, but Codex CLI 0.137.0 internally resolves the arm to the concrete model id gpt-5-codex-alpha-2025-11-07, which the api-proxy (172.30.0.30:10000/responses) rejects with 404 Not Found: Model not found gpt-5-codex-alpha-2025-11-07. The harness fallback re-runs with --model gpt-5-codex, but the CLI still emits the alpha id and 404s again — so retries cannot recover.

Evidence (api-proxy 404, all retries exhausted)
{"type":"error","message":"unexpected status 404 Not Found: Model not found gpt-5-codex-alpha-2025-11-07, url: (172.30.0.30/redacted) ..."}
{"type":"turn.failed","error":{"message":"... Model not found gpt-5-codex-alpha-2025-11-07 ..."}}
WARN codex_models_manager::model_info: Unknown model gpt-5-codex is used. This will use fallback model metadata.
[codex-harness] all 3 retries exhausted — giving up (exitCode=1)

Note: Smoke Codex's agent step succeeds in the same window, so the proxy/codex path is healthy for provisioned ids — only the gpt-5-codex→alpha resolution is broken.

Proposed remediation (pick one)

  1. Explicitly set the model id the CLI sends to a provisioned codex model (e.g. -c model=<available-id>) instead of relying on the CLI's gpt-5-codex→alpha default.
  2. Provision gpt-5-codex-alpha-2025-11-07 upstream and add it to the api-proxy model allowlist/pricing table.
  3. Pin/upgrade the Codex CLI to a version whose gpt-5-codex alias resolves to a served model.
  4. Drop the gpt-5-codex arm from the experiment model_size variants until the model is available.

Success criteria / verification

  • No 404 ... Model not found gpt-5-codex-alpha-2025-11-07 entries in api-proxy logs for codex runs.
  • Daily Cache Strategy Analyzer gpt-5-codex arm run_success_rate >= 0.90 (the experiment's own guardrail), or the arm is removed.
6h Failure Investigation — full window context (2026-06-13 19:13Z, 31 runs)

Failure clusters

Cluster Workflow(s) Run IDs Class Tracking
Codex gpt-5-codex→alpha 404 Daily Cache Strategy Analyzer 27475817126 config/model · P1 this issue
Claude log-parser guardrail false-fail Avenger ×4 27473035084, 27471579514, 27470367219, 27468988707 bug · P1 #39141
upload_artifact 400 → safe_outputs fail Smoke Copilot/Codex/Claude, Design Decision Gate, PR Sous Chef 27471858644, 27471836485, 27471836462, 27471832454, 27471203716 bug · P1 #38998 (open, recurring)
Daily-AIC guardrail false-failure PR Code Quality Reviewer ×6 27474598971, 27471832435, 27471799544, 27471709386, 27469570698, 27468597131 noise (by-design) #39079, #39077 (open)
Credit-limit test (intentional) Daily Credit Limit Test 27467921501 expected n/a
Copilot CLI: node missing in AWF chroot (exit 127) Daily Issues Report Generator ×2 27472163434, 27470398413 infra · P2 roadmap
Copilot CLI tool-denial threshold Daily Formal Spec Verifier 27471596105 bug FIXED by f7fb96b / #39101 (run predates fix)
Daily SPDD Spec Planner exec fail Daily SPDD Spec Planner 27472157373 unverified roadmap
Doc build: Git LFS pointer not hydrated Documentation Unbloat 27473025009 infra · P2 roadmap
Smoke Gemini: model has no AI-credits pricing Smoke Gemini 27471836499 config · P2 roadmap
PR-queue GraphQL 502 Bad Gateway PR Sous Chef ×2 27475734213, 27473979242 transient none
Release: Defender sig update hr=0x80070652 Release 27469593101 transient (external) none
Cancelled / superseded Smoke CI ×4, Q 27475351300, 27475289369, 27468719491, 27467752372, 27471149654 not a failure n/a

Existing-issue correlation

Fix roadmap

References: §27475817126 · §27473035084 · §27471858644

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions