feat(otel): ax otel — OTLP receiver coverage + freshness view#609
Merged
Conversation
The OTLP receiver was write-only: harness telemetry landed in `otel_metric_point` / `otel_log_event` / `otel_span` and only enriched existing insights via `telemetry_of`, with no surface to inspect whether telemetry was even flowing or being correlated. `ax otel [--days=N] [--json]` adds the read path: - per (harness, signal) all-time row count + freshness → health verdict (✓ flowing <6h / ⚠ stale <48h / ✗ cold / · none); - session correlation coverage (share of windowed sessions carrying a telemetry_of edge) — a live 0% loudly flags telemetry arriving but the correlation pass drawing no edges; - OTLP claude cost metric vs transcript cost as an independent cross-check (per-event log token sums intentionally not surfaced — they double-count). Read-only `db` query (deref-free; signals all-time, coverage+cost windowed), MCP tool `otel`, and the 4 doc gates (CLAUDE.md, llms.txt, cli-reference, VISIBLE_COMMANDS). OTLP stays content-stripped on purpose: the prompts/tool I/O another tool would scrape from request bodies are already in turn/tool_call from transcript parsing. Deferred: `ax doctor` OTLP nudge (doctor is runtime:"none", no DB) and a studio trace waterfall. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Deploying ax with
|
| Latest commit: |
b45db53
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://1af94773.ax-62d.pages.dev |
| Branch Preview URL: | https://feat-608-feat-otel-ax-otel-o.ax-62d.pages.dev |
… edge The first `ax otel` cut counted `telemetry_of` edges for coverage, which read a hard 0% - the edge is empty (separate bug, #610). But the edge is also the wrong thing to measure: no enrichment query reads it (telemetry-rollup.ts joins `session_id` directly), so coverage should too. Coverage is now: share of windowed TOP-LEVEL sessions whose uuid matches an otel `session_id`. otel stores a bare uuid; `session.id` is `session:⟨uuid⟩`, so a `bareUuid` helper compares uuids in JS. Subagents (`*-subagent` ids, no uuid) are excluded - OTLP is emitted at the top-level session, never per-subagent. Live: 153/279 top-level sessions (54.8%), a true number, replacing the edge-dependent 0%. Docs (CLAUDE.md / llms.txt / cli-reference) updated to match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Necmttn
added a commit
that referenced
this pull request
Jun 25, 2026
…emental (#610) The pass drew ZERO edges: `type::record("session:" + session_id)` evaluated the concat as arithmetic for hyphenated uuids -> `session:019fbf3f` (dropped everything after the first hyphen), so the IN-check never matched. otel `session_id` is a bare uuid while `session.id` is the escaped `session:⟨uuid⟩` record, so we now match uuid-to-uuid in JS instead of trusting type::record. Also reshaped to be cheap on the ingest hot path (this runs after EVERY ingest, including the watcher's `--since=1`): - SESSION-GRAIN: one edge per top-level session that has telemetry (the edge means "session has telemetry"; no data query reads it - enrichment joins session_id directly), not one per otel row. Codex emits ~1.5M log rows for a few hundred sessions; row-grain would write millions of edges nobody consumes. - INCREMENTAL: candidates = existing-but-unlinked sessions, probed with `session_id IN [...]` over the schema's session_id index, chunked at 500 (the telemetry-rollup.ts pattern). The old full `GROUP BY session_id` enumerated all 1.5M log rows (~8s) on every ingest; this scales with new sessions only. - idempotency is the in-memory `linked` set (drives candidates), replacing the per-row `count(<-telemetry_of)=0` graph traversal. Live (session-grain): 155 edges, idempotent re-run adds 0. Coverage itself reads session_id directly (PR #609) and never depended on this edge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The verify gate requires every visible subcommand to appear in README.md or docs/cli.md. Add the OTLP receiver health section + the `otel` MCP tool entry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Necmttn
added a commit
that referenced
this pull request
Jun 25, 2026
…emental (#611) * fix(otel): correlate telemetry_of by uuid match, session-grain + incremental (#610) The pass drew ZERO edges: `type::record("session:" + session_id)` evaluated the concat as arithmetic for hyphenated uuids -> `session:019fbf3f` (dropped everything after the first hyphen), so the IN-check never matched. otel `session_id` is a bare uuid while `session.id` is the escaped `session:⟨uuid⟩` record, so we now match uuid-to-uuid in JS instead of trusting type::record. Also reshaped to be cheap on the ingest hot path (this runs after EVERY ingest, including the watcher's `--since=1`): - SESSION-GRAIN: one edge per top-level session that has telemetry (the edge means "session has telemetry"; no data query reads it - enrichment joins session_id directly), not one per otel row. Codex emits ~1.5M log rows for a few hundred sessions; row-grain would write millions of edges nobody consumes. - INCREMENTAL: candidates = existing-but-unlinked sessions, probed with `session_id IN [...]` over the schema's session_id index, chunked at 500 (the telemetry-rollup.ts pattern). The old full `GROUP BY session_id` enumerated all 1.5M log rows (~8s) on every ingest; this scales with new sessions only. - idempotency is the in-memory `linked` set (drives candidates), replacing the per-row `count(<-telemetry_of)=0` graph traversal. Live (session-grain): 155 edges, idempotent re-run adds 0. Coverage itself reads session_id directly (PR #609) and never depended on this edge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(otel): window correlation by observed_at + CONCURRENTLY index builds Two follow-ups after live-verifying the first cut of this PR: 1. The "candidates = existing - linked" scan was actually slower (32s cold / 3.4s warm): transcript-only sessions never get telemetry, so they stay candidates and get re-probed every ingest forever - O(all sessions), not O(new). Replaced with a windowed scan: only telemetry observed in the last 2 days, over the `observed_at` index (range scan, ~30ms), then filtered to existing + unlinked sessions in JS. OTLP arrives with its transcript, so recent telemetry is exactly what a fresh ingest needs to link. (Keep the WHERE observed_at-only - a leading `session_id != NONE` defeated the index and full-scanned, 8s.) 2. Root-caused every DB wedge in this work: a plain `DEFINE INDEX` takes a TABLE LOCK while building, so re-applying the schema (`ax install`) onto an already- large otel_log_event (~1.5M codex log rows) wedges the daemon. All otel index builds are now `CONCURRENTLY` (background build, no lock). Added matching `observed_at` indexes for the windowed scan above. Live: 2d window = ~30ms (index used); a 5d replay over real data created 33 correct edges (proper escaping), confirming the relate path. 7 unit tests + 83 otel/schema tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #608.
Why
Comparing ax to latitude-llm's OTLP story surfaced a gap: ax's OTLP receiver was write-only. Telemetry lands in
otel_metric_point/otel_log_event/otel_spanand only enriches existing insights viatelemetry_of— there was no CLI, no MCP, no way to answer "is telemetry even flowing, and is it being correlated to my sessions?"(latitude scrapes full prompt/response/tool I/O from OTLP request bodies → cloud. ax deliberately keeps OTLP content-stripped — that content is already in
turn.text/tool_call.input_json|output_jsonfrom transcript parsing, so capturing bodies would only duplicate + re-leak it. No change there.)What
ax otel [--days=N] [--json]:(harness, signal)all-time row count + freshness → verdict (✓ flowing <6h / ⚠ stale <48h / ✗ cold / · none).telemetry_ofedge. A live 0% loudly flags telemetry arriving but the correlation pass drawing no edges.claude_code.cost.usagevs transcript cost over the window (per-event log token sums NOT surfaced — they double-count).Live output (this machine — receiver currently stale, correlation broken)
The view immediately earns its keep —
telemetry_ofhas 0 edges across 5,355 sessions. Filed as follow-up.ship-checklist
ax otelCLI +--json. ✓otel(3 roster pins updated). ✓ ·ax improvegenerator / dojo item — not wired (health is a check-on-demand, not a recurring proposal). Skipped intentionally.tsc0 errors.ax doctorOTLP-freshness line (doctor isruntime:"none", no DB — health lives in the command); studio trace waterfall.🤖 Generated with Claude Code