Skip to content

Commit 847bb6a

Browse files
Necmttnclaude
andauthored
fix(ingest): clear-then-derive friction/diagnostic events on full re-derive (#549) (#602)
Derived event rows are written via stable-key UPSERT, so when the current derivation logic stops emitting a given row (e.g. a classifier change like deriveCorrectionEdges no longer flagging Codex image-only messages), the old row is never re-touched - it orphans with a stale ts (the friction_event 1970-epoch residue). A full re-derive could neither overwrite nor remove it. On a FULL re-derive (no --since window) every friction/diagnostic row is re-emitted from a complete scan, so cmdSignals now clears those two standalone derived tables before re-inserting - orphans vanish. A bare DELETE (no WHERE) sidesteps the DELETE-WHERE-on-indexed-field ghost-row footgun (PR #141). A --since run keeps the UPSERT-only path: clearing there would drop rows whose source falls outside the window. The gate is the existing full-derive predicate (shouldDeriveAllTimeSkillPairs), the same one that guards all-time skill-pair writes. Scoped to the two tables named in the issue (friction_event, diagnostic_event); other UPSERT-keyed derived edges can get the same treatment as a follow-up. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 6d53e32 commit 847bb6a

3 files changed

Lines changed: 39 additions & 0 deletions

File tree

apps/axctl/src/ingest/derive-signals.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ import { AppLayer } from "@ax/lib/layers";
55
import type { DbError } from "@ax/lib/errors";
66
import { executeStatementsWith } from "@ax/lib/shared/statement-exec";
77
import {
8+
buildClearDiagnosticEventStatement, buildClearFrictionEventStatement,
89
buildCorrectedByStatements, buildDiagnosticEventStatements,
910
buildFrictionEventStatements, buildProposedStatements,
1011
buildRecoveredStatements, buildSkillPairStatements,
@@ -248,6 +249,17 @@ export const deriveSignals = Effect.fn("derive.signals")(
248249
attributes: { "signals.count": recoveryBatch.length },
249250
}),
250251
);
252+
// On a FULL re-derive, clear the standalone derived-event tables before
253+
// re-inserting so a row the current logic no longer emits cannot orphan
254+
// with a stale ts (#549). `shouldWriteSkillPairs` is exactly the
255+
// full-derive gate (sinceDays undefined / <=0); a --since run keeps the
256+
// UPSERT-only path so it never drops rows whose source is out of window.
257+
if (shouldWriteSkillPairs) {
258+
yield* exec([
259+
buildClearFrictionEventStatement(),
260+
buildClearDiagnosticEventStatement(),
261+
]).pipe(Effect.withSpan("signals.clear.derived-events"));
262+
}
251263
yield* exec(buildFrictionEventStatements(frictionBatch)).pipe(
252264
Effect.withSpan("signals.write.friction", {
253265
attributes: { "signals.count": frictionBatch.length },

apps/axctl/src/ingest/signals/statements.test.ts

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
import { describe, expect, test } from "bun:test";
22
import type { CorrectionEdge, DerivedDiagnosticEvent, DerivedFrictionEvent } from "./types.ts";
33
import {
4+
buildClearDiagnosticEventStatement,
5+
buildClearFrictionEventStatement,
46
buildCorrectedByStatements,
57
buildDiagnosticEventStatements,
68
buildFrictionEventStatements,
@@ -157,3 +159,14 @@ describe("buildDiagnosticEventStatements", () => {
157159
expect(stmt).toContain('ts: d"2026-05-09T10:00:00.000Z" };');
158160
});
159161
});
162+
163+
describe("clear-table statements (#549 orphan sweep)", () => {
164+
test("bare DELETE (no WHERE) so a full re-derive removes orphaned rows", () => {
165+
// Bare DELETE avoids the DELETE-WHERE-on-indexed-field ghost-row footgun
166+
// (PR #141); only run on a FULL re-derive that re-emits every row.
167+
expect(buildClearFrictionEventStatement()).toBe("DELETE friction_event RETURN NONE;");
168+
expect(buildClearDiagnosticEventStatement()).toBe("DELETE diagnostic_event RETURN NONE;");
169+
expect(buildClearFrictionEventStatement()).not.toContain("WHERE");
170+
expect(buildClearDiagnosticEventStatement()).not.toContain("WHERE");
171+
});
172+
});

apps/axctl/src/ingest/signals/statements.ts

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,20 @@ export const buildRecoveredStatements = (edges: readonly RecoveryEdge[]): string
100100
return `RELATE turn:\`${e.fromTurnKey}\` -> recovered_by:\`${edgeId}\` -> skill:\`${e.skillKey}\` SET ts = d"${e.ts}", error_excerpt = ${excerpt};`;
101101
});
102102

103+
/**
104+
* Truncate the standalone derived event tables before a FULL re-derive (#549).
105+
* These rows are written via stable-key UPSERT, so when the current derivation
106+
* logic stops emitting a given row (e.g. a classifier change), the old row is
107+
* never re-touched and orphans with a stale `ts` (the 1970-epoch residue). On a
108+
* full re-derive every row is re-emitted from a complete scan, so clearing
109+
* first removes orphans safely. A bare `DELETE` (no WHERE) sidesteps the
110+
* DELETE-WHERE-on-indexed-field ghost-row footgun (PR #141). Only safe on a
111+
* full derive - a `--since` run must NOT clear (it would drop rows whose source
112+
* falls outside the window).
113+
*/
114+
export const buildClearFrictionEventStatement = (): string => "DELETE friction_event RETURN NONE;";
115+
export const buildClearDiagnosticEventStatement = (): string => "DELETE diagnostic_event RETURN NONE;";
116+
103117
export const buildFrictionEventStatements = (
104118
events: readonly DerivedFrictionEvent[],
105119
): string[] =>

0 commit comments

Comments
 (0)