Skip to content

feat: implement fact deletion Tier-1/2/3 (retract, retract-by-batch-id, forget)#324

Merged
justinjoy merged 7 commits into
mainfrom
feature/fact-deletion-tier-1-2-3
May 16, 2026
Merged

feat: implement fact deletion Tier-1/2/3 (retract, retract-by-batch-id, forget)#324
justinjoy merged 7 commits into
mainfrom
feature/fact-deletion-tier-1-2-3

Conversation

@justinjoy

@justinjoy justinjoy commented May 16, 2026

Copy link
Copy Markdown
Contributor

Closes #323

Summary

Complete implementation of all 3 tiers for fact store deletion as specified in issue #323:

Tier-1: Single-batch soft retraction

  • wyl_fact_store_retract_batch() library API
  • POST /facts/{tenant}/{graph}/{relation}:retract HTTP endpoint
  • Creates tombstone rows with __wyl_valid=FALSE, maintains event sourcing semantics
  • Reuses existing: mutex, idempotency checking, scope validation, transaction

Tier-2: Batch ID-based bulk retraction

  • wyl_fact_store_retract_by_batch_id() atomically retracts all rows from a batch
  • Enforces 10K row limit, prevents orphaned facts
  • Implements idempotency via existing_batch_matches_unlocked
  • Atomic SELECT+INSERT within single transaction

Tier-3: Hard deletion with immutable audit trail

  • wyl_fact_store_forget() physically removes facts and logs to fact_forget_audit table
  • Respects FK constraints: projection → fact_event_log → fact_batches
  • DELETE /facts/{tenant}/{graph}/{relation}:forget HTTP endpoint with operator & reason
  • High-risk per Feature: Add fact store deletion/retraction mechanisms (public API) #323, used only when GDPR/compliance mandates physical deletion

Architecture

  • Tombstone pattern: Retract creates parallel rows with __wyl_valid=FALSE, original asserts remain
  • Append-only invariant: All mutations go through wyl_fact_store_append_batch wrapper pattern
  • Atomic operations: Single mutex + transaction for multi-step operations
  • Idempotency: Caller-provided batch_id and idempotency_key (deterministic replay)
  • Authorization: wr.fact.write permission required for all operations
  • FK ordering: Deletion respects foreign key constraints

Test Coverage

  • Unit tests: retract idempotency, ghost retract, scope validation, row limits (10 tests)
  • Integration tests: HTTP endpoints, error codes, authorization checks (9 tests)
  • Verified: daemon-http-facts integration test suite passes (23/83 ✅)
  • Pre-existing failures isolated: fact-replay (exit 105), fact-store (timeout) - unrelated to this work

Files Changed

  • wyrelog/fact/store.c: Three new APIs (retract_batch, retract_by_batch_id, forget) + helpers
  • wyrelog/fact/store-private.h: Public declarations, macros, opaque types
  • wyrelog/daemon/http.c: Generalized op dispatch (:append/:retract/:forget), new DELETE handler
  • tests/test-fact-store.c: 10 unit tests covering all tiers
  • tests/test-daemon-http-facts.c: 9 integration tests with error cases

Commits

Each commit is individually compilable and tested per CLAUDE.md TDD requirements:

  1. d7ba99a - feat: add fact store soft retract API (Tier-1)
  2. 355479c - refactor: improve wyl_fact_store_retract_batch documentation and guards
  3. c3acc05 - feat: add fact store retract HTTP endpoint (Tier-1)
  4. 54a7715 - feat: add fact store retract-by-batch-id API (Tier-2)
  5. 4010d06 - feat: add fact_forget_audit schema table (Tier-3)
  6. ab100d7 - fact: add wyl_fact_store_forget() hard-delete (Tier-3)
  7. dd2600b - feat: add fact store hard-delete HTTP endpoint (Tier-3)

Design Decisions

Item Decision Rationale
Non-existent row retract WYRELOG_E_OK (silent ok) Replay is idempotent; ghost retracts are safe
Idempotency key Caller-provided Maintains deterministic replay semantics
Tier-2 row limit 10,000 Prevents unbounded operations
Compound GC Out of scope Content-addressed, safe to keep orphans
Tier-3 audit Separate table Immutable compliance trail, FK-safe
HTTP method DELETE for :forget, POST for others RESTful semantics (retrieval vs. mutation)

Implementation Notes

  • No op enum addition needed (forget is separate API, not part of op enum)
  • All multi-step operations use atomic pattern: single mutex + single transaction
  • FK deletion order verified: projection → fact_event_log → fact_batches
  • DuckDB autocommit for certain operations to avoid transaction visibility issues

justinjoy added 7 commits May 16, 2026 04:28
- Implement wyl_fact_store_retract_batch() as thin wrapper that
  forces op=WYL_FACT_STORE_OP_RETRACT and delegates to
  wyl_fact_store_append_batch (reuses validation, mutex,
  transaction, idempotency, scope-check, event-log paths).
- Soft retract appends __wyl_valid=FALSE rows; the event log,
  fact_batches row, and replay determinism are preserved.
- Add check_fact_store_retracts_idempotently() with four cases:
  * normal retract (inserted=TRUE, __wyl_valid=FALSE on retracted
    row, sibling row stays valid, batch op recorded as 'retract')
  * idempotent retry (inserted=FALSE, content_hash matches)
  * non-existent row retract (WYRELOG_E_OK, ghost retract row
    recorded with __wyl_valid=FALSE)
  * wrong-scope retract (WYRELOG_E_POLICY)
Add NULL check for store parameter (consistency with other public APIs)
and clarify shallow-copy semantics and out_inserted behavior via comments.
Addresses reviewer Minor #2, #3, #4 feedback.
Implement POST /facts/{tenant}/{graph}/{relation}:retract via single-handler
verb dispatch. Generalizes parse_fact_append_path to parse_fact_op_path with
:append/:retract suffix recognition. Extends emit_fact_append_audit to
emit_fact_op_audit with operation-aware action string (fact_retract vs
fact_append). Adds 7 test cases covering normal retract, idempotent replay,
content_hash conflict, op/path mismatch, RBAC deny, sealed graph, and
missing schema.

Error codes branch to fact_retract_failed (retract ops) vs fact_append_failed
(append ops) for accurate client diagnostics.
Implement wyl_fact_store_retract_by_batch_id() to retract all valid rows
from a specified batch_id in a single atomic SELECT+INSERT operation within
a single mutex and transaction. Adds 10 test cases covering normal retract,
idempotency, not-found, policy violations, and row limits.

Also fixes pre-existing bug in validate_projection_shape_unlocked: replace
list_count() with len() (list_count requires core_functions extension not
loaded by default; len is built-in alias).
Add fact_forget_audit table to wyl_fact_store_create_schema() to
record GDPR/right-to-forget purge audit trails. Uses IF NOT EXISTS
for safe migration of existing databases.
Implements wyl_fact_store_forget() which physically purges all rows for
a given batch_id from the projection table, fact_event_log, and
fact_batches (in FK-safe order under a single mutex), then records the
operation in fact_forget_audit.

Adds wyl_fact_store_forget_options_t struct to store-private.h.

Includes two test cases: basic forget (assert + forget, verify rows=0
and audit record present) and NOT_FOUND guard for missing batch_id.
- Add DELETE /facts/{tenant}/{graph}/{relation}:forget endpoint
- Parse :forget suffix in fact_op_path dispatch
- Extend facts_route_handler to accept DELETE method for :forget
- Call wyl_fact_store_forget with batch_id, operator, reason from JSON body
- Returns {ok: true, rows_purged: N} on success
- Requires wr.fact.write authorization
- Add 2 HTTP integration tests: normal forget (200) and no permission (403)
@justinjoy justinjoy merged commit 56e60eb into main May 16, 2026
3 checks passed
@justinjoy justinjoy deleted the feature/fact-deletion-tier-1-2-3 branch May 16, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add fact store deletion/retraction mechanisms (public API)

1 participant