feat: companion_insights snapshot table for time-series history

## Summary

`engine.companion_insights` stores one row per user (PK `user_id`) and `InsightRepo::merge` does a JSONB shallow-merge + UPSERT (`crates/eros-engine-store/src/insight.rs:84-112`). The row reflects only the latest state — there is no record of how the JSONB evolved over time.

Anything downstream that wants to observe the timeline (dedup logic, drift analysis, audit trails, etc.) currently has no observation point: by the time a consumer reads, intermediate states are gone.

## Proposal

Append-only snapshot table plus a periodic sweeper that mirrors `pipeline::dreaming::sweeper` in shape, but does **no** LLM call and **no** transformation.

### Schema

```sql
CREATE TABLE engine.companion_insights_snapshot (
    id              UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         UUID        NOT NULL,
    insights        JSONB       NOT NULL,
    training_level  DOUBLE PRECISION NOT NULL,
    captured_at     TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_companion_insights_snapshot_user_time
    ON engine.companion_insights_snapshot (user_id, captured_at DESC);
```

### Sweeper

Background `tokio::spawn`'d task:

- Tick: `SNAPSHOT_TICK_SECS` (default 300), `SNAPSHOT_DISABLED=1` to disable
- Each tick: for every user whose `companion_insights.updated_at > MAX(captured_at)` (for that user), insert one row carrying the current `insights` + `training_level`
- Optional cheap optimisation: skip the insert if the JSONB equals the latest existing snapshot — the UPSERT in `InsightRepo::merge` bumps `updated_at` even when the merged JSONB didn't actually change, so polling alone would produce duplicate rows

Explicitly **not** in scope:

- No LLM calls
- No dedup / merge / classification logic
- No writes to `companion_insights` or `human_insights`

This table is pure write-through history capture. Whatever policy a consumer wants (collapsing near-duplicates, semantic merge, retention) is theirs to build on top.

## Why a separate table

`companion_insights` is the "freshest merged JSONB the prompt-builder should read" view, and that role wants single-row UPSERT semantics. Turning it into a stream would either bloat the prompt read path or force a more expensive query. Splitting reads (`companion_insights`) from history (`companion_insights_snapshot`) keeps each table single-purpose.

## Open questions

1. **Polling vs UPSERT trigger.** Polling matches the dreaming-lite cadence story and is simpler; a trigger has zero lag but adds write-path complexity. Polling preferred unless there's a known sub-minute-granularity need.
2. **Retention.** My instinct is to leave TTL to operators rather than ship a built-in cleaner. Open to either.
3. **Initial cadence.** 300s mirrors dreaming-lite; happy to take input on a saner default.

Happy to put up a PR if the approach lands.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: companion_insights snapshot table for time-series history #61

Summary

Proposal

Schema

Sweeper

Why a separate table

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: companion_insights snapshot table for time-series history #61

Description

Summary

Proposal

Schema

Sweeper

Why a separate table

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions