ShadowBrain's AI layer has two responsibilities: semantic search (find thoughts by meaning) and nightly compilation (turn raw captures into structured journal entries). This document covers the embedding pipeline, the vector storage layer, and the planned nightly job architecture.
Status: The vector storage and search layer is implemented. The embedding generation pipeline and nightly AI compilation job are planned (see phases.md). This document describes both the implemented storage layer and the designed architecture so contributors know what exists and what is coming.
┌──────────────────────────────────────────────────────┐
│ Content item created (API, Discord, import) │
│ │ │
│ ▼ │
│ ┌─────────────────┐ ┌──────────────────────┐ │
│ │ Embedding │───▶│ content_vectors │ │
│ │ generation │ │ (sqlite-vec vec0) │ │
│ │ (all-MiniLM-L6) │ │ 384-dim float vector │ │
│ └─────────────────┘ └──────────┬───────────┘ │
│ │ │
│ ┌─────────────────┐ │ │
│ │ FTS5 index │◀──────────────┘ │
│ │ (full-text) │ Both index the same row │
│ └─────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Nightly AI job (planned) │ │
│ │ Raw captures → journal entry + title + tags │ │
│ │ via OpenRouter (configurable model) │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
AI processing is configured via environment variables (synced into the
settings table on startup):
| Variable | Default | Purpose |
|---|---|---|
OPENROUTER_API_KEY |
(empty) | API key for OpenRouter LLM calls |
AI_MODEL |
mistralai/mistral-7b-instruct |
Model for nightly compilation |
EMBEDDING_MODEL |
all-MiniLM-L6-v2 |
Local sentence-transformers model |
Get an OpenRouter API key at https://openrouter.ai/keys. See available models at https://openrouter.ai/models.
The vector storage and search layer is fully implemented in
src/db/vectors.ts, backed by the
sqlite-vec vec0 virtual table.
- Each
content_itemsrow has arowid(SQLite internal integer). - The
content_vectorsvirtual table stores a 384-dimensional float vector keyed by the samerowid. - Vectors are generated by the
all-MiniLM-L6-v2sentence-transformers model (384 dimensions). - Search uses L2 (Euclidean) distance via sqlite-vec's
MATCH+ksyntax.
import {
upsertEmbedding,
getEmbedding,
vectorSearch,
deleteEmbedding,
isVecExtensionLoaded,
getVectorCount,
} from "@/db/index";| Function | Purpose |
|---|---|
upsertEmbedding(db, contentId, embedding) |
Store/update a vector |
getEmbedding(db, contentId) |
Retrieve a stored vector (number[] | null) |
vectorSearch(db, queryEmbedding, opts) |
K-nearest-neighbor search |
deleteEmbedding(db, contentId) |
Remove a vector (called on item delete) |
isVecExtensionLoaded(db) |
Check if vec0 is available |
getVectorCount(db) |
Count stored vectors |
const db = getDb();
// 1. Generate embedding for the query (requires the embedding model)
const queryEmbedding = await generateEmbedding("docker networking");
// 2. Search by similarity
const results = vectorSearch(db, queryEmbedding, {
limit: 10,
type: "note", // optional type filter
});
// results: Array<{ ...contentItemFields, distance: number }>
// Lower distance = more similarWhen a content item is deleted, its embedding must be removed manually —
the vec0 virtual table doesn't support foreign-key cascades. The
DELETE /api/items/[id] route handles this:
if (isVecExtensionLoaded(db)) {
deleteEmbedding(db, id);
}
contentItems.delete(db, id);Alongside vector search, every content item is indexed in an FTS5
virtual table (content_items_search) for keyword-based search with
BM25 ranking. The search helpers in
src/db/search.ts support:
- Full-text matching with snippet highlighting (
<mark>…</mark>) - Type and tag filters
- Pagination
- Visibility-aware filtering
See Database > Full-text search for usage.
| Feature | FTS5 (full-text) | sqlite-vec (semantic) |
|---|---|---|
| Matches by | Keywords / tokens | Meaning / similarity |
| Ranking | BM25 (term frequency) | L2 distance (vector proximity) |
| Requires ext | Built into SQLite | vec0.so (must be built) |
| Query input | Text string | Pre-computed embedding array |
| Good for | Exact phrases, names | Concepts, paraphrases |
Both index the same content_items rows and can be combined for hybrid
search.
Not yet implemented. The architecture below is the designed plan from architecture.md and phases.md.
The nightly job processes the day's raw captures and produces:
- Journal entries — AI-compiled daily summaries with a title and tags.
- Auto-tagging — suggests tags for new content based on existing taxonomy.
- Auto-titling — generates titles for untitled captures.
- Link suggestions (optional) — detects relationships between items (contradictions, builds-upon, inspired-by).
- Trigger: cron job or systemd timer (the architecture diagram shows
a
shadowbrain-croncontainer alongside the app). - Model: configurable via
AI_MODEL(default Mistral 7B via OpenRouter). - Grounding: all prompts are grounded in the user's own data — the AI only sees content the user created.
- Output: journal entries are stored as
type: "journal"content items with typed links back to the source captures.
- User-owned data only — no external context injection.
- Respect visibility flags —
is_privateitems are excluded from AI context unless explicitly opted in per thread. - Idempotent — re-running the job for a day that already has a journal entry should update, not duplicate.
Not yet implemented. The storage layer (
upsertEmbedding,vectorSearch) is ready; the generation step (calling theall-MiniLM-L6-v2model to produce vectors) is a planned feature.
The designed flow:
- A content item is created (via API, Discord capture, or import).
- An embedding is generated from the item's
content(and optionallytitle) using the localall-MiniLM-L6-v2model. - The 384-dim float vector is stored via
upsertEmbedding(db, contentId, vector). - On content update, the embedding is regenerated and upserted.
- On delete,
deleteEmbeddingcleans up the vector.
The generation step can run synchronously (on create/update) or in a background job (batch processing new items). The storage API is designed to support either approach.