AI Processing

ShadowBrain's AI layer has two responsibilities: semantic search (find thoughts by meaning) and nightly compilation (turn raw captures into structured journal entries). This document covers the embedding pipeline, the vector storage layer, and the planned nightly job architecture.

Status: The vector storage and search layer is implemented. The embedding generation pipeline and nightly AI compilation job are planned (see phases.md). This document describes both the implemented storage layer and the designed architecture so contributors know what exists and what is coming.

Overview

┌──────────────────────────────────────────────────────┐
│  Content item created (API, Discord, import)         │
│                      │                               │
│                      ▼                               │
│  ┌─────────────────┐    ┌──────────────────────┐    │
│  │ Embedding       │───▶│ content_vectors      │    │
│  │ generation      │    │ (sqlite-vec vec0)    │    │
│  │ (all-MiniLM-L6) │    │  384-dim float vector │    │
│  └─────────────────┘    └──────────┬───────────┘    │
│                                   │                  │
│  ┌─────────────────┐               │                 │
│  │ FTS5 index      │◀──────────────┘                 │
│  │ (full-text)     │  Both index the same row         │
│  └─────────────────┘                                 │
│                                                       │
│  ┌─────────────────────────────────────────────────┐ │
│  │  Nightly AI job (planned)                        │ │
│  │  Raw captures → journal entry + title + tags     │ │
│  │  via OpenRouter (configurable model)             │ │
│  └─────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘

Configuration

AI processing is configured via environment variables (synced into the settings table on startup):

Variable	Default	Purpose
`OPENROUTER_API_KEY`	(empty)	API key for OpenRouter LLM calls
`AI_MODEL`	`mistralai/mistral-7b-instruct`	Model for nightly compilation
`EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Local sentence-transformers model

Get an OpenRouter API key at https://openrouter.ai/keys. See available models at https://openrouter.ai/models.

Embedding storage layer (implemented)

The vector storage and search layer is fully implemented in src/db/vectors.ts, backed by the sqlite-vec vec0 virtual table.

How it works

Each content_items row has a rowid (SQLite internal integer).
The content_vectors virtual table stores a 384-dimensional float vector keyed by the same rowid.
Vectors are generated by the all-MiniLM-L6-v2 sentence-transformers model (384 dimensions).
Search uses L2 (Euclidean) distance via sqlite-vec's MATCH + k syntax.

API

import {
  upsertEmbedding,
  getEmbedding,
  vectorSearch,
  deleteEmbedding,
  isVecExtensionLoaded,
  getVectorCount,
} from "@/db/index";

Function	Purpose
`upsertEmbedding(db, contentId, embedding)`	Store/update a vector
`getEmbedding(db, contentId)`	Retrieve a stored vector (`number[] \| null`)
`vectorSearch(db, queryEmbedding, opts)`	K-nearest-neighbor search
`deleteEmbedding(db, contentId)`	Remove a vector (called on item delete)
`isVecExtensionLoaded(db)`	Check if `vec0` is available
`getVectorCount(db)`	Count stored vectors

Example: semantic search

const db = getDb();

// 1. Generate embedding for the query (requires the embedding model)
const queryEmbedding = await generateEmbedding("docker networking");

// 2. Search by similarity
const results = vectorSearch(db, queryEmbedding, {
  limit: 10,
  type: "note", // optional type filter
});

// results: Array<{ ...contentItemFields, distance: number }>
// Lower distance = more similar

Lifecycle hooks

When a content item is deleted, its embedding must be removed manually — the vec0 virtual table doesn't support foreign-key cascades. The DELETE /api/items/[id] route handles this:

if (isVecExtensionLoaded(db)) {
  deleteEmbedding(db, id);
}
contentItems.delete(db, id);

Full-text search (FTS5)

Alongside vector search, every content item is indexed in an FTS5 virtual table (content_items_search) for keyword-based search with BM25 ranking. The search helpers in src/db/search.ts support:

Full-text matching with snippet highlighting (<mark>…</mark>)
Type and tag filters
Pagination
Visibility-aware filtering

See Database > Full-text search for usage.

FTS5 vs. vector search

Feature	FTS5 (full-text)	sqlite-vec (semantic)
Matches by	Keywords / tokens	Meaning / similarity
Ranking	BM25 (term frequency)	L2 distance (vector proximity)
Requires ext	Built into SQLite	`vec0.so` (must be built)
Query input	Text string	Pre-computed embedding array
Good for	Exact phrases, names	Concepts, paraphrases

Both index the same content_items rows and can be combined for hybrid search.

Nightly AI compilation (planned)

Not yet implemented. The architecture below is the designed plan from architecture.md and phases.md.

The nightly job processes the day's raw captures and produces:

Journal entries — AI-compiled daily summaries with a title and tags.
Auto-tagging — suggests tags for new content based on existing taxonomy.
Auto-titling — generates titles for untitled captures.
Link suggestions (optional) — detects relationships between items (contradictions, builds-upon, inspired-by).

Planned architecture

Trigger: cron job or systemd timer (the architecture diagram shows a shadowbrain-cron container alongside the app).
Model: configurable via AI_MODEL (default Mistral 7B via OpenRouter).
Grounding: all prompts are grounded in the user's own data — the AI only sees content the user created.
Output: journal entries are stored as type: "journal" content items with typed links back to the source captures.

Prompt design principles

User-owned data only — no external context injection.
Respect visibility flags — is_private items are excluded from AI context unless explicitly opted in per thread.
Idempotent — re-running the job for a day that already has a journal entry should update, not duplicate.

Embedding generation pipeline (planned)

Not yet implemented. The storage layer (upsertEmbedding, vectorSearch) is ready; the generation step (calling the all-MiniLM-L6-v2 model to produce vectors) is a planned feature.

The designed flow:

A content item is created (via API, Discord capture, or import).
An embedding is generated from the item's content (and optionally title) using the local all-MiniLM-L6-v2 model.
The 384-dim float vector is stored via upsertEmbedding(db, contentId, vector).
On content update, the embedding is regenerated and upserted.
On delete, deleteEmbedding cleans up the vector.

The generation step can run synchronously (on create/update) or in a background job (batch processing new items). The storage API is designed to support either approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Processing

Overview

Configuration

Embedding storage layer (implemented)

How it works

API

Example: semantic search

Lifecycle hooks

Full-text search (FTS5)

FTS5 vs. vector search

Nightly AI compilation (planned)

Planned architecture

Prompt design principles

Embedding generation pipeline (planned)

FilesExpand file tree

ai-processing.md

Latest commit

History

ai-processing.md

File metadata and controls

AI Processing

Overview

Configuration

Embedding storage layer (implemented)

How it works

API

Example: semantic search

Lifecycle hooks

Full-text search (FTS5)

FTS5 vs. vector search

Nightly AI compilation (planned)

Planned architecture

Prompt design principles

Embedding generation pipeline (planned)