Skip to content

Latest commit

 

History

History
216 lines (167 loc) · 8.79 KB

File metadata and controls

216 lines (167 loc) · 8.79 KB

AI Processing

ShadowBrain's AI layer has two responsibilities: semantic search (find thoughts by meaning) and nightly compilation (turn raw captures into structured journal entries). This document covers the embedding pipeline, the vector storage layer, and the planned nightly job architecture.

Status: The vector storage and search layer is implemented. The embedding generation pipeline and nightly AI compilation job are planned (see phases.md). This document describes both the implemented storage layer and the designed architecture so contributors know what exists and what is coming.


Overview

┌──────────────────────────────────────────────────────┐
│  Content item created (API, Discord, import)         │
│                      │                               │
│                      ▼                               │
│  ┌─────────────────┐    ┌──────────────────────┐    │
│  │ Embedding       │───▶│ content_vectors      │    │
│  │ generation      │    │ (sqlite-vec vec0)    │    │
│  │ (all-MiniLM-L6) │    │  384-dim float vector │    │
│  └─────────────────┘    └──────────┬───────────┘    │
│                                   │                  │
│  ┌─────────────────┐               │                 │
│  │ FTS5 index      │◀──────────────┘                 │
│  │ (full-text)     │  Both index the same row         │
│  └─────────────────┘                                 │
│                                                       │
│  ┌─────────────────────────────────────────────────┐ │
│  │  Nightly AI job (planned)                        │ │
│  │  Raw captures → journal entry + title + tags     │ │
│  │  via OpenRouter (configurable model)             │ │
│  └─────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘

Configuration

AI processing is configured via environment variables (synced into the settings table on startup):

Variable Default Purpose
OPENROUTER_API_KEY (empty) API key for OpenRouter LLM calls
AI_MODEL mistralai/mistral-7b-instruct Model for nightly compilation
EMBEDDING_MODEL all-MiniLM-L6-v2 Local sentence-transformers model

Get an OpenRouter API key at https://openrouter.ai/keys. See available models at https://openrouter.ai/models.


Embedding storage layer (implemented)

The vector storage and search layer is fully implemented in src/db/vectors.ts, backed by the sqlite-vec vec0 virtual table.

How it works

  • Each content_items row has a rowid (SQLite internal integer).
  • The content_vectors virtual table stores a 384-dimensional float vector keyed by the same rowid.
  • Vectors are generated by the all-MiniLM-L6-v2 sentence-transformers model (384 dimensions).
  • Search uses L2 (Euclidean) distance via sqlite-vec's MATCH + k syntax.

API

import {
  upsertEmbedding,
  getEmbedding,
  vectorSearch,
  deleteEmbedding,
  isVecExtensionLoaded,
  getVectorCount,
} from "@/db/index";
Function Purpose
upsertEmbedding(db, contentId, embedding) Store/update a vector
getEmbedding(db, contentId) Retrieve a stored vector (number[] | null)
vectorSearch(db, queryEmbedding, opts) K-nearest-neighbor search
deleteEmbedding(db, contentId) Remove a vector (called on item delete)
isVecExtensionLoaded(db) Check if vec0 is available
getVectorCount(db) Count stored vectors

Example: semantic search

const db = getDb();

// 1. Generate embedding for the query (requires the embedding model)
const queryEmbedding = await generateEmbedding("docker networking");

// 2. Search by similarity
const results = vectorSearch(db, queryEmbedding, {
  limit: 10,
  type: "note", // optional type filter
});

// results: Array<{ ...contentItemFields, distance: number }>
// Lower distance = more similar

Lifecycle hooks

When a content item is deleted, its embedding must be removed manually — the vec0 virtual table doesn't support foreign-key cascades. The DELETE /api/items/[id] route handles this:

if (isVecExtensionLoaded(db)) {
  deleteEmbedding(db, id);
}
contentItems.delete(db, id);

Full-text search (FTS5)

Alongside vector search, every content item is indexed in an FTS5 virtual table (content_items_search) for keyword-based search with BM25 ranking. The search helpers in src/db/search.ts support:

  • Full-text matching with snippet highlighting (<mark>…</mark>)
  • Type and tag filters
  • Pagination
  • Visibility-aware filtering

See Database > Full-text search for usage.

FTS5 vs. vector search

Feature FTS5 (full-text) sqlite-vec (semantic)
Matches by Keywords / tokens Meaning / similarity
Ranking BM25 (term frequency) L2 distance (vector proximity)
Requires ext Built into SQLite vec0.so (must be built)
Query input Text string Pre-computed embedding array
Good for Exact phrases, names Concepts, paraphrases

Both index the same content_items rows and can be combined for hybrid search.


Nightly AI compilation (planned)

Not yet implemented. The architecture below is the designed plan from architecture.md and phases.md.

The nightly job processes the day's raw captures and produces:

  1. Journal entries — AI-compiled daily summaries with a title and tags.
  2. Auto-tagging — suggests tags for new content based on existing taxonomy.
  3. Auto-titling — generates titles for untitled captures.
  4. Link suggestions (optional) — detects relationships between items (contradictions, builds-upon, inspired-by).

Planned architecture

  • Trigger: cron job or systemd timer (the architecture diagram shows a shadowbrain-cron container alongside the app).
  • Model: configurable via AI_MODEL (default Mistral 7B via OpenRouter).
  • Grounding: all prompts are grounded in the user's own data — the AI only sees content the user created.
  • Output: journal entries are stored as type: "journal" content items with typed links back to the source captures.

Prompt design principles

  • User-owned data only — no external context injection.
  • Respect visibility flagsis_private items are excluded from AI context unless explicitly opted in per thread.
  • Idempotent — re-running the job for a day that already has a journal entry should update, not duplicate.

Embedding generation pipeline (planned)

Not yet implemented. The storage layer (upsertEmbedding, vectorSearch) is ready; the generation step (calling the all-MiniLM-L6-v2 model to produce vectors) is a planned feature.

The designed flow:

  1. A content item is created (via API, Discord capture, or import).
  2. An embedding is generated from the item's content (and optionally title) using the local all-MiniLM-L6-v2 model.
  3. The 384-dim float vector is stored via upsertEmbedding(db, contentId, vector).
  4. On content update, the embedding is regenerated and upserted.
  5. On delete, deleteEmbedding cleans up the vector.

The generation step can run synchronously (on create/update) or in a background job (batch processing new items). The storage API is designed to support either approach.