Releases: Sardor-M/lumen
v0.3.0: Graph connectivity + managed sync daemon
v0.3.0: Graph connectivity + managed sync daemon
Two additive, backward-compatible features. The compiler now keeps the knowledge graph connected across sources instead of forming per-source islands, and cross-device sync gets a first-class managed daemon so you no longer hand-edit launchd/systemd templates.
npm install -g lumen-kb@0.3.0Connect the graph across sources (#52)
Previously each compile pass coined slugs in isolation and cross-source edges formed only by coincidence - the graph fragmented into disconnected subgraphs. Now:
- Known-concepts hint — before compiling a source, the most-mentioned active concepts (top 80, retired excluded) are inlined into the LLM prompt as a "reuse these slugs verbatim" block, so the same idea gets the same slug across sources. Prompt overhead stays under ~2k tokens.
- Global edge resolver — every edge endpoint resolves against the whole brain: exact in-pass match → alias-aware DB match → fuzzy slug match (same Levenshtein threshold as dedup, with a length pre-filter). Edges pointing at concepts from prior sources are no longer silently dropped.
edges_droppedis now reported alongsideedges_createdin the compilation result.
Managed sync daemon (Tier 6, #26 / #28 / #29)
A first-class daemon replaces the manual launchd/systemd templates from 0.2.0:
lumen sync daemon install # detect platform, generate + load the unit
lumen sync daemon status # installed? managed vs. manual? PID alive?
lumen sync daemon uninstall # bootout/disable + clean up- Adaptive cadence — Active (~30s) when there's pending push work or recent pulls returned rows; Idle (~300s) after a few empty pulls.
- Push debounce — bursts of journal writes (e.g. during a long
lumen compile) coalesce into a single push; pull stays on schedule. - Manual-shape detection —
--replace-manualdetects the PR #27 templates (StartInterval/Type=oneshot), unloads them, and installs the managed daemon in their place.
Upgrade
npm install -g lumen-kb@0.3.0
# If you copied the manual sync templates from 0.2.0:
lumen sync daemon install --replace-manualFully backward compatible — no schema migration, no config changes required.
Full changelog: https://github.com/Sardor-M/lumen/blob/main/CHANGELOG.md
v0.2.0 — Tier 5 cross-device sync
v0.2.0 — Tier 5 cross-device sync
End-to-end-encrypted multi-device sync. Five additive sub-tiers (5a–5e) shipping as a single coherent feature. Local-first remains the default — sync is opt-in, every payload is sealed with X25519 + XChaCha20-Poly1305 on-device, and the relay holds opaque ciphertext only.
npm install -g lumen-kb@0.2.0What you can do now
# Device A:
lumen sync init --relay https://lumen-relay.<your-account>.workers.dev
lumen sync enable
lumen sync show-key --reveal # share securely
# Device B:
lumen sync import-key "<base64>" --relay <same URL>
lumen sync enable
lumen sync run # push → pull → applyConcepts, compiled-truth updates, +1/−1 feedback, retirements, and trajectories all sync across your laptops. New devices coming online a year later replay the entire history from one master key.
What's in the box
- Append-only sync journal (
sync_journal, schema v15) — every concept-touching mutation atomically journals alongside its entity write - X25519 + XChaCha20-Poly1305 encryption envelope — pure-TS, audited noble-suite deps, no native bindings
- Relay HTTP client + driver with retry/backoff + circuit-breaker after 5 consecutive failures
- Reference Cloudflare Worker relay (
apps/relay/) — D1-backed, ~600 LOC, deployable in threewranglercommands, zero-knowledge by construction - Per-op apply rules (schema v16) —
applyConceptCreate,applyTrajectory,applyFeedback,applyTruthUpdate,applyRetire— last-write-wins ontruth_updatewithconcept_truth_historyaudit trail lumen syncCLI —init/enable/disable/push/pull/apply/run/status/reset-error/show-key/import-key/forget-key- 23 MCP tools (was 19) —
brain_feedback,retire_skill,capture_trajectory,replay_skilljoin the existing surface
Tests
- 887 passing, 0 regressions (was 740 in 0.1.4)
- 132 new sync tests across 7 files (journal, crypto, keyring, relay-client, driver, apply, e2e)
- 25 relay tests via
@cloudflare/vitest-pool-workers(real workerd + miniflare D1/KV, no mocks) - 25 edge-case tests covering LWW chains, multi-device feedback/retire, mixed-op same-slug batches, orchestrator robustness, scope-aware apply
Schema migrations
Both purely additive — no rewrite of existing rows, existing tables and prior tests untouched.
- v15 —
sync_statesingleton +sync_journalappend-only log - v16 —
concept_truth_history(LWW audit) +concept_feedback.sync_idpartial UNIQUE INDEX
Honest limitations
- What syncs: concepts, feedback, truth updates, retirements, trajectories
- What doesn't:
lumen addsource content, embeddings, chunks — local-first stays local-first - No realtime: sync runs on demand via
lumen sync run; a daemon is Tier 6 - LWW only: no vector clocks / CRDTs in v1; flat-timestamp last-write-wins is honest enough at our scale
- Multi-device key share: currently
show-key --reveal+import-key <base64>; QR / BIP39 / age-file is Tier 6 - Tombstone propagation:
DELETEremoves the relay blob and tombstones the relay row, but other devices learn about the deletion only via their ownDELETE(Tier 6)
Migration
lumen upgrades automatically — schema v16 applies on first DB open after upgrade. No action required unless you want to enable sync (opt-in):
npm install -g lumen-kb@0.2.0
lumen sync init --relay <url> # only if you want syncExisting concept_feedback rows keep sync_id = NULL (the partial unique index doesn't apply to them).
Acknowledgements
Crypto primitives via @noble/ciphers and @noble/curves — pure-TS, audited, no native deps.
v0.1.4 — code/dataset/image ingest + Obsidian connector
What's in v0.1.4
- Code repositories (
source_type: code) —lumen add <github-url>shallow-clones intotmpdirand cleans up;
/.next/target, truncates files over 50 KB, caps at 800 files per repo. Regex signature extraction for
JS/TS/Python/Go/Rust/Java/C/C++/Ruby/C#.README.md/CONTRIBUTING.md/docs/ordered before source sections. - Datasets (
source_type: dataset) —lumen add ./data.csvorlumen add https://huggingface.co/datasets/<id>.
Native CSV / TSV / JSONL parsing with delimiter auto-detection, quoted-field + doubled-quote escapes; HuggingFace datasets
fetched via/api/datasets/plus/raw/main/README.mdfor the card. Schema table (column, inferred type, null count) +
20-row preview rendered as markdown. ColocatedREADME.md/dataset-card.mdauto-inlined. - Images (
source_type: image) —lumen add screenshot.png. SHA-256 of bytes into metadata, MIME inferred from
extension. Optional OCR shells out to a localtesseractbinary when onPATH;--no-ocrstores metadata only. Missing
binary produces abrew installhint instead of failing. Supports.png/.jpg/.webp/.gif/.bmp/.tiff. - Obsidian connector (
ConnectorType: obsidian) —lumen watch add obsidian ~/vault. Parses Web Clipper YAML
frontmatter, promotes thesource:URL into theExtractionResultso re-clippings dedup by URL. Skips.obsidianand
other dotfiles. Flat keys, inline arrays, block-list arrays all supported. - Auto-detection in
detectSourceType: GitHub repo URL →code, HuggingFace dataset URL →dataset,.csv/.tsv
/.jsonl/.ndjson→dataset, image extensions →image, directory containing.git→code. - New CLI flags:
--as-dataset(force dataset handling for ambiguous text),--no-ocr(skip OCR on images).
lib/add.tsAddInputobject form accepts anoptionsfield threadingIngestOptionsthrough the programmatic path. - 77 new tests across
tests/code.test.ts(20),tests/dataset.test.ts(20),tests/image.test.ts(12, one
conditionally skipped whentesseractis absent),tests/obsidian.test.ts(19). Full suite: 536 passing, 2 skipped.
Changed
SourceTypeunion widened to'url' | 'pdf' | 'youtube' | 'arxiv' | 'file' | 'folder' | 'code' | 'dataset' | 'image'.
No schema migration — existingsource_type TEXT+metadata TEXTcolumns accept the new shapes.ConnectorTypeunion widened with'obsidian'.ingestInput(input, options?)— optionalIngestOptionssecond argument, backwards-compatible.
Fixes
parseJsonl:SCHEMA_SAMPLE_ROWScap now counts only successfully-parsed object rows (previously incremented on
malformed lines too).datasettests:global.fetchstubs switched tovi.stubGlobal+vi.unstubAllGlobalsso they don't leak across test
files.obsidianconnector:parseTargetandpullreject subdir paths that escape the vault via..or absolute paths.addcommand:--typeflag is now actually threaded to the extractor asforcedType.addcommand: action body wrapped in top-leveltry/catchfor consistent error handling.
Deferred
Tracked in docs/docs-temp/INGEST-EXPANSION-PLAN.md:
- Tree-sitter-based code parsing — avoids native-build dependency for now.
- Claude Vision caption pass on
compilefor images —metadata.captionreserved as placeholder. - Native Parquet support — errors with a
duckdbconversion hint instead of parsing. - In-house browser extension — Obsidian path validates the flow first.
Install
npm install -g lumen-kbPRs
- #10 — code/dataset/image ingest formats + Obsidian connector + v0.1.4 CHANGELOG
Full changelog: https://github.com/Sardor-M/Lumen/blob/main/CHANGELOG.md
v0.1.3
Landing page, shared packages, benchmarks runner, CI and lint fixes.
What's in v0.1.3
- Landing page app with interactive graph visualization, knowledge model demo, and agent wiring walkthrough
- Shared packages:
@lumen/ui(useClipboard),@lumen/eslint-config,@lumen/tsconfig,@lumen/brand - Benchmarks runner with 6 test suites: search quality, search latency, graph ops, ingest, adversarial, MCP contract
- npm/GitHub badges in README,
npm install -g lumen-kbas primary install - Auto-review diff limit raised to 4000 lines
Fixes
useClipboard: awaitwriteTextPromise before setting copied state@eslint/eslintrcdeclared as explicit dependencybenchscript routes through workspace tsx@claudeworkflow:allowed_toolsmoved intoclaude_argsas--allowedTools- Landing JSX comment lint errors fixed
.gitignoreiCloud pattern escaped
Install
npm install -g lumen-kb
PRs
- #8 — npm badges, landing lint fixes, benchmark plan
- #7 — CHANGELOG, CONTRIBUTING, SECURITY, .env.example, knip
Full changelog: https://github.com/Sardor-M/Lumen/blob/main/CHANGELOG.md
v0.1.2 — First published release
Local-first knowledge compiler. Ingest articles, papers, PDFs, YouTube transcripts into a searchable knowledge graph — then wire it into your AI assistant.
Install
npm install -g lumen-kb
lumen init
echo 'ANTHROPIC_API_KEY=sk-ant-...' > ~/.lumen/.env
Then in Claude Code:
lumen install claude
What's in v0.1.2
Ingestion — URL, PDF, YouTube (Innertube captions), arXiv, local files and folders. SHA-256 dedup. Structural chunking (Markdown/HTML/plain text, merge <50t, split >1000t).
Search — 3-signal hybrid: BM25 (FTS5 Porter stemmed) + TF-IDF (in-memory cosine) + Vector ANN (sqlite-vec). Fused via Reciprocal Rank Fusion (k=60). Token budget flag: lumen search "query" -b 4000.
Compilation — LLM extracts concepts + weighted edges with compiled truth + timeline per concept. Parallel: lumen compile -c 5. Model override: --model claude-haiku-4-5-20251001. Prompt caching cuts cost 60-80%.
Enrichment — Concepts auto-escalate: Tier 3 (stub) → Tier 2 (enriched, 3+ mentions) → Tier 1 (full, 6+ mentions). lumen enrich --status.
Graph — PageRank, BFS shortest path, N-hop neighborhood, label propagation communities. lumen graph pagerank, lumen graph path <a> <b>.
Agent wiring — lumen install claude generates 5 files: CLAUDE.md (brain-first protocol), .mcp.json, skill, PreToolUse hook, Stop hook. Agent checks KB before web search, cites [Source: title], captures ideas after every response.
MCP server — 19 tools via stdio: search, brain_ops, compile, capture, add, concept, path, neighbors, god_nodes, communities, pagerank, query, status, profile, session_summary, add_link, backlinks, links, community.
Streaming — lumen ask streams tokens to stdout as they arrive.
Tests — 28 tests: compress pipeline (13), graph engine + PageRank + communities (15), delta stubs (10 todos).
Web UI — Next.js 15 dashboard with search, concept browser, graph visualization.
PRs
- #3 — Signal capture + tiered enrichment (schema v9)
- #4 — Streaming responses + always-on brain wiring
- #5 — Parallel compile, brain-first CLAUDE.md, CLI fixes
- #6 — MCP compile tool, tests, npm publish, MIT license
- #7 — CHANGELOG, CONTRIBUTING, SECURITY, .env.example, knip
What's next (v0.2.0)
- Vector embeddings wired into default search path
- Delta module: smart recompilation of only changed sources
- Browser extension for capturing highlights and URLs