29 May 08:55

Sardor-M

fb2ab9d

v0.3.0: Graph connectivity + managed sync daemon Latest

Latest

v0.3.0: Graph connectivity + managed sync daemon

Two additive, backward-compatible features. The compiler now keeps the knowledge graph connected across sources instead of forming per-source islands, and cross-device sync gets a first-class managed daemon so you no longer hand-edit launchd/systemd templates.

npm install -g lumen-kb@0.3.0

Connect the graph across sources (#52)

Previously each compile pass coined slugs in isolation and cross-source edges formed only by coincidence - the graph fragmented into disconnected subgraphs. Now:

Known-concepts hint — before compiling a source, the most-mentioned active concepts (top 80, retired excluded) are inlined into the LLM prompt as a "reuse these slugs verbatim" block, so the same idea gets the same slug across sources. Prompt overhead stays under ~2k tokens.
Global edge resolver — every edge endpoint resolves against the whole brain: exact in-pass match → alias-aware DB match → fuzzy slug match (same Levenshtein threshold as dedup, with a length pre-filter). Edges pointing at concepts from prior sources are no longer silently dropped.
edges_dropped is now reported alongside edges_created in the compilation result.

Managed sync daemon (Tier 6, #26 / #28 / #29)

A first-class daemon replaces the manual launchd/systemd templates from 0.2.0:

lumen sync daemon install      # detect platform, generate + load the unit
lumen sync daemon status       # installed? managed vs. manual? PID alive?
lumen sync daemon uninstall    # bootout/disable + clean up

Adaptive cadence — Active (~30s) when there's pending push work or recent pulls returned rows; Idle (~300s) after a few empty pulls.
Push debounce — bursts of journal writes (e.g. during a long lumen compile) coalesce into a single push; pull stays on schedule.
Manual-shape detection — --replace-manual detects the PR #27 templates (StartInterval / Type=oneshot), unloads them, and installs the managed daemon in their place.

Upgrade

npm install -g lumen-kb@0.3.0

# If you copied the manual sync templates from 0.2.0:
lumen sync daemon install --replace-manual

Fully backward compatible — no schema migration, no config changes required.

Full changelog: https://github.com/Sardor-M/lumen/blob/main/CHANGELOG.md

Assets 2

07 May 14:22

Sardor-M

v0.2.0

aad1468

v0.2.0 — Tier 5 cross-device sync

End-to-end-encrypted multi-device sync. Five additive sub-tiers (5a–5e) shipping as a single coherent feature. Local-first remains the default — sync is opt-in, every payload is sealed with X25519 + XChaCha20-Poly1305 on-device, and the relay holds opaque ciphertext only.

npm install -g lumen-kb@0.2.0

What you can do now

# Device A:
lumen sync init --relay https://lumen-relay.<your-account>.workers.dev
lumen sync enable
lumen sync show-key --reveal       # share securely

# Device B:
lumen sync import-key "<base64>" --relay <same URL>
lumen sync enable
lumen sync run                     # push → pull → apply

Concepts, compiled-truth updates, +1/−1 feedback, retirements, and trajectories all sync across your laptops. New devices coming online a year later replay the entire history from one master key.

What's in the box

Append-only sync journal (sync_journal, schema v15) — every concept-touching mutation atomically journals alongside its entity write
X25519 + XChaCha20-Poly1305 encryption envelope — pure-TS, audited noble-suite deps, no native bindings
Relay HTTP client + driver with retry/backoff + circuit-breaker after 5 consecutive failures
Reference Cloudflare Worker relay (apps/relay/) — D1-backed, ~600 LOC, deployable in three wrangler commands, zero-knowledge by construction
Per-op apply rules (schema v16) — applyConceptCreate, applyTrajectory, applyFeedback, applyTruthUpdate, applyRetire — last-write-wins on truth_update with concept_truth_history audit trail
lumen sync CLI — init / enable/disable / push/pull/apply/run / status / reset-error / show-key/import-key/forget-key
23 MCP tools (was 19) — brain_feedback, retire_skill, capture_trajectory, replay_skill join the existing surface

Tests

887 passing, 0 regressions (was 740 in 0.1.4)
132 new sync tests across 7 files (journal, crypto, keyring, relay-client, driver, apply, e2e)
25 relay tests via @cloudflare/vitest-pool-workers (real workerd + miniflare D1/KV, no mocks)
25 edge-case tests covering LWW chains, multi-device feedback/retire, mixed-op same-slug batches, orchestrator robustness, scope-aware apply

Schema migrations

Both purely additive — no rewrite of existing rows, existing tables and prior tests untouched.

v15 — sync_state singleton + sync_journal append-only log
v16 — concept_truth_history (LWW audit) + concept_feedback.sync_id partial UNIQUE INDEX

Honest limitations

What syncs: concepts, feedback, truth updates, retirements, trajectories
What doesn't: lumen add source content, embeddings, chunks — local-first stays local-first
No realtime: sync runs on demand via lumen sync run; a daemon is Tier 6
LWW only: no vector clocks / CRDTs in v1; flat-timestamp last-write-wins is honest enough at our scale
Multi-device key share: currently show-key --reveal + import-key <base64>; QR / BIP39 / age-file is Tier 6
Tombstone propagation: DELETE removes the relay blob and tombstones the relay row, but other devices learn about the deletion only via their own DELETE (Tier 6)

Migration

lumen upgrades automatically — schema v16 applies on first DB open after upgrade. No action required unless you want to enable sync (opt-in):

npm install -g lumen-kb@0.2.0
lumen sync init --relay <url>          # only if you want sync

Existing concept_feedback rows keep sync_id = NULL (the partial unique index doesn't apply to them).

Acknowledgements

Crypto primitives via @noble/ciphers and @noble/curves — pure-TS, audited, no native deps.

Assets 2

23 Apr 08:16

Sardor-M

v0.1.4

a8767ec

v0.1.4 — code/dataset/image ingest + Obsidian connector

What's in v0.1.4

Code repositories (source_type: code) — lumen add <github-url> shallow-clones into tmpdir and cleans up;
/ .next / target, truncates files over 50 KB, caps at 800 files per repo. Regex signature extraction for
JS/TS/Python/Go/Rust/Java/C/C++/Ruby/C#. README.md / CONTRIBUTING.md / docs/ ordered before source sections.
Datasets (source_type: dataset) — lumen add ./data.csv or lumen add https://huggingface.co/datasets/<id>.
Native CSV / TSV / JSONL parsing with delimiter auto-detection, quoted-field + doubled-quote escapes; HuggingFace datasets
fetched via /api/datasets/ plus /raw/main/README.md for the card. Schema table (column, inferred type, null count) +
20-row preview rendered as markdown. Colocated README.md / dataset-card.md auto-inlined.
Images (source_type: image) — lumen add screenshot.png. SHA-256 of bytes into metadata, MIME inferred from
extension. Optional OCR shells out to a local tesseract binary when on PATH; --no-ocr stores metadata only. Missing
binary produces a brew install hint instead of failing. Supports .png / .jpg / .webp / .gif / .bmp / .tiff.
Obsidian connector (ConnectorType: obsidian) — lumen watch add obsidian ~/vault. Parses Web Clipper YAML
frontmatter, promotes the source: URL into the ExtractionResult so re-clippings dedup by URL. Skips .obsidian and
other dotfiles. Flat keys, inline arrays, block-list arrays all supported.
Auto-detection in detectSourceType: GitHub repo URL → code, HuggingFace dataset URL → dataset, .csv / .tsv
/ .jsonl / .ndjson → dataset, image extensions → image, directory containing .git → code.
New CLI flags: --as-dataset (force dataset handling for ambiguous text), --no-ocr (skip OCR on images).
lib/add.ts AddInput object form accepts an options field threading IngestOptions through the programmatic path.
77 new tests across tests/code.test.ts (20), tests/dataset.test.ts (20), tests/image.test.ts (12, one
conditionally skipped when tesseract is absent), tests/obsidian.test.ts (19). Full suite: 536 passing, 2 skipped.

Changed

SourceType union widened to 'url' | 'pdf' | 'youtube' | 'arxiv' | 'file' | 'folder' | 'code' | 'dataset' | 'image'.
No schema migration — existing source_type TEXT + metadata TEXT columns accept the new shapes.
ConnectorType union widened with 'obsidian'.
ingestInput(input, options?) — optional IngestOptions second argument, backwards-compatible.

Fixes

parseJsonl: SCHEMA_SAMPLE_ROWS cap now counts only successfully-parsed object rows (previously incremented on
malformed lines too).
dataset tests: global.fetch stubs switched to vi.stubGlobal + vi.unstubAllGlobals so they don't leak across test
files.
obsidian connector: parseTarget and pull reject subdir paths that escape the vault via .. or absolute paths.
add command: --type flag is now actually threaded to the extractor as forcedType.
add command: action body wrapped in top-level try/catch for consistent error handling.

Deferred

Tracked in docs/docs-temp/INGEST-EXPANSION-PLAN.md:

Tree-sitter-based code parsing — avoids native-build dependency for now.
Claude Vision caption pass on compile for images — metadata.caption reserved as placeholder.
Native Parquet support — errors with a duckdb conversion hint instead of parsing.
In-house browser extension — Obsidian path validates the flow first.

Install

npm install -g lumen-kb

PRs

#10 — code/dataset/image ingest formats + Obsidian connector + v0.1.4 CHANGELOG

Full changelog: https://github.com/Sardor-M/Lumen/blob/main/CHANGELOG.md

Assets 2

22 Apr 01:49

Sardor-M

v0.1.3

0833bf4

v0.1.3

Landing page, shared packages, benchmarks runner, CI and lint fixes.

What's in v0.1.3

Landing page app with interactive graph visualization, knowledge model demo, and agent wiring walkthrough
Shared packages: @lumen/ui (useClipboard), @lumen/eslint-config, @lumen/tsconfig, @lumen/brand
Benchmarks runner with 6 test suites: search quality, search latency, graph ops, ingest, adversarial, MCP contract
npm/GitHub badges in README, npm install -g lumen-kb as primary install
Auto-review diff limit raised to 4000 lines

Fixes

useClipboard: await writeText Promise before setting copied state
@eslint/eslintrc declared as explicit dependency
bench script routes through workspace tsx
@claude workflow: allowed_tools moved into claude_args as --allowedTools
Landing JSX comment lint errors fixed
.gitignore iCloud pattern escaped

Install

npm install -g lumen-kb

PRs

#8 — npm badges, landing lint fixes, benchmark plan
#7 — CHANGELOG, CONTRIBUTING, SECURITY, .env.example, knip

Full changelog: https://github.com/Sardor-M/Lumen/blob/main/CHANGELOG.md

Assets 2

21 Apr 01:41

Sardor-M

v0.1.2

f147d62

v0.1.2 — First published release

Local-first knowledge compiler. Ingest articles, papers, PDFs, YouTube transcripts into a searchable knowledge graph — then wire it into your AI assistant.

Install

npm install -g lumen-kb
lumen init
echo 'ANTHROPIC_API_KEY=sk-ant-...' > ~/.lumen/.env

Then in Claude Code:

lumen install claude

What's in v0.1.2

Ingestion — URL, PDF, YouTube (Innertube captions), arXiv, local files and folders. SHA-256 dedup. Structural chunking (Markdown/HTML/plain text, merge <50t, split >1000t).

Search — 3-signal hybrid: BM25 (FTS5 Porter stemmed) + TF-IDF (in-memory cosine) + Vector ANN (sqlite-vec). Fused via Reciprocal Rank Fusion (k=60). Token budget flag: lumen search "query" -b 4000.

Compilation — LLM extracts concepts + weighted edges with compiled truth + timeline per concept. Parallel: lumen compile -c 5. Model override: --model claude-haiku-4-5-20251001. Prompt caching cuts cost 60-80%.

Enrichment — Concepts auto-escalate: Tier 3 (stub) → Tier 2 (enriched, 3+ mentions) → Tier 1 (full, 6+ mentions). lumen enrich --status.

Graph — PageRank, BFS shortest path, N-hop neighborhood, label propagation communities. lumen graph pagerank, lumen graph path <a> <b>.

Agent wiring — lumen install claude generates 5 files: CLAUDE.md (brain-first protocol), .mcp.json, skill, PreToolUse hook, Stop hook. Agent checks KB before web search, cites [Source: title], captures ideas after every response.

MCP server — 19 tools via stdio: search, brain_ops, compile, capture, add, concept, path, neighbors, god_nodes, communities, pagerank, query, status, profile, session_summary, add_link, backlinks, links, community.

Streaming — lumen ask streams tokens to stdout as they arrive.

Tests — 28 tests: compress pipeline (13), graph engine + PageRank + communities (15), delta stubs (10 todos).

Web UI — Next.js 15 dashboard with search, concept browser, graph visualization.

PRs

#3 — Signal capture + tiered enrichment (schema v9)
#4 — Streaming responses + always-on brain wiring
#5 — Parallel compile, brain-first CLAUDE.md, CLI fixes
#6 — MCP compile tool, tests, npm publish, MIT license
#7 — CHANGELOG, CONTRIBUTING, SECURITY, .env.example, knip

What's next (v0.2.0)

Vector embeddings wired into default search path
Delta module: smart recompilation of only changed sources
Browser extension for capturing highlights and URLs

Assets 2

Releases: Sardor-M/lumen

v0.3.0: Graph connectivity + managed sync daemon

v0.3.0: Graph connectivity + managed sync daemon

Connect the graph across sources (#52)

Managed sync daemon (Tier 6, #26 / #28 / #29)

Upgrade

Uh oh!

v0.2.0 — Tier 5 cross-device sync

v0.2.0 — Tier 5 cross-device sync

What you can do now

What's in the box

Tests

Schema migrations

Honest limitations

Migration

Acknowledgements

Uh oh!

v0.1.4 — code/dataset/image ingest + Obsidian connector

What's in v0.1.4

Changed

Fixes

Deferred

Install

Uh oh!

v0.1.3

What's in v0.1.3

Fixes

Install

PRs

Uh oh!

v0.1.2 — First published release

Install

What's in v0.1.2

PRs

What's next (v0.2.0)

Uh oh!