Skip to content

heathersherry/ContextHub

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ContextHub: Unified Context Management
for Multi-Agent Collaboration

A context-governance engine built on a filesystem paradigm with LLM-native commands. Agents navigate memories, skills, documents, and data-lake metadata through familiar operations (ls, read, grep, stat) over ctx:// URIs β€” with version control, visibility boundaries, change propagation, and cross-agent sharing.

Built on FastAPI + PostgreSQL. Single database. No external vector store. No message queue.

English | δΈ­ζ–‡


Why ContextHub? πŸ”Ž

When multiple AI agents collaborate on the same business entities, their contexts are siloed, unversioned, and disconnected:

  • 79% of multi-agent failures stem from coordination problems, not technical bugs (Zylos Research, 2026).
  • 36.9% of failures come from inter-agent misalignment β€” agents ignoring, duplicating, or contradicting each other's work (Cemri et al., 2025).

These are structural deficits in system architecture β€” they cannot be fixed by improving individual model capabilities. ContextHub addresses this by unifying four types of context under one governance layer.

What Does ContextHub Manage? πŸ“¦

Context Type What It Is Example
Memory Facts, patterns, and decisions an agent learns during conversations A SQL query pattern that worked for monthly sales reports
Skill Reusable capabilities that agents publish, version, and subscribe to A "SQL Generator" skill β€” subscribers get notified on breaking changes
Resource Documents that agents read, understand, and retrieve API docs, runbooks, or policy documents referenced during tasks
Data-Lake Metadata Structured metadata for lakehouse tables β€” schemas, columns, lineage Table orders(user_id, amount, created_at) and its upstream/downstream dependencies

All four are managed under a unified ctx:// URI namespace with the same versioning, visibility, and propagation semantics.

For a detailed analysis of research gaps in each context type, see Research Positioning.

Core Capabilities ✨

Capability What It Solves
Filesystem Paradigm All context types managed as files under ctx:// URIs β€” one model for memories, skills, documents, and table metadata
LLM-native Commands Agents use ls, read, grep, stat β€” LLMs already understand file operations, no custom API needed
Multi-Agent Collaboration Team hierarchy with visibility inheritance (child reads parent, parent doesn't see child); memory promotion private β†’ team β†’ org with derived_from lineage
Version Management Pin agents to stable versions; is_breaking flag prevents silent breakage; immutable published versions
Change Propagation Upstream changes auto-notify all downstream dependents β€” no polling, no "latest version wins"
L0/L1/L2 Layered Retrieval Vector search β†’ BM25 rerank β†’ on-demand full content; 60–80% token reduction vs. flat retrieval
Tenant Isolation Row-Level Security on all tables; request-scoped tenant binding
PostgreSQL-centric Single DB ACID + RLS + LISTEN/NOTIFY + pgvector in one database; no dual-write, no message queue

Architecture πŸ›οΈ

         Agents (via OpenClaw Plugin / SDK)
              β”‚
              β–Ό
    ContextHub Server (FastAPI)
    β”œβ”€β”€ ContextStore       β€” ctx:// URI routing
    β”œβ”€β”€ MemoryService      β€” promote, lineage, team sharing
    β”œβ”€β”€ SkillService       β€” publish, subscribe, version resolution
    β”œβ”€β”€ RetrievalService   β€” pgvector + BM25 rerank
    β”œβ”€β”€ PropagationEngine  β€” outbox, retry, dependency dispatch
    └── ACLService         β€” visibility / write permissions
              β”‚
              β–Ό
    PostgreSQL + pgvector  (single DB: metadata + content + vectors + events)

Single database. No external vector store. No message queue. This eliminates dual-write consistency problems and minimizes infrastructure complexity for on-premise deployment.


Quick Start πŸš€

Prerequisites

  • Python 3.12+
  • PostgreSQL 16 with pgvector extension

Step 1: Install PostgreSQL + pgvector

macOS (Homebrew)
brew install postgresql@16
brew install pgvector
brew services start postgresql@16
Linux (Ubuntu / Debian)
# Add PostgreSQL APT repository
sudo apt install -y curl ca-certificates
sudo install -d /usr/share/postgresql-common/pgdg
sudo curl -o /usr/share/postgresql-common/pgdg/apt.postgresql.org.asc \
  --fail https://www.postgresql.org/media/keys/ACCC4CF8.asc
echo "deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] \
  https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" \
  | sudo tee /etc/apt/sources.list.d/pgdg.list

sudo apt update
sudo apt install -y postgresql-16 postgresql-16-pgvector
sudo systemctl start postgresql

Verify PostgreSQL is running:

pg_isready
# Expected: "accepting connections"

Step 2: Create Database

# macOS (Homebrew): psql postgres
# Linux: sudo -u postgres psql
psql postgres

Inside the psql shell:

CREATE USER contexthub WITH PASSWORD 'contexthub' SUPERUSER;
CREATE DATABASE contexthub OWNER contexthub;
\c contexthub
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
\q

SUPERUSER is required because the schema uses FORCE ROW LEVEL SECURITY. This is fine for local development.

Step 3: Install & Start ContextHub

git clone https://github.com/The-AI-Framework-and-Data-Tech-Lab-HK/ContextHub.git
cd ContextHub

python3 -m venv .venv
source .venv/bin/activate

pip install -e ".[dev]"
pip install greenlet
pip install -e sdk/

# Run database migrations
alembic upgrade head

# Start the server
uvicorn contexthub.main:app --port 8000

Verify:

curl http://localhost:8000/health
# {"status":"ok"}

API docs available at http://localhost:8000/docs.

Step 4: Try the Python SDK

from contexthub_sdk import ContextHubClient

client = ContextHubClient(base_url="http://localhost:8000", api_key="changeme")

# Store a private memory
memory = await client.add_memory(
    content="SELECT date_trunc('month', created_at), SUM(amount) FROM orders GROUP BY 1",
    tags=["sql", "sales"],
)

# Promote to team-shared knowledge
promoted = await client.promote_memory(uri=memory.uri, target_team="engineering")

# Semantic search across all visible contexts
results = await client.search("monthly sales summary", top_k=5)

ContextHub also integrates directly with agent frameworks like OpenClaw as a drop-in context engine β€” making context governance transparent to agent code. See Integration with OpenClaw below.

For the full E2E demo and integration tests, see Local Setup & E2E Verification Guide.


Integration with OpenClaw 🦞

ContextHub is designed as the context engine for OpenClaw β€” replacing its built-in engine with enterprise-grade context governance.

# One-command install
pnpm openclaw plugins install -l /path/to/ContextHub/bridge

What happens automatically (no agent code changes):

Event ContextHub Action
Agent receives a prompt assemble() β€” searches all visible contexts and injects relevant ones into the system prompt
Agent completes a response afterTurn() β€” extracts reusable facts and stores them as private memories

7 agent tools available in every session:

ls Β· read Β· grep Β· stat Β· contexthub_store Β· contexthub_promote Β· contexthub_skill_publish

Multi-Agent Collaboration in Action

Org: engineering/backend  ← query-agent        Org: data/analytics  ← analysis-agent
                                                     (also engineering member)
1. query-agent stores a SQL pattern as private memory

2. query-agent promotes it to engineering team
   β†’ ctx://team/engineering/shared_knowledge/monthly-sales-pattern

3. analysis-agent asks "How to query monthly sales?"
   β†’ ContextHub auto-recalls the promoted pattern via assemble()
   β†’ zero manual sharing needed

4. query-agent publishes breaking Skill v2
   β†’ analysis-agent (pinned to v1) continues using v1 stably
   β†’ advisory: "v2 available with breaking changes"

What makes this different from a shared document? ContextHub enforces visibility boundaries, tracks derived_from lineage, and propagates changes through dependency graphs β€” not just "latest version wins."

For full setup instructions, see the OpenClaw Integration Guide.


Roadmap πŸ—ΊοΈ

  • Phase 1 β€” MVP Core βœ… Context store (ctx:// URI routing), memory / skill / retrieval / propagation services, ACL with RLS + team hierarchy, Python SDK, OpenClaw context-engine plugin, data lake carrier, Tier 3 integration tests (P-1P-8, C-1C-5, A-1~A-4)
  • Phase 2 β€” Explicit ACL & Audit β€” ACL allow/deny/field mask overlay, audit logging, cross-team sharing
  • Phase 3 β€” Feedback & Lifecycle β€” Quality signals, automatic lifecycle transitions, long doc retrieval
  • Phase 4 β€” Quantitative Evaluation (ECMB) β€” SQL accuracy benchmarks, L0/L1/L2 vs. flat RAG A/B experiments
  • Phase 5 β€” Production Hardening β€” Multi-instance (SKIP LOCKED), MCP Server, real catalog connectors

Documentation πŸ“„

Document Description
OpenClaw Integration Guide Full 5-terminal setup for ContextHub + OpenClaw
Local Setup & E2E Verification Dev environment, migrations, E2E demo
MVP Verification Plan Three-layer verification: tests β†’ API demo β†’ runtime contract
Developer Guide API overview, SDK reference, tech stack, project structure

References πŸ“š

License βš–οΈ

Apache License 2.0

About

ContextHub: Unified Context Management for Multi-Agent Collaboration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.6%
  • TypeScript 2.2%
  • Other 0.2%