Skip to content

🧠 Gateway-Level Memory System β€” Persistent Project Context Across Engines and TerminalsΒ #91

@realDuang

Description

@realDuang

πŸ’‘ Vision

CodeMux sits at the crossroads of every message β€” all user inputs, AI outputs, and tool calls flow through EngineManager. Yet every new conversation starts from scratch. The AI doesn't know what decisions were made last session, what conventions this project follows, or how the user prefers to work.

This hurts most in three scenarios:

Scenario Pain
Engine switching Discovered "tests must use --runInBand" while using Engine A, switched to Engine B β€” knowledge lost
IM Bot access Asking questions via Feishu/DingTalk β€” the AI has zero project context, no local config files to fall back on
New conversations Repeating "this project uses bun", "deploy target is staging", "API uses snake_case" every single time

CodeMux is the only layer that can provide cross-engine memory continuity. Any individual engine's memory system serves only itself β€” switch engines and the context breaks.

🎯 Design Principles

1. Piggyback, Don't Drive

Routine memory updates should NOT require a separate LLM call.

There are two approaches to memory extraction:

  • Post-hoc analysis: After a conversation ends, spin up a separate LLM session to analyze the dialogue and extract memories. This costs extra tokens every turn and assumes the user's API is always available.
  • Inline update: Tell the AI in the prompt "after completing your task, update the memory file." The AI adds one file-edit tool call within the conversation turn it's already running. Zero extra cost.

CodeMux should use the latter. The LLM is already running β€” let it handle memory maintenance as part of the task. Never consume tokens behind the user's back.

2. Thin Memory Layer β€” Complement Engines, Don't Compete

Some engines have their own sophisticated memory systems. CodeMux should not duplicate that work. Instead, it should fill the gap for engines that lack memory and provide cross-engine continuity.

Only store knowledge that cannot be derived from code: user preferences, project decisions, lessons learned, workflow habits. Never store file paths, function signatures, or architecture details β€” the AI can retrieve those in real-time via tools.

3. Plain Files, No Complex Retrieval

No vector databases, embeddings, or SQLite. Markdown files + AI reading on demand is sufficient. Users can open and edit memory files with any text editor β€” fully transparent and controllable.

πŸ—‚οΈ Storage Structure

~/.codemux/memory/
β”œβ”€β”€ global.md                              # Global user preferences (cross-project)
β”‚
└── projects/
    └── <sanitized-project-path>/
        β”œβ”€β”€ MEMORY.md                      # Project memory index (<200 lines)
        β”‚                                   # One-line pointers: - [Title](topics/file.md) β€” hook
        β”œβ”€β”€ daily/                         # Daily notes (write buffer)
        β”‚   β”œβ”€β”€ 2026-04-04.md
        β”‚   └── 2026-04-03.md
        └── topics/                        # Topic files (consolidated long-term knowledge)
            β”œβ”€β”€ testing-patterns.md
            └── deployment-notes.md
  • daily/*.md β€” Low-friction write buffer. The AI appends discoveries during conversation without worrying about breaking existing structure
  • topics/*.md β€” Consolidated long-term knowledge, produced by the "Organize" feature
  • MEMORY.md β€” Navigation index. The AI reads the titles and decides whether to open a specific topic file

πŸ’‰ Injection Strategy

In EngineManager.sendMessage(), inject memory context before sending to the engine:

Content Injection
global.md Full (typically <20 lines)
MEMORY.md Full (<200 lines)
daily/today.md Full
daily/yesterday.md Full (if exists)
topics/*.md Not injected β€” AI reads on demand

Total budget: ≀500 lines / ≀50KB.

Adaptive injection by engine capability: For engines with their own memory system, only inject global.md (cross-engine user preferences) to avoid duplication and potential contradictions. For engines without memory, do full injection.

πŸ–₯️ Frontend Interaction

Memory Tab β€” Third Tab in the Right Panel

Add a "Memory" tab alongside "Files" / "Changes" in the existing File Explorer panel:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Files β”‚ Changes β”‚ Memory        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                  β”‚
β”‚  πŸ“Œ Global Preferences           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ β€’ Respond in Chinese       β”‚  β”‚
β”‚  β”‚ β€’ Always use pnpm          β”‚  β”‚
β”‚  β”‚                    [Edit]  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                  β”‚
β”‚  πŸ“ Project: codemux             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Index (12 entries)         β”‚  β”‚
β”‚  β”‚ β”œ testing-patterns     [β†’] β”‚  β”‚
β”‚  β”‚ β”œ deployment-notes     [β†’] β”‚  β”‚
β”‚  β”‚ β”” api-conventions      [β†’] β”‚  β”‚
β”‚  β”‚                            β”‚  β”‚
β”‚  β”‚ Today's Notes (3 entries)  β”‚  β”‚
β”‚  β”‚ β”œ CI requires Node 20     β”‚  β”‚
β”‚  β”‚ β”œ User prefers Tailwind   β”‚  β”‚
β”‚  β”‚ β”” Staging config changed  β”‚  β”‚
β”‚  β”‚                            β”‚  β”‚
β”‚  β”‚        [Edit] [Organize]   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why the right panel instead of Sidebar: The sidebar is for navigation (projects, sessions, scheduled tasks). Memory is content β€” closer in nature to Files/Changes. Users need to reference memory while chatting, and the right panel is visible alongside the conversation.

View & Edit

  • Click any entry to expand full content
  • [Edit] opens an inline Markdown editor
  • Hover reveals a delete button
  • [+ Add] button at the top for manual entries

Organize β€” Button in the Memory Panel

Clicking [Organize] opens a modal with two modes:

Mode What It Does Cost
Smart Organize Creates a temporary session, uses AI to distill daily notes into topic files, updates the index ~5-10K tokens (user's conscious choice)
Quick Organize Pure file operations: deduplicate lines, truncate old dailies, preserve index Zero

The modal clearly shows estimated token cost so the user makes an informed decision. Progress is displayed within the Memory tab.

Gentle Reminders

When daily notes accumulate for 7+ days without organizing, show a notification badge on the Memory tab:

Files β”‚ Changes β”‚ Memory πŸ”΄

Inside the panel, a top banner:

⚠ 7 days since last organize β€” consider tidying up for cleaner context  [Organize] [Dismiss]

πŸ“± IM Bot Scenario

IM Bots are where memory provides the most value β€” the AI in a Feishu group has no local filesystem to rely on. Memory injection is the only source of persistent context.

  • Full injection of all memory tiers (IM users can't browse files themselves)
  • Since the AI can't directly edit files via IM β†’ it outputs structured blocks (e.g., <memory-update>...</memory-update>), and the channel adapter parses and writes them to disk
  • IM Bot slash commands could include /memory to view the current project's memory

πŸ›‘οΈ Compact Protection

No special hooks needed. Since memory updates happen inline (piggybacking on the conversation), once the AI writes to a daily file, it's already on disk.

Optional enhancement: when a compact event is detected, append a one-line system hint to the next message β€” "Context was just compressed. If you have unsaved discoveries, please write them to the memory file now." This is not an extra LLM call, just an extra line in an already-flowing message.

πŸ—οΈ Why CodeMux Is Uniquely Positioned

Existing Infrastructure How It Enables Memory
EngineManager (message hub) Natural injection point β€” all messages pass through here
ConversationStore Full conversation history already persisted β€” data source is ready
Identity prompt injection System prompt augmentation pattern already exists
Engine adapter abstraction Adaptive injection per engine capability is straightforward
Channel adapters (IM Bots) Memory parsing + write-back for non-filesystem environments
WebSocket gateway Real-time Memory Panel updates across all connected clients
File Explorer panel UI container for the Memory tab already exists

πŸ“‹ Implementation Path

Phase Content Extra LLM?
P0 memory-store.ts β€” file read/write + directory management ❌
P0 engine-manager.ts β€” inject memory on sendMessage ❌
P0 Append memory-update instructions to system prompt ❌
P1 Memory Panel UI (third tab in right panel) ❌
P1 Adaptive injection by engine type ❌
P2 Smart Organize (frontend-triggered, temporary session) βœ… User-initiated
P2 Quick Organize (pure file ops fallback) ❌
P2 IM Bot adapter: memory output parsing + write-back ❌
P3 Memory entry provenance (link back to originating conversation) ❌
P3 Compact event detection + reminder injection ❌

P0 estimate: ~3 files, 300–500 lines. A memory-store.ts for file I/O, injection logic in engine-manager.ts, and memory-update instructions appended to the system prompt.

πŸ€” Open Questions

  1. Injection format β€” Should memory be injected as a system message prefix, a separate system message block, or prepended to the user's first message?
  2. Engine detection β€” How to reliably detect whether an engine has its own memory system? Hardcode per engine type, or probe at runtime?
  3. Multi-user IM groups β€” When multiple users share a Feishu group session, should global preferences be per-user or per-group?
  4. Memory size governance β€” What happens when daily notes grow very large before the user organizes? Auto-truncate, warn, or let it grow?
  5. Conflict resolution β€” If the AI writes contradictory information to memory across sessions, how should the Organize step handle it?

πŸ“Ž Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestroadmapRoadmap tracking item

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions