Skip to content

MCP server fails on multi-repo cwd + 3 related bugs (ambiguity, path-as-name, case-sensitivity, unsafe consolidate) #283

@Maicololiveras

Description

@Maicololiveras

Summary

Engram MCP server (engram mcp) keeps creating ambiguity errors and garbage projects when the cwd has multiple .git children, and several related issues compound the experience. After running for a few months on Windows my user store accumulated 9 garbage projects with absolute paths as project names plus several unintended duplicates, while every mem_save from the MCP returned ambiguous_project.

This issue bundles 4 related bugs found while debugging on engram 1.14.12 (Windows 11, MCP integrated with Claude Code).


Bug 1 — mem_save fails with ambiguous_project when cwd has multiple .git children

Severity: High — blocks mem_save from MCP entirely whenever the user's home (or any parent directory) contains multiple repos.

Repro:

  1. Have multiple git repos directly under $HOME (very common on Windows: e.g. ~/engram, ~/agent-teams-lite, ~/dotfiles).
  2. Start Engram MCP server with default args (no --project).
  3. Call mem_save with title + content (no project param — schema doesn't accept one anyway).

Expected: save succeeds, project resolved to general (or last-used, or --project default).

Actual:

{
  "available_projects": ["agent-teams-lite", "dotfiles", "engram", ...],
  "error_code": "ambiguous_project",
  "hint": "Use mem_current_project to inspect detection results, or cd into one of the listed repositories.",
  "message": "Cannot determine project: ambiguous project: multiple git repos found in cwd"
}

Why it's blocking: when Engram MCP is launched by Claude Code / OpenCode, the server's cwd is fixed (typically the user's home directory). The user cannot cd into a specific repo from inside the agent — the MCP process inherits the parent agent's cwd. So the suggested workaround in hint is impossible to apply.

The mem_save tool schema also doesn't accept project as an argument, so even passing it explicitly doesn't help.

Suggested fix: when ambiguity is detected, fall back to (in order):

  1. The --project NAME flag if provided to the MCP server.
  2. The most recently used project (last last_activity_at).
  3. general.

Don't error out — the server has the info to make a sensible default.


Bug 2 — When auto-detection fails, projects get created with the absolute path as the project name

Severity: Medium — produces garbage data over time. Workaround = manual SQL cleanup.

After several months of usage I had 9 garbage project entries in my Engram store, all with Windows-style absolute paths as the project name and 0 observations but lots of empty session rows. Sample shape (paths anonymized):

Project name pattern obs sessions
C:\Users\<user> 0 296
C:\Users\<user>\.opencode 0 53
C:\<workspace-A> 0 50
C:\Users\<user>\source\repos\<repo> 0 11
D:\<workspace-B>\<sub>\<sub> 0 17
...and 4 more similar 0 varied

These were created in earlier 1.14.x versions where the MCP fell back to "use cwd as the project name" when detection failed. Newer versions error out instead (Bug 1), but the garbage from old versions remains.

Cleanup query (worked safely — 434 empty sessions deleted, 0 observations lost):

DELETE FROM sessions WHERE project LIKE 'C:\\%' AND project IN (
  SELECT project FROM sessions GROUP BY project HAVING COUNT(*) > 0
  AND project NOT IN (SELECT DISTINCT project FROM observations WHERE deleted_at IS NULL)
);

Suggested fix: validate project names. Reject any name that:

  • Contains \ or / (filesystem separators) — these are paths, not project names
  • Is an absolute path pattern (^[A-Z]: on Windows, ^/ on Unix)

Fall back to general instead of using the path verbatim.

Optionally: engram doctor cleanup-paths command that detects and deletes these in one shot.


Bug 3 — mem_search filter project: "E3" is lowercased internally and not found

Severity: Medium — case sensitivity inconsistency.

Repro:

  1. Have a project saved with name E3 (uppercase).
  2. Call mem_search(query: "...", project: "E3") from MCP.

Expected: returns observations under project E3.

Actual:

{
  "available_projects": [..., "E3", ...],
  "error_code": "unknown_project",
  "hint": "Use one of the available_projects values, or omit project to auto-detect.",
  "message": "Project \"e3\" not found in store"
}

The error message shows the input was lowercased to "e3" before lookup, even though available_projects lists E3 (uppercase) in the same response.

Suggested fix: case-insensitive lookup against the available_projects set, OR preserve case as-is. Currently behavior is inconsistent (server preserves case in storage but downcases in input filter).

This bug is also present in the SQLite backend if you query directly: SELECT * FROM observations WHERE project = 'e3' returns nothing (case-sensitive collation), but the user filter that arrives via MCP transforms the input — so it's specifically an input normalization issue at the MCP boundary.


Bug 4 — engram projects consolidate --all fuzzy-matching groups unrelated projects

Severity: High — runs the risk of catastrophic data merge.

Repro:

engram projects consolidate --all --dry-run

Expected: groups of projects with truly similar names (typos, casing variants, kebab vs snake case).

Actual: returns 4 groups, but Group 2 contains 46 completely unrelated projects with no shared substring or naming pattern, and proposes merging all of them into general. Sample shape (project names anonymized — what matters is the count and that they are clearly distinct projects):

Group 2:
    [1]  <internal-tool-A>                47 obs
    [2]  <absolute-path-B>                 0 obs
    [10] <product-C>                       1 obs
    [11] <product-C-variant>               1 obs
    [12] <presentation-D>                 13 obs
    [14] <mcp-E>                           4 obs
    [16] <plugin-F>                        6 obs
    [17] <daemon-G>                        1 obs
    [22] <training-H>                     10 obs
    [23] <runtime-I>                      12 obs
    [24] <plugin-J>                       90 obs
  → [27] general                         262 obs
    [29] <industrial-K>                   46 obs
    [35] <agent-L>                       119 obs
    [38] <user-home>                     229 obs
    [40] <opencode-plugin-M>              69 obs
    ...46 entries total — every one of these is a distinct project...
  Suggested canonical: "general"
  [dry-run] Would merge into "general"

These are fully distinct projects with their own observation history (4 to 229 obs each). If a user runs engram projects consolidate --all (without --dry-run) trusting the algorithm, they would lose project separation across most of their store.

Suggested fix: tighten the similarity threshold significantly. A reasonable heuristic:

  • Levenshtein distance ≤ 3 on the canonical (lowercased, dehyphenated) form, OR
  • Substring containment with both strings >50% of each other, OR
  • Only group projects that share a normalized prefix of ≥ 4 characters.

The current algorithm seems to use something much looser (maybe just "ends with vowel-consonant" or "shares 1+ token") which is far too permissive.

For reference, Group 1 in the same report is correct (course-brandingcourse-presentation), so the algorithm has the right shape, just calibrated way too loose for Group 2.

Suggested additional safeguard: add an interactive --confirm-each flag that requires explicit y/N per group. The current --dry-run is good but a user who skips dry-run could lose data instantly.


Environment

  • OS: Windows 11 Pro 10.0.22631
  • Engram: 1.14.12 (also reproduced on 1.14.4)
  • Shell: Git Bash + PowerShell 7
  • Integration: Claude Code (MCP via ~/.claude/mcp/engram.json)
  • Multi-repo home directory: 3+ git repos directly under $HOME (and growing)

Impact summary

These 4 bugs together produce a degrading user experience over time:

  1. Bug 1 prevents normal mem_save from MCP → users do SQL workarounds → data ends up inconsistently saved
  2. Bug 2 silently accumulates garbage entries that pollute the project list
  3. Bug 3 surfaces as confusing "Project not found" when the project clearly exists
  4. Bug 4 turns a "cleanup" tool into a data-loss footgun

I'd be glad to provide more diagnostics or test fixes — happy to PR the validation logic for Bug 2 if you'd like.

Thanks for the great tool 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions