fix: resolve multi-repo cwd ambiguity + 3 related bugs (Closes #283)#284
Conversation
Fixes 4 related issues that compound for users with multiple git repos in $HOME or large project stores accumulated over months. All fixes are opt-in or strictly additive — no breaking changes. Bug Gentleman-Programming#1 — mem_save ambiguity fallback (--default-project flag) resolveWriteProject now consults a configurable fallback when cwd-based detection returns ErrAmbiguousProject. Operators opt in by launching the MCP server with `engram mcp --default-project=NAME`. Empty default preserves the historical fail-fast contract. The fallback response carries Source="default_fallback" + Warning so callers can see what happened. Bug Gentleman-Programming#2 — Reject path-like project names NormalizeProject now detects filesystem paths (Windows drive paths, UNC, Unix absolute, anything with separators) and reduces them to the basename with a warning. A new helper sanitizePathLike isolates the detection for testability. This prevents future "C:\Users\foo" artifacts from polluting the project list when detection fails. Bug Gentleman-Programming#3 — Case-insensitive project resolution New Store.ResolveProjectName(input) tries case-sensitive first, falls back to COLLATE NOCASE, and returns the canonical stored casing. The MCP layer uses it from resolveReadProject so a search filter like project=\"E3\" finds legacy data stored as \"E3\" (previously reported as not found because the store normalized the input to \"e3\" before the case-sensitive lookup). Distinct casings remain individually addressable — exact-case match always wins. Bug Gentleman-Programming#4 — Tighten consolidate similarity grouping groupSimilarProjects now skips directories shared by more than 3 distinct projects (parameterized by maxSharedProjectsForDirMatch). This eliminates the catastrophic transitive cluster where every project sharing \$HOME got merged into a single component (in one user's store: 46 unrelated projects proposed for merge into \"general\"). Adds new flag --strict-similarity that disables shared-directory union entirely for fully name-based grouping. Tests - 8 cases for NormalizeProject_PathLike (Windows/UNC/Unix/relative). - ResolveProjectName covers exact match, case-insensitive fallback, distinct-casing addressability, unknown, and empty. - resolveWriteProject_AmbiguousFallback + SetDefaultProjectFallback normalization. - groupSimilarProjects regression (noisy dir not merged) + legitimate-rename-still-merges + strict-mode skips dir union. Closes Gentleman-Programming#283
|
CI environment note for the maintainer reviewing this PR: Confirmed locally that
The test packages exercised by this PR ( ( The CI on Linux likely won't hit the same hang — but flagging it in case it surfaces and the reviewer wonders if the PR introduced it. Happy to dig into the cmdServe goroutine leak as a follow-up issue if it'd be useful. Thanks again 🙏 |
Summary
Fixes the 4 related bugs reported in #283:
mem_saveblocks withambiguous_projectwhen cwd has multiple.gitchildren (very common when MCP server starts from$HOME).C:\Users\…artifacts).mem_searchwithproject="E3"reports "Project 'e3' not found" even though the store has a row underE3(case-sensitivity inconsistency for legacy data).engram projects consolidate --allgroups dozens of unrelated projects into a single component because every project transitively shares$HOMEas a "directory fingerprint" → unsafe merge proposed.Closes #283
Changes by bug
Bug #1 —
mem_saveambiguity fallbackengram mcp --default-project=NAMEconfigures a project to fall back to when cwd-based detection returnsErrAmbiguousProject.mcp.SetDefaultProjectFallback(name)ininternal/mcp/mcp.go. The fallback is opt-in: empty string preserves the historical "fail fast" contract.available_projectsand addsSource: "default_fallback"plus aWarningso callers can see what happened.Files:
internal/mcp/mcp.go,cmd/engram/main.go(flag parsing + help text).Bug #2 — Reject path-like project names
NormalizeProjectnow detects filesystem paths (Windows drive paths, UNC, Unix absolute, anything with\or/) and reduces them to the basename with a warning.sanitizePathLikeextracted for testability.C:\) yields empty basename → recursion bottoms out at"unknown"instead of leaving an empty project name.Files:
internal/store/store.go.Bug #3 — Case-insensitive project resolution
Store.ResolveProjectName(input)method: tries case-sensitive match first (fast path), falls back toCOLLATE NOCASEand returns the canonical stored casing. Distinct casings (e.g. bothE3ande3in store) remain individually addressable — exact-case match wins.resolveReadProjectininternal/mcp/mcp.gonow usesResolveProjectNameinstead ofProjectExists, with two attempts: normalized form first, then the raw user override (in case the user passed exact stored casing that survives normalization).DetectionResult.Projectis the canonical stored name, not the normalized input — so downstream queries match exact rows.Files:
internal/store/store.go,internal/mcp/mcp.go.Bug #4 — Tighten
consolidatesimilarity groupinggroupSimilarProjectsis now a thin wrapper overgroupSimilarProjectsWithMode(projects, strictSimilarity bool).maxSharedProjectsForDirMatch = 3: when a directory is touched by more than 3 distinct projects, it is treated as too noisy to be a fingerprint and is skipped in the union-find. This is what eliminates the catastrophic transitive cluster where every project sharing$HOMEgot merged into one component.engram projects consolidate --strict-similarity: disables shared-directory union entirely. Only name-similarity edges are formed. Recommended for stores where the user knows many projects share noisy parent paths.Files:
cmd/engram/main.go.Tests added
TestNormalizeProject_PathLike(8 cases — Windows drive paths, UNC, Unix absolute, relative paths with separators, drive root edge case, plain-name passthrough).TestResolveProjectName_CaseInsensitive(legacyE3row inserted directly via SQL, covers exact match, lowercase fallback, unknown, empty).TestResolveProjectName_PrefersExactCaseWhenBothExist(asserts addressability of distinct casings).TestResolveWriteProject_AmbiguousFallback(fallback togeneralwhen--default-projectis set, asserts envelope fields).TestSetDefaultProjectFallback_NormalizesAndTrims(confirms input normalization).TestGroupSimilarProjects_NoisyDirectoryNotMerged(5 unrelated projects sharing$HOME→ 0 groups).TestGroupSimilarProjects_LegitimateRenameStillMerges(2 projects with unique shared dir → 1 group, canonical chosen by obs count).TestGroupSimilarProjectsWithMode_StrictDisablesDirUnion(strict mode skips dir union even for legitimate renames).TestResolveWriteProject_AmbiguousErrorwas updated to explicitly clear the fallback so it continues to assert the historical error path.Backward compatibility
--default-projectis empty by default → ambiguous cwd still errors as before.--strict-similarityis opt-in → consolidate behavior unchanged for users who do not pass it.NormalizeProjectonly activates when the input contains path separators — names without separators (the overwhelming majority of inputs) are unchanged.ResolveProjectNameis a new method; existingProjectExistsis untouched. The MCP layer now prefers the new method but case-sensitive matches still work identically.Test plan
go build ./...— clean.go test ./internal/project/ ./internal/store/ ./internal/mcp/ ./cmd/engram/ -count=1— all passing locally (existing + new tests).Notes
status:needs-review. This PR is opened ahead of approval as a courtesy starting point so the maintainer can review the proposed shape of the fixes alongside the bug report. Happy to revise scope, split into smaller PRs, or hold until approval — whatever works best for the project.🙏 Thanks for the great tool.