Skip to content

feat(store): inject implicit graph-linking metadata via topic_key siblings#266

Open
Snakeblack wants to merge 1 commit intoGentleman-Programming:mainfrom
Snakeblack:feat/implicit-graph-linking
Open

feat(store): inject implicit graph-linking metadata via topic_key siblings#266
Snakeblack wants to merge 1 commit intoGentleman-Programming:mainfrom
Snakeblack:feat/implicit-graph-linking

Conversation

@Snakeblack
Copy link
Copy Markdown

Summary / Resumen

Adds implicit graph-linking to mem_search: alongside FTS5 lexical results, the response now appends sibling observations that share topic_key with any returned hit. This gives agents the relational neighbourhood of a memory without an embedding model — staying true to Engram's single binary, zero dependencies, local-first philosophy.

Changes

Store Layer

  • RelatedByTopicKeys(keys, project, scope, excludeIDs) (map[string][]RelatedObservation, error): returns up to 5 sibling rows per topic_key, filtered by project + scope, excluding the IDs already in the search hit set and rows where deleted_at IS NOT NULL. Uses the existing idx_obs_topic(topic_key, project, scope, updated_at DESC) composite index.
  • InsertObservationForTest(...): low-level test helper that bypasses AddObservation's topic_key UPSERT path so tests can seed multiple sibling rows sharing (topic_key, project, scope).

MCP Layer

  • handleSearch groups returned topic_keys by (project, scope) so the lookup respects the same partition the search ran in. After listing the primary results, it appends:
    ---
    Related Context (from graph):
    Related (topic "auth/jwt"): ID 4 (Title), ID 12 (Title)
    
  • The existing direct topic_key route (Rank=-1000) is untouched.

Tests

  • Store (5): siblings returned, deleted excluded, project+scope respected, 5-per-key limit, empty-keys short-circuit.
  • MCP (2): Related Context block formatted correctly when results carry topic_key; absent when no result has one.

Migration impact

  • No DB schema change. No new columns.
  • Reuses existing topic_key, project, scope columns and idx_obs_topic index.
  • Backwards compatible — only enriches the mem_search response text.

Closes #243

Inject related observations sharing topic_key into mem_search responses
so agents see the lexical neighbourhood without an embedding model.

- store: RelatedByTopicKeys looks up up to 5 sibling rows per topic_key
  filtered by project + scope, excluding the just-returned IDs.
- store: InsertObservationForTest helper bypasses AddObservation's
  topic_key UPSERT path so tests can seed multiple sibling rows.
- mcp: handleSearch groups topic_keys by (project, scope) and appends
  a 'Related Context (from graph)' block to the response.

Closes Gentleman-Programming#243
Copilot AI review requested due to automatic review settings April 27, 2026 20:50
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enriches mem_search results with an implicit “graph neighborhood” by appending up to 5 sibling observations that share a topic_key with any primary search hit, scoped to the same (project, scope) partition.

Changes:

  • Add Store.RelatedByTopicKeys(...) to fetch sibling observations (excluding soft-deleted rows and already-returned IDs).
  • Extend mcp.handleSearch to append a Related Context (from graph) block to mem_search responses.
  • Add store- and MCP-level tests covering sibling lookup behavior and output formatting.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
internal/store/store.go Adds RelatedByTopicKeys query + supporting types/helpers for sibling lookup.
internal/store/store_test.go Adds unit tests covering sibling retrieval rules (deleted exclusion, project/scope filter, per-key limit, empty keys).
internal/mcp/mcp.go Appends related-context section to mem_search output, grouping topic keys by (project, scope).
internal/mcp/mcp_test.go Adds tests ensuring related-context block appears (or not) in mem_search responses.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/store/store.go
Comment on lines +5749 to +5758
query := fmt.Sprintf(`
SELECT id, title, topic_key
FROM observations
WHERE topic_key IN (%s)
AND deleted_at IS NULL
AND project = ?
AND scope = ?
%s
ORDER BY updated_at DESC
`, strings.Join(keysPlaceholders, ","), excludeClause)
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RelatedByTopicKeys currently selects all rows matching the topic_key set and only enforces the “max 5 siblings per topic_key” limit in Go. If a topic_key has many observations, this can scan and allocate far more rows than needed and slow down mem_search. Consider enforcing the per-topic limit in SQL (e.g., via a window function ROW_NUMBER() OVER (PARTITION BY topic_key ORDER BY updated_at DESC) and filtering rn<=5) so SQLite can stop earlier and do less work.

Copilot uses AI. Check for mistakes.
Comment on lines +7029 to +7031
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 1'").Scan(&id1)
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 2'").Scan(&id2)
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 3'").Scan(&id3)
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These QueryRow().Scan calls ignore the returned error. If an insert failed or the row isn’t found, the test will proceed with id=0 and produce misleading failures. Please check Scan errors (and fail fast) when loading id1/id2/id3.

Suggested change
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 1'").Scan(&id1)
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 2'").Scan(&id2)
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 3'").Scan(&id3)
if err := s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 1'").Scan(&id1); err != nil {
t.Fatalf("load id1: %v", err)
}
if err := s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 2'").Scan(&id2); err != nil {
t.Fatalf("load id2: %v", err)
}
if err := s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 3'").Scan(&id3); err != nil {
t.Fatalf("load id3: %v", err)
}

Copilot uses AI. Check for mistakes.
Comment on lines +7087 to +7090
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 1'").Scan(&id1)
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 2'").Scan(&id2)

s.DeleteObservation(id2, false)
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test ignores errors from QueryRow().Scan and from DeleteObservation. If DeleteObservation fails (or IDs are zero due to a failed scan), the assertion about deleted rows can become unreliable. Please assert errors for the Scan calls and for DeleteObservation.

Suggested change
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 1'").Scan(&id1)
s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 2'").Scan(&id2)
s.DeleteObservation(id2, false)
err = s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 1'").Scan(&id1)
if err != nil {
t.Fatalf("select id1: %v", err)
}
err = s.db.QueryRow("SELECT id FROM observations WHERE title = 'Obs 2'").Scan(&id2)
if err != nil {
t.Fatalf("select id2: %v", err)
}
if err := s.DeleteObservation(id2, false); err != nil {
t.Fatalf("delete observation: %v", err)
}

Copilot uses AI. Check for mistakes.
Comment on lines +7110 to +7121
s.CreateSession("sess-1", "proj", "/tmp/proj")
s.CreateSession("sess-2", "other", "/tmp/other")

s.db.Exec(`
INSERT INTO observations (sync_id, session_id, type, title, content, project, scope, topic_key, normalized_hash, revision_count, duplicate_count, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, 1, 1, datetime('now'), datetime('now'))
`, "sync-1", "sess-1", "bugfix", "Proj 1 Obs", "content", "proj", "project", "auth/jwt", "hash1")

s.db.Exec(`
INSERT INTO observations (sync_id, session_id, type, title, content, project, scope, topic_key, normalized_hash, revision_count, duplicate_count, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, 1, 1, datetime('now'), datetime('now'))
`, "sync-2", "sess-2", "bugfix", "Other Proj Obs", "content", "other", "project", "auth/jwt", "hash2")
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test drops errors from CreateSession and both INSERT Exec calls. If session creation fails or FK constraints reject the inserts, the test can fail later with confusing symptoms. Please handle and assert the returned errors for these calls.

Copilot uses AI. Check for mistakes.
Comment on lines +7140 to +7147
s.CreateSession("sess-1", "proj", "/tmp/proj")

for i := 0; i < 10; i++ {
s.db.Exec(`
INSERT INTO observations (sync_id, session_id, type, title, content, project, scope, topic_key, normalized_hash, revision_count, duplicate_count, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, 1, 1, datetime('now'), datetime('now'))
`, fmt.Sprintf("sync-%d", i), "sess-1", "bugfix", fmt.Sprintf("Obs %d", i), "content", "proj", "project", "auth/jwt", fmt.Sprintf("hash%d", i))
}
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors returned by CreateSession and the INSERT Exec calls are ignored here. If any of these fail, the test may pass/fail nondeterministically (e.g., zero rows inserted) without a clear cause. Please check and fail on errors for these DB operations.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(store): inject implicit graph-linking metadata in search results via topic_key siblings

2 participants