feat: /understand-crossrepo — multi-repo interlinked knowledge graph by Hafiz408 · Pull Request #522 · Egonex-AI/Understand-Anything

Hafiz408 · 2026-06-26T17:23:28Z

`/understand-crossrepo` — multi-repo interlinked knowledge graph

Understand Anything analyzes one repo at a time. A microservice platform is many repos that communicate at runtime (HTTP calls, an SSO/auth provider, iframe embedding, a shared data hub) — links invisible to static import analysis. This adds a new command that combines several interlinked repos into one namespaced, per-repo-layered, cross-linked knowledge graph, explorable in the existing dashboard with no dashboard changes.

Pipeline (new `skills/understand-crossrepo/SKILL.md`, Phases 0–7)

Select repos (args or interactive) + output dir.
Reuse-or-fill — reuse each repo's fresh /understand graph (git-commit unchanged) or build it.
Extract signals — extract-crossrepo-signals.mjs, a deterministic, stdlib-only scanner for outbound API hosts, Keycloak clients, iframe/ct_token embeds, Pub/Sub topics, GCS buckets, plus each repo's served identity.
Combine — combine-graphs.py namespaces every node id (<type>:<repo>/<path>[:member]), unions + dedups, and builds one layer:<repo> + a module:<repo> anchor per repo.
Link — agents/crossrepo-linker.md (LLM) matches one repo's outbound signals to another's identity and emits typed cross-repo edges (calls/authenticates_via/embeds/publishes/…) with confidence + evidence; self-grounds (no fabrication).
Apply — apply-interlinks.py synthesizes shared-infra nodes (service:external/<svc> + layer:external-shared-infra), applies edges (dedup, drop-dangling, low-confidence tag), assembles, and validates.

Verified

31 unit tests (extractor 16 node:test, combine 7 + apply 8 pytest).
Live e2e on savo_pricing_ui + savo_pricing_service + savo_bridge_service: 508-node / 761-edge / 4-layer combined graph, 0 id collisions, 10 cross-repo edges (incl. pricing_ui → bridge calls and all repos → Keycloak), rendered in the dashboard with 0 graph/validation errors.

The graph is validated field-by-field against packages/core/src/schema.ts; the dashboard is untouched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ark router ceiling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… ceiling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…point/service layer membership Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…e + validate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… source, bind phase2 vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Lock files (package-lock.json, pnpm-lock.yaml, …) are huge and full of registry/vcs URLs (github.com) that are never real integration signals. Found during e2e: pricing_ui emitted 260 outbound signals, ~all github.com from package-lock.json. Skip by exact basename (their .json/.yaml extensions slip past the existing .lock BINARY_EXTS guard). pricing_ui 260→18 signals. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…yers combine-graphs.py picked layer members by id-prefix (file/endpoint/service), but the canonical graph validator treats config/document/pipeline/table/schema/ resource as file-level too — every such node must be in exactly one layer. Found during e2e: 59 'File node not in any layer' validation issues (config, document, pipeline, resource). Fix: select layer members by node.type against the validator's full FILE_LEVEL_TYPES set (a pipeline node can carry a step: id, so type — not id-prefix — is authoritative). Regression test covers a config node and a pipeline node with a step: id. Final graph now validates 0 issues. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The assembler validated only against the inline .cjs, which doesn't check the dashboard's stricter core/schema.ts. The e2e dashboard load dropped 270 edges/ nodes. Three real gaps, all fixed in apply-interlinks.py: - project metadata: add analyzedAt + gitCommitHash (ProjectMetaSchema requires them; without them the graph fails fatally to load). - node normalization: ensure every node has a valid complexity, and drop a string lineRange (some per-repo graphs store lineRange as a string, which fails node validation → the node is dropped → its edges cascade-drop). - edge types: map linker types not in EdgeTypeSchema (authenticates_via, embeds) to depends_on, preserving the original semantic in description (which survives schema stripping; label does not). All other linker types already pass. Dashboard now loads with 0 validation issues; all 4 layers + 10 cross-repo edges render. Regression test test_dashboard_schema_requirements locks all three. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…r order, drop dead regexes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… label Ponytail trims (no behavior change, both verified by tests + e2e): - tour steps no longer compute inline `order` (the sequential renumber after assembly is the single source of truth); removes misleading len()+2 arithmetic. - cross-repo edges set only `description` (kept by the dashboard schema), not a duplicate `label` (stripped on load, read by nothing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rection QA across gemba/market_pulse/product_lens repo sets surfaced that some base /understand graphs emit edges with weight>1 (e.g. 2) or invalid direction. The dashboard's core/schema clamps them on load and shows an 'N auto-corrections' banner (240/123 on gemba_ui/product_lens). The assembler already normalizes nodes (lineRange/complexity); do the same for edges so the combined graph renders clean on ANY source repos. Sets now validate with 0 issues (was 240/10/123).

Hafiz408 and others added 16 commits June 26, 2026 19:08

feat(crossrepo): skill scaffold + repo selection + output dir

ee6dc0c

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(crossrepo): deterministic cross-repo signal extractor + tests

c6e348c

fix(crossrepo): dedup api signals, catch yaml bucket/topic signals, m…

7a9e318

…ark router ceiling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(crossrepo): cover k8s two-line yaml signals + mark pendingEnvKey…

7c08df3

… ceiling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(crossrepo): combine graphs with per-repo namespacing + layering

be5ab70

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(crossrepo): tag module anchor, tighten trash-path skip, cover end…

d0a24cf

…point/service layer membership Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(crossrepo): LLM cross-repo linker agent

604cba6

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(crossrepo): apply interlinks, synthesize external infra, assembl…

feb16d0

…e + validate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(crossrepo): tag lowConfidence for zero-confidence cross-repo edges

6e757b4

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(crossrepo): wire end-to-end orchestration in SKILL.md

da4585b

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(crossrepo): expand OUT_DIR in phase6, count cross-repo edges from…

cf6ef57

… source, bind phase2 vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(crossrepo): correct phase6 low-confidence/edge counts, unique tou…

1edc1ad

…r order, drop dead regexes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Hafiz408 added a commit to Hafiz408/Understand-Anything that referenced this pull request Jun 26, 2026

docs(readme): point cross-repo PR link at Egonex-AI#522

ea11c0b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: /understand-crossrepo — multi-repo interlinked knowledge graph#522

feat: /understand-crossrepo — multi-repo interlinked knowledge graph#522
Hafiz408 wants to merge 17 commits into
Egonex-AI:mainfrom
Hafiz408:feat/understand-crossrepo

Hafiz408 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Hafiz408 commented Jun 26, 2026

/understand-crossrepo — multi-repo interlinked knowledge graph

Pipeline (new skills/understand-crossrepo/SKILL.md, Phases 0–7)

Verified

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`/understand-crossrepo` — multi-repo interlinked knowledge graph

Pipeline (new `skills/understand-crossrepo/SKILL.md`, Phases 0–7)