feat: /understand-crossrepo — multi-repo interlinked knowledge graph#522
Open
Hafiz408 wants to merge 17 commits into
Open
feat: /understand-crossrepo — multi-repo interlinked knowledge graph#522Hafiz408 wants to merge 17 commits into
Hafiz408 wants to merge 17 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ark router ceiling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… ceiling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…point/service layer membership Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e + validate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… source, bind phase2 vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lock files (package-lock.json, pnpm-lock.yaml, …) are huge and full of registry/vcs URLs (github.com) that are never real integration signals. Found during e2e: pricing_ui emitted 260 outbound signals, ~all github.com from package-lock.json. Skip by exact basename (their .json/.yaml extensions slip past the existing .lock BINARY_EXTS guard). pricing_ui 260→18 signals. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…yers combine-graphs.py picked layer members by id-prefix (file/endpoint/service), but the canonical graph validator treats config/document/pipeline/table/schema/ resource as file-level too — every such node must be in exactly one layer. Found during e2e: 59 'File node not in any layer' validation issues (config, document, pipeline, resource). Fix: select layer members by node.type against the validator's full FILE_LEVEL_TYPES set (a pipeline node can carry a step: id, so type — not id-prefix — is authoritative). Regression test covers a config node and a pipeline node with a step: id. Final graph now validates 0 issues. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The assembler validated only against the inline .cjs, which doesn't check the dashboard's stricter core/schema.ts. The e2e dashboard load dropped 270 edges/ nodes. Three real gaps, all fixed in apply-interlinks.py: - project metadata: add analyzedAt + gitCommitHash (ProjectMetaSchema requires them; without them the graph fails fatally to load). - node normalization: ensure every node has a valid complexity, and drop a string lineRange (some per-repo graphs store lineRange as a string, which fails node validation → the node is dropped → its edges cascade-drop). - edge types: map linker types not in EdgeTypeSchema (authenticates_via, embeds) to depends_on, preserving the original semantic in description (which survives schema stripping; label does not). All other linker types already pass. Dashboard now loads with 0 validation issues; all 4 layers + 10 cross-repo edges render. Regression test test_dashboard_schema_requirements locks all three. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r order, drop dead regexes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… label Ponytail trims (no behavior change, both verified by tests + e2e): - tour steps no longer compute inline `order` (the sequential renumber after assembly is the single source of truth); removes misleading len()+2 arithmetic. - cross-repo edges set only `description` (kept by the dashboard schema), not a duplicate `label` (stripped on load, read by nothing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hafiz408
added a commit
to Hafiz408/Understand-Anything
that referenced
this pull request
Jun 26, 2026
…rection QA across gemba/market_pulse/product_lens repo sets surfaced that some base /understand graphs emit edges with weight>1 (e.g. 2) or invalid direction. The dashboard's core/schema clamps them on load and shows an 'N auto-corrections' banner (240/123 on gemba_ui/product_lens). The assembler already normalizes nodes (lineRange/complexity); do the same for edges so the combined graph renders clean on ANY source repos. Sets now validate with 0 issues (was 240/10/123).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
/understand-crossrepo— multi-repo interlinked knowledge graphUnderstand Anything analyzes one repo at a time. A microservice platform is many repos that communicate at runtime (HTTP calls, an SSO/auth provider, iframe embedding, a shared data hub) — links invisible to static import analysis. This adds a new command that combines several interlinked repos into one namespaced, per-repo-layered, cross-linked knowledge graph, explorable in the existing dashboard with no dashboard changes.
Pipeline (new
skills/understand-crossrepo/SKILL.md, Phases 0–7)/understandgraph (git-commit unchanged) or build it.extract-crossrepo-signals.mjs, a deterministic, stdlib-only scanner for outbound API hosts, Keycloak clients, iframe/ct_token embeds, Pub/Sub topics, GCS buckets, plus each repo's served identity.combine-graphs.pynamespaces every node id (<type>:<repo>/<path>[:member]), unions + dedups, and builds onelayer:<repo>+ amodule:<repo>anchor per repo.agents/crossrepo-linker.md(LLM) matches one repo's outbound signals to another's identity and emits typed cross-repo edges (calls/authenticates_via/embeds/publishes/…) with confidence + evidence; self-grounds (no fabrication).apply-interlinks.pysynthesizes shared-infra nodes (service:external/<svc>+layer:external-shared-infra), applies edges (dedup, drop-dangling, low-confidence tag), assembles, and validates.Verified
node:test, combine 7 + apply 8pytest).savo_pricing_ui+savo_pricing_service+savo_bridge_service: 508-node / 761-edge / 4-layer combined graph, 0 id collisions, 10 cross-repo edges (incl.pricing_ui → bridgecalls and all repos → Keycloak), rendered in the dashboard with 0 graph/validation errors.The graph is validated field-by-field against
packages/core/src/schema.ts; the dashboard is untouched.