fix: recognize batch-existing.json in incremental merge (graph no longer collapses to changed files)#527
Open
rodcon wants to merge 1 commit into
Open
Conversation
The incremental /understand path (SKILL.md Phase 2) writes the pruned prior graph to batch-existing.json and states the merge will combine it with the fresh batch-*.json files. But merge-batch-graphs.py filters batch files with the regex batch-(\d+)(?:-part-(\d+))?\.json, which batch-existing.json does not match - so it is classed 'unrecognized' and skipped at load. Every retained node/edge is dropped and the incremental update collapses the graph down to only the changed files. Fix: recognize batch-existing.json explicitly and sort it first (key -1) so the fresh per-file batches still override it during dedup (keep-last-occurrence). Verified with a standalone repro (batch-existing.json with files a,b,c + an edge, plus batch-0.json with changed file d): before, the assembled graph has only file:d.py; after, all of a,b,c,d and the edge are preserved.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The incremental
/understandpath collapses the knowledge graph down to only the changed files.skills/understand/SKILL.md(Phase 2, "Incremental update path") writes the pruned prior graph tobatch-existing.jsonand states:But
merge-batch-graphs.pydiscovers batch files and filters them with:batch-existing.jsonhas no numeric index, so it does not match, is appended tounrecognized_batch_files, and is skipped at load:Result: on every incremental update, all retained nodes/edges from the prior graph are dropped and the assembled graph shrinks to just the files that changed in that commit. The script prints a generic "unrecognized filenames will be DROPPED" warning, but the file it is dropping is the one the skill's own incremental path is documented to produce.
Fix
Recognize
batch-existing.jsonexplicitly and sort it first (key = -1) so the fresh per-file batches still override it during dedup (the merge keeps the last occurrence per node id):_batch_sort_keyreturns-1forbatch-existing.jsonso it loads before the numbered batches.batch-existing.jsonas a valid batch instead of marking it unrecognized.Minimal change, one file, no behavior change for full analysis or for the existing
batch-<N>.json/batch-<N>-part-<K>.jsonhandling.Verification
Standalone repro against the patched script:
batch-existing.json= filesa,b,c+ animportsedge (the retained prior graph)batch-0.json= changed filedfile:d.pyonly (1)file:a.py, file:b.py, file:c.py, file:d.py(4)Dedup still works (fresh batch overrides existing for a changed file, because existing loads first).
Notes
Happy to add a test in whatever harness you prefer for the Python scripts (I didn't see an existing pytest setup alongside the vitest suites - let me know and I'll wire one up).