fix: auto-correct dash-to-underscore filename mismatches in path validation (#48)#160
fix: auto-correct dash-to-underscore filename mismatches in path validation (#48)#160williamp44 wants to merge 2 commits into
Conversation
📝 WalkthroughWalkthroughAttempts filename-only dash→underscore correction for missing Clojure source files: validate-path-with-client checks existence, and if absent and the extension is a Clojure source, it tries replacing dashes with underscores in the filename (not directories), re-validates, and returns the corrected path when found. Tests and integration coverage added. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Validator as validate-path-with-client
participant FS as FileSystem
participant Helper as CorrectionHelper
Client->>Validator: validate-path-with-client(path, nrepl-client)
Validator->>FS: stat(path)
FS-->>Validator: not found
Validator->>Helper: try-dash-to-underscore-correction(path)
Helper->>Helper: check extension ∈ clojure-source-extensions
alt Clojure source
Helper->>Helper: replace '-' → '_' in filename only
Helper->>FS: stat(corrected-path)
FS-->>Helper: found / not found
Helper-->>Validator: corrected-path or nil
else Non-Clojure or no correction
Helper-->>Validator: nil
end
alt corrected-path
Validator->>FS: stat(corrected-path)
FS-->>Validator: exists
Validator-->>Client: return corrected-path
else fallback
Validator-->>Client: original validation result (error)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Thanks for this! The dash-to-underscore correction is a real pain point with LLMs. I think we can simplify this significantly by moving the correction into That way every tool that validates paths (read_file, the edit tools, etc.) gets the correction for free, and we avoid the re-indentation of Would you be open to reworking it with that approach? |
|
yes, of course. will review and work through it. |
fef9282 to
a69383b
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/clojure_mcp/utils/valid_paths.clj (1)
61-69: Deduplicate source-extension matching to prevent drift.
clojure-source-ext?introduces a second extension list that overlaps withclojure-file?. Consider centralizing the.clj/.cljs/.cljclist in one private constant/predicate and reusing it in both places.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/clojure_mcp/utils/valid_paths.clj` around lines 61 - 69, Introduce a single private constant or predicate (e.g., def ^:private clojure-extensions or a private function clojure-ext? ) that lists/checks the Clojure source extensions (.clj, .cljs, .cljc) and replace the duplicate logic in clojure-source-ext? and clojure-file? to call that shared symbol; update clojure-source-ext? to delegate to the new predicate/constant (preserving the nil check and lower-case handling) so extension matching is centralized and cannot drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/clojure_mcp/utils/valid_paths.clj`:
- Around line 61-69: Introduce a single private constant or predicate (e.g., def
^:private clojure-extensions or a private function clojure-ext? ) that
lists/checks the Clojure source extensions (.clj, .cljs, .cljc) and replace the
duplicate logic in clojure-source-ext? and clojure-file? to call that shared
symbol; update clojure-source-ext? to delegate to the new predicate/constant
(preserving the nil check and lower-case handling) so extension matching is
centralized and cannot drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: df08ff63-bedd-4f14-8289-4818f791e369
📒 Files selected for processing (3)
src/clojure_mcp/utils/valid_paths.cljtest/clojure_mcp/tools/unified_read_file/tool_test.cljtest/clojure_mcp/utils/valid_paths_test.clj
…dation (bhauman#48) LLMs frequently request Clojure files using dashes (core-stuff.clj) when the filesystem uses underscores (core_stuff.clj). This adds transparent correction in validate-path-with-client so all tools (read_file, edit, grep, etc.) get the fix for free. Only corrects .clj/.cljs/.cljc files. Directory components are preserved. Corrected path is re-validated for defense-in-depth (symlink protection).
a69383b to
63ea7ac
Compare
|
Bruce, not sure what i did wrong, but there was a merge conflict in the github UI. i merged the change i expected, but not sure how this happened. UPDATES: FAIL in (clojure-file?-test) (valid_paths_test.clj:109) please discard this PR and i will review locally to see what went wrong. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/clojure_mcp/utils/valid_paths.clj (1)
152-156:⚠️ Potential issue | 🔴 CriticalRestore
.cljddetection inclojure-file?to fix regression.The refactor dropped
.cljdrecognition, which is now failing CI (Detect ClojureDart extensionassertions). Keep correction scoped to.clj/.cljs/.cljc, but preserve.cljdclassification inclojure-file?.💡 Suggested fix
(defn clojure-file? @@ (when file-path (let [lower-path (str/lower-case file-path)] (or (clojure-source-ext? file-path) + (str/ends-with? lower-path ".cljd") (str/ends-with? lower-path ".bb") (str/ends-with? lower-path ".lpy") (str/ends-with? lower-path ".edn") (babashka-shebang? file-path)))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/clojure_mcp/utils/valid_paths.clj` around lines 152 - 156, The clojure-file? predicate lost recognition for the .cljd extension; update the OR branch in clojure-file? to include (str/ends-with? lower-path ".cljd") alongside the existing checks (e.g., .clj, .cljs, .cljc, .bb, .lpy, .edn) so files with .cljd are classified as Clojure; locate the clojure-file? function (uses file-path and lower-path and babashka-shebang?) and add the .cljd ends-with check in the same style as the other extension checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/clojure_mcp/utils/valid_paths.clj`:
- Around line 152-156: The clojure-file? predicate lost recognition for the
.cljd extension; update the OR branch in clojure-file? to include
(str/ends-with? lower-path ".cljd") alongside the existing checks (e.g., .clj,
.cljs, .cljc, .bb, .lpy, .edn) so files with .cljd are classified as Clojure;
locate the clojure-file? function (uses file-path and lower-path and
babashka-shebang?) and add the .cljd ends-with check in the same style as the
other extension checks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a5a1cb31-f6c6-4fc7-bc33-42916deb5373
📒 Files selected for processing (2)
src/clojure_mcp/utils/valid_paths.cljtest/clojure_mcp/utils/valid_paths_test.clj
|
Thanks for working on this and for being proactive about the test failures — appreciate the quick communication. Go ahead and sort it out locally and open a new PR when it's ready. Looking forward to seeing the reworked approach! |
|
Also, I'm sorry — I realize you did a bunch of work on this today and it sounds like it was your only day to get things done. I really do appreciate the effort, and the approach is solid. Looking forward to the follow-up PR. |
|
Yes, the dash→underscore correction should apply to (defn ns-to-paths [ns-name]
(let [base (replace-all (name ns-name) #"[.-]" {"." "/" "-" "_"})]
[(str base ".cljd") (str base ".cljc")]))So |


Summary
-and return file content along with corrected filename. #48 — LLMs commonly request Clojure files using namespace-style dashes (core-stuff.clj) when the filesystem uses underscores (core_stuff.clj)validate-path-with-clientso all tools (read_file, edit, grep, glob, etc.) get the fix for free.clj,.cljs,.cljcfiles — extensions that follow the Clojure namespace-to-filename conventionApproach
Per your review feedback: correction lives in
validate-path-with-clientinvalid_paths.clj, not in the read_file tool. Aftervalidate-pathreturns, if the file doesn't exist and has a Clojure source extension, tries the underscore version. The tool already returns the path it read, so the LLM sees the corrected filename without needing an explicit notice.Changes
utils/valid_paths.clj—clojure-source-extensionsconstant (shared withclojure-file?),clojure-source-ext?predicate,try-dash-to-underscore-correctionhelper, updatedvalidate-path-with-clientutils/valid_paths_test.clj— 12 unit test scenariostools/unified_read_file/tool_test.clj— 7 integration tests through full tool pipeline + recursive fixture cleanupTest plan
Summary by CodeRabbit
New Features
Tests