fix(discover): reorder leading rg flags in rewrite#1394
fix(discover): reorder leading rg flags in rewrite#1394lawrence3699 wants to merge 1 commit intortk-ai:developfrom
Conversation
|
|
📊 Automated PR Analysis
SummaryFixes the rg/grep rewrite path in the discover module to reorder leading flags (like -n, -U) behind the required pattern and path arguments, since rtk grep expects Review Checklist
Linked issues: #1367 Analyzed automatically by wshm · This is an automated analysis, not a human review. |
There was a problem hiding this comment.
Pull request overview
This PR fixes an rtk rewrite edge case for rg/grep commands by ensuring leading flags don’t get placed into rtk grep’s required positional argument slots, preventing rewritten commands from failing (Fixes #1367).
Changes:
- Adds a specialized
rewrite_grep_likepath for thertk greprule that moves simple leadingrg/grepflags to the end of the rewritten command. - Skips rewriting when encountering leading flags that require a separate value (to avoid emitting broken rewrites).
- Adds unit tests covering basic leading-flag reordering, default-path behavior, and value-flag skip behavior.
| let args: Vec<String> = tokenize(cmd_clean) | ||
| .into_iter() | ||
| .filter(|t| t.kind == TokenKind::Arg) | ||
| .map(|t| t.value) | ||
| .collect(); | ||
|
|
There was a problem hiding this comment.
rewrite_grep_like builds args by tokenizing and then filtering to TokenKind::Arg. In lexer::tokenize, characters like *, {}, !, etc. are emitted as TokenKind::Shellism, so this rewrite path will silently drop them and corrupt arguments (e.g., rg foo src/*.rs would lose the * and change the effective paths/pattern). Consider either (a) returning None (skip rewrite) if any non-Arg tokens occur within what should be a single shell word, or (b) reconstructing shell words by grouping contiguous Arg+Shellism tokens using offsets so globs/brace expansions are preserved exactly.
| let args: Vec<String> = tokenize(cmd_clean) | |
| .into_iter() | |
| .filter(|t| t.kind == TokenKind::Arg) | |
| .map(|t| t.value) | |
| .collect(); | |
| let tokens = tokenize(cmd_clean); | |
| if tokens.iter().any(|t| t.kind != TokenKind::Arg) { | |
| return None; | |
| } | |
| let args: Vec<String> = tokens.into_iter().map(|t| t.value).collect(); |
| while idx < args.len() { | ||
| let arg = &args[idx]; | ||
| if arg == "-" || !arg.starts_with('-') { | ||
| break; | ||
| } | ||
|
|
||
| if arg == "--" { | ||
| return None; | ||
| } | ||
|
|
||
| if grep_flag_requires_separate_value(arg) { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
The safety check for leading flags assumes a hard-coded denylist of options that take a separate value (grep_flag_requires_separate_value). Any missing value-taking flag will be treated as a “simple” flag and can shift idx so that the next value is mis-identified as the pattern/path (e.g. ripgrep flags like --max-depth 2 would make 2 become the pattern). To ensure the rewrite is actually safe, consider flipping this to an allowlist of valueless flags that you are confident don’t consume the next arg, or conservatively skip rewrite for unknown -X/--long leading flags unless they’re in a known-safe set or use --flag=value form.
| let (path, has_explicit_path) = match args.get(idx) { | ||
| Some(arg) if arg != "-" && !arg.starts_with('-') => { | ||
| idx += 1; | ||
| (arg.clone(), true) | ||
| } | ||
| _ => (".".to_string(), false), | ||
| }; |
There was a problem hiding this comment.
Path parsing treats "-" as not being an explicit path (Some(arg) if arg != "-" && !arg.starts_with('-')). For grep/rg, - is commonly used to mean stdin; with the current logic rg PATTERN - rewrites to rtk grep PATTERN . -, changing behavior (searching . and stdin instead of only stdin). Consider accepting - as a valid explicit path (set has_explicit_path = true), or if stdin paths aren’t supported, return None when args[idx] == "-" so the original command is preserved.
| parts.extend(leading_flags); | ||
| Some(format!("{}{}", parts.join(" "), redirect_suffix)) | ||
| } | ||
|
|
There was a problem hiding this comment.
The new rewrite_grep_like behavior isn’t covered for a few important edge cases that the implementation explicitly tries to be “safe” about: (1) shell-glob/brace arguments (e.g. src/*.rs), (2) stdin path -, and (3) unknown leading flags that take a separate value. Adding tests for these would help prevent regressions, especially since this rewrite path is intentionally conservative (returning None when unsafe).
| #[cfg(test)] | |
| mod tests { | |
| use super::rewrite_grep_like; | |
| #[test] | |
| fn rewrite_grep_like_preserves_shell_glob_path_argument() { | |
| assert_eq!( | |
| rewrite_grep_like("", r#"rg foo "src/*.rs""#, ""), | |
| Some(r#"rtk grep foo src/*.rs"#.to_string()) | |
| ); | |
| } | |
| #[test] | |
| fn rewrite_grep_like_preserves_brace_style_path_argument() { | |
| assert_eq!( | |
| rewrite_grep_like("", r#"rg foo "src/{lib,bin}""#, ""), | |
| Some(r#"rtk grep foo src/{lib,bin}"#.to_string()) | |
| ); | |
| } | |
| #[test] | |
| fn rewrite_grep_like_keeps_stdin_dash_as_remaining_argument() { | |
| assert_eq!( | |
| rewrite_grep_like("", "rg foo -", ""), | |
| Some("rtk grep foo . -".to_string()) | |
| ); | |
| } | |
| #[test] | |
| fn rewrite_grep_like_returns_none_for_long_flag_with_separate_value() { | |
| assert_eq!(rewrite_grep_like("", "rg --glob *.rs foo src", ""), None); | |
| } | |
| #[test] | |
| fn rewrite_grep_like_returns_none_for_short_flag_with_separate_value() { | |
| assert_eq!(rewrite_grep_like("", "grep -e foo src", ""), None); | |
| } | |
| } |
Fixes #1367
rtk rewritecurrently turns commands like:rg -n -U "document:\s*\{\s*create:" api/src scriptsinto:
rtk grep -n -U "document:\s*\{\s*create:" api/src scriptsBut
rtk grepexpectspattern path [extra_args...], so those leading flags get parsed into the wrong slots and the rewritten command fails.This updates the grep/rg rewrite path to move simple leading flags behind the required
pattern patharguments. If a leading flag needs its own separate value, RTK now skips the rewrite instead of emitting a broken command.Validation:
cargo fmt --all --checkcargo test rewrite_rg_cargo testcargo clippy --all-targetscargo run -- rewrite 'rg -n -U "document:\s*\{\s*create:" /tmp/rtk-rg-repro/sample.txt'