fix(ls): LC_ALL=C + fallback to raw on unrecognized locale#1338
fix(ls): LC_ALL=C + fallback to raw on unrecognized locale#1338lumincui wants to merge 3 commits intortk-ai:developfrom
Conversation
- Force LC_ALL=C so ls always outputs English month names regardless of system locale - When no lines are parsed (e.g., non-English locale where regex fails to match), fall back to raw output instead of returning '(empty)' - This prevents silent data loss for users in zh_CN/ja/ko/etc. locales - Fixes rtk-ai#1276
|
Thank you all for reporting and working around this issue! 🙏 This PR implements the fix:
Would any of you like to review the PR? Your firsthand experience with the bug would be especially valuable. |
|
Hey, i'm in favor of this one over #1358 Handling better parse failure + better scoped |
|
Follow-up to do : Some commands may also be using same type of parsing, and may encounter issue like those with non EN locales. Thanks for contributing ! |
|
Thanks for the fix , the Just tested, found a regression on empty directories. The fallback falsely triggers, causing raw Reproducemkdir /tmp/empty-dir
# On develop → correct
rtk ls /tmp/empty-dir
# (empty)
# On this branch → regression
rtk ls /tmp/empty-dir
# total 28
# drwxr-xr-x 2 user user 4096 Apr 17 14:51 .
# drwxrwxrwt 16 root root 20480 Apr 17 14:54 ..The NoteThe existing unit test |
- Move . and .. detection before date parsing (is_dotdir) for non-English locale compatibility - Add dotdirs counter to distinguish empty dir (only . and ..) from real content that failed to parse - Fix test_compact_empty to use real ls -la output (includes . and ..) - Add test_compact_empty_chinese_locale for Chinese locale empty dir case - Closes regression where fallback falsely triggered on empty directories
|
@aeppling Thank you for the incredibly thorough regression testing! Your detailed reproduction steps were spot-on — the fallback logic was incorrectly treating empty directories (which only contain . and .. entries) as 'content that failed to parse.' Fix SummaryThe root cause: parse_ls_line returns None for . and .. entries under non-English locales, and the fallback condition could not distinguish between:
Solution: Added is_dotdir() to detect . and .. entries before the date regex check, and a dotdirs counter to track whether all unparseable lines were just . and .. entries. Test Coverage
Both tests now pass. Your regression case is fully covered. Note on Follow-upYour suggestion about a global LC_ALL=C in run_filtered is noted — that would be a separate improvement for consistency across all commands. Happy to discuss further if you would like to open a follow-up issue. |
Summary
Fixes
rtk lsreturning empty output for non-English locales (zh_CN, ja, ko, etc.).Root Cause
The
LS_DATE_REregex hardcodes English month names (Jan|Feb|Mar|...). Whenls -laruns under a non-English locale, it outputs native month names (e.g.,1月,1月), causing the regex to match nothing →parse_ls_linereturnsNonefor every line →(empty)output.Changes
LC_ALL=C: Force English output forlsregardless of system locale (src/cmds/system/ls.rs:38)lsoutput instead of(empty)(src/cmds/system/ls.rs:79-84)test_compact_chinese_locale_fallbackverifies the fallback pathBehavior
(empty)ls -laoutput(empty)(empty)No token savings for non-English locale users, but no silent data loss — the LLM still sees the full directory listing.
Closes #1276