Skip to content

fix(ls): LC_ALL=C + fallback to raw on unrecognized locale#1338

Open
lumincui wants to merge 3 commits intortk-ai:developfrom
lumincui:fix/ls-locale-fallback
Open

fix(ls): LC_ALL=C + fallback to raw on unrecognized locale#1338
lumincui wants to merge 3 commits intortk-ai:developfrom
lumincui:fix/ls-locale-fallback

Conversation

@lumincui
Copy link
Copy Markdown

Summary

Fixes rtk ls returning empty output for non-English locales (zh_CN, ja, ko, etc.).

Root Cause

The LS_DATE_RE regex hardcodes English month names (Jan|Feb|Mar|...). When ls -la runs under a non-English locale, it outputs native month names (e.g., 1月, 1月), causing the regex to match nothing → parse_ls_line returns None for every line → (empty) output.

Changes

  1. LC_ALL=C: Force English output for ls regardless of system locale (src/cmds/system/ls.rs:38)
  2. Fallback to raw: When zero lines are parsed but the directory is non-empty, return the original ls output instead of (empty) (src/cmds/system/ls.rs:79-84)
  3. Unit test: test_compact_chinese_locale_fallback verifies the fallback path

Behavior

Scenario Before After
English locale ✅ Works ✅ Works
Non-English locale (e.g. zh_CN) (empty) ✅ Raw ls -la output
Empty directory (empty) (empty)

No token savings for non-English locale users, but no silent data loss — the LLM still sees the full directory listing.

Closes #1276

- Force LC_ALL=C so ls always outputs English month names regardless of system locale
- When no lines are parsed (e.g., non-English locale where regex fails to match),
  fall back to raw output instead of returning '(empty)'
- This prevents silent data loss for users in zh_CN/ja/ko/etc. locales
- Fixes rtk-ai#1276
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 16, 2026

CLA assistant check
All committers have signed the CLA.

@lumincui
Copy link
Copy Markdown
Author

Thank you all for reporting and working around this issue! 🙏

This PR implements the fix:

  • @daydreaming666 — your shell wrapper workaround pointed us directly at LC_ALL=C as the solution
  • @NycolasAbreu — your alias rtk='LC_ALL=C.UTF-8 rtk' workaround confirmed the locale root cause
  • @Petrusreno — your detailed macOS 15 + pt_BR.UTF-8 reproduction and wrapper script were invaluable
  • @pszymkowiak — thanks for the automated triage

Would any of you like to review the PR? Your firsthand experience with the bug would be especially valuable.

@aeppling
Copy link
Copy Markdown
Contributor

Hey, i'm in favor of this one over #1358

Handling better parse failure + better scoped

@aeppling
Copy link
Copy Markdown
Contributor

Follow-up to do :

Some commands may also be using same type of parsing, and may encounter issue like those with non EN locales.
Maybe a global set of this var in run_filtered would be better for eg.

Thanks for contributing !

@aeppling
Copy link
Copy Markdown
Contributor

aeppling commented Apr 17, 2026

@lumincui

Thanks for the fix , the LC_ALL=C approach and the fallback logic look solid in intent.

Just tested, found a regression on empty directories. The fallback falsely triggers, causing raw ls -la output instead of (empty).

Reproduce

mkdir /tmp/empty-dir

# On develop → correct
rtk ls /tmp/empty-dir
# (empty)

# On this branch → regression
rtk ls /tmp/empty-dir
# total 28
# drwxr-xr-x  2 user user  4096 Apr 17 14:51 .
# drwxrwxrwt 16 root root 20480 Apr 17 14:54 ..

The -v flag also confirms the filter never runs (no Chars: line).

Note

The existing unit test test_compact_empty uses "total 0\n" as input (no ./.. lines), which doesn't match real ls -la output on actual empty directories — that's why CI passes.

@aeppling aeppling self-assigned this Apr 17, 2026
- Move . and .. detection before date parsing (is_dotdir) for non-English locale compatibility
- Add dotdirs counter to distinguish empty dir (only . and ..) from real content that failed to parse
- Fix test_compact_empty to use real ls -la output (includes . and ..)
- Add test_compact_empty_chinese_locale for Chinese locale empty dir case
- Closes regression where fallback falsely triggered on empty directories
@lumincui
Copy link
Copy Markdown
Author

@aeppling Thank you for the incredibly thorough regression testing!

Your detailed reproduction steps were spot-on — the fallback logic was incorrectly treating empty directories (which only contain . and .. entries) as 'content that failed to parse.'

Fix Summary

The root cause: parse_ls_line returns None for . and .. entries under non-English locales, and the fallback condition could not distinguish between:

  • Empty directory (only . and ..) → should return (empty)
  • Real content that failed to parse → should fallback to raw

Solution: Added is_dotdir() to detect . and .. entries before the date regex check, and a dotdirs counter to track whether all unparseable lines were just . and .. entries.

Test Coverage

  • test_compact_empty — fixed to use real ls -la output (includes . and ..)
  • test_compact_empty_chinese_locale — new test for Chinese locale empty directory

Both tests now pass. Your regression case is fully covered.

Note on Follow-up

Your suggestion about a global LC_ALL=C in run_filtered is noted — that would be a separate improvement for consistency across all commands. Happy to discuss further if you would like to open a follow-up issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants