Skip to content

util: extend string search with user-defined printable characters#6161

Open
cheese-cakee wants to merge 3 commits intorizinorg:devfrom
cheese-cakee:feature/string-search-configurable
Open

util: extend string search with user-defined printable characters#6161
cheese-cakee wants to merge 3 commits intorizinorg:devfrom
cheese-cakee:feature/string-search-configurable

Conversation

@cheese-cakee
Copy link
Copy Markdown
Contributor

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository.
  • I made sure to follow the project's coding style.
  • I've documented every RZ_API function and struct this PR changes.
  • I've added tests that prove my changes are effective (required for changes to RZ_API).
  • I've updated the Rizin book with the relevant information (if needed).
  • I have NOT used AI tools to generate these code changes.

Detailed description

This PR extends the string search functionality in Rizin to allow users to define custom unprintable character sets via the new str.unprintable configuration option.

Users can specify a comma-separated list of Unicode code points to treat as non-printable:

e str.unprintable=0x09,0x0a,0x0d,0x1b
e str.unprintable=0x200B  # zero-width space

Test plan

  1. Build Rizin with the changes
  2. Test string search with default unprintable chars
  3. Test string search with custom unprintable chars
  4. Run the test suite: rizin -q -c < test/db/cmd/cmd_search_z

Closing issues

closes #4930

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

❌ Patch coverage is 63.30275% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.23%. Comparing base (6131a0e) to head (f990ec5).

Files with missing lines Patch % Lines
librz/core/cconfig.c 38.46% 24 Missing and 8 partials ⚠️
librz/util/unicode.c 0.00% 5 Missing ⚠️
librz/util/str_search.c 83.33% 0 Missing and 2 partials ⚠️
librz/util/str.c 96.77% 0 Missing and 1 partial ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
librz/bin/bfile_string.c 61.83% <100.00%> (+0.25%) ⬆️
librz/bin/bin.c 60.25% <100.00%> (+0.04%) ⬆️
librz/core/canalysis.c 61.78% <100.00%> (+0.01%) ⬆️
librz/core/cmd/cmd_print.c 44.29% <100.00%> (+0.01%) ⬆️
librz/core/cmeta.c 70.52% <100.00%> (+0.09%) ⬆️
librz/core/csearch.c 48.38% <100.00%> (+0.18%) ⬆️
librz/include/rz_bin.h 43.47% <ø> (ø)
librz/include/rz_util/rz_str.h 60.00% <ø> (ø)
librz/util/str.c 56.57% <96.77%> (+0.61%) ⬆️
librz/util/str_search.c 87.09% <83.33%> (-0.25%) ⬇️
... and 2 more

... and 7 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6131a0e...f990ec5. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cheese-cakee
Copy link
Copy Markdown
Contributor Author

ready for review @Rot127

Comment on lines +4266 to +4271
const RzCodePoint *user_unprintable = (const RzCodePoint *)rz_vector_head(option->user_unprintable);
for (size_t i = 0, count = rz_vector_len(option->user_unprintable); i < count; i++) {
if (user_unprintable[i] == cp) {
return true;
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use rz_vector_foreach It is way cleaner.

Comment on lines +96 to +101
const RzCodePoint *user_unprintable = (const RzCodePoint *)rz_vector_head(opt->user_unprintable);
for (size_t i = 0, count = rz_vector_len(opt->user_unprintable); i < count; i++) {
if (user_unprintable[i] == cp) {
return true;
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rz_vector_foreach

* \param user_unprintable Array of user-defined non-printable code points.
* \param user_unprintable_count Number of user-defined non-printable code points.
*/
RZ_API bool rz_unicode_code_point_is_user_unprintable(const RzCodePoint c, const RzCodePoint *user_unprintable, size_t user_unprintable_count) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems to be unused?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove it then , if I am not mistaken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend string search

2 participants