fix: only strip the leading bullet from a ListItem first line#1056
Open
pablopupo wants to merge 1 commit into
Open
fix: only strip the leading bullet from a ListItem first line#1056pablopupo wants to merge 1 commit into
pablopupo wants to merge 1 commit into
Conversation
replace_bullets ran re.sub without a count, so it removed every dash or hyphen surrounded by spaces in the first line instead of just the leading bullet. A list item whose text held a spaced dash mid-sentence lost that dash. Passing count=1 limits the substitution to the leading bullet. Fixes datalab-to#1024
Contributor
|
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
Author
|
I have read the CLA Document and I hereby sign the CLA |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
In
replace_bullets, the bullet-strippingre.subran without acount, so itreplaced every bullet/dash that sits between spaces on a list item's first line,
not just the leading bullet. The character class includes the en dash, em dash,
and hyphen, so a list item whose text contains a spaced dash mid-line had that
dash silently deleted.
For example, a dialogue line rendered as a list item lost its internal dash:
I saw him...he interrupted.(the mid-sentence dash was removed)Fix
Add
count=1so only the leftmost match is replaced, which for a real list itemis the leading bullet. Internal dashes are preserved.
Test
Adds
tests/schema/blocks/test_listitem.py, a pure-logic regression test thatruns
replace_bulletson a minimal Line block and checks the leading bullet isremoved while a spaced dash inside the text survives. It fails on the current
code and passes with this change, and pulls in no model weights.
Fixes #1024