Skip to content

fix: only strip the leading bullet from a ListItem first line#1056

Open
pablopupo wants to merge 1 commit into
datalab-to:devfrom
pablopupo:fix/1024-listitem-dash-strip
Open

fix: only strip the leading bullet from a ListItem first line#1056
pablopupo wants to merge 1 commit into
datalab-to:devfrom
pablopupo:fix/1024-listitem-dash-strip

Conversation

@pablopupo

Copy link
Copy Markdown

Problem

In replace_bullets, the bullet-stripping re.sub ran without a count, so it
replaced every bullet/dash that sits between spaces on a list item's first line,
not just the leading bullet. The character class includes the en dash, em dash,
and hyphen, so a list item whose text contains a spaced dash mid-line had that
dash silently deleted.

For example, a dialogue line rendered as a list item lost its internal dash:

  • before: I saw him ... he interrupted. (the mid-sentence dash was removed)

Fix

Add count=1 so only the leftmost match is replaced, which for a real list item
is the leading bullet. Internal dashes are preserved.

Test

Adds tests/schema/blocks/test_listitem.py, a pure-logic regression test that
runs replace_bullets on a minimal Line block and checks the leading bullet is
removed while a spaced dash inside the text survives. It fails on the current
code and passes with this change, and pulls in no model weights.

Fixes #1024

replace_bullets ran re.sub without a count, so it removed every dash or
hyphen surrounded by spaces in the first line instead of just the leading
bullet. A list item whose text held a spaced dash mid-sentence lost that
dash. Passing count=1 limits the substitution to the leading bullet.

Fixes datalab-to#1024
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@pablopupo

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant