Skip to content

Improved verse segmentation#985

Open
benjaminking wants to merge 8 commits intomasterfrom
improved_verse_segmentation
Open

Improved verse segmentation#985
benjaminking wants to merge 8 commits intomasterfrom
improved_verse_segmentation

Conversation

@benjaminking
Copy link
Copy Markdown
Collaborator

@benjaminking benjaminking commented Mar 31, 2026

This PR adds new features to the verse segmentation script and refactors it for better organization. New features include:

  • Greedy selection of verse breaks (instead of in-order)
  • A new alignment method that uses Fast Align to filter out erroneous Eflomal word alignment pairs
  • Preliminary splitting of passages into smaller sub-passages for better word alignments
  • Saving of sub-passages and their alignments
  • Improved heuristics for correcting close, but not quite correct verse breaks

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@TaperChipmunk32 TaperChipmunk32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@TaperChipmunk32 partially reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: 3 of 17 files reviewed, all discussions resolved (waiting on Enkidu93).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants