Add timestamp rules and constraints to decoder in Whisper example#3054
Merged
Conversation
…initial timestamp index
There was a problem hiding this comment.
Pull Request Overview
This PR implements comprehensive timestamp handling rules in the Whisper decoder to ensure accurate and well-formed timestamp generation during transcription, following OpenAI Whisper's specifications.
Key Changes:
- Added
apply_timestamp_rules()method implementing 4 core timestamp constraints (pairing, non-decreasing order, forced initial timestamp, probability-based preference) - Added
max_initial_timestamp_indexparameter for configurable initial timestamp limits - Changed timestamps to be enabled by default and refactored decoder methods for consistent model access
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Contributor
|
Thanks! |
john-sharratt
pushed a commit
to john-sharratt/candle
that referenced
this pull request
May 7, 2026
…ggingface#3054) * Apply timestamp rules in whisper decoder and add support for maximum initial timestamp index * Optimize mask generation in decoder by pre-allocating a reusable buffer * Refactor timestamp probability calculations in decoder to use log-softmax for numerical stability
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements proper timestamp handling rules in the Whisper decoder to ensure accurate and well-formed timestamp generation during transcription, according to: https://github.com/openai/whisper/blob/e8622f9afc4eba139bf796c210f5c01081000472/whisper/decoding.py#L439
New Example Output
Changes Made
Core Timestamp Rules Implementation
Configuration Enhancements
Code Structure Improvements
Benefits
✅ More accurate timestamp generation following OpenAI Whisper specifications
✅ Prevents malformed timestamp sequences
✅ Configurable initial timestamp constraints for better control
✅ Improved transcription quality with proper temporal alignment
Testing
The implementation follows the timestamp rules from the original OpenAI Whisper codebase, ensuring compatibility and correctness.