fix(tts): voice not working by Moskize91 · Pull Request #2 · oomol-lab/epub2speech

Moskize91 · 2025-09-25T07:48:17Z

No description provided.

coderabbitai · 2025-09-25T07:48:24Z

Caution

Review failed

The pull request is closed.

Summary by CodeRabbit

Breaking Changes
- You must now specify a voice when generating audio; no default voice is applied.
- Removed voice discovery and config validation methods from the public interface.
Bug Fixes
- More reliable detection of empty/silent audio with clearer error messages after synthesis.
Tests
- Added multi-voice test coverage and setup to validate TTS output across voices.
Chores
- Removed build artifact upload from CI workflow.

Walkthrough

The Azure TTS provider API was simplified: default_voice was removed and convert_text_to_audio now requires an explicit voice: str. Pre-synthesis file-existence/silence checks were removed; silence/empty-audio detection is performed after reading the generated WAV. Several public helpers were removed (get_available_voices, validate_config, create_azure_tts_from_config) and the TextToSpeechProtocol now requires a non-optional voice. Tests were refactored to set up an AzureTextToSpeech instance and validate audio generation across multiple voices. CI no longer uploads build artifacts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix(tts): voice not working #2 — Implements the same API changes to AzureTextToSpeech and TextToSpeechProtocol (removal of default_voice, required voice parameter, and deletion of get_available_voices/validate_config).

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description is missing and does not provide any information about the changes, leaving reviewers without context to understand the update.	Please add a concise description summarizing the changes made to the TTS provider, protocol, and tests to help reviewers understand the intent and scope of this pull request.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The pull request title follows the `<type>(<scope>): <subject>` format in English and accurately reflects the main change of resolving the voice issue in the TTS component.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4563347 and 1610923.

📒 Files selected for processing (1)

.github/workflows/ci.yml (0 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f609db and 4563347.

📒 Files selected for processing (3)

epub2speech/tts/azure_provider.py (1 hunks)
epub2speech/tts/protocol.py (1 hunks)
tests/test_tts.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/test_tts.py (4)

tests/utils/config.py (3)

TTSConfig (9-43)

validate_config (32-43)

get_azure_config (29-30)

tests/test_chapter_tts.py (2)

validate_config (123-124)

convert_text_to_audio (114-118)

epub2speech/tts/azure_provider.py (2)

AzureTextToSpeech (10-71)

convert_text_to_audio (36-71)

epub2speech/tts/protocol.py (1)

convert_text_to_audio (7-13)

🪛 Ruff (0.13.1)

epub2speech/tts/azure_provider.py

43-43: Avoid specifying long messages outside the exception class

(TRY003)

46-46: Avoid specifying long messages outside the exception class

(TRY003)

62-62: Avoid specifying long messages outside the exception class

(TRY003)

64-64: Avoid specifying long messages outside the exception class

(TRY003)

68-68: Avoid specifying long messages outside the exception class

(TRY003)

71-71: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (1)

epub2speech/tts/protocol.py (1)

11-12: Explicit voice requirement matches current engine needs.

Line 11 ensures every caller passes a concrete voice, which lines up with the updated Azure implementation and prevents silent fallbacks. No further action needed here.

coderabbitai · 2025-09-25T07:53:31Z

epub2speech/tts/azure_provider.py

        if not text or not text.strip():
            raise ValueError("Empty text provided for conversion")

-        output_path.parent.mkdir(parents=True, exist_ok=True)
-
-        voice = voice or self.default_voice
-
+        self._speech_config.speech_synthesis_voice_name = voice
        audio_config = speechsdk.audio.AudioOutputConfig(filename=str(output_path))


⚠️ Potential issue | 🟠 Major

Guard against missing/blank voice input before invoking Azure

With the default voice removed, upstream callers that still rely on configs (or env vars) may hand us None/blank strings. Passing that straight into speech_synthesis_voice_name now blows up inside the SDK with an opaque error, whereas previously we silently fell back to the default voice. Let's fail fast with a clear ValueError and strip accidental whitespace before assigning.

if not text or not text.strip(): raise ValueError("Empty text provided for conversion") + if not voice or not voice.strip(): + raise ValueError("Voice is required and cannot be empty") + voice = voice.strip() + self._speech_config.speech_synthesis_voice_name = voice

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if not text or not text.strip():

raise ValueError("Empty text provided for conversion")

output_path.parent.mkdir(parents=True, exist_ok=True)

voice = voice or self.default_voice

self._speech_config.speech_synthesis_voice_name = voice

audio_config = speechsdk.audio.AudioOutputConfig(filename=str(output_path))

if not text or not text.strip():

raise ValueError("Empty text provided for conversion")

if not voice or not voice.strip():

raise ValueError("Voice is required and cannot be empty")

voice = voice.strip()

self._speech_config.speech_synthesis_voice_name = voice

audio_config = speechsdk.audio.AudioOutputConfig(filename=str(output_path))

🧰 Tools

🪛 Ruff (0.13.1)

46-46: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In epub2speech/tts/azure_provider.py around lines 45 to 49, add a guard that checks the provided voice before assigning it to self._speech_config.speech_synthesis_voice_name: if voice is None or not str(voice).strip(), raise a ValueError with a clear message (e.g., "No voice provided for Azure TTS"), otherwise strip() the voice string and then assign the cleaned value; ensure this check happens before any call into the Azure SDK so we fail fast with a clear error.

coderabbitai · 2025-09-25T07:53:31Z

tests/test_tts.py

+    def setUp(self):
+        """Set up test configuration"""
+        self.config_path = Path(__file__).parent / "tts_config.json"
+        if not self.config_path.exists():
            self.skipTest("TTS config file not found")

-        config = TTSConfig(config_path)
-        self.assertTrue(config.validate_config(), "TTS configuration is invalid")
-        print("✅ TTS configuration validated")
+        config = TTSConfig(self.config_path)
+        self.assertTrue(config.validate_config())

-        config = TTSConfig(config_path)
        azure_config = config.get_azure_config()
-        self.assertIsNotNone(azure_config, "No Azure configuration found in config file")
-
-        tts = AzureTextToSpeech(
+        self.tts = AzureTextToSpeech(
            subscription_key=azure_config["subscription_key"],
-            region=azure_config["region"],
+            region=azure_config["region"]
        )
-        self.assertTrue(tts.validate_config(), "Azure TTS validation failed")
-        print("✅ Azure TTS instance created successfully")

-        output_dir = Path(__file__).parent / "dist"
-        output_dir.mkdir(exist_ok=True)
+        self.output_dir = Path(__file__).parent / "dist"
+        self.output_dir.mkdir(exist_ok=True)

-        for old_file in output_dir.glob("*.wav"):
+        for old_file in self.output_dir.glob("*.wav"):
            old_file.unlink()


⚠️ Potential issue | 🟠 Major

Prevent cleanup from deleting shared WAV fixtures.

self.output_dir points to <repo>/tests/dist, which is also used by other test modules. Clearing *.wav inside setUp erases audio artifacts those tests rely on, causing racey failures when suites run together.

Tighten the cleanup scope to this test’s own files by either using a test‑specific subdirectory or filtering on a unique prefix:

- self.output_dir = Path(__file__).parent / "dist" - self.output_dir.mkdir(exist_ok=True) - - for old_file in self.output_dir.glob("*.wav"): - old_file.unlink() + self.output_dir = Path(__file__).parent / "dist" / "voice_param" + self.output_dir.mkdir(parents=True, exist_ok=True) + + for old_file in self.output_dir.glob("voice_param_*.wav"): + old_file.unlink()

🤖 Prompt for AI Agents

In tests/test_tts.py around lines 16-35, setUp currently clears all WAVs in tests/dist which removes shared fixtures used by other tests; change to use a test-specific output directory (e.g. tests/dist/tts_test) or at minimum only remove files created by this test by using a unique filename prefix (e.g. tts_*) — update self.output_dir to a subdirectory, mkdir it if missing, and restrict the cleanup loop to unlink only files matching that prefix or inside that subdirectory so other tests' WAV files are not touched.

Moskize91 added 2 commits September 25, 2025 15:42

refactor(tts): remove useless codes

24002ce

fix(tts): voice not working

4563347

chore(project): upgrade CI build

1610923

Moskize91 force-pushed the fix branch from 608a928 to 1610923 Compare September 25, 2025 07:52

coderabbitai bot reviewed Sep 25, 2025

View reviewed changes

Moskize91 merged commit a57e423 into main Sep 25, 2025
2 checks passed

Moskize91 deleted the fix branch September 25, 2025 07:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tts): voice not working#2

fix(tts): voice not working#2
Moskize91 merged 3 commits intomainfrom
fix

Moskize91 commented Sep 25, 2025

Uh oh!

coderabbitai bot commented Sep 25, 2025 •

edited

Loading

Review failed

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 25, 2025

Uh oh!

coderabbitai bot Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Moskize91 commented Sep 25, 2025

Uh oh!

coderabbitai bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Estimated code review effort

Possibly related PRs

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Sep 25, 2025 •

edited

Loading