Skip to content

fix: text_to_speech requires output_path, removes inline audio#14

Merged
maharshi-smallest merged 1 commit into
mainfrom
fix/tts-no-inline-audio
Apr 23, 2026
Merged

fix: text_to_speech requires output_path, removes inline audio#14
maharshi-smallest merged 1 commit into
mainfrom
fix/tts-no-inline-audio

Conversation

@maharshi-smallest

Copy link
Copy Markdown
Collaborator

Summary

Fixes TTS tool creating 3 duplicate files on Desktop when Claude Desktop can't render inline audio.

Changes

  • output_path required — LLM asks user where to save (suggests ~/Desktop by default)
  • Removed inline AudioContent — Claude Desktop doesn't support it, causing retries
  • Expands ~ to home directory
  • Description tells LLM not to retry on success
  • Added duration estimate

🤖 Generated with Claude Code

- output_path is now required — LLM must ask user where to save
- Removed inline AudioContent (Claude Desktop doesn't support it in
  tool responses, causing the LLM to retry and create duplicate files)
- Expands ~ to home directory
- Description tells LLM not to retry on success
- Added duration estimate in response

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@maharshi-smallest maharshi-smallest merged commit 461e258 into main Apr 23, 2026
1 check passed
@entelligence-ai-pr-reviews

Copy link
Copy Markdown

EntelligenceAI PR Summary

Refactors text_to_speech tool in src/tools/text-to-speech.ts to enforce caller-supplied output paths and remove inline audio data from responses.

  • output_path promoted to required parameter at top of schema with tilde (~) expansion support
  • Tool description updated to instruct LLM to always prompt the user for a save location before invoking the tool
  • Removed inline base64 audio response block and associated mimeMap lookup
  • Added durationEstimate field to the success response payload
  • Removed default Desktop path fallback logic, now redundant since output_path is always provided

Confidence Score: 5/5 - Safe to Merge

Safe to merge — this PR cleanly refactors text_to_speech in src/tools/text-to-speech.ts by promoting output_path to a required parameter with tilde expansion support, removing the inline base64 audio response and associated mimeMap lookup, and adding a durationEstimate field to the success payload. The automated review found zero issues across the changed file, and the changes reflect a coherent, well-scoped improvement to the tool's interface contract. No logic regressions, security concerns, or unresolved prior comments were identified.

Key Findings:

  • Promoting output_path to a required parameter in the schema is a correctness improvement — it eliminates a code path where the tool could silently omit saving audio, ensuring callers always supply an explicit destination.
  • Removing the inline base64 audio block and mimeMap lookup simplifies the response contract and reduces unnecessary data transfer in the tool's output, with no apparent loss of functionality since the file is written to disk.
  • The addition of durationEstimate to the success response is a backward-compatible enrichment with no risk of breaking existing consumers.
  • Zero automated review comments were generated and coverage spans the single changed file, giving high confidence that the change is internally consistent.
Files requiring special attention
  • src/tools/text-to-speech.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant