Skip to content

fix: text_to_speech saves to Desktop and returns inline audio#13

Merged
maharshi-smallest merged 2 commits into
mainfrom
fix/tts-desktop-save-inline-audio
Apr 23, 2026
Merged

fix: text_to_speech saves to Desktop and returns inline audio#13
maharshi-smallest merged 2 commits into
mainfrom
fix/tts-desktop-save-inline-audio

Conversation

@maharshi-smallest

Copy link
Copy Markdown
Collaborator

Summary

Fixes TTS tool so generated audio is actually accessible to the user.

Changes

  • Save to ~/Desktop by default instead of temp dir
  • Return audio as MCP AudioContent (base64) inline — Claude Desktop can render or offer it directly
  • Both file path AND inline audio in response

Problem

Claude Desktop runs the MCP server in a sandbox — temp files on the server side are inaccessible. Users couldn't get the generated audio.

🤖 Generated with Claude Code

- Save to ~/Desktop by default instead of temp dir (accessible to user)
- Return audio as MCP AudioContent (base64) so Claude Desktop can
  offer it directly without needing filesystem access
- Both file path AND inline audio returned

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Language is now required (no silent default) — description tells LLM
  to always ask the user what language the audio is in
- Description warns that chat sandbox paths don't work — LLM should ask
  for actual file path on user's machine or a URL
- Expands ~ to home directory in file paths
- Better error message for file not found (mentions sandbox limitation)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@maharshi-smallest maharshi-smallest merged commit 418e6e6 into main Apr 23, 2026
1 check passed
@entelligence-ai-pr-reviews

entelligence-ai-pr-reviews Bot commented Apr 23, 2026

Copy link
Copy Markdown

EntelligenceAI PR Summary

Improves the transcribe_audio tool in src/tools/transcribe-audio.ts with UX, validation, and reliability enhancements.

  • Updated tool and parameter descriptions to explicitly instruct the LLM to ask for language and a real machine file path before invocation
  • Removed default value for language, making it a required parameter
  • Added tilde (~) expansion for file paths using process.env.HOME
  • Wrapped file read operations in try/catch with descriptive ENOENT-aware error messages
  • Converted file buffer to Uint8Array before sending as the API request body

Confidence Score: 5/5 - Safe to Merge

Safe to merge — this PR makes targeted, well-scoped improvements to src/tools/transcribe-audio.ts including tilde expansion via process.env.HOME, ENOENT-aware try/catch error handling, and stricter parameter validation by removing the default language value. The changes are defensive in nature and improve reliability without introducing new logic paths that could regress existing functionality. No review comments were generated and heuristic analysis flagged zero issues across the changed file.

Key Findings:

  • Tilde expansion using process.env.HOME is a common and safe pattern for resolving user home-directory paths, and the implementation avoids shell injection since it does not invoke a subprocess.
  • Wrapping file reads in try/catch with ENOENT-aware messaging improves UX by surfacing actionable errors rather than letting unhandled promise rejections propagate to the caller.
  • Making language a required parameter (removing its default) is a deliberate correctness improvement that forces the LLM to prompt users explicitly, reducing silent mis-transcriptions.
  • Coverage was complete (1/1 changed files reviewed) and the heuristic ceiling returned 5/5 with zero critical, significant, or medium issues identified.
Files requiring special attention
  • src/tools/transcribe-audio.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant