feat(VoiceServer): local TTS support, voice personalities, and automatic ElevenLabs fallback#1101
feat(VoiceServer): local TTS support, voice personalities, and automatic ElevenLabs fallback#1101salaheldinaz wants to merge 5 commits into
Conversation
…lback - Fix .env loading path: try ~/.config/PAI/.env first (PAI standard location), fall back to ~/.env — resolves api_key_configured always showing false - Add playLocalSpeech() using macOS say command (no API key required) - Add tts_provider config: set voiceServer.tts_provider = "local" in settings.json to use local TTS exclusively; defaults to "elevenlabs" - Add localVoice config: voiceServer.local_voice sets the macOS voice (default: Samantha) - Automatic fallback: when ElevenLabs fails (402, network error, etc.), server silently retries with local TTS so notifications always play - Update /health endpoint: exposes tts_provider, local_voice, local_tts_available, renames api_key_configured → elevenlabs_api_key_configured for clarity - Update startup log to show active TTS mode Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…local endpoint - Add curated LOCAL_VOICE_CATALOGUE with 22 realistic English voices grouped by category (natural/classic) with accent, gender, and personality descriptions - Add getInstalledLocalVoices() that cross-references the catalogue against voices actually installed on the system via `say -v ?` - Add GET /voices/local endpoint: returns full catalogue with active voice marked and instructions for switching (set voiceServer.local_voice in settings.json) - Surface /voices/local in the root response for discoverability Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds kokoro-fastapi as a third TTS provider alongside ElevenLabs and local macOS say. Configure via settings.json voiceServer.tts_provider, kokoro_url, and kokoro_voice. Falls back to local TTS on connection failure. Also extracts shared preprocessForTTS() helper (removes pronunciation log duplication between providers) and fixes a temp file leak in playAudio's error handler. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…roSpeech - Extract shared preprocessForTTS() helper used by all TTS providers - Add generateKokoroSpeech() using kokoro-fastapi OpenAI-compatible API - Kokoro provider falls back to local TTS on failure - Config: kokoro_url (default localhost:8880), kokoro_voice (default af_sky) - Health endpoint now reports kokoro_url and kokoro_voice Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Direct all voice notification traffic to the standalone VoiceServer (localhost:8888) instead of Pulse (localhost:31337) so that the custom kokoro-fastapi + local TTS fallback stack is used for voice output. - hooks/handlers/VoiceNotification.ts: 31337 → 8888 - PAI/ALGORITHM/v6.3.0.md: phase-announcement curl 31337 → 8888 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hey @salaheldinaz, thanks for raising this, and sorry it sat for a while. We're changing how LifeOS ships. Instead of cloning a full That's aimed right at what you hit here. The old "one directory, one layout, hope it matches your setup" approach is exactly what broke for so many people, and the new model should handle it far better because your AI does the integration per machine instead of us guessing. So we're closing this in prep for that release. If it still bites you once the skill-based version is out, reopen or file a fresh one and we'll jump on it. Appreciate you taking the time. |
Summary
~/.envbut PAI stores the key at~/.config/PAI/.env, causingelevenlabs_api_key_configuredto always reportfalseand silently skipping all TTS audiosaycommand (zero dependencies, no API key) via newplayLocalSpeech()naturalandclassiccategoriesGET /voices/localendpoint — lists voices installed on the system with the active voice marked and instructions for switching/healthresponse — now exposestts_provider,local_voice,local_tts_available; renamesapi_key_configured→elevenlabs_api_key_configuredConfiguration
No breaking changes. Existing installs behave identically. To customise local TTS add to
~/.claude/settings.json:tts_provideroptions:"elevenlabs"(default — ElevenLabs with local fallback) or"local"(always local, no API call).Browse available voices on your system:
Test plan
/healthreturnslocal_tts_available: trueand correcttts_providerGET /voices/localreturns catalogue with active voice marked200tts_provider: "local"→ local TTS used directly, no ElevenLabs calltts_provider: "elevenlabs"with valid paid key → ElevenLabs used, unchanged behaviour.envloaded correctly from~/.config/PAI/.env🤖 Generated with Claude Code