Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions text-to-speech/README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,21 @@
# Text-to-Speech

> **Powered by [Lightning TTS v3.1](https://waves-docs.smallest.ai/v4.0.0/content/api-references/lightning-v3.1)**
> **Powered by [Lightning TTS v3.1](https://docs.smallest.ai/waves/model-cards/text-to-speech/lightning-v-3-1) and the new [Lightning v3.1 Pro](https://docs.smallest.ai/waves/model-cards/text-to-speech/lightning-v-3-1-pro) pool.**

Generate natural-sounding speech from text using Smallest AI's Lightning TTS API. 80+ voices, 44.1 kHz native sample rate, ~200ms latency.
Generate natural-sounding speech from text using Smallest AI's Lightning TTS API. 80+ voices on standard Lightning v3.1, plus a curated Pro voice catalog across American, British, and Indian accents. 44.1 kHz native sample rate, ~200ms latency.

## Try It Now (Zero Install)

```bash
curl -X POST https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech \
curl -X POST https://api.smallest.ai/waves/v1/tts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Hello from Smallest AI!", "voice_id": "sophia", "sample_rate": 24000, "output_format": "wav"}' \
-d '{"text": "Hello from Smallest AI!", "voice_id": "meher", "model": "lightning_v3.1_pro", "sample_rate": 24000, "output_format": "wav"}' \
--output hello.wav
```

Drop the `model` field (or set it to `"lightning_v3.1"`) to use the standard pool — that pool has more voices, the full 12-language catalog, plus voice cloning. The unified `/waves/v1/tts` route serves both.

Get your API key at [app.smallest.ai](https://app.smallest.ai/dashboard/settings/apikeys).

## Quickstart
Expand Down Expand Up @@ -75,16 +77,18 @@ uv run text-to-speech/getting-started/python/synthesize.py "Hello from Smallest

## Supported Languages

`en` English · `hi` Hindi · `es` Spanish · `ta` Tamil
Standard Lightning v3.1: `en` English · `hi` Hindi · `mr` Marathi · `kn` Kannada · `ta` Tamil · `bn` Bengali · `gu` Gujarati · `te` Telugu · `ml` Malayalam · `pa` Punjabi · `or` Odia · `es` Spanish · `auto`

Lightning v3.1 Pro: depends on the voice — Indian Pro voices speak `en` + `hi`; British and American Pro voices speak `en`. Query `GET /waves/v1/lightning-v3.1/get_voices` and read `tags.language` for the source of truth.

## Output Formats

`pcm` (raw) · `wav` · `mp3` · `mulaw`

## Documentation

- [Lightning v3.1 REST](https://waves-docs.smallest.ai/v4.0.0/content/api-references/lightning-v3.1)
- [Lightning v3.1 WebSocket](https://waves-docs.smallest.ai/v4.0.0/content/api-references/lightning-v3.1-ws)
- [Lightning v3.1 REST](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/synthesize-speech)
- [Lightning v3.1 WebSocket](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/synthesize-speech-ws)
- [Voices API](https://waves-docs.smallest.ai/v4.0.0/content/api-references/get-voices-api)
- [Voice Cloning](https://waves-docs.smallest.ai/v4.0.0/content/api-references/voice-cloning-api)
- [Pronunciation Dicts](https://waves-docs.smallest.ai/v4.0.0/content/api-references/pronunciation-dicts-api)
Expand Down
12 changes: 7 additions & 5 deletions text-to-speech/getting-started/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ The simplest way to generate speech from text using Smallest AI's Lightning TTS
- Generate speech from text with a single API call
- Save output as WAV file
- Choose voice, speed, and language
- Uses Lightning v3.1 for highest quality output
- Uses the unified `/waves/v1/tts` route — defaults to Lightning v3.1 Pro pool, pass `MODEL="lightning_v3.1"` to use the standard pool instead

## Requirements

Expand Down Expand Up @@ -40,16 +40,18 @@ Output is saved to `output.wav` in the current directory.

| Parameter | Description | Default |
|-----------|-------------|---------|
| `MODEL` | TTS model | `lightning-v3.1` |
| `VOICE_ID` | Voice to use (see [Voices](../voices/)) | `sophia` |
| `MODEL` | TTS pool (`lightning_v3.1_pro` or `lightning_v3.1`) | `lightning_v3.1_pro` |
| `VOICE_ID` | Voice to use (see [Voices](../voices/)) | `meher` |
| `SPEED` | Playback speed (0.5 to 2.0) | `1.0` |
| `SAMPLE_RATE` | Audio sample rate in Hz | `24000` |
| `LANGUAGE` | Language code (`en`, `hi`, `es`, `ta`) | `en` |
| `LANGUAGE` | Language code — Pro pool: `en` (all voices) or `hi` (Indian voices); switch `MODEL` to `lightning_v3.1` to use `es`, `ta`, and 9 more | `en` |
| `OUTPUT_FORMAT` | Output format (`wav`, `pcm`, `mp3`, `mulaw`) | `wav` |

## API Reference

- [Lightning v3.1 API](https://waves-docs.smallest.ai/v4.0.0/content/api-references/lightning-v3.1)
- [Synthesize speech (unified `/waves/v1/tts`)](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/synthesize-speech)
- [Lightning v3.1 model card](https://docs.smallest.ai/waves/model-cards/text-to-speech/lightning-v-3-1)
- [Lightning v3.1 Pro model card](https://docs.smallest.ai/waves/model-cards/text-to-speech/lightning-v-3-1-pro)

## Next Steps

Expand Down
12 changes: 8 additions & 4 deletions text-to-speech/getting-started/javascript/synthesize.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,20 @@
const fs = require("fs");

// Configuration
const MODEL = "lightning-v3.1";
const VOICE_ID = "sophia";
// Set MODEL to "lightning_v3.1_pro" for the Pro pool (curated voices, en + hi
// on Indian voices, en on British/American voices). Set to "lightning_v3.1"
// for the standard pool (more voices, full 12-language catalog, voice cloning).
const MODEL = "lightning_v3.1_pro";
const VOICE_ID = "meher";
const SPEED = 1.0;
const SAMPLE_RATE = 24000;
const LANGUAGE = "en"; // en, hi, es, ta
const LANGUAGE = "en"; // en, hi, es, ta (per voice; see voice tags via /get_voices)
const OUTPUT_FORMAT = "wav";

const API_BASE = "https://api.smallest.ai/waves/v1";

async function synthesize(text, apiKey) {
const response = await fetch(`${API_BASE}/${MODEL}/get_speech`, {
const response = await fetch(`${API_BASE}/tts`, {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
Expand All @@ -32,6 +35,7 @@ async function synthesize(text, apiKey) {
body: JSON.stringify({
text,
voice_id: VOICE_ID,
model: MODEL,
speed: SPEED,
sample_rate: SAMPLE_RATE,
language: LANGUAGE,
Expand Down
12 changes: 8 additions & 4 deletions text-to-speech/getting-started/python/synthesize.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,26 +18,30 @@
load_dotenv()

# Configuration
MODEL = "lightning-v3.1"
VOICE_ID = "sophia"
# Set MODEL to "lightning_v3.1_pro" for the Pro pool (curated voices, en + hi
# on Indian voices, en on British/American voices). Set to "lightning_v3.1"
# for the standard pool (more voices, full 12-language catalog, voice cloning).
MODEL = "lightning_v3.1_pro"
VOICE_ID = "meher"
SPEED = 1.0
SAMPLE_RATE = 24000
LANGUAGE = "en" # en, hi, es, ta
LANGUAGE = "en" # en, hi, es, ta (per voice; see voice tags via /get_voices)
OUTPUT_FORMAT = "wav"

API_BASE = "https://api.smallest.ai/waves/v1"


def synthesize(text: str, api_key: str) -> bytes:
response = requests.post(
f"{API_BASE}/{MODEL}/get_speech",
f"{API_BASE}/tts",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"text": text,
"voice_id": VOICE_ID,
"model": MODEL,
"speed": SPEED,
"sample_rate": SAMPLE_RATE,
"language": LANGUAGE,
Expand Down
8 changes: 5 additions & 3 deletions text-to-speech/quickstart-curl.sh
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
#!/bin/bash
# Text-to-Speech Quickstart — cURL
# Generate speech from text using Lightning v3.1.
# Generate speech from text using the unified Lightning TTS route.
# Targets the Lightning v3.1 Pro pool — drop the `model` field to use the
# standard Lightning v3.1 pool instead.
#
# Usage:
# export SMALLEST_API_KEY="your-api-key"
# bash quickstart-curl.sh
#
# Docs: https://docs.smallest.ai/waves/documentation/text-to-speech-lightning/quickstart

curl -X POST "https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech" \
curl -X POST "https://api.smallest.ai/waves/v1/tts" \
-H "Authorization: Bearer $SMALLEST_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text":"Modern problems require modern solutions.","voice_id":"magnus","sample_rate":24000,"speed":1.0,"language":"en","output_format":"wav"}' \
-d '{"text":"Modern problems require modern solutions.","voice_id":"meher","model":"lightning_v3.1_pro","sample_rate":24000,"speed":1.0,"language":"en","output_format":"wav"}' \
--output output.wav

echo "Saved output.wav ($(wc -c < output.wav) bytes)"
10 changes: 7 additions & 3 deletions text-to-speech/quickstart-javascript.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
/**
* Text-to-Speech Quickstart — JavaScript
* Generate speech from text using Lightning v3.1.
* Generate speech using the unified Lightning TTS route.
*
* Targets the Lightning v3.1 Pro pool — drop the `model` field (or set it to
* `"lightning_v3.1"`) to use the standard Lightning v3.1 pool instead.
*
* Usage:
* export SMALLEST_API_KEY="your-api-key"
Expand All @@ -14,7 +17,7 @@ const fs = require("fs");
const API_KEY = process.env.SMALLEST_API_KEY;

const response = await fetch(
"https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech",
"https://api.smallest.ai/waves/v1/tts",
{
method: "POST",
headers: {
Expand All @@ -23,7 +26,8 @@ const response = await fetch(
},
body: JSON.stringify({
text: "Modern problems require modern solutions.",
voice_id: "magnus",
voice_id: "meher",
model: "lightning_v3.1_pro",
sample_rate: 24000,
speed: 1.0,
language: "en",
Expand Down
10 changes: 7 additions & 3 deletions text-to-speech/quickstart-python.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
"""
Text-to-Speech Quickstart — Python
Generate speech from text using Lightning v3.1.
Generate speech using the unified Lightning TTS route.

Targets the Lightning v3.1 Pro pool — drop the `model` field (or set it to
`"lightning_v3.1"`) to use the standard Lightning v3.1 pool instead.

Usage:
export SMALLEST_API_KEY="your-api-key"
Expand All @@ -15,14 +18,15 @@
API_KEY = os.environ["SMALLEST_API_KEY"]

response = requests.post(
"https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech",
"https://api.smallest.ai/waves/v1/tts",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
},
json={
"text": "Modern problems require modern solutions.",
"voice_id": "magnus",
"voice_id": "meher",
"model": "lightning_v3.1_pro",
"sample_rate": 24000,
"speed": 1.0,
"language": "en",
Expand Down
6 changes: 4 additions & 2 deletions text-to-speech/quickstart/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,17 @@ Get your API key at [app.smallest.ai](https://app.smallest.ai/dashboard/settings
## curl (Fastest — zero install)

```bash
curl -X POST https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech \
curl -X POST https://api.smallest.ai/waves/v1/tts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Hello from Smallest AI!", "voice_id": "sophia", "sample_rate": 24000, "output_format": "wav"}' \
-d '{"text": "Hello from Smallest AI!", "voice_id": "meher", "model": "lightning_v3.1_pro", "sample_rate": 24000, "output_format": "wav"}' \
--output hello.wav && echo "Done! Play hello.wav"
```

Replace `YOUR_API_KEY` with your key. That's it — you'll have a WAV file in 2 seconds.

> The example above uses the Lightning v3.1 Pro pool. Omit `"model"` (or set it to `"lightning_v3.1"`) to use the standard pool — that one has more voices, the full 12-language catalog, plus voice cloning.

## Python

```bash
Expand Down
4 changes: 2 additions & 2 deletions text-to-speech/quickstart/quickstart.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ if (!apiKey) {
process.exit(1);
}

fetch("https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech", {
fetch("https://api.smallest.ai/waves/v1/tts", {
method: "POST",
headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": "application/json" },
body: JSON.stringify({ text: "Hello! Welcome to Smallest AI. This is your first text-to-speech generation.", voice_id: "sophia", sample_rate: 24000, output_format: "wav" }),
body: JSON.stringify({ text: "Hello! Welcome to Smallest AI. This is your first text-to-speech generation.", voice_id: "meher", model: "lightning_v3.1_pro", sample_rate: 24000, output_format: "wav" }),
})
.then((r) => {
if (!r.ok) return r.text().then((t) => { throw new Error(`API error (${r.status}): ${t}`); });
Expand Down
5 changes: 3 additions & 2 deletions text-to-speech/quickstart/quickstart.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,15 @@
sys.exit(1)

response = requests.post(
"https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech",
"https://api.smallest.ai/waves/v1/tts",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"text": "Hello! Welcome to Smallest AI. This is your first text-to-speech generation.",
"voice_id": "sophia",
"voice_id": "meher",
"model": "lightning_v3.1_pro",
"sample_rate": 24000,
"output_format": "wav",
},
Expand Down
40 changes: 33 additions & 7 deletions text-to-speech/streaming-python.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
TTS Streaming — Python
Stream speech via SSE (Server-Sent Events) for real-time playback.

Targets the Lightning v3.1 Pro pool on the unified /waves/v1/tts/live
endpoint. Drop the `model` field (or set it to "lightning_v3.1") to use the
standard Lightning v3.1 pool instead.

Usage:
export SMALLEST_API_KEY="your-api-key"
pip install requests
Expand All @@ -10,35 +14,57 @@
Docs: https://docs.smallest.ai/waves/documentation/text-to-speech-lightning/streaming
"""

import base64
import json
import os
import wave

import requests

API_KEY = os.environ["SMALLEST_API_KEY"]

response = requests.post(
"https://api.smallest.ai/waves/v1/lightning-v3.1/stream",
"https://api.smallest.ai/waves/v1/tts/live",
Comment thread
entelligence-ai-pr-reviews[bot] marked this conversation as resolved.
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
},
json={
"text": "Modern problems require modern solutions.",
"voice_id": "magnus",
"voice_id": "meher",
"model": "lightning_v3.1_pro",
"sample_rate": 24000,
},
stream=True,
)

response.raise_for_status()

# SSE frames look like:
# event: audio
# data: {"audio": "<base64-encoded PCM chunk>"}
#
# data: {"done": true}
# We collect base64 PCM payloads then write them with a WAV header.
chunks: list[bytes] = []
for line in response.iter_lines():
if not line:
continue
decoded = line.decode("utf-8", "replace")
if not decoded.startswith("data:"):
continue
payload = json.loads(decoded[5:].strip())
if payload.get("done"):
break
audio_b64 = payload.get("audio")
if audio_b64:
chunks.append(base64.b64decode(audio_b64))

pcm = b"".join(chunks)
with wave.open("streamed.wav", "wb") as wf:
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(24000)
total = 0
for chunk in response.iter_content(chunk_size=4096):
wf.writeframes(chunk)
total += len(chunk)
wf.writeframes(pcm)

print(f"Saved streamed.wav ({total:,} bytes)")
print(f"Saved streamed.wav ({len(pcm):,} PCM bytes from {len(chunks)} chunks)")
10 changes: 6 additions & 4 deletions text-to-speech/streaming/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,15 +60,17 @@ node stream_ws.js "This text will be streamed via WebSocket."

| Parameter | Description | Default |
|-----------|-------------|---------|
| `MODEL` | TTS model | `lightning-v3.1` |
| `VOICE_ID` | Voice to use | `sophia` |
| `MODEL` | TTS pool (`lightning_v3.1_pro` or `lightning_v3.1`) | `lightning_v3.1_pro` |
| `VOICE_ID` | Voice to use | `meher` |
| `SAMPLE_RATE` | Audio sample rate in Hz | `24000` |
| `SPEED` | Playback speed (0.5–2.0) | `1.0` |

## API Reference

- [Lightning v3.1 SSE Streaming](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/stream-lightning-v-31-speech)
- [Lightning v3.1 WebSocket](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/text-to-speech-v-3-1)
- [Synthesize speech (REST sync)](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/synthesize-speech)
- [Stream speech (SSE)](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/synthesize-speech-sse)
- [Live TTS WebSocket](https://docs.smallest.ai/waves/api-reference/api-reference/text-to-speech/live-tts-web-socket)
- [Lightning v3.1 Pro model card](https://docs.smallest.ai/waves/model-cards/text-to-speech/lightning-v-3-1-pro)

## Next Steps

Expand Down
Loading
Loading