feat(stt): interactive VAD events demo on the feature page by abhishekmishragithub · Pull Request #272 · smallest-inc/smallest-ai-documentation

abhishekmishragithub · 2026-06-23T09:51:49Z

Stacks on top of #271. Adds an interactive React component to the VAD events page that lets readers play sample audio and see the captured server message stream in sync.

What ships

File	Purpose
`fern/components/vad-events-demo/VadEventsDemo.tsx`	React component. SVG waveform + native audio playback + scrolling event log. ~16 KB.
`fern/components/vad-events-demo/assets/{clean,multi-turn,no-tail}.mp3`	Three audio samples. ~210 KB total.
`fern/components/vad-events-demo/fixtures/{clean,multi-turn,no-tail}.json`	Captured live event streams + precomputed amplitude arrays. ~20 KB total.
`scripts/spec-live-tests/prep_vad_demo_fixtures.py`	Reproducer. Downloads source audio, builds three variants with ffmpeg, runs each through the live Pulse STT WebSocket, writes fixtures.
`fern/products/waves/.../vad-events.mdx` + mirror	Adds the import and a `` block under a "Try it" section.

Three fixtures, three behaviors

Fixture	speech_started	speech_ended	Transcripts	Illustrates
clean	1	1	7	Normal single-utterance case.
multi-turn	2	2	13	Per-voiced-region behavior across two turns.
no-tail	1	0	6	The "speech_ended requires trailing silence" caveat from Notes.

Every event payload in the fixtures comes directly from the production endpoint. The component does not connect to a WebSocket at runtime; everything is static.

Behavior

Sample picker at the top switches between fixtures. Each chip shows a title and a one-line subtitle.
Waveform is rendered as 200 SVG bars from a precomputed peak-amplitude array. Vertical dashed markers sit at the timestamps of every `speech_started` and `speech_ended` event.
Play button drives a native `` element. A live playhead moves across the waveform.
Event log scrolls in sync. The row whose timestamp matches the playhead highlights and auto-scrolls into view. Earlier rows fade in (full opacity once "reached"), later rows are dimmed.

Why this design

Two ideas, one demo:

Timing: the waveform + markers make the acoustic-boundary concept tactile. The reader hears the sample audio and sees where the events fire on the wave.
Wire shape: the event log shows the actual JSON each customer would receive, with the same discriminator field they'd switch on in their code.

The Mermaid sequence diagram below the demo is the formal contract; the interactive demo is the hands-on layer above it.

Verification

`fern check`: 0 errors.
Live fixture regeneration verified: all three variants captured with expected event counts. Replayable via `SMALLEST_API_KEY=... python3 scripts/spec-live-tests/prep_vad_demo_fixtures.py`.
v4-mirror diff: empty.
llms.txt regenerated; in sync.

Test plan

`fern check` green.
Render check on Vercel preview: confirm the component renders, the play button plays audio, markers display at the right positions, event log highlights in sync.
Confirm preview works on both desktop and mobile (the waveform SVG uses `preserveAspectRatio="none"` so it should scale; the fixture chips wrap on narrow screens).

Adds a React component that plays one of three pre-recorded audio samples and renders the captured server message stream in sync with playback. Three fixtures show the three behaviors documented on the page: - clean : 1 speech_started + 1 speech_ended (normal case) - multi-turn : 2 of each (per-voiced-region behavior) - no-tail : speech_started fires; speech_ended does not (caveat) Each fixture is captured live from the production WebSocket; the JSON files are the actual recorded message stream. No client-side WebSocket connection at runtime; everything ships as static assets. ## Files - fern/components/vad-events-demo/VadEventsDemo.tsx - fern/components/vad-events-demo/assets/{clean,multi-turn,no-tail}.mp3 - fern/components/vad-events-demo/fixtures/{clean,multi-turn,no-tail}.json - scripts/spec-live-tests/prep_vad_demo_fixtures.py Regenerates audio + fixtures by running each variant through the live Pulse STT endpoint. Requires ffmpeg + SMALLEST_API_KEY. - fern/products/waves/.../vad-events.mdx (+ versions mirror) Imports the component, renders it under a "Try it" section above the existing Mermaid sequence diagram. ## Sizes | File | Size | |---|---| | VadEventsDemo.tsx | 16 KB | | clean.mp3 | 60 KB | | multi-turn.mp3 | 108 KB | | no-tail.mp3 | 48 KB | | 3 x fixture.json | ~20 KB | | total | ~252 KB | ## Verification - fern check: 0 errors. - Live regeneration of fixtures verified: ran prep script against prod, all three variants captured with the expected event counts.

…t root Component was failing to render with 'Could not resolve ./fixtures/*.json' and './assets/*.mp3'. Root cause: Fern's MDX bundler only resolves .tsx/.ts/.js/.mdx imports from custom React components. Relative JSON and binary asset imports are not supported. Changes: - Convert fixtures from JSON to typed TS modules (clean.ts etc.) that export a Fixture object. Component imports those instead. - Add a shared types.ts so the fixture and component agree on shape. - Move MP3s to fern/docs/assets/vad-events/ (Fern's static asset root). Each fixture's `audio` field now points at /assets/vad-events/<name>.mp3 which is the served URL on the rendered site. - Update prep script accordingly so future regens emit the TS format + write MP3s to the new asset location.

…of page Two fixes from review feedback on the rendered preview. 1. Play/pause did nothing because the audio src pointed at /assets/vad-events/<name>.mp3, which 404s on docs.smallest.ai. Fern does not serve fern/docs/assets/ at runtime — that path is for MDX compile-time references only (e.g. <img src="../../docs/assets/...">). Custom React components running at runtime have no equivalent. Fix: embed each MP3 as a base64 data URL directly in the fixture TS module. Self-contained, no external CDN, no CSP issue. 2. Move the demo from the top of the page to the bottom after Notes, so readers see the param + payload + behavior caveats before the interactive walkthrough. Fixture sizes after embedding: - clean.ts: 80 KB - multi-turn.ts: 152 KB - no-tail.ts: 64 KB

…ix sequence ordering Three improvements after reviewing the rendered demo. 1. **Maverick voice via Lightning v3.1 Pro** for the audio samples. More natural than the cookbook test audio and showcases real product output. Dialogue is conversational with fillers ('um', 'so', 'yeah') so the samples represent real call audio, not test reads. 2. **Real word timestamps** on transcripts. The Pulse WS does not put a top-level timestamp on transcription messages, only on speech_started / speech_ended. Previous fixtures synthesized transcript times by spreading them evenly across the audio duration. Now the prep script enables word_timestamps=true and uses words[-1].end for each transcript. 3. **Sort by timestamp** so the demo timeline is monotonic. With real word timestamps, partials emitted before silence had earlier times than the speech_ended that follows them in wire order — the highlight picker walked off the end of the array. Sorting the events by their assigned timeline timestamp puts them in playback order. Component picker also fixed defensively: scans the full array instead of stopping at the first out-of-order timestamp. Sequences captured live from prod: - clean : speech_started @0.032s, speech_ended @5.36s, 6 transcripts - multi-turn : 2 speech_started + 2 speech_ended pairs, 6 transcripts - no-tail : speech_started @0.384s, no speech_ended (caveat), 3 transcripts Fixture sizes: 80 KB / 84 KB / 28 KB (base64-embedded MP3 + envelope + events).

abhishekmishragithub force-pushed the feat/vad-events-interactive-demo branch from bb3f3de to 38b854f Compare June 23, 2026 10:10

abhishekmishragithub added 3 commits June 23, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(stt): interactive VAD events demo on the feature page#272

feat(stt): interactive VAD events demo on the feature page#272
abhishekmishragithub wants to merge 4 commits into
feat/vad-events-pulse-sttfrom
feat/vad-events-interactive-demo

abhishekmishragithub commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

abhishekmishragithub commented Jun 23, 2026

What ships

Three fixtures, three behaviors

Behavior

Why this design

Verification

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant