Transformers.js v4 Parakeet TDT Demo

Live demo: https://ysdede.github.io/tdt-webgpu-demo/

This project is a React + Vite web app for automatic speech recognition with Nemo Conformer TDT models using Transformers.js v4.

Features

Run Parakeet-style TDT ASR models in the browser, for example ysdede/parakeet-tdt-0.6b-v2-onnx-tfjs4.
Choose encoder backend and dtype settings (WebGPU or WASM).
Switch between HF pipeline mode and direct model.transcribe() mode with mode-specific controls.
Compare browser audio prep paths, including a deterministic custom JS path with linear parity resampling and optional higher-quality SRC modes.
Transcribe sample or uploaded audio.
Inspect transcript, timestamps, confidence data, raw JSON output, and separated run/audio/model metrics.
Run a Node.js CLI test script for quick non-UI checks.

Requirements

Node.js 18 or newer.
A modern browser. WebGPU (Chrome or Edge) is recommended for best encoder performance.
Optional local development setup for transformers.js as a sibling folder at ../transformers.js.

Install

git clone <this-repo-url>
cd transformers-v4-parakeet-demo
npm install

Run

NPM mode (default)

Uses @huggingface/transformers@next from npm:

npm run dev

Then open the URL shown by Vite (typically http://localhost:5173).

Local source mode

Use this when you want to test local transformers.js changes without publishing a package.

Keep both repositories as siblings:
- .../transformers.js/
- .../transformers-v4-parakeet-demo/
Build transformers from the transformers.js root:

cd path/to/transformers.js
pnpm --filter @huggingface/transformers run build

Start the demo in local mode:

cd path/to/transformers-v4-parakeet-demo
npm run dev:local

dev:local sets TRANSFORMERS_LOCAL=true and aliases @huggingface/transformers to ../transformers.js/packages/transformers/dist/transformers.web.js.

Production Build

npm run build
npm run preview

GitHub Pages Deployment

Deployment is handled by .github/workflows/deploy-pages.yml.

The workflow:

Checks out this demo repository.
Checks out transformers.js into a sibling directory.
Builds @huggingface/transformers from source with pnpm.
Builds this demo with TRANSFORMERS_LOCAL=true.
Publishes dist/ to GitHub Pages.

Repository settings:

Enable Pages and select GitHub Actions as the source.
Optional repository variable TRANSFORMERS_REPO (default ysdede/transformers.js).
Optional repository variable TRANSFORMERS_REPO_REF (default v4-nemo-conformer-tdt-main-r3).
Optional secret TRANSFORMERS_REPO_TOKEN if transformers.js is private.

Notes:

GitHub Actions can only build commits that are pushed to GitHub.
workflow_dispatch supports a transformers_ref input for one-off branch/tag/SHA overrides.
The checked-in workflow defaults to your fork branch v4-nemo-conformer-tdt-main-r3.
The Vite base path is set automatically for both project pages and user pages.

Hugging Face Spaces Sync

Sync is handled by .github/workflows/sync-hf-space.yml.

The workflow:

Exports an HF-safe copy of the app to hf_export/.
Removes GitHub/local-only files and COI serviceworker wiring.
Writes HF-specific README.md, vite.config.js, and package.json.
Pushes the result to https://huggingface.co/spaces/ysdede/tdt-webgpu-demo.

Repository settings:

Add secret HF_TOKEN with write access to ysdede/tdt-webgpu-demo.

Node CLI Test

Run a quick transcription from the terminal:

npm run test:node -- --model ysdede/parakeet-tdt-0.6b-v2-onnx-tfjs4 --audio <path-to-wav-file> --encoder-device webgpu

By default, this script loads the local transformers build from ../transformers.js/packages/transformers/dist/transformers.node.mjs. Use --npm to use the installed npm package instead. Node CLI input must be WAV (.wav).

Option	Description
`--model <id-or-path>`	Model ID or local model path
`--audio <wav-path>`	WAV file path
`--encoder-device <webgpu\|cpu>`	Encoder device (`cpu` is safer for Node)
`--encoder-dtype`, `--decoder-dtype`	Examples: `fp16`, `int8`, `fp32`
`--timestamps`	Request word-level timestamps
`--loop <n>`	Repeat transcription `n` times
`--npm`	Use `@huggingface/transformers` from `node_modules`
`--local-module <path>`	Path to a local transformers node build

Included Sample

Sample audio file used by the UI: public/assets/Harvard-L2-1.ogg.

Additional Notes

Conformer TDT return granularity details

UI Notes

Model configuration supports load mode, model ID, device, dtype, and WASM thread tuning.
Transcription options include explicit inference mode selection, direct Nemo API flags, pipeline timestamp settings, and audio prep controls.
Metrics are split by mode: pipeline shows wall-clock run timing plus audio prep, while direct mode shows audio prep plus direct model internals.
The transcribe workspace is organized into three columns with transcript and API contract visible at the same time.
Settings and theme preferences are persisted in localStorage.

License

See the repository license.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
docs		docs
public		public
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformers.js v4 Parakeet TDT Demo

Features

Requirements

Install

Run

NPM mode (default)

Local source mode

Production Build

GitHub Pages Deployment

Hugging Face Spaces Sync

Node CLI Test

Included Sample

Additional Notes

UI Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transformers.js v4 Parakeet TDT Demo

Features

Requirements

Install

Run

NPM mode (default)

Local source mode

Production Build

GitHub Pages Deployment

Hugging Face Spaces Sync

Node CLI Test

Included Sample

Additional Notes

UI Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages