TNT 🧨

🌐 appautomaton.github.io/tnt-asr — the project landing page.

Terminal voice-to-text. Tap Space, speak, tap Space — your words land in the transcript and on the clipboard.

Qwen3-ASR-1.7B runs in-process on the Apple GPU via mlx-speech as an 8-bit (int8) quantized checkpoint — ~2.5 GB resident: the model loads once, stays resident, and transcribes a short take in a fraction of a second. Fully local — no cloud, no runtime network calls. The microphone is captured natively through AVFoundation by a small Swift helper process, so a misbehaving audio stack can never trap the mic: TNT just kills the helper and macOS releases it.

Note

Using Termux on Android? Use the preserved legacy/android-termux-qwen0.6b branch instead of master. It is a legacy proot setup and may need device-specific fixes; validate it locally and adapt it with your own tools or agentic AI workflow.

git fetch origin
git switch --track origin/legacy/android-termux-qwen0.6b

Features

In-process GPU inference — pure MLX, no PyTorch
8-bit quantized — int8 weights (~2.5 GB), about half the memory of BF16 with a faster decode
Resident model — loads once in the background at startup; every take is warm
Native mic capture — AVFoundation via an isolated Swift helper process; the mic can always be reclaimed
English, Chinese, and mixed speech — language auto-detected, or forced via env var
Live braille oscilloscope — real audio levels while you record
Clipboard-first — new transcriptions auto-copy; click any past entry to copy it again
Responsive TUI — side-rail layout on wide terminals, stacked on narrow ones

Setup

Important

Requires an Apple Silicon Mac (M1 or later), Python 3.13+, uv, and the Xcode command line tools (xcode-select --install) — the mic capture helper is compiled from Swift on first launch and cached.

git clone https://github.com/appautomaton/tnt-asr.git
cd tnt-asr
uv sync
./bootstrap-mlx-asr.sh        # downloads + links the int8 checkpoint (~2.5 GB, cached by Hugging Face)
uv run tnt

Or install from PyPI (automaton-tnt):

uv tool install automaton-tnt
TNT_MLX_MODEL=/path/to/qwen3-asr-1.7b-int8-mlx tnt

(Instead of exporting TNT_MLX_MODEL, you can symlink the checkpoint at ~/.local/share/tnt/qwen3-asr-mlx.)

Model checkpoint

TNT expects a converted Qwen3-ASR-1.7B MLX checkpoint. A ready-to-use int8 build (~2.5 GB) is published at appautomaton/qwen3-asr-1.7b-int8-mlx. The bootstrap script takes three forms:

./bootstrap-mlx-asr.sh                       # download the int8 build from Hugging Face, then link it
./bootstrap-mlx-asr.sh <hf-repo-id>          # download a specific Hugging Face repo
./bootstrap-mlx-asr.sh /path/to/checkpoint   # link a checkpoint you already have (no download)

Downloads use huggingface_hub (already installed via mlx-speech) and land in the shared Hugging Face cache (~/.cache/huggingface); the script symlinks bin/qwen3-asr-mlx to the cached snapshot. It is idempotent — if the model is already cached, or you pass a local path, nothing is re-downloaded, so you never keep two copies of the 2.5 GB weights. BF16 and mxfp8 builds work too — mlx-speech reads the quantization from the checkpoint's config.json, so switching is just a relink. Alternatively, convert the upstream Qwen/Qwen3-ASR-1.7B weights yourself with mlx-speech's scripts/convert/qwen3_asr.py.

Configuration

Environment variable	Default	Description
`TNT_MLX_MODEL`	`bin/qwen3-asr-mlx`, else `~/.local/share/tnt/qwen3-asr-mlx`	Path to the converted MLX checkpoint
`TNT_MLX_LANGUAGE`	`auto`	`Chinese`, `English`, or `auto`. Use `Chinese` to keep mixed Chinese/English speech from being translated to English
`TNT_INPUT_DEVICE`	system default	Microphone, by index or name
`TNT_CAPTURE_BACKEND`	`auto`	macOS always uses native AVFoundation (needs the Xcode command line tools: `xcode-select --install`); other platforms use PortAudio. `portaudio` is rejected on macOS

Keybindings

Key	Action
`Space`	Start / stop recording, or hold to record until release; cancels during transcription
`c`	Copy the last transcript entry
mouse click	Copy the clicked transcript entry
`x`	Clear the transcript
`q`	Quit

Project structure

src/tnt/
├── app.py             # Textual TUI, state machine, keybindings
├── audio.py           # Recorder protocol, backend selection, PortAudio (non-macOS)
├── avf_audio.py       # Native AVFoundation capture via helper process (macOS)
├── mic_helper.swift   # AVFoundation helper source, compiled on demand
├── async_threads.py   # Daemon-thread helpers for blocking work
├── transcriber.py     # In-process MLX Qwen3-ASR transcription
└── widgets/
    ├── transcript.py  # Scrollable transcript log
    └── status.py      # Braille oscilloscope + state rail
bin/
└── qwen3-asr-mlx      # Symlink to converted MLX checkpoint (gitignored)

Tip

The inference path expects 16 kHz mono PCM WAV; the recorder produces exactly that. Cancelling a transcription abandons its result — the in-process generation cannot be killed mid-flight and quietly finishes in the background.

Related projects

mlx-speech — our MLX-native speech runtime that powers TNT (PyPI)
qwen3-asr-1.7b-int8-mlx — our int8 MLX checkpoint that TNT runs (converted from Qwen3-ASR-1.7B)

More from appautomaton

🌐 appautomaton.github.io — our site
🤗 huggingface.co/appautomaton — our models and checkpoints on Hugging Face
🐙 github.com/appautomaton — our open-source projects

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
bin		bin
docs		docs
scripts		scripts
src/tnt		src/tnt
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
bootstrap-mlx-asr.sh		bootstrap-mlx-asr.sh
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TNT 🧨

Features

Setup

Model checkpoint

Configuration

Keybindings

Project structure

Related projects

More from appautomaton

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

TNT 🧨

Features

Setup

Model checkpoint

Configuration

Keybindings

Project structure

Related projects

More from appautomaton

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages