feat: add Studio batch transcription workflow by tomerar · Pull Request #1183 · cjpais/Handy

tomerar · 2026-03-28T18:35:16Z

Summary

This PR introduces Studio, a new file-based transcription workflow for Handy.

Handy’s core experience today is optimized for live dictation: press a shortcut, speak, and have the text appear in the active app. That workflow remains fully intact.

Studio adds a second major workflow for a different but very common use case: bringing an existing audio file into Handy, transcribing it locally, reviewing progress, and exporting clean transcript files.

In practice, this expands Handy from a live speech-to-text tool into both:

a fast offline dictation tool
and a practical offline transcription workspace for existing recordings

This fits Handy’s current direction well:

offline
private
local-first
simple to use
extensible enough for community-driven growth

It also matches the kinds of adjacent workflows users commonly ask for around the project: not just live microphone dictation, but also reliable local transcription for audio files they already have.

Scope

This PR focuses on introducing Studio as a complete first version of file-based transcription inside Handy.

It does not redesign or replace Handy’s existing live dictation flow. Instead, it adds a complementary workflow for local audio-file transcription, job management, and transcript export.

Why This Feature Exists

Before this PR, Handy was excellent for:

quick dictation
shortcut-based speech capture
pasting transcribed text directly into another app

But there was no first-class workflow for:

transcribing an audio file that already exists
working with longer recordings
exporting transcript files to disk
revisiting and retrying previous transcription jobs
comparing repeated runs of the same file
managing file-based transcription history inside Handy itself

That gap matters for real users.

Common examples include:

podcasts
meetings
lectures
interviews
voice notes
songs and lyric captures
spoken screen recordings
longer offline recordings that are not part of a live dictation session

Studio is designed to cover exactly those cases while preserving Handy’s existing simplicity.

What Studio Adds

Studio is a dedicated transcription workspace inside Handy for existing audio files.

A user can now:

Open the Studio section from the sidebar
Choose or drag-and-drop an audio file
Review file details before starting
Select an output folder
Select one or more output formats
Start transcription
Watch live progress and transcript preview
Cancel or retry when needed
Re-open recent jobs
Open exported transcript files from completed jobs

Supported output formats:

TXT
SRT
VTT

Supported input formats currently include:

MP3
WAV
M4A
FLAC
OGG

Product Flow

1. File preparation

When a file is selected, Studio creates a persisted job and inspects the media before transcription begins.

The user immediately gets:

file name
duration
size
format/container details
estimated processing time
a setup screen before committing to the run

2. Job setup

Before starting, the user chooses:

where output files will be saved
which export formats to generate

This keeps Studio task-oriented and predictable rather than turning it into a one-click fire-and-forget flow.

3. Audio normalization and chunking

The backend normalizes the source audio into a Whisper-friendly WAV and splits it into overlapping chunks.

This helps Studio handle:

longer recordings
different codecs and containers
more stable progress tracking
safer retry and recovery behavior

4. Live progress and preview

While the job is running, Studio shows:

current processing stage
chunk progress
transcript preview as chunks complete
clear cancelled/error states when something goes wrong

5. Export

When the job completes, Studio writes one or more transcript outputs to disk.

Completed jobs show:

output files
final status
transcript preview
quick access to the output folder

If the same file is exported more than once, Studio versions filenames safely instead of failing on collisions.

6. Recent jobs

Studio includes a real recent-jobs workflow, not just a passive history list.

Users can:

revisit jobs
load prepared jobs again
inspect completed jobs
retry failed/cancelled jobs
filter by status
delete one or many jobs
keep history without being forced to remove older work

This matters because file-based transcription is inherently a multi-run workflow.

Screenshots

Studio home

Main Studio landing screen
New file-based transcription workflow inside Handy
Dedicated entry point for batch and offline transcription jobs

Prepared job

File selected and prepared before transcription starts
Output folder and export formats can be configured before running
Studio shows file details and estimated processing time up front

Running transcription

Studio surfaces both preparation stages and active transcription stages
Progress remains visible throughout decoding, chunking, and transcription
The transcript preview begins filling in as chunks complete

Completed job

Finished jobs show the final transcript preview and completion state
Generated output files are listed directly in the Studio UI
Users can immediately open the output folder after transcription finishes

Recent jobs

Studio keeps a browsable history of recent transcription jobs
Users can revisit prepared and completed work directly from the same screen
This makes repeated file-based workflows practical instead of one-time only

Recent job filtering

Recent jobs can be filtered by status without leaving the Studio workflow
This helps users focus on ready or completed work without deleting job history
Filtering stays lightweight and integrated into the existing recent-jobs panel

Cancelled and retryable jobs

Cancelled jobs remain visible in Studio instead of disappearing from context
Users can retry interrupted work directly from the job view or recent-jobs list
This makes Studio more resilient for longer workflows and unexpected interruptions

How It Is Implemented

Backend

Studio adds a dedicated backend manager responsible for the file-transcription lifecycle.

Main backend additions:

src-tauri/src/managers/studio.rs
src-tauri/src/media/decode.rs
src-tauri/src/exporters/srt.rs
src-tauri/src/exporters/txt.rs
src-tauri/src/exporters/vtt.rs
src-tauri/src/commands/studio.rs

Core backend responsibilities:

SQLite persistence for jobs, chunks, and exports
media probing and decode/normalize pipeline
chunk planning and tracking
per-job progress events
transcript preview events
output export generation
retry / cancel / recovery behavior
recent-jobs state from persisted data

Frontend

Studio adds a dedicated React/Zustand UI flow.

Main frontend additions:

src/components/studio/StudioHome.tsx
src/components/studio/StudioDropzone.tsx
src/components/studio/StudioSetupCard.tsx
src/components/studio/StudioJobView.tsx
src/components/studio/StudioRecentList.tsx
src/stores/studioStore.ts
src/lib/studioApi.ts
src/lib/types/studio.ts

Core frontend responsibilities:

preparing files
starting jobs
reacting to backend events
rendering progress and preview
browsing recent jobs
filtering recent jobs by status
loading previous jobs back into the UI
retry / delete / clear-all flows
formatting time, size, and estimate data consistently

The implementation follows Handy’s existing architecture rather than introducing a separate subsystem with unrelated conventions.

Reliability and Hardening Included

This branch includes a full stabilization pass beyond the initial feature implementation.

Notable improvements included in this PR:

prevent duplicate concurrent starts for the same Studio job
validate output folders and export formats before starting jobs
improve retry behavior so retries are safer and more deterministic
avoid duplicate output generation after interrupted export flows
track worker lifecycle more carefully
improve shutdown behavior on app exit
handle output filename collisions by versioning instead of failing
improve transcript preview behavior during chunked transcription
improve recent-jobs selection and browsing flow
improve recovery behavior for interrupted jobs
improve drag-and-drop fallback behavior when the path is unavailable
align file-picking with backend-supported extensions
fix Playwright web-server startup on Windows so verification remains reliable

This matters because Studio is a long-running, stateful workflow with persistence, recovery, exports, retries, and background processing. The feature needed a full validation and hardening cycle to be ready for merge.

UX and Product Polish Included

Studio is designed to feel like a native part of Handy rather than an experimental side panel.

UX work in this PR includes:

dedicated sidebar entry
dedicated Studio landing page
drag-and-drop plus file chooser
setup screen before starting a job
live progress view
transcript preview view
recent-jobs browser
recent-jobs filtering
bulk clear flow with confirmation
load-from-recent feedback
ability to choose a new file even when another job already exists
clearer retry/cancel behavior
open-output-folder actions
better handling for repeated runs of the same source file

The goal was not only to make the feature functional, but also to make it practical for repeated everyday use.

Localization

Studio is fully wired into Handy’s translation system.

This PR updates all locales with Studio-related copy, including:

page labels
setup copy
status copy
recent-jobs copy
drag-and-drop messaging
retry / cancel / warning copy
output and preview labels

This keeps Studio aligned with Handy’s multilingual direction instead of introducing a large untranslated feature area.

Why This Does Not Regress Existing Handy Behavior

This feature is additive.

It does not replace or redesign Handy’s existing real-time dictation workflow.

The current live flow still behaves as expected:

start/stop by shortcut
microphone capture
local transcription
paste into the active app

Studio is separate in the right places:

separate manager
separate commands
separate events
separate UI/store flow
separate persistence tables

At the same time, it is integrated in the right places:

model/settings integration
app shell/navigation
existing project architecture
existing translation system
existing build/test flow

So this PR expands Handy’s scope without destabilizing what already works.

Community / Project Fit

This PR fits Handy’s public product identity very well.

Handy is described in the project README as:

free
open source
private
offline
simple
extensible

Studio is a direct extension of those values.

It brings file transcription into the same local-first model instead of pushing users toward a separate cloud workflow.

It also fits the broader direction of the project:

Handy is positioned as something people can build on
this expands the app into a highly requested adjacent workflow
the feature stays aligned with privacy, offline processing, and product simplicity

Studio is not feature creep for its own sake. It is a natural next workflow for users who already trust Handy for offline transcription and want that same experience for existing recordings.

Validation and Verification

This branch went through a full validation cycle, including implementation, hardening, UX cleanup, localization cleanup, and verification.

Frontend verification

bun run lint
bun run build
bun run check:translations
bun run test:playwright

Backend verification

cargo fmt --manifest-path src-tauri/Cargo.toml -- --check
cargo check --manifest-path src-tauri/Cargo.toml --features windows-whisper-vulkan
cargo test --manifest-path src-tauri/Cargo.toml --features windows-whisper-vulkan --lib

Result:

84 passed, 0 failed

This PR was not left at the “feature works on my machine” stage. It went through a full pass of bug fixing, flow validation, test verification, and product-level cleanup before being prepared for review.

Reviewer Notes

Suggested review order:

src-tauri/src/managers/studio.rs
src-tauri/src/media/decode.rs
src/stores/studioStore.ts
src/components/studio/*
src-tauri/src/commands/studio.rs
src/lib/studioApi.ts
src/lib/types/studio.ts
i18n additions

Key review areas:

job lifecycle correctness
retry / cancel / recovery behavior
output/export behavior
event wiring between backend and frontend
recent-jobs workflow
isolation from existing dictation flows

Final Takeaway

This PR adds a major new capability to Handy:

not just “transcribe speech while I’m talking,” but also “take an existing audio file, transcribe it locally, manage the job, and export clean results.”

That makes Studio a real feature-level expansion of Handy, not just a small enhancement.

It stays aligned with the app’s core values, keeps existing flows intact, and makes Handy substantially more useful for real-world offline transcription work.

…dio decoding

Tighten the Studio workflow across backend and frontend by fixing preparation-stage cancellation, restart recovery, chunk handling, and user-facing job behavior. - make Studio cancellation work safely during audio preparation - improve preparation progress reporting for long files - reduce chunk size and add overlap trimming to improve responsiveness and boundary quality - fix sequential SRT numbering when empty chunks are skipped - recover stale running/paused jobs after restart - validate retry behavior when the source file is missing - surface Studio action errors in the UI - replace browser confirm with Tauri dialog for stop confirmation - clean up Studio labels, status text, and i18n wiring - add regression tests for SRT export, chunking, overlap trimming, and retry/source recovery behavior

…tion

- add Studio translations across supported locales - remove the unused Studio status bar from the home view - emit cleaned chunk text in transcript previews to reduce duplicate lines - extend overlap trimming coverage for longer repeated chunk prefixes

- improve recent job browsing, filtering, and selection feedback - allow loading new audio without depending on pending state - preserve active job context while preparing another file - auto-version export filenames on output collisions - complete Studio i18n keys across locales - tighten Studio store initialization and view state handling

…tion

- prevent duplicate concurrent starts for the same studio job - validate output folder and export formats before starting jobs - make studio export retries safer and avoid duplicate output generation - add graceful worker shutdown and timeout-aware dictation waiting - unify studio time/size formatting and improve runtime status messaging - add fallback handling for dropzone paths that cannot be read - complete new studio translations across locales - add focused tests for studio validation helpers and VTT export

- harden studio job lifecycle, retry behavior, and export validation - improve localized studio status, formatting, and dropzone fallback UX - align file picking with supported backend extensions across studio views - add and verify studio-focused backend tests and stable Playwright setup - polish recent jobs, output handling, and overall pre-MR readiness

cjpais · 2026-03-28T23:19:50Z

New features are not being accepted at the moment. This was written in the PR template and Contributing.md.

You cannot just drop a huge slop PR on me, if you want a feature and discuss it, we can do that in discord, DM, or GitHub discussions.

Generally it's well thought out but this is impossible for me to review. If you want to make huge changes we need to work together not just have something spawn out of the blue.

tomerar added 11 commits March 27, 2026 01:10

feat(studio): add Studio file transcription workflow with built-in au…

c9d6cbf

…dio decoding

chore(bindings): add Studio command and type bindings

8bdd42e

Merge remote-tracking branch 'upstream/main' into feat/studio-integra…

c6b07bc

…tion

Merge remote-tracking branch 'upstream/main' into feat/studio-integra…

74ba0fc

…tion

fix(ci): restore cargo compatibility and format studio files

e4b5c4b

fix(ci): update transcription mock for studio api

5efc315

cjpais closed this Mar 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Studio batch transcription workflow#1183

feat: add Studio batch transcription workflow#1183
tomerar wants to merge 11 commits intocjpais:mainfrom
tomerar:feat/studio-integration

tomerar commented Mar 28, 2026

Uh oh!

cjpais commented Mar 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tomerar commented Mar 28, 2026

Summary

Scope

Why This Feature Exists

What Studio Adds

Product Flow

1. File preparation

2. Job setup

3. Audio normalization and chunking

4. Live progress and preview

5. Export

6. Recent jobs

Screenshots

Studio home

Prepared job

Running transcription

Completed job

Recent jobs

Recent job filtering

Cancelled and retryable jobs

How It Is Implemented

Backend

Frontend

Reliability and Hardening Included

UX and Product Polish Included

Localization

Why This Does Not Regress Existing Handy Behavior

Community / Project Fit

Validation and Verification

Frontend verification

Backend verification

Reviewer Notes

Final Takeaway

Uh oh!

cjpais commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cjpais commented Mar 28, 2026 •

edited

Loading