Skip to content

PhillipTwenk/TTS-STT-Module-CCArman

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This project implements an offline Text-to-Speech (TTS) and Speech-to-Text (STT) module designed for industrial public address and alert systems. The module operates completely offline, making it suitable for environments with limited or no internet connectivity, such as warehouses, production facilities, and remote sites.

The solution is built on Raspberry Pi / Orange Pi hardware (ARM architecture) but is fully cross-platform and also runs on Windows x86_64 for development and testing purposes.

Features

Speech Synthesis (TTS)

Russian and English language support

Natural-sounding speech using Piper TTS models

Adjustable speech rate (length_scale parameter)

Real-time playback via PortAudio

Speech Recognition (STT)

Russian and English language recognition using NeMo CTC models

Runtime language switching (no restart required)

Voice command detection via text post-processing (KWS emulation)

Support for hotwords with Transducer models

Audio Processing

Silero VAD – Voice Activity Detection for energy-efficient operation

GTCRN – Lightweight noise suppression (23.7K parameters)

PortAudio – Cross-platform audio capture and playback (16 kHz mono PCM)

User Interface

Command-line interface (CLI) with colored prompts

Voice commands (listen, stop, repeat)

Console commands for status, language switching, and result retrieval

Start

The executable file is located at source\build\start.exe (for Windows x86)

Team FGLπ Peter the Great Case Championship 2026

About

This repository provides an implementation of the TTS/ASR system project on a single-board computer from the FGLπ team as part of the Peter the Great Case Championship

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors