A Python toolkit for performing various audio processing tasks, such as:
- Removing silent audio files from a folder.
- Trimming silence from a single large audio file.
- Slicing audio files into smaller segments.
- Trimming audio to the first 30 seconds.
-
Batch Remove Silent Files:
- Scan a folder of audio files and move files below a specified volume threshold to a subfolder.
- Script:
batch_remove_silent_files.py. -
Usage :
python scripts/batch_remove_silent_files.py /path/to/input/folder --threshold 0.02
-
Trim Silence From Large Audio:
- Remove silent segments from a single large audio file, generating a continuous output.
- Script:
remove_silence_from_audio.py. -
Usage :
python scripts/remove_silent_from_audio.py input.wav output.wav --silence_thresh -40 --min_silence_len 300 --padding 200
-
Slice Audio Into Segments:
- Slice a large audio file into smaller segments based on specified time intervals or silence thresholds.
- Script:
slice_audio.py. -
Usage :
python scripts/slice_audio.py /path/to/audio/file --slice_duration_ms 2000 --fade_duration_ms 50
-
Trim Audio to First
$n$ Seconds:- Extract the first
$n$ seconds of an audio file and save it as a new file. - Script:
trim_audio.py. -
Usage :
python scripts/trim_audio.py input.wav output.wav
- Extract the first
-
Clone the repository:
git clone https://github.com/yourusername/audio-toolkit.git cd audio-toolkit -
Create a virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
RMS (Root Mean Square):
- The most common way to measure loudness is to calculate the RMS of the audio signal. It computes the average power of the waveform over its duration. RMS gives a single value representing the "overall" loudness of the file.
The formula for calculating RMS is:
Where:
-
$x[i]$ is the audio amplitude at sample$i$ , -
$N$ is the total number of samples.
Limitation: RMS averages the loudness over the entire file. If 1.9 seconds are silent and the last 0.1 seconds is loud, the loud portion will affect the RMS but may not be representative of the whole file's "perceived" silence.