speechRecognitionTranscriber module use Google Speech API in python. This module performs speech recognition and converts to text. Admits video and audio files to be transcribed. Use ffmpeg to convert video and audio files to .wav to be recognized in Google Speech API. Also use fragments division based on silence.
Documentation available on docs.
speechRecognitionTranscriber requires video/audio like input.
The process to running the program:
- Execute programs/speechRecognitionTranscriber.py, to start de program.
python speechRecognitionTranscriber.py- Introduce your file path.
yourfile.extensionNOTE:
- Transcribed text is saved in
transcribedText.txt. - Transcribed text is saved in
transcribedText.pdf. - Audio fragments are saved in
/fragments. - Converted source is saved as
convertedFile.wav.
Temporal files like cconvertedFile.wav and /fragments are deleted when program ends.
speechRecognitionTranscriber requires:
- Install pip
- Install SpeechRecognition:
pip install SpeechRecognition- Install fpdf:
pip install fpdf- Install pydub
pip install pydub- Install ffmpeg
Linux
sudo apt-get install ffmpegMicrosoft Windows Download binaries and set path in system variables.
Tested on: windows 10,ubuntu 14.04, ubuntu 16.04, ubuntu 18.04, lubuntu 18.04 and raspbian.