Speakr

Self-hosted, intelligent note-taking for meetings and recordings

Speakr is an intelligent, self-hosted web application that transforms your audio recordings into organized, searchable, and insightful notes. By running on your own server, it ensures your sensitive conversations and data remain completely private.

Designed for a wide range of uses, Speakr is trusted by professionals for meeting minutes, by therapists for session notes, by students for lecture capture, and even for transcribing D&D sessions. It automatically transcribes audio with speaker identification, generates concise summaries, and provides an AI chat interface to interact with your content.

What's New?

Latest Release (Version 0.4.2)

Large File Chunking Support: Automatically splits large audio files to work with transcription services that have file size limits (e.g., OpenAI's 25MB limit). Review release notes for instructions or keep reading below.
Optimized File Processing: Improved efficiency by minimizing file conversions and using compressed formats.
Enhanced Security: Strengthened CSRF protection and fixed session timeout issues.
Improved Recording Reliability: Addressed several bugs related to in-browser recording.

Previous Version History

Version 0.4.1

Secure Sharing System: Share transcriptions via public links with customizable permissions.
Enhanced Recording & Note-Taking: Redesigned recording interface with a real-time notepad.
Advanced Speaker Diarization: AI-powered speaker detection and saved speaker profiles.
"Black hole" Directory: Feature for automatic, hands-free audio file processing.
Transcript Editing: Manually edit and correct transcriptions.
Clickable Timestamps: Navigate audio by clicking timestamps in the transcript.
Streaming Chat Responses: More interactive and responsive AI chat.

Screenshots

Transcription & chat

Integrated Chat

Light & dark

Light

Dark

Transcription views

Simple

Bubble

Speaker identification

AI-assisted

Manual & Auto

Saved Suggestions

Recordings & notes

Recording Options

Mic/System Audio

Mic + System Audio

Core Features

Self-Hosted and Private: Keep full control over your data by hosting Speakr on your own server.
Advanced Transcription & Diarization: Get accurate transcripts with optional AI-powered speaker identification (diarization) to know who said what.
AI-Powered Insights: Automatically generate titles and summaries for your recordings. Use the integrated chat to ask questions and pull insights directly from the transcript.
Install as a PWA App: Install on your phone for quick and easy recordings and note capture.
Versatile Recording & Upload: Upload existing audio files or record directly in the browser or PWA app. Capture audio from your microphone, your system's audio (e.g., for an online meeting), or both simultaneously.
Automated Processing: Designate a "black hole" directory for drag-and-drop batch processing of audio files.
Secure Sharing: Create shareable links for your transcripts with granular controls, allowing you to include or exclude summaries and notes.
Customizable AI: Configure the specific AI models, API endpoints (compatible with OpenAI, OpenRouter, local models), and custom prompts for summarization and chat.
Multi-User Support: Includes a complete user management system with an admin dashboard.

Getting Started

The recommended setup method uses Docker, which is simple and fast.

Easy Setup: Docker Compose (Recommended)

You only need Docker installed for this method; you do not need to clone the repository.

Create docker-compose.yml Create a file named docker-compose.yml and add the following content:

services:
  app:
    image: learnedmachine/speakr:latest
    container_name: speakr
    restart: unless-stopped
    ports:
      - "8899:8899"
    env_file:
      - .env
    volumes:
      - ./uploads:/data/uploads
      - ./instance:/data/instance

Create Configuration (.env) File Create a file named .env in the same directory. Your configuration will depend on whether you need speaker identification (diarization).

Option A: Standard Whisper API (No Speaker Diarization) This is the simplest method and works with any OpenAI Whisper-compatible API (like OpenAI, OpenRouter, or local LLMs).

# --- Text Generation Model (uses /chat/completions endpoint) ---
TEXT_MODEL_BASE_URL=[https://openrouter.ai/api/v1](https://openrouter.ai/api/v1)
TEXT_MODEL_API_KEY=your_openrouter_api_key
TEXT_MODEL_NAME=openai/gpt-4o-mini

# --- Transcription Service (uses /audio/transcriptions endpoint) ---
TRANSCRIPTION_BASE_URL=[https://api.openai.com/v1](https://api.openai.com/v1)
TRANSCRIPTION_API_KEY=your_openai_api_key
WHISPER_MODEL=whisper-1

# --- Large File Chunking (for endpoints with file size limits) ---
ENABLE_CHUNKING=true
CHUNK_SIZE_MB=20

# --- Application Settings ---
ALLOW_REGISTRATION=false
ADMIN_USERNAME=admin
ADMIN_EMAIL=admin@example.com
ADMIN_PASSWORD=changeme

# --- Docker Settings ---
SQLALCHEMY_DATABASE_URI=sqlite:////data/instance/transcriptions.db
UPLOAD_FOLDER=/data/uploads

Option B: ASR Webservice (With Speaker Diarization) This method enables speaker identification but requires running a separate ASR webservice container. See the Advanced Configuration section below for details on setting up the ASR service.

# --- Text Generation Model (uses /chat/completions endpoint) ---
TEXT_MODEL_BASE_URL=[https://openrouter.ai/api/v1](https://openrouter.ai/api/v1)
TEXT_MODEL_API_KEY=your_openrouter_api_key
TEXT_MODEL_NAME=openai/gpt-4o-mini

# --- Transcription Service (uses /asr endpoint) ---
USE_ASR_ENDPOINT=true
ASR_BASE_URL=http://your_asr_host:9000  # URL of your running ASR webservice
ASR_DIARIZE=true
ASR_MIN_SPEAKERS=1
ASR_MAX_SPEAKERS=5

# --- Application Settings ---
ALLOW_REGISTRATION=false
ADMIN_USERNAME=admin
ADMIN_EMAIL=admin@example.com
ADMIN_PASSWORD=changeme

# --- Docker Settings ---
SQLALCHEMY_DATABASE_URI=sqlite:////data/instance/transcriptions.db
UPLOAD_FOLDER=/data/uploads

Start the Application After editing your .env file with your API keys and settings, run the following command:
```
docker compose up -d
```
Access the application at http://localhost:8899. The admin user will be created on the first run.

Advanced Setup: Build from Source

Follow these steps if you want to modify the code or build the Docker image yourself.

Clone the Repository:

git clone [https://github.com/murtaza-nasir/speakr.git](https://github.com/murtaza-nasir/speakr.git)
cd speakr

Create Configuration Files: Copy the example files. Use env.whisper.example for the standard API method or env.asr.example for the ASR webservice method.
```
cp docker-compose.example.yml docker-compose.yml
cp env.whisper.example .env # Or cp env.asr.example .env
```
Edit the .env file with your custom settings and API keys.
Build and Start:
```
docker compose up -d --build
```

Usage Guide

Login: Access the application (e.g., http://localhost:8899) and log in. The admin account is created from the .env variables on the first launch.
Set Preferences (Recommended): Navigate to your Account page to set your default language, customize the AI summarization prompt, and add professional context to improve chat results.
Add a Recording:
- Upload: Drag and drop an audio file onto the dashboard or use the New Recording page.
- Record: Use the in-browser recorder. You can record your mic, system audio, or both. Note: To capture system audio (e.g., from a meeting), you must share a browser tab or your entire screen and ensure the "Share audio" checkbox is enabled.
- Automated: If enabled, simply drop files into the monitored "black hole" directory.
Interact with Your Transcript:
- From the gallery, click a recording to view its details.
- Read the transcript, listen to the audio, and review the AI-generated summary.
- Edit metadata like titles and participants.
- Use the Chat panel to ask questions about the content.
Identify Speakers (Diarization):
- If you used the ASR method with diarization enabled, click the Identify Speakers button.
- In the modal, assign names to the detected speakers (e.g., SPEAKER 00, SPEAKER 01). You can use the Auto Identify feature to let the AI suggest names based on the conversation.

Advanced Configuration & Technical Details

For detailed deployment instructions and information about the various API's used, see the Deployment Guide

The recommended method is to use the pre-built Docker image, which is fast and simple. This is explained above.

Automated File Processing

Speakr includes a powerful "black hole" directory monitoring feature that automatically processes audio files without manual uploads. This is perfect for batch processing scenarios where you want to drop files into a directory and have them automatically transcribed.

How It Works

File Monitoring: Speakr monitors a designated directory for new audio files
Automatic Detection: When new audio files are detected, they are automatically queued for processing
File Stability Check: Files are checked for stability (not being written to) before processing
Automatic Processing: Files are moved to the uploads directory and processed using your configured transcription settings
Database Integration: Processed recordings appear in your gallery with the title "Auto-processed - [filename]"

For detailed instructions on setting this up, see the Deployment Guide

License

This project is dual-licensed:

GNU Affero General Public License v3.0 (AGPLv3)

Speakr is offered under the AGPLv3 as its open-source license. You are free to use, modify, and distribute this software under the terms of the AGPLv3. A key condition of the AGPLv3 is that if you run a modified version on a network server and provide access to it for others, you must also make the source code of your modified version available to those users under the AGPLv3.
- You must create a file named LICENSE (or COPYING) in the root of your repository and paste the full text of the GNU AGPLv3 license into it.
- Read the full license text carefully to understand your rights and obligations.
Commercial License

For users or organizations who cannot or do not wish to comply with the terms of the AGPLv3 (for example, if you want to integrate Speakr into a proprietary commercial product or service without being obligated to share your modifications under AGPLv3), a separate commercial license is available.

Please contact speakr maintainers for details on obtaining a commercial license.

You must choose one of these licenses under which to use, modify, or distribute this software. If you are using or distributing the software without a commercial license agreement, you must adhere to the terms of the AGPLv3.

Roadmap

Speakr is in active development. Planned features include a faster way to switch transcription languages on the fly.

Contributing

Feedback, bug reports, and feature suggestions are highly encouraged! Please open an issue on the GitHub repository to share your thoughts.

Note on Code Contributions: Should the project begin formally accepting external code contributions, a Contributor License Agreement (CLA) will be required.

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
.github/workflows		.github/workflows
deployment		deployment
static		static
templates		templates
.dockerignore		.dockerignore
.env.example		.env.example
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
audio_chunking.py		audio_chunking.py
create_admin.py		create_admin.py
create_docs.py		create_docs.py
docker-compose.example.yml		docker-compose.example.yml
docker-entrypoint.sh		docker-entrypoint.sh
docker_create_admin.py		docker_create_admin.py
env.asr.example		env.asr.example
env.whisper.example		env.whisper.example
file_monitor.py		file_monitor.py
parse_asr_json.py		parse_asr_json.py
requirements.txt		requirements.txt
reset_db.py		reset_db.py
resize_logo.py		resize_logo.py
resize_logo.sh		resize_logo.sh
test_json_fix.py		test_json_fix.py
test_json_preprocessing.py		test_json_preprocessing.py
test_json_standalone.py		test_json_standalone.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speakr

What's New?

Latest Release (Version 0.4.2)

Version 0.4.1

Screenshots

Core Features

Getting Started

Usage Guide

Automated File Processing

How It Works

License

Roadmap

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Speakr

What's New?

Latest Release (Version 0.4.2)

Version 0.4.1

Screenshots

Core Features

Getting Started

Usage Guide

Automated File Processing

How It Works

License

Roadmap

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages