Skip to content

thorwebdev/telegram-gemini-voice-bot

Repository files navigation

Telegram Gemini Voice Bot 🤖🎙️

A Telegram bot powered by Gemini 3.1 Flash Lite for text & voice reasoning, with optional Gemini TTS voice responses. Deploy on Cloud Run with scale-to-zero.

Features

  • 💬 Text messages → AI-powered text responses
  • 🎤 Voice notes → Audio understanding + text responses
  • 🔊 Voice replies → Optional TTS responses via Gemini
  • 🔀 Modes → Switch between Agent, Transcribe, and Translate
  • Fast → Flash Lite model for low-latency responses
  • 🚀 Cloud Run ready → Webhook mode with auto-scaling

Setup

1. Create a Telegram Bot

  1. Open Telegram and message @BotFather
  2. Send /newbot and follow the prompts to pick a name and username
  3. Copy the bot token (e.g., 123456:ABC-DEF...) — this is your TELEGRAM_BOT_TOKEN

Note: BotFather is only for creating the bot and getting the token. You do not set the webhook URL in BotFather — the bot code handles that automatically via the Telegram API.

2. Get a Google AI API Key

  1. Go to Google AI Studio
  2. Create an API key — this is your GOOGLE_API_KEY

3. Install Dependencies

pip install -r requirements.txt

You also need ffmpeg installed for audio conversion:

  • macOS: brew install ffmpeg
  • Linux: apt-get install ffmpeg
  • Docker: Already included in the Dockerfile

Local Development

Polling Mode (simplest)

No public URL needed — the bot polls Telegram for updates:

cp .env.example .env
# Edit .env: set TELEGRAM_BOT_TOKEN and GOOGLE_API_KEY
# Leave WEBHOOK_URL empty

python bot.py

Webhook Mode (with ngrok)

ngrok http 8080
# Copy the https URL (e.g., https://abc123.ngrok.io)

# Set WEBHOOK_URL=https://abc123.ngrok.io in .env
python bot.py

Deploy to Cloud Run

Step 1: Configure gcloud and Enable APIs

gcloud init --skip-diagnostics

gcloud services enable secretmanager.googleapis.com
gcloud services enable cloudbuild.googleapis.com

Step 2: Store Secrets

echo -n "$(grep TELEGRAM_BOT_TOKEN .env | cut -d '=' -f2)" | \
  gcloud secrets create TELEGRAM_BOT_TOKEN --data-file=-
echo -n "$(grep GOOGLE_API_KEY .env | cut -d '=' -f2)" | \
  gcloud secrets create GOOGLE_API_KEY --data-file=-
echo -n "$(openssl rand -base64 32)" | \
  gcloud secrets create TELEGRAM_SECRET_TOKEN --data-file=-

Step 3: Grant IAM Permissions

Cloud Run source deploys use the default Compute Engine service account, which needs additional roles:

PROJECT_NUMBER=$(gcloud projects describe $(gcloud config get-value project) \
  --format='value(projectNumber)')

# Cloud Build + Storage (for building the container)
gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
  --member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
  --role="roles/cloudbuild.builds.builder"

gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
  --member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
  --role="roles/storage.objectViewer"

# Secret Manager (for accessing secrets at runtime)
gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
  --member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
  --role="roles/secretmanager.secretAccessor"

Step 4: Deploy

gcloud run deploy telegram-gemini-bot \
  --source . \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets="TELEGRAM_BOT_TOKEN=TELEGRAM_BOT_TOKEN:latest,GOOGLE_API_KEY=GOOGLE_API_KEY:latest,TELEGRAM_SECRET_TOKEN=TELEGRAM_SECRET_TOKEN:latest" \
  --no-cpu-throttling

The bot auto-detects Cloud Run via the K_SERVICE environment variable and starts listening on port 8080 even without a webhook URL — so the first deploy succeeds.

The deploy output will show the service URL, e.g.:

Service URL: https://telegram-gemini-bot-abc123-uc.a.run.app

Step 5: Set the Webhook URL

Update the service with the URL from step 4 so Telegram knows where to send messages:

gcloud run services update telegram-gemini-bot \
  --region us-central1 \
  --update-env-vars="WEBHOOK_URL=https://telegram-gemini-bot-abc123-uc.a.run.app"

Troubleshooting

Error Fix
Build failed... default service account is missing required IAM permissions Grant roles/cloudbuild.builds.builder and roles/storage.objectViewer (Step 3)
Permission denied on secret Grant roles/secretmanager.secretAccessor (Step 3)
API not enabled Run gcloud services enable <api> or say Y when prompted
Voice replies are slow or delayed Use --no-cpu-throttling to keep CPU active after the initial response

How Webhooks Work

  • The bot calls the Telegram setWebhook API automatically on startup
  • Telegram then sends all updates as POST requests to {WEBHOOK_URL}/webhook
  • If you redeploy with a new URL, the bot re-registers and overwrites the old webhook

Manual webhook management (usually not needed):

# Check current webhook
curl "https://api.telegram.org/bot<TOKEN>/getWebhookInfo"

# Set webhook manually
curl "https://api.telegram.org/bot<TOKEN>/setWebhook?url=https://your-url.run.app/webhook"

# Remove webhook (switch to polling)
curl "https://api.telegram.org/bot<TOKEN>/deleteWebhook"

Environment Variables

Variable Required Description
TELEGRAM_BOT_TOKEN Bot token from @BotFather
GOOGLE_API_KEY Google AI API key
TELEGRAM_SECRET_TOKEN Secret for secure webhooks (recommended)
WEBHOOK_URL Public URL for webhooks (omit for polling mode)
PORT HTTP port (default: 8080)
VOICE_ENABLED Default voice response state (default: true)

Bot Commands

Command Description
/start Welcome message & capabilities
/mode Switch mode — inline keyboard with Agent, Transcribe, Translate
/voice on Enable voice responses (Gemini TTS)
/voice off Disable voice responses

Architecture

User ↔ Telegram API ↔ Cloud Run (bot.py)
                           ├─→ Gemini 3.1 Flash Lite (reasoning)
                           └─→ Gemini 3.1 Flash TTS (voice synthesis)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors