A Python-based personal voice assistant powered by Google Gemini AI. Jarvis listens for a wake word, understands natural language commands, and performs real-world actions on your machine — all through voice.
- 🎤 Wake word detection — Say "Jarvis" to activate
- 🌐 Browser control — Open Google, YouTube, LinkedIn, new tabs, close tabs
- 🔍 Voice search — Search the web hands-free
- 🎵 Music playback — Play songs from your personal music library
- 📰 Live news — Fetches and reads top headlines via NewsAPI
- ⏯️ Playback control — Pause and resume media
- 🤖 AI fallback — Powered by Google Gemini for anything not covered by built-in commands
- 🖥️ Desktop UI — Clean Tkinter interface with live conversation transcript and status indicator
| Technology | Purpose |
|---|---|
| Python | Core language |
| speech_recognition | Voice input & transcription |
| pyttsx3 | Text-to-speech output |
| webbrowser | Web browser controller |
| pyautogui | Browser automation (tabs, search) |
| tkinter | Desktop UI |
| Google Gemini API | AI responses for unknown commands |
| NewsAPI | Fetching live news headlines |
- Python 3+
- A working microphone
- Google Gemini API key → Get one here
- NewsAPI key → Get one here
git clone https://github.com/aadityasingh9601/Jarvis.git
cd Jarvis# Create a virtual env
python -m venv venv
# To activate the virtual env
venv/Scripts/activate
# To exit the virtual env
deactivatepip install -r requirements.txtDuplicate .env.example to .env
cp `.env.example` `.env`Open musicLibrary.py and add your songs in this format:
music = {
"song name 1": "https://youtube-link1-here",
"song name 2": "https://youtube-link2-here",
}python main.py| Command | Action |
|---|---|
Jarvis |
Wake word — activates Jarvis |
open google |
Opens Google in browser |
open youtube |
Opens YouTube in browser |
open linkedin |
Opens LinkedIn in browser |
open the browser |
Opens a new browser window |
open new tab |
Opens a new tab (Ctrl+T) |
close the tab |
Closes current tab (Ctrl+W) |
search <query> |
Types and searches your query |
scroll the page |
Scrolls current page |
play <song> |
Plays a song from music library |
pause |
Pauses media |
resume |
Resumes media |
fetch news |
Reads top 10 news headlines |
good bye |
Shuts down Jarvis |
| anything else | Handled by Gemini AI |
Jarvis/
├── main.py # Core logic — commands, speech, Jarvis loop with visual interface
├── llmHandler.py # Google Gemini AI integration
├── musicLibrary.py # Personal music library (song → URL map)
├── cursorPosition.py # Utility to find screen coordinates
├── .env.example # API keys
├── requirements.txt
├── .gitignore
└── README.md
- GitHub: https://github.com/aadityasingh9601
- LinkedIn: https://www.linkedin.com/in/aadityasingh999
- X: https://x.com/AadityaSingh771
- Portfolio: https://aadityasingh.dev
MIT
