Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
GEMINI_API_KEY=your_gemini_api_key
PEXELS_API_KEY=your_pexels_api_key

# Optional: override the Gemini model used by modules/brain.py
GEMINI_MODEL=gemini-2.0-flash
57 changes: 21 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
![Views](https://komarev.com/ghpvc/?username=SaarD00-AI-Youtube-Shorts-Generator&style=for-the-badge&color=blue)


**AutoShorts AI** is a fully automated Python pipeline that creates viral-style "Faceless" YouTube Shorts and TikToks from a single topic. It handles the entire production chain: researching, scriptwriting, voiceover generation, stock footage sourcing, and advanced video editing with transitions and avatar injection.
**AutoShorts AI** is a Python pipeline that creates viral-style "Faceless" YouTube Shorts and TikToks from a topic. It handles the production chain: AI topic/script generation, voiceover generation, stock footage sourcing, and FFmpeg editing with transitions and avatar injection.

---

## ✨ Key Features

- **🧠 Intelligent Scriptwriting:** Uses **Google Gemini 2.0 Flash** to write engaging, "Edutainment" style scripts (Vox/Kurzgesagt style) with strict storytelling structures (Hook → Context → Mechanism → Twist).
- **🗣️ Human-Like Voiceovers:** Integrated with **Suno Bark** (via Google Colab/Ngrok) for high-quality, expressive AI narration. Includes "Influencer Mode" for dynamic intonation.
- **🗣️ Voiceovers:** Generates narration with `edge-tts`.
- **🎞️ Dual-Visual System:** Automatically searches and downloads **two distinct stock videos** per scene from **Pexels**, creating a dynamic "A/B Split" visual style to maximize viewer retention.
- **✂️ Advanced FFmpeg Editing:**
- **Smart Trimming:** Syncs video perfectly to audio duration.
Expand Down Expand Up @@ -38,12 +38,11 @@ Automated-YT-Shorts-AI/
├── modules/ # Core Logic Modules
│ ├── brain.py # AI Scriptwriter (Gemini)
│ ├── audio.py # Voice Generator (Bark Client)
│ ├── audio.py # Voice generator (edge-tts)
│ ├── asset_manager.py # Pexels Downloader (Dual-Visual logic)
│ └── composer.py # FFmpeg Video Editor (Stitching & Transitions)
├── main.py # Entry point (Orchestrator)
├── test_audio.py # Diagnostic tool for Bark connection
└── requirements.txt # Python dependencies

```
Expand All @@ -62,7 +61,7 @@ Automated-YT-Shorts-AI/

- **Google Gemini API Key** (Free tier available).
- **Pexels API Key** (Free).
- **Ngrok Auth Token** (If running Bark on Colab).
- No Ngrok token is required for the default voiceover path. The current pipeline uses `edge-tts`.

---

Expand All @@ -83,49 +82,35 @@ pip install -r requirements.txt

```

_(If `requirements.txt` is missing, install manually: `pip install google-generativeai requests ffmpeg-python mutagen colorama`)_

### 3. Environment Setup

Create the required folders and add your avatar:

1. Create folder: `assets/avatar`
2. Place your avatar video inside and name it: `Professional_Girl_Animation_Video_Generation.mp4`
2. Place your avatar video inside and name it: `avatars.mp4`

### 4. Configure API Keys

You can set them in your environment variables or hardcode them (temporarily) in the modules:

- `modules/brain.py` → `genai.configure(api_key="YOUR_GEMINI_KEY")`
- `modules/asset_manager.py` → `self.api_key = "YOUR_PEXELS_KEY"`
- `modules/audio.py` → Update `raw_url` with your active Ngrok/Colab link.

---

## 🎮 How to Run

### Step 1: Start the Audio Server (Bark)
Copy `.env.example` to `.env` and fill in your key:

Since Bark requires a GPU, we run it on Google Colab.
```bash
cp .env.example .env
```

1. Open the **Colab Notebook** provided for this project.
2. Paste your Ngrok Token.
3. Run the cell.
4. Copy the `https://xxxx.ngrok-free.app` URL.
5. Paste this URL into `modules/audio.py` inside the `AudioEngine` class.
Required:

### Step 2: Test Connection (Optional)
- `GEMINI_API_KEY` for script generation
- `PEXELS_API_KEY` for stock video search/download

Run the test script to ensure your local machine can talk to the Cloud GPU.
Optional:

```bash
python test_audio.py
- `GEMINI_MODEL` to override the default `gemini-2.0-flash` model

```
---

_If you see `✅ SUCCESS`, you are ready._
## 🎮 How to Run

### Step 3: Generate Video
### Generate Video

Run the main script:

Expand All @@ -150,8 +135,8 @@ python main.py
### `audio.py` (The Voice)

- **Input:** Text script.
- **Logic:** Sends text to the Colab server. Includes a "Confidence" setting (`text_temp=0.7`) to make the voice sound like an influencer.
- **Post-Processing:** Uses FFmpeg to trim silence and boost volume (2x).
- **Logic:** Generates MP3 voice clips with `edge-tts`.
- **Post-Processing:** Reads durations with `mutagen` so scenes can be synced to audio length.

### `asset_manager.py` (The Librarian)

Expand All @@ -177,11 +162,11 @@ python main.py

**Q: "Avatar file missing" error.**

- **Fix:** Altough not needed, Ensure your folder structure is exactly `assets/avatar/avatar.mp4`.
- **Fix:** Ensure your folder structure is exactly `assets/avatar/avatars.mp4`.

**Q: The audio is silent or fails.**

- **Fix:** Your Ngrok tunnel likely expired. Restart the Colab cell and update the URL in `audio.py`.
- **Fix:** Check your internet connection and that `edge-tts` is installed from `requirements.txt`.

**Q: FFmpeg error "Exec format error" or "not found".**

Expand Down
7 changes: 5 additions & 2 deletions modules/asset_manager.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
import os
import requests
import random
from dotenv import load_dotenv

class AssetManager:
def __init__(self):
# Your original API Key
self.api_key = "hZBjjYowDAauyvn9rioK5qYMHFdCq11rKnmWo4OQlXhZspsVuo2DkpCP"
load_dotenv()
self.api_key = os.getenv("PEXELS_API_KEY")
if not self.api_key:
raise RuntimeError("PEXELS_API_KEY is not set. Create a .env file or set the environment variable before running.")
self.base_url = "https://api.pexels.com/videos/search"
self.headers = {
"Authorization": self.api_key
Expand Down
15 changes: 11 additions & 4 deletions modules/brain.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,13 @@
from google import genai
from dotenv import load_dotenv

# Load API Key
client = genai.Client(api_key="api-key")
load_dotenv()

def _get_client():
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
raise RuntimeError("GEMINI_API_KEY is not set. Create a .env file or set the environment variable before running.")
return genai.Client(api_key=api_key)

class ContentBrain:
def get_trending_topic(self):
Expand All @@ -13,7 +18,8 @@ def get_trending_topic(self):
For now, we ask Gemini to pick a viral niche topic.
"""
prompts = "Give me 1 specific, viral, and engaging topic for a Short Documentary. It should be a 'Engaging Did you know' fact or a 'Fun/intriguing Engaging News'. return ONLY the topic name."
response = client.models.generate_content(model='gemini-3-flash-preview', contents=prompts)
client = _get_client()
response = client.models.generate_content(model=os.getenv('GEMINI_MODEL', 'gemini-2.0-flash'), contents=prompts)
topic = response.text.strip()
print(f"🎯 Selected Topic: {topic}")
return topic
Expand Down Expand Up @@ -99,7 +105,8 @@ def generate_script(self, topic):
# """


response = client.models.generate_content(model='gemini-3-flash-preview', contents=prompt)
client = _get_client()
response = client.models.generate_content(model=os.getenv('GEMINI_MODEL', 'gemini-2.0-flash'), contents=prompt)

# Clean the response to ensure it's valid JSON (sometimes AI adds markdown)
clean_text = response.text.replace('```json', '').replace('```', '').strip()
Expand Down
6 changes: 6 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
edge-tts>=7.0.0
google-genai>=1.0.0
python-dotenv>=1.0.1
requests>=2.31.0
ffmpeg-python>=0.2.0
mutagen>=1.47.0