This project leverages Generative AI (GenAI) to perform:
-
Style Blog Post: Creates a detailed blog post in English using a video from YouTube that could be in Spanish, Portuguese, or English (title, description, and transcript).
-
Style Slack Post: Creates a Slack post in English using a video from YouTube that could be in Spanish, Portuguese, or English (title, description, and transcript).
This serves as a base to expedite content generation, and it is suggested to add a more personal/human tone to the results.
The application is fully containerized with Docker for easy setup and deployment.
Before starting, ensure you have the following installed:
- Docker and Docker Compose.
- A Google Cloud service account with access to the YouTube Data API.
Make sure the following files are available in your project directory:
.env: Contains your API key.client_secrets.json: Configured for your Google Cloud service account.
Firstly, you will need to use docker-compose-ollamadockerized.yml instead of the content of docker-compose.yml of this repo.
To ensure the correct models are available in your Docker container, follow these steps:
-
Start the Docker container with Ollama running:
docker-compose up
-
Download the required models.
Then, you can pull multiple, such as deepseek-r1 and llama3.1, using the following command:
```bash
docker exec -it ollama ollama pull deepseek-r1
docker exec -it ollama ollama pull llama3.1
```
This will download the specified models into the Ollama container.
Once your Docker container is running, you can verify the models that are installed in the Ollama container with the following command:
docker exec -it ollama ollama listThis will list all the models available in your Ollama container, and you should see something similar to:
NAME ID SIZE MODIFIED
deepseek-r1:latest 0a8cX 4.7 GB 4 minutes ago
llama3.1:latest 46e0X 4.9 GB 2 hours ago
In the root directory of the project, create a .env file with the following variables:
YOUTUBE_API_KEY=<Your_YouTube_Data_API_Key>
Replace <Your_YouTube_Data_API_Key> with your actual API key from the Google Cloud Console.
OLLAMA_MODEL=<model_for_ollama>
Replace <model_for_ollama> with the LLM that you want to use with Ollama.
(Optional: If you want to use OpenAI)
OPENAI_MODEL=<model_for_openai>
Replace <model_for_openai> with the LLM that you want to use with OpenAI.
OPENAI_API_KEY=<Your_OpenAI_API_Key>
Replace <Your_OpenAI_API_Key> with your actual API key from OpenAI.
Alternatively, you can use the .env-template file and rename it to .env.
Ensure you have the client_secrets.json file in the root directory. If you don’t have it, follow these steps to create one:
- Log in to your Google Cloud Console.
- Navigate to APIs & Services > Credentials.
- Create a Service Account and download the credentials file.
- Save the file as
client_secrets.jsonin the root of your project.
An example client_secrets.json file format is shown below:
{
"web": {
"client_id": "",
"project_id": "YOUR_PROJECT_ID",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_secret": "YOUR_CLIENT_SECRET"
}
}If you want to customize the prompt used by /blog_post & /slack_post endpoints, you can update the prompt.yaml file located in the root directory of the project. This file allows you to define the structure and content of the prompt(s).
poetry installRun the following commands to build and start the application:
# Build the Docker image
docker-compose build
# Start the application
docker-compose upThe application will be accessible at http://localhost:5000.
Endpoint: /blog_post/<video_id_youtube>
Method: GET
Description: Creates a blog post based on the video’s title, description, and transcript.
Endpoint: /slack_post/<video_id_youtube>
Method: GET
Description: Creates a slack post based on the video’s title, description, and transcript.
Optional Parameter:
use_openai: If set totrue, the blog post will be generated using OpenAI. If not provided, the default isfalse, and the blog post will be generated using the local model.
Example:
curl http://localhost:5000/blog_post/VIDEO_ID_YOUTUBE?use_openai=trueIn the examples-blog-posts-generated folder, you will find two main directories: Ollama and OpenAI. Each of these directories contains subfolders with examples of blog posts generated by the application. Each example is stored in its own folder and includes the following files:
{youtube_video_id}.json: This file contains the response of the endpoint/blog_post/, including metadata about the video, its transcript, and the generated blog post.blog_post.md: This file contains the blog post generated from the video in a readable markdown format.
These examples demonstrate the capabilities of the application in generating blog posts from YouTube videos using different models.
examples-blog-posts-generated/
├── video1/
│ ├── info.txt
│ └── blog_post.txt
├── video2/
│ ├── info.txt
│ └── blog_post.txt
When using OpenAI via API, the /blog_post endpoint executes in under 10 seconds, delivering consistent and accurate results.
However, when using DeepSeek-R1 or Llama 3.1 models with Ollama, performance can vary depending on the setup. Initially, I experienced very slow response times (over 5 minutes for transcripts of around 1 hour), but this was due to running the Ollama Docker image, which did not utilize my Mac M3’s GPU.
- Execution Time: Execution Time: Local models such as DeepSeek-R1 and Llama 3.1 require more computational resources, and performance depends on how they are run. When running Ollama natively on macOS, it leverages Apple’s Metal API, significantly improving speed. However, when using the Dockerized Ollama image, performance is much slower because Docker cannot access the Mac’s GPU.
- Recommended Setup: If you are on a Mac with an M-series chip, it is best to run Ollama natively and have the Flask app communicate with it via API. If you still want to run Ollama inside Docker, use the
docker-compose-ollamadockerized.ymlfile available in this repo.
Models process text in units called "tokens." Longer video transcripts increase the token count, which can impact processing time. Performance varies depending on the model and hardware—local models may experience slower processing or incomplete results if they lack sufficient computational resources.
For video transcripts of about 5 minutes, all models perform well.
For transcripts up to 1 hour, OpenAI (via API) and models run with Ollama (when executed natively on macOS) both provide reliable results in a reasonable time, with OpenAI performing slightly better in my subjective evaluation. However, I did not conduct formal benchmarking. For longer videos or when using local models inside Docker, performance may vary, and further testing or adjustments may be necessary.