🧠 GPT-2 Inference API Development

In this project, I used the HuggingFace GPT-2 model for inference using FastAPI and tested on Google Colab.

🚀 1. GPT-2 Inference Server-side (FastAPI + ngrok)

Run this notebook to:

Load a GPT-2 model using the HuggingFace transformers library
Test the FastAPI using ngrok

👉 GPT-2 Inference Server Google Colab

🛠️ Features

/generate endpoint for single-prompt inference
/batch_generate endpoint for multi-prompt batch inference
Build your own prototype or demos to further test it
Experimenting with quantization, batching, and latency

🧪 2. API Client-end Testing

Use this notebook to:

Send test requests to your public FastAPI endpoint
Measure tokens generated and latency
Support both single and batch inference calls

👉 Client-end Testing Google Colab

🙌 Credits

Built with:

⚡ FastAPI – High-performance Python web framework for building APIs
🤗 Hugging Face Transformers – State-of-the-art NLP models
🧠 Google Colab – Free cloud notebooks with GPU support
🌐 ngrok – Public URLs for localhost APIs

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
client_test.ipynb		client_test.ipynb
gpt_2_model_inference.ipynb		gpt_2_model_inference.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 GPT-2 Inference API Development

🚀 1. GPT-2 Inference Server-side (FastAPI + ngrok)

🛠️ Features

🧪 2. API Client-end Testing

🙌 Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 GPT-2 Inference API Development

🚀 1. GPT-2 Inference Server-side (FastAPI + ngrok)

🛠️ Features

🧪 2. API Client-end Testing

🙌 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages