SCaLE 23x: A Practical Guide to Training a Small Language Model: Tokenizers, Training, and Real-World Pitfalls
Welcome to the landing page for the session A Practical Guide to Training a Small Language Model: Tokenizers, Training, and Real-World Pitfalls at the SCaLE 23x.
This repo intends to provide an introduction to:
- Building a Small Language Model (SLM) From scratch
- Provide a guide for fine-tuning and quantization
- Serve as an introduction to other language models
The SLM training and fine-tuning projects will require a GPU (and of the H100 or better variety). There is simply no getting around that.
If you don't have access to this kind of hardware, you can at least download the pre-built models for inference.
- A Linux or Mac-based Developer’s Laptop
- Windows Users should use a VM or Cloud Instance
- Python Installed: version 3.12 or higher
- (Recommended) Using a miniconda or venv virtual environment
- Basic familiarity with shell operations
There are 3 separate demo projects:
The instructions and purpose for each demo is contained within their respective folders.