5,300 practical techniques to cut your LLM token usage by up to 90% — without sacrificing output quality.
LLMs bill by the token. Bloated prompts, verbose outputs, and redundant context quietly drain your budget. This repo collects 5,300 actionable, real-world skills — 300 core principles plus 5,000 context-specific applications across languages, data formats, providers, frameworks, RAG, agents, security, and ops.
ℹ️ Honest note: there aren't 5,300 fundamentally distinct techniques. The 300 core principles are the foundation; the other 5,000 apply those principles to concrete, named contexts (specific languages, platforms, tools, providers, data types, industries, and tasks) so each entry is directly actionable rather than abstract.
If these skills save you money, consider supporting the work. Every donation keeps this list growing and free for everyone. 🙏
| File | Skills | Topics |
|---|---|---|
| Core Principles | #1–300 | The 15 foundational categories |
| Programming Languages & Multilingual Efficiency | #301–684 | Per-language code tips + multilingual tokenizer efficiency |
| Data, Formats, Databases & Preprocessing | #685–1063 | JSON/CSV/XML/PDF…, databases, preprocessing, chunking |
| Providers, Model Selection, Parameters & Tuning | #1064–1355 | OpenAI/Anthropic/Gemini…, API params, fine-tuning, serving |
| Tasks, Roles, Use Cases & Analytics | #1356–2008 | Task types, job roles, industry use cases, analytics |
| Apps & Frameworks | #2009–2368 | Slack/Gmail/Jira…, LangChain/LlamaIndex… |
| RAG, Retrieval, Embeddings & Multimodal | #2369–2784 | Vector DBs, RAG docs, embeddings, multimodal |
| Prompting, Architecture, Caching & Ops | #2785–3300 | Prompt parts, pipelines, caching, monitoring, anti-patterns |
| More Languages & Cloud/AI Platforms | #3301–3760 | Niche languages + cloud/AI platforms |
| More Apps, Industries & Job Functions | #3761–4408 | More SaaS apps, industry verticals, job functions |
| LLM Features, Prompting Patterns & Structured Calls | #4409–4694 | Function calling, structured output, prompting patterns, batching |
| Formats, Security, Media & Observability | #4695–5160 | Specialized formats, security/compliance, media, monitoring tools |
| Advanced Ops, FinOps & Governance | #5161–5300 | Token tooling, prompt governance, multi-agent, FinOps, edge |
| Total | 5,300 |
- Skim the Core Principles — they cover ~90% of the savings.
- Jump to the file matching your stack (language, platform, provider, framework, RAG, etc.).
- Apply, measure tokens before/after, and keep what holds quality.
Found a token-saving trick that isn't here? Open a PR or issue. Keep entries concise: a bold title and a one-line explanation.
Released under the MIT License — free to use, share, and adapt.