AI/ML Engineer · CTO at ClarioScope AI · From Bangladesh
Train small language models (SLMs) from scratch · Fine-tune larger ones with QLoRA · Ship production AI products
- 🧠 Training small language models from scratch — the ORCH series (350M–3B) for Next.js code generation, and MedLLM-10M for medical applications
- 🔧 Fine-tuning larger base models with QLoRA — ORCH-7B is a 4-bit fine-tune of DeepSeek Coder 6.7B
- 🎯 Building benchmark-grade specialist SLMs — clarioscope-intent-deberta-v1 matches frontier LLMs within 4 pp of accuracy at 22× lower latency and $0/inference (dev.to writeup)
- 🏥 Leading engineering at ClarioScope AI — HIPAA-compliant healthcare practice growth platform
- 🚀 Shipping production AI products — BeautyCrew AI, VETR Proposal, CommonRoom AI
- 📚 Open-source everything — all model weights, configs, and tokenizers are public on Hugging Face
A three-model intake intelligence pipeline for healthcare practices. Each model is small, specialized, and benchmarked head-to-head against frontier APIs. Suite-level writeup: Three small models for healthcare intake — and what shipping all three taught me.
| Model | Task | Size | Headline result | Speed vs frontier | Cost / 1K | Links |
|---|---|---|---|---|---|---|
| clarioscope-intent-deberta-v1 | 7-class intent classification | 184M | 91.16% accuracy (within 4 pp of Claude Haiku) | 22× faster | $0 | 🤗 · 📝 |
| clarioscope-phi-deberta-v1 | 18-category HIPAA PHI span detection | 125M | Macro F1 0.63 (triples frontier on LOC, ties on NAME/DATE/PHONE/IP/AGE) |
45× faster | $0 | 🤗 · 📝 |
| clarioscope-insurance-v1 | 12-field insurance / billing extraction | 125M | Macro F1 0.79 (ties GPT-4o on SUBSCRIBER_NAME, within 5–13 pp on the four highest-volume fields) |
26× faster | $0 | 🤗 · 📝 |
Total cost to build all three: ~$16 in OpenAI + RunPod + benchmark API spend. Total infrastructure: Hugging Face (free) + RunPod spot pods (a few cents per run).
The recurring pattern across all three: small specialized models don't replace frontier APIs — they're stage one of a hybrid pipeline that does the bulk-volume work cheaply, then defers a small fraction of hard cases to a frontier API. All three model cards include honest per-entity / per-class breakdowns showing where the small model wins and loses.
All published openly on 🤗 Hugging Face. Configs and tokenizers included.
|
A 3 billion parameter decoder-only transformer trained from scratch for full-stack Next.js code generation.
|
QLoRA fine-tune of DeepSeek Coder 6.7B Instruct, specialized for autonomous Next.js generation.
|
||||||||||||||||||||||||||||
|
Compact code-gen model trained from scratch on consumer hardware (RTX 3060 12GB).
Benchmark (ORCH-ProjectBench): 76.6 overall · 95.3 code parse · 93.3 format |
GPT-2 style language model trained from scratch on medical literature.
|
Also published: ORCH Next.js 350M v2 (287M, from scratch with 16k vocab).
| Product | What it does |
|---|---|
| ClarioScope AI | HIPAA-compliant healthcare practice growth platform (CTO) |
| BeautyCrew AI | Booking management for the beauty industry — prevents missed appointments |
| VETR Proposal | AI-assisted federal contracting co-pilot for small business teams |
| CommonRoom AI | Collaborative digital workspace — 15 group-coordination tools, no install |
| ORCH Studio | Generate complete Next.js apps from natural language (powered by ORCH-7B) |
- orch-ai — Hugging Face org for the ORCH code-generation model family
- clarioscope-ai — ClarioScope AI's Hugging Face org
- Configs, tokenizers, and training details are public on every model card
📫 Get in touch: raihan@clarioscope.ai · Portfolio




