AI/ML Engineer Β· CTO at ClarioScope AI Β· From Bangladesh
Train small language models (SLMs) from scratch Β· Fine-tune larger ones with QLoRA Β· Ship production AI products
- π§ Training small language models from scratch β the ORCH series (350Mβ3B) for Next.js code generation, and MedLLM-10M for medical applications
- π§ Fine-tuning larger base models with QLoRA β ORCH-7B is a 4-bit fine-tune of DeepSeek Coder 6.7B
- π― Building benchmark-grade specialist SLMs β clarioscope-intent-deberta-v1 matches frontier LLMs within 4 pp of accuracy at 22Γ lower latency and $0/inference (dev.to writeup)
- π₯ Leading engineering at ClarioScope AI β HIPAA-compliant healthcare practice growth platform
- π Shipping production AI products β BeautyCrew AI, VETR Proposal, CommonRoom AI
- π Open-source everything β all model weights, configs, and tokenizers are public on Hugging Face
A three-model intake intelligence pipeline for healthcare practices. Each model is small, specialized, and benchmarked head-to-head against frontier APIs. Suite-level writeup: Three small models for healthcare intake β and what shipping all three taught me.
| Model | Task | Size | Headline result | Speed vs frontier | Cost / 1K | Links |
|---|---|---|---|---|---|---|
| clarioscope-intent-deberta-v1 | 7-class intent classification | 184M | 91.16% accuracy (within 4 pp of Claude Haiku) | 22Γ faster | $0 | π€ Β· π |
| clarioscope-phi-deberta-v1 | 18-category HIPAA PHI span detection | 125M | Macro F1 0.63 (triples frontier on LOC, ties on NAME/DATE/PHONE/IP/AGE) |
45Γ faster | $0 | π€ Β· π |
| clarioscope-insurance-v1 | 12-field insurance / billing extraction | 125M | Macro F1 0.79 (ties GPT-4o on SUBSCRIBER_NAME, within 5β13 pp on the four highest-volume fields) |
26Γ faster | $0 | π€ Β· π |
Total cost to build all three: ~$16 in OpenAI + RunPod + benchmark API spend. Total infrastructure: Hugging Face (free) + RunPod spot pods (a few cents per run).
The recurring pattern across all three: small specialized models don't replace frontier APIs β they're stage one of a hybrid pipeline that does the bulk-volume work cheaply, then defers a small fraction of hard cases to a frontier API. All three model cards include honest per-entity / per-class breakdowns showing where the small model wins and loses.
All published openly on π€ Hugging Face. Configs and tokenizers included.
|
A 3 billion parameter decoder-only transformer trained from scratch for full-stack Next.js code generation.
|
QLoRA fine-tune of DeepSeek Coder 6.7B Instruct, specialized for autonomous Next.js generation.
|
||||||||||||||||||||||||||||
|
Compact code-gen model trained from scratch on consumer hardware (RTX 3060 12GB).
Benchmark (ORCH-ProjectBench): 76.6 overall Β· 95.3 code parse Β· 93.3 format |
GPT-2 style language model trained from scratch on medical literature.
|
Also published: ORCH Next.js 350M v2 (287M, from scratch with 16k vocab).
| Product | What it does |
|---|---|
| ClarioScope AI | HIPAA-compliant healthcare practice growth platform (CTO) |
| BeautyCrew AI | Booking management for the beauty industry β prevents missed appointments |
| VETR Proposal | AI-assisted federal contracting co-pilot for small business teams |
| CommonRoom AI | Collaborative digital workspace β 15 group-coordination tools, no install |
| ORCH Studio | Generate complete Next.js apps from natural language (powered by ORCH-7B) |
- orch-ai β Hugging Face org for the ORCH code-generation model family
- clarioscope-ai β ClarioScope AI's Hugging Face org
- Configs, tokenizers, and training details are public on every model card
π« Get in touch: raihan@clarioscope.ai Β· Portfolio
