GitHub - vllm-project/semantic-router: System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Documentation | Playground | Blog | Publications | Hugging Face

About

In the LLM era, the number of models is exploding. Different models vary across capability, scale, cost, and privacy boundaries. Choosing and connecting the right models to build semantic AI infrastructure is a system problem.

vLLM Semantic Router is a signal-driven intelligent router for that problem. It helps teams build model systems that are more efficient, safer, and more adaptive across cloud, data center, and edge environments.

It delivers three core values:

Token economics: reduce wasted tokens, increase effective output, and maximize the value of every token.
LLM safety: detect jailbreaks, sensitive leakage, and hallucinations so agents remain controllable, trustworthy, and auditable.
Fullmesh intelligence: build personal AI at the edge and intelligent MaaS in the cloud by coordinating local, private, and frontier models across cost, privacy, and capability boundaries.

Getting Started

Install

curl -fsSL https://vllm-semantic-router.com/install.sh | bash

For platform notes, detailed setup options, and troubleshooting, see the Installation Guide.

Important

Online playground default credentials:

username: love@vllm-sr.ai
password: vllm-sr

Latest News

[2026/03/24] Vision Paper Released: The Workload-Router-Pool Architecture for LLM Inference Optimization
[2026/03/10] v0.2 Released: vLLM Semantic Router v0.2 Athena Release
[2026/02/27] White Paper Released: Signal Driven Decision Routing for Mixture-of-Modality Models
[2026/01/05] Iris v0.1 Released: vLLM Semantic Router v0.1 Iris: The First Major Release
[2025/12/16] Collaboration: AMD × vLLM Semantic Router: Building the System Intelligence Together
[2025/11/19] New Blog: Signal-Decision Driven Architecture: Reshaping Semantic Routing at Scale
[2025/11/03] Paper Published: Category-Aware Semantic Caching for Heterogeneous LLM Workloads
[2025/10/12] Paper Accepted: When to Reason: Semantic Router for vLLM

Earlier announcements

[2025/12/15] New Blog: Token-Level Truth: Real-Time Hallucination Detection for Production LLMs
[2025/10/27] New Blog: Scaling Semantic Routing with Extensible LoRA
[2025/10/08] Collaboration: vLLM Semantic Router with vLLM Production Stack Team.
[2025/09/01] Released the project: vLLM Semantic Router: Next Phase in LLM inference.

More announcements are available on the Blog and Publications pages.

Community

For questions, feedback, or to contribute, please join the #semantic-router channel in vLLM Slack.

Community Meetings

We host bi-weekly community meetings to sync with contributors across different time zones:

First Tuesday of the month: 9:00-10:00 AM EST (accommodates US EST, EU, and Asia Pacific contributors)
Third Tuesday of the month: 1:00-2:00 PM EST (accommodates US EST and California contributors)
Meeting recordings: YouTube

Contributing

If you want to contribute, start with CONTRIBUTING.md.

For repository-native development workflow and validation commands, use AGENTS.md as the entrypoint and docs/agent/README.md as the canonical index.

Citation

If you find Semantic Router helpful in your research or projects, please consider citing it:

@misc{semanticrouter2025,
  title={vLLM Semantic Router},
  author={vLLM Semantic Router Team},
  year={2025},
  howpublished={\url{https://github.com/vllm-project/semantic-router}},
}

Name		Name	Last commit message	Last commit date
Latest commit History 1,262 Commits
.agents/skills/harness		.agents/skills/harness
.github		.github
bench		bench
candle-binding		candle-binding
config		config
dashboard		dashboard
deploy		deploy
docs		docs
e2e		e2e
ml-binding		ml-binding
nlp-binding		nlp-binding
onnx-binding		onnx-binding
paper		paper
perf		perf
scripts		scripts
src		src
tools		tools
website		website
.crd-ref-docs.yaml		.crd-ref-docs.yaml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.prowlabels.yaml		.prowlabels.yaml
AGENTS.md		AGENTS.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
Makefile		Makefile
OWNER		OWNER
README.md		README.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Getting Started

Install

Latest News

Community

Community Meetings

Contributing

Citation

Star History

Sponsors

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Getting Started

Install

Latest News

Community

Community Meetings

Contributing

Citation

Star History

Sponsors

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages