Hearth discovers metrics by probing each backend's /metrics endpoint. Today there are two adapters:
| Adapter | Prefix | What it returns |
|---|---|---|
_scrape_vllm |
vllm:* |
tps, TTFT, TPOT, KV%, p50/95/99 (e2e), running, waiting |
_scrape_llamacpp |
llamacpp:* |
tps, TPOT, running, waiting |
_scrape_sglang |
sglang:* |
tps, TTFT, TPOT, KV%, p50/95/99, running, waiting (untested against live SGLang — please report) |
Both live in server/api/main.py. They share a normalized output shape (the live dict) so the frontend renders them identically.
- Add a
_LLAMACPP_SCALARS-style set of metric names you care about. - Write
_scrape_<kind>(base) -> dictthat fetches{base}/metricsand returns the normalized fields. - In
_discover()add a probe step: if the backend exposes your prefix, append to<kind>_baseson the model. - In
models_list()add a branch that computes thelivedict from your scrape + setsmetrics: "<kind>". - In
data.jsGB10 derivation, themetricsSourcecheck generalizes (already accepts vllm + llamacpp; just add yours).
v0.2.0 will formalize this as a Python entry-point plugin so you can ship adapters as separate packages without forking Hearth.
- Ollama
ollama:*adapter (Ollama doesn't ship Prometheus metrics natively yet — could wrap its/api/ps) - TensorRT-LLM Triton adapter