Skip to content

Releases: defilantech/infercost

v0.1.1

23 Mar 06:51

Choose a tag to compare

Patch release: Docker images on GHCR, Homebrew formula update, OCI labels for repo linking.

v0.1.0

22 Mar 23:37

Choose a tag to compare

InferCost v0.1.0

The first release of InferCost: Kubernetes-native cost intelligence for on-premises AI inference.

Features

  • CostProfile CRD: Declare GPU hardware economics (purchase price, amortization, electricity rate, PUE)
  • UsageReport CRD: Auto-populated cost reports with per-model/namespace breakdown
  • Cost engine: Scrapes DCGM for GPU power draw and llama.cpp/vLLM for token counts
  • 12 Prometheus metrics: Cost-per-token, hourly cost, cloud comparison, GPU power, savings
  • Cloud comparison: Verified pricing for 9 models across OpenAI, Anthropic, and Google
  • CLI: infercost status and infercost compare with monthly projections
  • REST API: /api/v1/costs/current, /api/v1/models, /api/v1/compare, /api/v1/status
  • Grafana dashboard: 13 panels shipped as JSON
  • Helm chart: Single-command installation with configurable DCGM endpoint and API server
  • Honest results: Shows when cloud is cheaper than on-prem

Install

Homebrew (macOS):

brew install defilantech/tap/infercost

Helm:

helm install infercost infercost/infercost \
  --set dcgm.endpoint=http://nvidia-dcgm-exporter:9400/metrics

Binary: Download from the assets below.