Releases: defilantech/infercost
Releases · defilantech/infercost
v0.1.1
v0.1.0
InferCost v0.1.0
The first release of InferCost: Kubernetes-native cost intelligence for on-premises AI inference.
Features
- CostProfile CRD: Declare GPU hardware economics (purchase price, amortization, electricity rate, PUE)
- UsageReport CRD: Auto-populated cost reports with per-model/namespace breakdown
- Cost engine: Scrapes DCGM for GPU power draw and llama.cpp/vLLM for token counts
- 12 Prometheus metrics: Cost-per-token, hourly cost, cloud comparison, GPU power, savings
- Cloud comparison: Verified pricing for 9 models across OpenAI, Anthropic, and Google
- CLI:
infercost statusandinfercost comparewith monthly projections - REST API: /api/v1/costs/current, /api/v1/models, /api/v1/compare, /api/v1/status
- Grafana dashboard: 13 panels shipped as JSON
- Helm chart: Single-command installation with configurable DCGM endpoint and API server
- Honest results: Shows when cloud is cheaper than on-prem
Install
Homebrew (macOS):
brew install defilantech/tap/infercostHelm:
helm install infercost infercost/infercost \
--set dcgm.endpoint=http://nvidia-dcgm-exporter:9400/metricsBinary: Download from the assets below.