8 lines (6 loc) · 424 Bytes

Scaling Guide

Switch database_url to PostgreSQL for multi-worker deployments.
Increase queue.max_parallel_jobs for higher throughput.
Move artifact storage to a high-capacity volume or object-store-backed plugin.
Add provider configurations for local GPU-serving endpoints such as vLLM or Ollama.
Replace the default vector store with a plugin-backed FAISS, Chroma, or Qdrant adapter as the dataset grows.