You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I keep hitting the same wall from different angles while looking at Haystack for a multi-tenant setup (one process, many tenants). The most immediate symptom is secrets: EnvVarSecret.resolve_value() reads os.environ directly (haystack/utils/auth.py:201), so two tenants can't safely share a process. You either mutate the environment per request (race conditions) or run a process per tenant (operationally painful). @tstadel calls this out in #11366.
But the same gap shows up four more times once a pipeline is in production:
Cost. There's no per-run ledger, so you can't tell a tenant what their last query cost. grep -rni 'cost|pricing' haystack/ returns nothing. Simple local token tracking for Pipelines #10889 is the same shape of problem.
Cancellation. No per-run cancel token. A runaway agent loop has to hit max_steps before it stops.
Session memory. Agents need per-conversation memory; today people end up monkey-patching dicts onto component instances.
Tracing. Spans aren't tied to a request-level ID, so logs from one user request are hard to correlate across components and tool calls.
Five separate concerns, one missing primitive.
Describe the solution you'd like
Land #11366's option 2 (PipelineRunContext), but widen the dataclass so all five concerns share one object instead of arriving as five separate ContextVars over the next year:
Internally it's a ContextVar set around the run loop. Components opt in by calling haystack.runtime.current_context(). When no context is set, everything falls back to current behavior (os.environ for secrets, no cost tracking, etc.), so existing pipelines don't change.
What this unlocks: safe multi-tenant hosting, first-party cost reporting (no MLflow bolt-on just to know what a query cost), cancelable agent loops, structured session memory, and trace IDs tied to run_id.
Mutating os.environ per request introduces race conditions under any concurrency.
One process per tenant doesn't compose well with Hayhooks and gets expensive at any non-trivial tenant count.
Wrapping every generator to track cost pushes the same problem onto every user of the library.
Additional context
This closes #11366 (becomes the wider version of option 2) and addresses #10889 (cost tracking falls out of the cost_ledger field as a side-effect). It also gives Hayhooks a clean place to read tenant_id if per-tenant rate limits or audit logs ever come into scope.
I'm aware this overlaps with #11366. I'm opening a separate issue because the scope is materially broader, but I'm happy to consolidate the discussion into #11366 if you'd prefer.
Is your feature request related to a problem? Please describe.
I keep hitting the same wall from different angles while looking at Haystack for a multi-tenant setup (one process, many tenants). The most immediate symptom is secrets:
EnvVarSecret.resolve_value()readsos.environdirectly (haystack/utils/auth.py:201), so two tenants can't safely share a process. You either mutate the environment per request (race conditions) or run a process per tenant (operationally painful).@tstadelcalls this out in #11366.But the same gap shows up four more times once a pipeline is in production:
grep -rni 'cost|pricing' haystack/returns nothing. Simple local token tracking for Pipelines #10889 is the same shape of problem.max_stepsbefore it stops.Five separate concerns, one missing primitive.
Describe the solution you'd like
Land #11366's option 2 (
PipelineRunContext), but widen the dataclass so all five concerns share one object instead of arriving as five separateContextVars over the next year:Internally it's a
ContextVarset around the run loop. Components opt in by callinghaystack.runtime.current_context(). When no context is set, everything falls back to current behavior (os.environfor secrets, no cost tracking, etc.), so existing pipelines don't change.What this unlocks: safe multi-tenant hosting, first-party cost reporting (no MLflow bolt-on just to know what a query cost), cancelable agent loops, structured session memory, and trace IDs tied to
run_id.Describe alternatives you've considered
Pipelineinstance.os.environper request introduces race conditions under any concurrency.Additional context
This closes #11366 (becomes the wider version of option 2) and addresses #10889 (cost tracking falls out of the
cost_ledgerfield as a side-effect). It also gives Hayhooks a clean place to readtenant_idif per-tenant rate limits or audit logs ever come into scope.I'm aware this overlaps with #11366. I'm opening a separate issue because the scope is materially broader, but I'm happy to consolidate the discussion into #11366 if you'd prefer.