Skip to content

Unify request-scoped state in a single PipelineRunContext (extends #11366) #11580

@Aarkin7

Description

@Aarkin7

Is your feature request related to a problem? Please describe.
I keep hitting the same wall from different angles while looking at Haystack for a multi-tenant setup (one process, many tenants). The most immediate symptom is secrets: EnvVarSecret.resolve_value() reads os.environ directly (haystack/utils/auth.py:201), so two tenants can't safely share a process. You either mutate the environment per request (race conditions) or run a process per tenant (operationally painful). @tstadel calls this out in #11366.

But the same gap shows up four more times once a pipeline is in production:

  • Cost. There's no per-run ledger, so you can't tell a tenant what their last query cost. grep -rni 'cost|pricing' haystack/ returns nothing. Simple local token tracking for Pipelines #10889 is the same shape of problem.
  • Cancellation. No per-run cancel token. A runaway agent loop has to hit max_steps before it stops.
  • Session memory. Agents need per-conversation memory; today people end up monkey-patching dicts onto component instances.
  • Tracing. Spans aren't tied to a request-level ID, so logs from one user request are hard to correlate across components and tool calls.

Five separate concerns, one missing primitive.

Describe the solution you'd like
Land #11366's option 2 (PipelineRunContext), but widen the dataclass so all five concerns share one object instead of arriving as five separate ContextVars over the next year:

@dataclass
class PipelineRunContext:
    run_id: str                              # also the trace correlation id
    tenant_id: str | None = None
    secrets: SecretProvider | None = None    # supersedes os.environ
    cost_ledger: CostLedger | None = None    # generators append token usage
    cancel: CancellationToken | None = None
    session: MutableMapping = field(default_factory=dict)

pipeline.run(data, context=PipelineRunContext(run_id=..., tenant_id="acme"))

Internally it's a ContextVar set around the run loop. Components opt in by calling haystack.runtime.current_context(). When no context is set, everything falls back to current behavior (os.environ for secrets, no cost tracking, etc.), so existing pipelines don't change.

What this unlocks: safe multi-tenant hosting, first-party cost reporting (no MLflow bolt-on just to know what a query cost), cancelable agent loops, structured session memory, and trace IDs tied to run_id.

Describe alternatives you've considered

Additional context
This closes #11366 (becomes the wider version of option 2) and addresses #10889 (cost tracking falls out of the cost_ledger field as a side-effect). It also gives Hayhooks a clean place to read tenant_id if per-tenant rate limits or audit logs ever come into scope.

I'm aware this overlaps with #11366. I'm opening a separate issue because the scope is materially broader, but I'm happy to consolidate the discussion into #11366 if you'd prefer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low priority, leave it in the backlog
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions