Unify request-scoped state in a single PipelineRunContext (extends #11366)

**Is your feature request related to a problem? Please describe.**
I keep hitting the same wall from different angles while looking at Haystack for a multi-tenant setup (one process, many tenants). The most immediate symptom is secrets: `EnvVarSecret.resolve_value()` reads `os.environ` directly (`haystack/utils/auth.py:201`), so two tenants can't safely share a process. You either mutate the environment per request (race conditions) or run a process per tenant (operationally painful). `@tstadel` calls this out in #11366.

But the same gap shows up four more times once a pipeline is in production:

- **Cost.** There's no per-run ledger, so you can't tell a tenant what their last query cost. `grep -rni 'cost|pricing' haystack/` returns nothing. #10889 is the same shape of problem.
- **Cancellation.** No per-run cancel token. A runaway agent loop has to hit `max_steps` before it stops.
- **Session memory.** Agents need per-conversation memory; today people end up monkey-patching dicts onto component instances.
- **Tracing.** Spans aren't tied to a request-level ID, so logs from one user request are hard to correlate across components and tool calls.

Five separate concerns, one missing primitive.

**Describe the solution you'd like**
Land #11366's option 2 (`PipelineRunContext`), but widen the dataclass so all five concerns share one object instead of arriving as five separate `ContextVar`s over the next year:

```python
@dataclass
class PipelineRunContext:
    run_id: str                              # also the trace correlation id
    tenant_id: str | None = None
    secrets: SecretProvider | None = None    # supersedes os.environ
    cost_ledger: CostLedger | None = None    # generators append token usage
    cancel: CancellationToken | None = None
    session: MutableMapping = field(default_factory=dict)

pipeline.run(data, context=PipelineRunContext(run_id=..., tenant_id="acme"))
```

Internally it's a `ContextVar` set around the run loop. Components opt in by calling `haystack.runtime.current_context()`. When no context is set, everything falls back to current behavior (`os.environ` for secrets, no cost tracking, etc.), so existing pipelines don't change.

What this unlocks: safe multi-tenant hosting, first-party cost reporting (no MLflow bolt-on just to know what a query cost), cancelable agent loops, structured session memory, and trace IDs tied to `run_id`.

**Describe alternatives you've considered**
- #11366's option 1 (ContextVar-only, secrets-only) works, but it locks us into one ContextVar per future request-scoped concern. We'll file the same shape of issue four more times.
- #11366's option 3 (constructor-time context) breaks the per-request semantic, the context has to be able to change between calls on the same `Pipeline` instance.
- Mutating `os.environ` per request introduces race conditions under any concurrency.
- One process per tenant doesn't compose well with Hayhooks and gets expensive at any non-trivial tenant count.
- Wrapping every generator to track cost pushes the same problem onto every user of the library.

**Additional context**
This closes #11366 (becomes the wider version of option 2) and addresses #10889 (cost tracking falls out of the `cost_ledger` field as a side-effect). It also gives Hayhooks a clean place to read `tenant_id` if per-tenant rate limits or audit logs ever come into scope.

I'm aware this overlaps with #11366. I'm opening a separate issue because the scope is materially broader, but I'm happy to consolidate the discussion into #11366 if you'd prefer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify request-scoped state in a single PipelineRunContext (extends #11366) #11580

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Unify request-scoped state in a single PipelineRunContext (extends #11366) #11580

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions