Unit checks: data-independent branch coverage (Tier 1 enumeration + Tier 2 path explorer)

## Context

Follow-up to GEP-10 (units & dimensionality). The environment-level unit check
`fail_if_environment_units_are_inconsistent` dry-runs each `@policy_function` /
`@param_function` body with pint scalars (`Quantity(1.0, unit)`) and compares the
inferred unit against the declaration. See `src/ttsim/interface_dag_elements/unit_checks.py`
(`_verify_one_body`).

Today the dry-run explores only **two paths**: all boolean inputs `True`, and all boolean
inputs `False` (`unit_checks.py:671`). Because units are *data-independent* (data only
selects which branch runs, never the units), the right goal is to verify the unit contract
on **every reachable syntactic path** — independent of any input data. This is strictly
stronger than, and preferred over, asserting units against the test-suite's input data
(which couples coverage to whichever inputs happen to exist).

## Gaps in the current two-run

- **(a) Multiple independent booleans / multiple guarded returns.** The two-run covers only
  (all-True, all-False); combinations in between are unchecked.
- **(b) Numeric-driven `if`/ternary.** The representative `1.0` picks one arm; the other arm
  is never executed, so its unit contract is unverified.

## Proposed work

### Tier 1 — generalize the two-run to 2^N boolean enumeration
Enumerate boolean-input combinations (`itertools.product`) instead of just (all-True,
all-False). Closes gap (a). ~10 lines, cheap, cacheable per policy date, no false positives.

### Tier 2 — path-exploring re-execution (concolic-lite)
Force every reachable path, including numeric-driven branches. Wrap representative values in
a proxy whose `__bool__` / `__gt__` / `__lt__` / `__eq__` return an **oracle-controlled**
bool, while the wrapped pint `Quantity` still carries its unit for the arithmetic. A
controller records the decision sequence and, across repeated runs, flips the last undriven
decision (DFS over the path tree) until all feasible paths are explored. Runs = number of
distinct paths, not 2^N-of-everything. (Essentially CrossHair without the SMT solver — we
don't need feasibility pruning, since we *want* to explore arms real data could never reach;
the unit contract must hold there too.) Closes gap (b).

Composes with the existing conservative rule: the `if exempt: return 0.0 else: return <eur>`
pattern stays benign because a dimensionless inference is already treated as
fallback-don't-flag (`unit_checks.py:680-683`); forcing both arms just adds the eur-arm check.

## Caveats / non-goals

- **Vectorized-op wall is unchanged.** Bodies calling `piecewise_polynomial`, `join`,
  lookup tables, or raw `xnp` ops throw under pint and fall back to the declaration
  (`unit_checks.py:676-679`). Neither tier can execute those. Only a fully static AST
  checker could cross that wall, and only by hand-encoding unit rules per op — out of scope
  (large build + permanent maintenance).
- **Path explosion** is theoretically exponential. Policy bodies are shallow in practice,
  but Tier 2 must cap path count and `log()` any truncation rather than silently stopping.
- The literal-zero arm is unit-polymorphic by design; keep it as fallback, not a flag.

## Suggested approach

Land Tier 1 first (nearly free, closes the most-cared-about gap). Treat Tier 2 as the real
target for true per-line, input-independent coverage.

Refs: GEP-10 (gettsim docs/geps/gep-10.md), parked design notes from the GEP-10 review session.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit checks: data-independent branch coverage (Tier 1 enumeration + Tier 2 path explorer) #134

Context

Gaps in the current two-run

Proposed work

Tier 1 — generalize the two-run to 2^N boolean enumeration

Tier 2 — path-exploring re-execution (concolic-lite)

Caveats / non-goals

Suggested approach

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Unit checks: data-independent branch coverage (Tier 1 enumeration + Tier 2 path explorer) #134

Description

Context

Gaps in the current two-run

Proposed work

Tier 1 — generalize the two-run to 2^N boolean enumeration

Tier 2 — path-exploring re-execution (concolic-lite)

Caveats / non-goals

Suggested approach

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions