You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pandas is hardcoded throughout SubspaceDiscrete. The class stores exp_rep and comp_rep as pd.DataFrame, constructs them using pandas operations, and returns them as pandas objects. All downstream consumers (recommenders, constraints, Campaign) assume pandas.
Polars support was bolted on via the BAYBE_DEACTIVATE_POLARS environment variable, but it is partial:
Polars is used for some internal construction steps, then collected back to pandas.
Row ordering differs between the pandas and Polars code paths, producing non-deterministic results depending on the backend.
There is no abstraction over the tabular type at the API boundary — callers must work with pandas regardless of their preferred backend.
Why it matters
Users who prefer Polars or PyArrow are forced to convert at every interaction with BayBE's search space API.
Row ordering inconsistencies between backends create subtle bugs that are difficult to diagnose.
True lazy evaluation is impossible with pandas as the internal representation. Pandas executes eagerly; wrapping it with a lazy abstraction does not make it lazy. This blocks the performance gains needed for large spaces (see sub-issue Eager search space materialization #796).
Problem
Pandas is hardcoded throughout
SubspaceDiscrete. The class storesexp_repandcomp_repaspd.DataFrame, constructs them using pandas operations, and returns them as pandas objects. All downstream consumers (recommenders, constraints, Campaign) assume pandas.Polars support was bolted on via the
BAYBE_DEACTIVATE_POLARSenvironment variable, but it is partial:Why it matters
Related