-
Notifications
You must be signed in to change notification settings - Fork 79
Enable Coefficients for DiscreteSumConstraint and from_simplex
#786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
557c092
6c403d4
fd46541
e22196a
89deeab
f2f87d9
a4af9af
43b9c99
fa78764
1c9d4a7
583ba2b
d96d428
df93b36
4f43355
0163e3c
d2c7aa3
2ce4a29
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,6 +6,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |
|
|
||
| ## [Unreleased] | ||
| ### Added | ||
| - `coefficients` attribute for `DiscreteSumConstraint`, enabling weighted sums. Follows | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Needs rebase and corresponding changes in CHANGELOG after release |
||
| the same pattern as `ContinuousLinearConstraint.coefficients` | ||
| - `simplex_coefficients` keyword argument to `SubspaceDiscrete.from_simplex` for | ||
| weighted simplex sum constraints | ||
| - Support for Python 3.14 | ||
| - Support for pandas 3 | ||
| - `Settings` class for unified and streamlined settings management | ||
|
|
@@ -40,6 +44,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |
| ### Breaking Changes | ||
| - `parameter_cartesian_prod_pandas` and `parameter_cartesian_prod_polars` moved | ||
| from `baybe.searchspace.discrete` to `baybe.searchspace.utils` | ||
| - All optional arguments of `SubspaceDiscrete.from_simplex` after `simplex_parameters` | ||
|
AVHopp marked this conversation as resolved.
|
||
| are now keyword-only | ||
| - `ContinuousLinearConstraint.to_botorch` now returns a collection of constraint tuples | ||
| instead of a single tuple (needed for interpoint constraints) | ||
| - `Kernel.to_gpytorch` now takes a `SearchSpace` instead of explicit `ard_num_dims`, | ||
|
|
@@ -61,6 +67,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |
| equality check | ||
|
|
||
| ### Changed | ||
| - `DiscreteSumConstraint`, `ContinuousLinearConstraint`, and | ||
| `SubspaceDiscrete.from_simplex` now forbid 0 as coefficients | ||
| - "User Guide" section has been split into "Components" and "Concepts" | ||
| - Default transfer learning kernel changed from `IndexKernel` to `PositiveIndexKernel`, | ||
| enforcing positive task correlations | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -3,15 +3,16 @@ | |||||||||||||||
| from __future__ import annotations | ||||||||||||||||
|
|
||||||||||||||||
| import gc | ||||||||||||||||
| from collections.abc import Callable | ||||||||||||||||
| from collections.abc import Callable, Sequence | ||||||||||||||||
| from functools import reduce | ||||||||||||||||
| from typing import TYPE_CHECKING, Any, ClassVar, cast | ||||||||||||||||
|
|
||||||||||||||||
| import cattrs | ||||||||||||||||
| import numpy as np | ||||||||||||||||
| import numpy.typing as npt | ||||||||||||||||
| import pandas as pd | ||||||||||||||||
| from attrs import define, field | ||||||||||||||||
| from attrs.validators import in_, min_len | ||||||||||||||||
| from attrs.validators import deep_iterable, in_, min_len | ||||||||||||||||
| from typing_extensions import override | ||||||||||||||||
|
|
||||||||||||||||
| from baybe.constraints.base import CardinalityConstraint, DiscreteConstraint | ||||||||||||||||
|
|
@@ -26,6 +27,7 @@ | |||||||||||||||
| block_serialization_hook, | ||||||||||||||||
| converter, | ||||||||||||||||
| ) | ||||||||||||||||
| from baybe.utils.validation import finite_float | ||||||||||||||||
|
|
||||||||||||||||
| if TYPE_CHECKING: | ||||||||||||||||
| import polars as pl | ||||||||||||||||
|
|
@@ -77,7 +79,11 @@ def get_invalid_polars(self) -> pl.Expr: | |||||||||||||||
|
|
||||||||||||||||
| @define | ||||||||||||||||
| class DiscreteSumConstraint(DiscreteConstraint): | ||||||||||||||||
| """Class for modelling sum constraints.""" | ||||||||||||||||
| """Class for modelling sum constraints. | ||||||||||||||||
|
|
||||||||||||||||
| The constraint evaluates whether the (optionally weighted) sum of the specified | ||||||||||||||||
| parameters satisfies the given threshold condition. | ||||||||||||||||
| """ | ||||||||||||||||
|
|
||||||||||||||||
| # IMPROVE: refactor `SumConstraint` and `ProdConstraint` to avoid code copying | ||||||||||||||||
|
|
||||||||||||||||
|
|
@@ -94,9 +100,45 @@ class DiscreteSumConstraint(DiscreteConstraint): | |||||||||||||||
| condition: ThresholdCondition = field() | ||||||||||||||||
| """The condition modeled by this constraint.""" | ||||||||||||||||
|
|
||||||||||||||||
| coefficients: tuple[float, ...] = field( | ||||||||||||||||
| converter=lambda x: cattrs.structure(x, tuple[float, ...]), | ||||||||||||||||
| validator=deep_iterable(member_validator=finite_float), | ||||||||||||||||
| ) | ||||||||||||||||
| """The coefficients for the weighted sum, one per entry in ``parameters``. | ||||||||||||||||
|
|
||||||||||||||||
| Defaults to all-ones, i.e. an unweighted sum.""" | ||||||||||||||||
|
|
||||||||||||||||
| @coefficients.default | ||||||||||||||||
| def _default_coefficients(self) -> tuple[float, ...]: | ||||||||||||||||
| """Return equal weight coefficients as default.""" | ||||||||||||||||
| return (1.0,) * len(self.parameters) | ||||||||||||||||
|
|
||||||||||||||||
| @coefficients.validator | ||||||||||||||||
| def _validate_coefficients( # noqa: DOC101, DOC103 | ||||||||||||||||
| self, _: Any, coefficients: Sequence[float] | ||||||||||||||||
| ) -> None: | ||||||||||||||||
| """Validate the coefficients. | ||||||||||||||||
|
|
||||||||||||||||
| Raises: | ||||||||||||||||
| ValueError: If the number of coefficients does not match the number of | ||||||||||||||||
| parameters. | ||||||||||||||||
| """ | ||||||||||||||||
| if len(self.parameters) != len(coefficients): | ||||||||||||||||
| raise ValueError( | ||||||||||||||||
| "The given 'coefficients' list must have one floating point entry for " | ||||||||||||||||
| "each entry in 'parameters'." | ||||||||||||||||
| ) | ||||||||||||||||
| if any(c == 0.0 for c in coefficients): | ||||||||||||||||
| raise ValueError("All entries in 'coefficients' must be non-zero.") | ||||||||||||||||
|
|
||||||||||||||||
| @override | ||||||||||||||||
| def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index: | ||||||||||||||||
| evaluate_df = df[self.parameters].sum(axis=1) | ||||||||||||||||
| evaluate_df = pd.Series( | ||||||||||||||||
| sum( | ||||||||||||||||
| df[p].to_numpy() * c for p, c in zip(self.parameters, self.coefficients) | ||||||||||||||||
| ), | ||||||||||||||||
| index=df.index, | ||||||||||||||||
| ) | ||||||||||||||||
|
Comment on lines
+136
to
+141
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So just to be sure I understand this right: you are saying that accessing all columns simultaneously could give a non-contiguous array and that the
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. afaik
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I understand this is only speculative? It could indeed be the case, but just as likely could it be the opposite. Numpy, for example, is where efficient with creating views based on slicing, with no copying involved. And since pandas uses numpy under the hood, there are chances that no copying is involved here, in which case the
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you're overestimating what even numpy can do, if you index an array with eg indices 0,3,7 it will always do a copy because that cannot be represented into numpys basepointer/shape/stride model or in other words: it is not a slice. So pandas just inherits from that. Cant spot any speculation here |
||||||||||||||||
| mask_bad = ~self.condition.evaluate(evaluate_df) | ||||||||||||||||
|
|
||||||||||||||||
| return df.index[mask_bad] | ||||||||||||||||
|
|
@@ -105,7 +147,8 @@ def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index: | |||||||||||||||
| def get_invalid_polars(self) -> pl.Expr: | ||||||||||||||||
| from baybe._optional.polars import polars as pl | ||||||||||||||||
|
|
||||||||||||||||
| return self.condition.to_polars(pl.sum_horizontal(self.parameters)).not_() | ||||||||||||||||
| weighted = [pl.col(p) * c for p, c in zip(self.parameters, self.coefficients)] | ||||||||||||||||
| return self.condition.to_polars(pl.sum_horizontal(weighted)).not_() | ||||||||||||||||
|
|
||||||||||||||||
|
|
||||||||||||||||
| @define | ||||||||||||||||
|
|
||||||||||||||||

Uh oh!
There was an error while loading. Please reload this page.