Add Hyperparameter Optimization (HPO) examples using Ray Tune and HyperNOs#2070
Add Hyperparameter Optimization (HPO) examples using Ray Tune and HyperNOs#2070MaxGhi8 wants to merge 5 commits intolululxvi:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds new hyperparameter-optimization (HPO) example scripts intended to demonstrate integrating DeepXDE problems with HyperNOs + Ray Tune, and updates the top-level README to reference these capabilities.
Changes:
- Update
README.mdformatting and add HPO mentions/links. - Add Ray Tune + HyperNOs HPO example scripts for operator learning (Poisson, Advection 1D/2D).
- Add Ray Tune + HyperNOs HPO example scripts for PINN forward/inverse diffusion.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
README.md |
Reformats algorithm list; adds HPO feature bullet and a new “Demos” link. |
examples/operator/poisson_1d_hpo.py |
New HPO example for Poisson operator learning via DeepONetCartesianProd. |
examples/operator/advection_hpo.py |
New HPO example for 1D advection PI-DeepONet-style operator learning. |
examples/operator/advection_2d_hpo.py |
New HPO example for 2D (mapped) advection PI-DeepONet-style operator learning. |
examples/pinn_forward/diffusion_1d_hpo.py |
New HPO example script for 1D diffusion forward problem. |
examples/pinn_inverse/diffusion_1d_inverse_hpo.py |
New HPO example script for 1D diffusion inverse problem. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - [Demos of forward problems](https://deepxde.readthedocs.io/en/latest/demos/pinn_forward.html) | ||
| - [Demos of inverse problems](https://deepxde.readthedocs.io/en/latest/demos/pinn_inverse.html) | ||
| - [Demos of operator learning](https://deepxde.readthedocs.io/en/latest/demos/operator.html) | ||
| - [Demos of hyperparameter optimization](examples/README.md) |
There was a problem hiding this comment.
The new README link points to examples/README.md, but there is no examples/README.md file in the repository. This will render as a broken link on GitHub; either add that file or link to an existing docs page/path that contains the HPO examples.
| - [Demos of hyperparameter optimization](examples/README.md) | |
| - [Demos of hyperparameter optimization](https://deepxde.readthedocs.io/en/latest/demos/hpo.html) |
| - 4 **function spaces**: power series, Chebyshev polynomial, Gaussian random field (1D/2D). | ||
| - **data-parallel training** on multiple GPUs. | ||
| - different **optimizers**: Adam, L-BFGS, etc. | ||
| - **hyperparameter optimization** using [HyperNOs](https://github.com/MaxGhi8/HyperNOs) and [Ray Tune](https://docs.ray.io/en/latest/tune/index.html). |
There was a problem hiding this comment.
The README claims DeepXDE supports hyperparameter optimization using HyperNOs + Ray Tune, but these are not project dependencies (and the PR description links a different HyperNOs org). Consider rephrasing this as an example integration and mentioning that hypernos/ray[tune] are optional extra installs, and align the HyperNOs URL with the one intended for this PR.
| - **hyperparameter optimization** using [HyperNOs](https://github.com/MaxGhi8/HyperNOs) and [Ray Tune](https://docs.ray.io/en/latest/tune/index.html). | |
| - example **hyperparameter optimization** integrations (via optional extra packages) using [HyperNOs](https://pypi.org/project/hypernos/) and [Ray Tune](https://docs.ray.io/en/latest/tune/index.html) (install `hypernos` and `ray[tune]` separately). |
| def solve_advection_1d(): | ||
| # PDE: u_t + u_x = 0 | ||
| def pde_fn(x, y): | ||
| dy_x = dde.grad.jacobian(y, x, j=0) | ||
| dy_t = dde.grad.jacobian(y, x, j=1) | ||
| return dy_t + dy_x | ||
|
|
There was a problem hiding this comment.
PDEOperatorCartesianProd operator-learning setups pass the sampled branch function values into the PDE callback (see the referenced advection_aligned_pideeponet.py, which defines pde(x, y, v)). Here pde_fn is defined as (x, y) only, which will raise a TypeError when DeepXDE calls it with the extra v argument. Update the signature to accept v (even if unused).
| def solve_advection_2d(): | ||
| # PDE: u_y + u_x = 0 (where y is time) | ||
| def pde_fn(x, y): | ||
| dy_x = dde.grad.jacobian(y, x, j=0) | ||
| dy_y = dde.grad.jacobian(y, x, j=1) | ||
| return dy_y + dy_x | ||
|
|
There was a problem hiding this comment.
Same issue as the 1D version: the referenced base example defines the PDE as pde(x, y, v) for operator learning. Here pde_fn is (x, y) only, so DeepXDE is likely to call it with an extra branch-function argument and fail at runtime. Adjust pde_fn to accept the additional v parameter.
| # For HPO with HyperNOs, we use a dummy branch of ones | ||
| # and dummy targets of ones to avoid division by zero in relative loss. | ||
| # The target shape must match (batch, num_points) | ||
| X_branch_train = np.ones((1, 10)) # 1 sample, 10 eval points | ||
| X_branch_test = np.ones((1, 10)) | ||
|
|
||
| y_train = np.ones((1, num_train)) | ||
| y_test = np.ones((1, num_test)) | ||
|
|
There was a problem hiding this comment.
The current HPO objective is trained against an all-ones target (y_train/y_test). That makes the Ray Tune search largely uninformative (most configs can fit a constant field), and it no longer reflects a diffusion PINN example as described. Consider generating targets from the known analytic solution in examples/pinn_forward/diffusion_1d.py (or using a physics-residual-based objective) so the validation loss actually measures PDE solution quality.
| # Observation points as trunk inputs | ||
| X_trunk_train = np.vstack((np.linspace(-1, 1, num=50), np.full((50), 1))).T | ||
| X_trunk_test = X_trunk_train # Simplified for HPO | ||
|
|
||
| num_pts = X_trunk_train.shape[0] | ||
|
|
||
| # Dummy branch and non-zero targets to avoid division by zero | ||
| X_branch_train = np.ones((1, 10)) | ||
| X_branch_test = np.ones((1, 10)) | ||
|
|
||
| y_train = np.ones((1, num_pts)) | ||
| y_test = np.ones((1, num_pts)) | ||
|
|
||
| class PDEData: | ||
| def __init__(self, X_branch, X_trunk, y, batch_size): | ||
| self.train_loader = DataLoader( | ||
| PINNInverseCartesianDataset(X_branch, X_trunk, y), | ||
| batch_size=batch_size, | ||
| shuffle=True, | ||
| collate_fn=deeponet_collate_fn, | ||
| ) | ||
| self.val_loader = DataLoader( | ||
| PINNInverseCartesianDataset(X_branch, X_trunk, y), | ||
| batch_size=batch_size, | ||
| shuffle=False, | ||
| collate_fn=deeponet_collate_fn, | ||
| ) | ||
| self.test_loader = self.val_loader | ||
|
|
||
| return PDEData(X_branch_train, X_trunk_train, y_train, config.get("batch_size", 1)) |
There was a problem hiding this comment.
X_trunk_test, X_branch_test, and y_test are defined but unused, and val_loader is constructed from the same data as train_loader. If the intention is to do architecture selection, consider providing a distinct validation loader (even if small) to prevent tuning on training loss only.
| # Observation points as trunk inputs | |
| X_trunk_train = np.vstack((np.linspace(-1, 1, num=50), np.full((50), 1))).T | |
| X_trunk_test = X_trunk_train # Simplified for HPO | |
| num_pts = X_trunk_train.shape[0] | |
| # Dummy branch and non-zero targets to avoid division by zero | |
| X_branch_train = np.ones((1, 10)) | |
| X_branch_test = np.ones((1, 10)) | |
| y_train = np.ones((1, num_pts)) | |
| y_test = np.ones((1, num_pts)) | |
| class PDEData: | |
| def __init__(self, X_branch, X_trunk, y, batch_size): | |
| self.train_loader = DataLoader( | |
| PINNInverseCartesianDataset(X_branch, X_trunk, y), | |
| batch_size=batch_size, | |
| shuffle=True, | |
| collate_fn=deeponet_collate_fn, | |
| ) | |
| self.val_loader = DataLoader( | |
| PINNInverseCartesianDataset(X_branch, X_trunk, y), | |
| batch_size=batch_size, | |
| shuffle=False, | |
| collate_fn=deeponet_collate_fn, | |
| ) | |
| self.test_loader = self.val_loader | |
| return PDEData(X_branch_train, X_trunk_train, y_train, config.get("batch_size", 1)) | |
| # Observation points as trunk inputs (train) | |
| X_trunk_train = np.vstack((np.linspace(-1, 1, num=50), np.full((50), 1))).T | |
| # Separate observation points for validation/test to avoid tuning on training loss only | |
| X_trunk_test = np.vstack((np.linspace(-1, 1, num=20), np.full((20), 1))).T | |
| num_pts_train = X_trunk_train.shape[0] | |
| num_pts_test = X_trunk_test.shape[0] | |
| # Dummy branch and non-zero targets to avoid division by zero | |
| X_branch_train = np.ones((1, 10)) | |
| X_branch_test = np.ones((1, 10)) | |
| y_train = np.ones((1, num_pts_train)) | |
| y_test = np.ones((1, num_pts_test)) | |
| class PDEData: | |
| def __init__( | |
| self, | |
| X_branch_train, | |
| X_trunk_train, | |
| y_train, | |
| X_branch_val, | |
| X_trunk_val, | |
| y_val, | |
| batch_size, | |
| ): | |
| self.train_loader = DataLoader( | |
| PINNInverseCartesianDataset(X_branch_train, X_trunk_train, y_train), | |
| batch_size=batch_size, | |
| shuffle=True, | |
| collate_fn=deeponet_collate_fn, | |
| ) | |
| self.val_loader = DataLoader( | |
| PINNInverseCartesianDataset(X_branch_val, X_trunk_val, y_val), | |
| batch_size=batch_size, | |
| shuffle=False, | |
| collate_fn=deeponet_collate_fn, | |
| ) | |
| self.test_loader = DataLoader( | |
| PINNInverseCartesianDataset(X_branch_val, X_trunk_val, y_val), | |
| batch_size=batch_size, | |
| shuffle=False, | |
| collate_fn=deeponet_collate_fn, | |
| ) | |
| return PDEData( | |
| X_branch_train, | |
| X_trunk_train, | |
| y_train, | |
| X_branch_test, | |
| X_trunk_test, | |
| y_test, | |
| config.get("batch_size", 1), | |
| ) |
| # Generate training and test data from DeepXDE operator | ||
| X_train, y_train, aux_train = pde_op.train_next_batch(config["training_samples"]) | ||
| X_test, y_test, aux_test = pde_op.test() | ||
|
|
||
| # In PDEOperatorCartesianProd, y_train is often None (physics-informed). | ||
| # For this HPO demo, we'll use a dummy target if y is None, | ||
| # or you could solve the PDE to get ground truth. | ||
| if y_train is None: | ||
| y_train = np.ones((X_train[0].shape[0], X_train[1].shape[0], 1)) | ||
| if y_test is None: | ||
| y_test = np.ones((X_test[0].shape[0], X_test[1].shape[0], 1)) | ||
|
|
There was a problem hiding this comment.
For operator-learning via PDEOperatorCartesianProd, y_train is None because DeepXDE expects to train via the physics-informed residual. Replacing it with a constant dummy target means the HPO run is no longer optimizing for the Poisson operator (it’s just fitting a constant). Either compute true solution targets for the sampled forcings (or implement a residual-based loss inside the training loop) or make it explicit in the script/docs that this is only an integration template and not a meaningful Poisson HPO benchmark.
| [dim_trunk] + [config["network_width"]] * config["trunk_depth"] + [p] | ||
| ) | ||
|
|
||
| model = dde.nn.DeepONetCartesianProd( | ||
| layer_sizes_branch, | ||
| layer_sizes_trunk, | ||
| "tanh", | ||
| "Glorot normal", | ||
| ) | ||
|
|
||
| # Optional: Feature transform for periodicity as in the base example | ||
| def periodic(x): | ||
| xt, tt = x[:, :1], x[:, 1:] | ||
| xt = xt * 2 * np.pi | ||
| return torch.cat( | ||
| [torch.cos(xt), torch.sin(xt), torch.cos(2 * xt), torch.sin(2 * xt), tt], 1 | ||
| ) | ||
|
|
||
| # Note: Applying feature transform might change trunk input dim to 5 | ||
| # If we apply it, we need to adjust layer_sizes_trunk[0] | ||
| layer_sizes_trunk_transformed = ( | ||
| [5] + [config["network_width"]] * config["trunk_depth"] + [p] | ||
| ) | ||
| model = dde.nn.DeepONetCartesianProd( | ||
| layer_sizes_branch, | ||
| layer_sizes_trunk_transformed, | ||
| "tanh", | ||
| "Glorot normal", | ||
| ) | ||
| model.apply_feature_transform(periodic) | ||
|
|
There was a problem hiding this comment.
model_builder instantiates a DeepONetCartesianProd twice; the first model is immediately overwritten when applying the feature transform. This extra instantiation is unnecessary and can confuse readers. Consider constructing the transformed trunk layer sizes up-front and creating the model only once before calling apply_feature_transform.
| # In PI-DeepONet, we often use dummy targets if we are pure physics-informed, | ||
| # or initial condition values. In this HPO demo, we use ones to avoid division by zero. | ||
| if y_train is None: | ||
| y_train = np.ones((X_train[0].shape[0], X_train[1].shape[0], 1)) | ||
| if y_test is None: | ||
| y_test = np.ones((X_test[0].shape[0], X_test[1].shape[0], 1)) | ||
|
|
||
| if y_train.shape[-1] == 1: | ||
| y_train = y_train.squeeze(-1) | ||
| if y_test.shape[-1] == 1: | ||
| y_test = y_test.squeeze(-1) |
There was a problem hiding this comment.
When y_train/y_test are None (physics-informed operator setup), this example replaces them with constant ones. That means the HPO run is not optimizing the PI-DeepONet physics objective at all (it becomes a trivial supervised constant-fitting task). Either (1) implement a residual-based loss using the PDE definition, or (2) compute non-trivial targets from an analytic/numerical solution, or (3) clearly label this as a Ray/HyperNOs integration template rather than an advection PI-DeepONet HPO example.
| # Handle potentially missing targets in pure physics-informed setup. | ||
| # In this HPO demo, we use ones to avoid division by zero. | ||
| if y_train is None: | ||
| y_train = np.ones((X_train[0].shape[0], X_train[1].shape[0], 1)) | ||
| if y_test is None: | ||
| y_test = np.ones((X_test[0].shape[0], X_test[1].shape[0], 1)) | ||
|
|
||
| if y_train.shape[-1] == 1: | ||
| y_train = y_train.squeeze(-1) | ||
| if y_test.shape[-1] == 1: | ||
| y_test = y_test.squeeze(-1) | ||
|
|
There was a problem hiding this comment.
Same concern as the 1D advection HPO script: substituting y_train/y_test with constant ones turns this into a trivial supervised objective and does not reflect PI-DeepONet physics-informed training. Consider using a residual-based loss (preferred for PI-DeepONet) or generating meaningful solution targets, or explicitly documenting that this is only an integration/template example.
echen5503
left a comment
There was a problem hiding this comment.
Please provide docs for at least one of the examples. Follow the format as in #2059.
Additionally, the examples themselves seem dubious. Not every hyperparameter needs to be tuned at once, additionally it seems that each example is mostly boilerplate. Please make each example more unique and show different aspects of hyperparam tuning.
| DeepXDE is a library for scientific machine learning and physics-informed learning. DeepXDE includes the following algorithms: | ||
|
|
||
| - physics-informed neural network (PINN) | ||
| - solving different problems |
There was a problem hiding this comment.
Don't randomly change white space, keep minimal changes.
d3b1642 to
e822d5b
Compare
|
Thank you for the feedback. I have refactored the PR as requested:
|
|
Great, this looks much better now. Can you provide screenshots of the documentation and outputs of the code so that we can be sure it runs correctly? |
|
Running with |
| @@ -0,0 +1,223 @@ | |||
| """ | |||
There was a problem hiding this comment.
| """ | |
| """Backend Supported: pytorch |
| @@ -0,0 +1,223 @@ | |||
| """ | |||
| Backend supported: pytorch | |||
There was a problem hiding this comment.
| Backend supported: pytorch |
| @@ -0,0 +1,195 @@ | |||
| """ | |||
There was a problem hiding this comment.
same here. remove the enter
| - [Demos of forward problems](https://deepxde.readthedocs.io/en/latest/demos/pinn_forward.html) | ||
| - [Demos of inverse problems](https://deepxde.readthedocs.io/en/latest/demos/pinn_inverse.html) | ||
| - [Demos of operator learning](https://deepxde.readthedocs.io/en/latest/demos/operator.html) | ||
| - [Demos of hyperparameter optimization](https://deepxde.readthedocs.io/en/latest/demos/operator/advection_2d_hpo.html) |
There was a problem hiding this comment.
It could be considered that this may not be a very extensible approach. If someone were to add more hyperparam optimization examples, then they would have to individually link them.
Hyperparameter optimization doesn't fit neatly inside of operator, inverse, or forward, maybe you could create a hyperparameter folder, where it would also be extensible to people wanting to add examples of learning rate annealing, optuna, etc.
| @@ -0,0 +1,205 @@ | |||
| 2D advection: Comprehensive HPO with HyperNOs | |||
| ============================================= | |||
|
|
|||
There was a problem hiding this comment.
Can you put the approximate runtime here? Hyperparam optimization often takes a long time, so the user should be assured of the code's runtime.
|
|
||
| python examples/operator/advection_2d_hpo.py | ||
|
|
||
| The script will launch the Ray Tune dashboard where you can monitor the progress of each trial in real-time. Once finished, it will print the best configuration and the corresponding relative loss. No newline at end of file |
There was a problem hiding this comment.
let the user see the final code here, it is standard for all the other documentation examples.
|
Hello, I have just pushed the latest updates. I followed your feedback and implemented all the suggested changes. Specifically, I have reorganized the files by moving the examples and documentation into dedicated |
echen5503
left a comment
There was a problem hiding this comment.
Looking much better. just a few small changes before I think it's good.
| - | ||
| - `Diffusion reaction equation with aligned points using ZCS <https://github.com/lululxvi/deepxde/tree/master/examples/operator/diff_rec_aligned_zcs_pideeponet.py>`_ | ||
| - `Stokes flow with aligned points using ZCS <https://github.com/lululxvi/deepxde/tree/master/examples/operator/stokes_aligned_zcs_pideeponet.py>`_ | ||
|
|
There was a problem hiding this comment.
what's the point of this?
|
|
||
| .. note:: | ||
|
|
||
| This code takes about 5 minutes to run. |
There was a problem hiding this comment.
mention what computer you ran this on.
| :maxdepth: 1 | ||
|
|
||
| hyperparameter/advection_2d_hpo | ||
|
|
|
I've implemented the requested changes. |
|
Ok. Looks good now. |






This PR introduces a comprehensive set of examples for hyperparameter optimization (HPO) in DeepXDE, leveraging Ray Tune and the HyperNOs (https://github.com/lu-group/HyperNOs) framework.
The examples cover the core functionalities of the library:
These examples provide a unified, "supervised operator style" template for HPO that is easily extensible to other physics-informed learning tasks.