Skip to content

Optimization Cycle I: Loop merge & transient refine#465

Draft
FlorianDeconinck wants to merge 100 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:opt_cycle_I/loop_merge
Draft

Optimization Cycle I: Loop merge & transient refine#465
FlorianDeconinck wants to merge 100 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:opt_cycle_I/loop_merge

Conversation

@FlorianDeconinck

@FlorianDeconinck FlorianDeconinck commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Description

Readying for mainline the following Schedule Tree transform:

  • CartesianMerge
  • InlineVertical2DWrite
  • CartesianRefineTransients

QOL / Tooling:

  • TreeOptimizationStatistics will record the before/after count of maps, fors and transients

🐞 Regression/ Bugs worked around

  • Locals are now non-transient in GPU because of bugs showing during tree optimization
  • CartesianRefineTransients is not applied on GPU - same as above

⚠️ This PR includes an update to temporary branches of gt4py/dace to consolidate all changes needed for the June presentation

How has this been tested?

New tests when needed

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. add new modules to docs/docstrings/)
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules
  • New check tests, if applicable, are included

@FlorianDeconinck FlorianDeconinck requested review from romanc and twicki May 14, 2026 16:06
FlorianDeconinck and others added 26 commits May 14, 2026 15:28
Move pipeline defaults inside the Pipeline itself and have orchestration call default
Mockup of passes required for merging to behave
Use symbols in the replacement directory. Update DaCe to a version
that doesn't re-initialize the symbols. And fix the test failure in
python 3.13.
This has been replaced with `InlineOffgridConditionals` pass
- Local are no longer transient on GPU
- RefineTransients is deactivated
romanc and others added 30 commits June 19, 2026 15:02
This commit brings an optimization config, which will allow to teak
optmiization parameters per NDSLRuntime and/or `orchestrate()` call.
 - Remove  "Config" from nested name
 - Add `GPU` & apply to orchestration
 - Some docs
 - Display configuration for building ranks
…havior of allocation when`fortran_aligned` that does not pad dimensions to be the same length

Added a `--pad_non_interface_dimensions` to the unit tests to pass the flag during testing
Let's keep the naming of this consistent with the pass name, i.e.
`KernelizeMaps` / `KernelizeMap`.
On GPU, we already used the OptimizationConfig to turn on/off certain
pipeline passes. This commit extends that work to the `CPUPipeline` and
adds OptimizationConfig flags where missing.
Because we have a "dace/" directory in `ndsl/dsl`, the previous import
could be resolved as a local import. If that happened (depending on
import order), then the DaCe's `Config` object would not be found there.
Resolved by importing `Config` from `dace.config`, which is unambiguous
to resolve.
The default merging order for `CartesianMerge` is to follow the loop
order of the given backend. This commit adds support for a custom merge
oder override.
Revert "fixup: use normalized indices in debug message"

This reverts commit b24e1fc.

Revert "fix: account for map start in axis normalization"

This reverts commit de9763d.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants