Skip to content

GPU parallel prefix scan for clear-sky adding method — 6/6 tests pass #298

Description

@consigcody94

Summary

The non-scattering (clear-sky) adding method in ADA_Module.f90 (lines 228-246) can be parallelized using a parallel prefix scan with affine tuples, reducing sequential depth from O(N_layers) to O(log N_layers).

The radiance recurrence Rad_UP[k-1] = source_up[k] + trans[k] * Rad_UP[k] is an affine map, and affine map composition is associative — enabling a Hillis-Steele parallel scan on GPU.

Benchmark (RTX 3060, CUDA 13.2)

Configuration Profiles Layers Angles GPU Time Max Rel Error Status
IR sounder 10,000 60 6 0.18 ms 4.1e-07 PASS
IR sounder 50,000 60 6 0.87 ms 4.3e-07 PASS
High-res 10,000 100 6 0.32 ms 4.3e-07 PASS
MW sounder 100,000 60 2 0.59 ms 3.5e-07 PASS
Multi-stream 10,000 60 16 0.44 ms 4.4e-07 PASS
Deep atm 1,000 200 6 0.06 ms 3.1e-07 PASS

Zero NaN, zero Inf across all tests.

Scope & Limitations

What works: The scalar (non-scattering) adding path where layer transmittance and reflectance are diagonal. Each discrete ordinate angle is processed independently.

What is NOT tested: The full scattering path (lines 198-226) which involves nZ×nZ matrix operations with matinv at every layer. The matrix Redheffer star product is theoretically still associative, but the parallel scan with matrix inverses inside the operator is more complex and may have numerical issues. This needs further research.

Hardware limitation: Tested on RTX 3060 12GB only. Production CRTM workloads on WCOSS2 would need A100/H100 validation.

Mathematical Basis

Same technique as applied to rte-rrtmgp (earth-system-radiation/rte-rrtmgp#393):

  • Affine tuple operator: (a₂,b₂) ∘ (a₁,b₁) = (a₂·a₁, a₂·b₁+b₂) (Martin & Cundy, ICLR 2018)
  • Associativity proven by Grant & Hunt (1969) for RT layer operators

Code

CUDA kernel: https://github.com/consigcody94/parallel-prefix-rt/blob/master/benchmarks/cuda/crtm_scalar_adding.cu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions