Skip to content

Add default support for 3.2km (ne1024pg2) #80

@jtruesdal

Description

@jtruesdal

What is the feature/what would you like to discuss?

This issue tracks changes needed to support 3.2km resolution runs.

Using the E3SM theta-l documentation and physics and hardware configuration settings from previous 3.7km runs we set the initial ne1024pg2 dycore defaults for running a spun up configurations.
se_dt_remap_factor = 2
se_dt_tracer_factor = 6
se_hypervis_subcycle = 1
se_hypervis_subcycle_q = 6
se_hypervis_subcycle_tom = 1
se_nu_top = 1e4
se_tstep = 8.33333333333333

A physics timestep of 100s was recommended for this resolution so default coupling periods of 864 couplings per day were added as defaults for this resolution.

Adam gave some initial hardware configurations that worked for the MPASA 3km runs that consisted of running on 38400 pes using 80 mpi tasks per node. Our relevant defaults are:

./xmlchange CAM_TARGET=theta-l_kokkos
./xmlchange --append CAM_CONFIG_OPTS="-analytic_ic"
./xmlchange STOP_OPTION=nhours
./xmlchange STOP_N=2
./xmlchange NTASKS=-480
./xmlchange PIO_BLOCKSIZE=800000000
./xmlchange PIO_NETCDF_FORMAT=64bit_data
./xmlchange MAX_MPITASKS_PER_NODE=80

New boundary topography and mesh files were created:
bnd_topo = '/glade/p/cesmdata/inputdata/atm/cam/topo/se/ne1024pg2_gmted2010_modis_bedmachine_nc3000_Laplace0007_noleak_20260210.nc'
/glade/campaign/cgd/amp/jet/ne1024pg2_ESMFmesh_cdf5_c250812.nc

New ne1024pg2 ctsm surface boundary data was created using the ctsm tool set:
/glade/p/cesmdata/inputdata/lnd/clm2/surfdata_esmf/ctsm5.4.0/surfdata_ne1024np4.pg2_hist_2000_78pfts_c260212.nc

Brian Dobbins provided a temporary fix for slow CTSM initialization and a new ctsm tag was added via gitmodules. Adam also let us know that the land was unable to interpolate some of the lower resolution urban files and I created a few additional ctsm boundary data files that have been added as defaults at this resolution.

/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/urbandata/CTSM52_tbuildmax_OlesonFeddema_2020_ne1024pg2_ESMFmesh_c240503.nc
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/urbandata/CTSM52_tbuildmax_OlesonFeddema_2020_ne1024pg2_simyr1849-2106_c240503.nc

An initial 32Level F2000dev run with the above settings ran for about 7 model hours before blowing up with a level inversion at the sponge boundary. Before putting more effort into the 32 level model we started a 58 level configuration using the finidat created from the successful 32 level run. The 58 level configuration ran sucessfully for a few time steps. Currently we have FHISTC_LTso compsets configured and being tested.

Following the E3SM recommendation to halve the timestep and set remap factor to 1 we are beginning spin up procedures for ne1024pg2 using an analytic US standard temperature/pressure start.

======================
There were issues along the way when the decomposition was changed to use more cores and provide better performance. The first issue was a limitation of E3SM where it must follow the rule that every PE assigned to a component must receive at least one element of the mesh to work on. The 1x1 ice sst file was the first to fail this test when asking to run on 512 nodes. We can get around this limitation using fewer cores for the ice component but when I tried to do that the model was terminated without much debug information to go on. I've tried several different decompositions for the different components, all failed with what could have been memory issues but the termination signal was not very specific. I also tried changing the pio_blocksize to something smaller thinking that the mpiallgather calls would overwhelm the system bandwidth with many large blocks being sent at once. I was unable to get one of these configurations running either.

Is there anyone in particular you want to be part of this conversation?

No response

Will this change (regression test) answers?

Yes

Will you be implementing this enhancement yourself?

Yes

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions