I encountered several related issues in the zppy [ts] and [e3sm_diags workflow], especially for the EAMxx model output.
This logic removes U because it is a 3D variable and does not work with rgn_avg. However, this command line is fundamentally problematic because
(1) the workflow may still include other 3D variables, such as V, that are also incompatible with rgn_avg. As a result, only U is filtered while other incompatible variables may still be passed into the global regional-average step and cause failures.
(2) When addressing EAMxx model output, the variable list such as "surf_evap,U_at_10m_above_surface," can be unintentionally transformed into "surf_evap_at_10m_above_surface", which triggers the following errors on my test:
This suggests a broader issue in the zppy/e3sm_diags workflow variable handling in the global average section. A more robust treatment should be implemented here.
I am running on Perlmutter but the bug should not depend on the specific machine.
# Directions to run:
# 1. Update <output>, <www>, <environment_commands_secondary> below.
# 2. Run with `zppy -c examples/post.v3.LR.amip.0101.cfg`.
# Direction to create stand-alone test data for zppy-interfaces:
# 3. Once the jobs finish, `cd <output>/post/scripts`.
# 4. Run `grep -n "Running a zi-pcmdi command" pcmdi_diags*.o*` to find the pcmdi_diags commands.
# 5. Then, you can run those lines stand-alone.
[default]
input = /pscratch/sd/z/zhan391/e3smv4_project/ne256pg2_ne256pg2.F20TR-SCREAMv1.July-1.spanc800.2xauto.acc150.n0032.test2.1
output = /pscratch/sd/z/zhan391/e3smv4_project/ne256pg2_ne256pg2.F20TR-SCREAMv1.July-1.spanc800.2xauto.acc150.n0032.test2.1
case = ne256pg2_ne256pg2.F20TR-SCREAMv1.July-1.spanc800.2xauto.acc150.n0032.test2.1
www = /global/cfs/cdirs/e3sm/www/zhan391/eamxx-pcmdi
partition = "debug"
account = "e3sm"
#account = "priority"
campaign = "water_cycle"
debug = False
environment_commands = "source /global/common/software/e3sm/anaconda_envs/load_latest_e3sm_unified_pm-cpu.sh"
[climo]
active = True
walltime = "2:00:00"
years = "1995:2004:10",
# Another example of `years`:
# years = "1985:2014:30", "1985:2014:15"
[[ atm_monthly_180x360_aave ]]
# The following e3sm_diags sets require it:
# "lat_lon", "zonal_mean_xy", "zonal_mean_2d", "polar", "cosp_histogram", "meridional_mean_2d", "annual_cycle_zonal_mean", "zonal_mean_2d_stratosphere" "aerosol_aeronet", "aerosol_budget"
input_component = "eamxx"
#cmip_plevdata = "/lcrc/group/e3sm/diagnostics/e3sm_to_cmip_data/grids/vrt_remap_plev19.nc"
case = "1ma_ne30pg2"
input_files = "AVERAGE.nmonths_x1"
frequency = "monthly"
#input_subdir = "archive/atm/hist"
input_subdir = "run/"
mapping_file = /global/cfs/cdirs/e3sm/diagnostics/maps/map_ne30pg2_to_cmip6_180x360_aave.20200201.nc
[[ atm_monthly_diurnal_8xdaily_180x360_aave ]]
# The following e3sm_diags sets require it:
# "diurnal_cycle"
input_component = "eamxx"
#cmip_plevdata = "/lcrc/group/e3sm/diagnostics/e3sm_to_cmip_data/grids/vrt_remap_plev19.nc"
case = "3ha_ne30pg2"
input_files = "AVERAGE.nhours_x3"
#input_subdir = "archive/atm/hist"
input_subdir = "run/"
frequency = "diurnal_8xdaily"
mapping_file = /global/cfs/cdirs/e3sm/diagnostics/maps/map_ne30pg2_to_cmip6_180x360_aave.20200201.nc
vars = "precip_liq_surf_mass_flux,precip_ice_surf_mass_flux"
[[ land_monthly_climo ]]
active = True
# This subtask is a dependency for the e3sm_diags task's lnd_monthly_mvm_lnd subtask.
# The following e3sm_diags sets require it:
# "lat_lon_land",
input_component = "elm"
#note: if not specify case then the default will be used
frequency = "monthly"
input_files = "elm.h0"
#input_subdir = archive/lnd/hist
input_subdir = "run/"
vars = "" # Setting this as "" will tell zppy to use ALL variables
[[ land_monthly_180x360_traave ]]
active = True
input_component = "elm"
#note: if not specify case then the default will be used
frequency = "monthly"
input_files = "elm.h0"
#input_subdir = "archive/lnd/hist"
input_subdir = "run/"
mapping_file = "/global/cfs/cdirs/e3sm/diagnostics/maps/map_ne256pg2_to_cmip6_180x360_traave.20250301.nc"
vars = ""
[ts]
active = True
walltime = "01:00:00"
years = "1995:2004:5"
ts_num_years=5
[[ atm_daily_180x360_aave ]]
active = True
# This subtask is a dependency for the e3sm_diags task's atm_monthly_180x360 subtask.
# The following e3sm_diags sets require it:
# "tropical_subseasonal", "precip_pdf"
input_component = "eamxx"
case = "1da_ne30pg2"
input_files = "AVERAGE.ndays_x1"
frequency = "daily"
#input_subdir = "archive/atm/hist"
input_subdir = "run/"
mapping_file = /global/cfs/cdirs/e3sm/diagnostics/maps/map_ne30pg2_to_cmip6_180x360_aave.20200201.nc
# Needed for Wheeler Kiladis
vars = "LW_flux_up_at_model_top,precip_liq_surf_mass_flux,precip_ice_surf_mass_flux,U_at_850hPa"
[[ atm_monthly_glb ]]
active = True
# This subtask is a dependency for the global_time_series task.
input_component = "eam"
#input_subdir = "archive/atm/hist"
input_subdir = "run/"
case = "1ma_ne30pg2"
input_files = "AVERAGE.nmonths_x1"
frequency = "monthly"
mapping_file = "glb"
vars="ps,surf_radiative_T,SeaLevelPressure,IceWaterPath,qv_2m,precip_liq_surf_mass_flux,precip_ice_surf_mass_flux"
#vars="omega_at_500hPa,omega_at_700hPa,omega_at_850hPa,T_mid_at_700hPa,T_2m,surface_upward_latent_heat_flux,surf_sens_flux,z_mid_at_700hPa,wind_speed_10m,surf_evap,U_at_10m_above_surface,V_at_10m_above_surface,LW_clrsky_flux_dn_at_model_bot,LW_clrsky_flux_up_at_model_top,LW_flux_dn_at_model_bot,LW_flux_up_at_model_bot,LW_flux_up_at_model_top,SW_clrsky_flux_dn_at_model_bot,SW_clrsky_flux_dn_at_model_top,SW_clrsky_flux_up_at_model_bot,SW_clrsky_flux_up_at_model_top,SW_flux_dn_at_model_bot,SW_flux_dn_at_model_top,SW_flux_up_at_model_bot,SW_flux_up_at_model_top,ShortwaveCloudForcing,LongwaveCloudForcing,isccp_cldtot"
[[ land_monthly ]]
active = True
# This subtask is a dependency for the e3sm_to_cmip task's land_monthly subtask.
input_component = "elm"
frequency = "monthly"
input_files = "elm.h0"
#input_subdir = "archive/lnd/hist"
input_subdir = "run/"
mapping_file = "/global/cfs/cdirs/e3sm/diagnostics/maps/map_ne256pg2_to_cmip6_180x360_traave.20250301.nc"
# Variables:
#vars = "FSH,RH2M,LAISHA,LAISUN,QINTR,QOVER,QRUNOFF,QSOIL,QVEGE,QVEGT,SOILICE,SOILLIQ,SOILWATER_10CM,TSA,TSOI,H2OSNO,TOTLITC,CWDC,SOIL1C,SOIL2C,SOIL3C,SOIL4C,WOOD_HARVESTC,TOTVEGC,NBP,GPP,AR,HR"
vars = "SOILWATER_10CM"
extra_vars = "landfrac"
[[ lnd_monthly_glb ]]
active = True
# This subtask is a dependency for the global_time_series task.
input_component = "elm"
frequency = "monthly"
input_files = "elm.h0"
#input_subdir = "archive/lnd/hist"
input_subdir = "run/"
mapping_file = "glb"
job_nbr = 50 # This reduces paralllel processes in ncclimo time-series splitting for memory management.
#vars = "" # This will tell zppy to use all available variables.
vars = "FSH,RH2M,LAISHA,LAISUN,QINTR,QOVER,QRUNOFF,QSOIL,QVEGE,QVEGT,SOILWATER_10CM,TSA,H2OSNO,TOTLITC,CWDC,SOIL1C,SOIL2C,SOIL3C,SOIL4C,WOOD_HARVESTC,TOTVEGC,NBP,GPP,AR,HR"
[[ land_monthly_energy ]]
active = True
input_component = "elm"
frequency = "monthly"
input_files = "elm.h0"
#input_subdir = "archive/lnd/hist"
input_subdir = "run/"
mapping_file = ""
vars = "EFLX_LH_TOT,FIRA,FLDS,FSA,FSDS,FSRND,FSRVD,FSDSND,FSDSVD,FSH,TSA"
[[ rof_monthly ]]
active = True
# The following e3sm_diags sets require it:
# "streamflow"
input_component = "mosart"
frequency = "monthly"
input_files = "mosart.h0"
#input_subdir = "archive/lnd/hist"
input_subdir = "run/"
mapping_file = ""
# Variables:
vars = "RIVER_DISCHARGE_OVER_LAND_LIQ"
extra_vars = 'areatotal2'
[e3sm_to_cmip]
active = True
frequency = "monthly"
ts_grid = "180x360_aave"
ts_num_years=5
walltime = "00:10:00"
years = "1995:2004:5"
environment_commands = "source /global/common/software/e3sm/anaconda_envs/load_latest_e3sm_unified_pm-cpu.sh; conda activate zi-pcmdi-diags"
[[ atm_2d_monthly_180x360_aave ]]
input_component = "eamxx"
#cmip_plevdata = "/lcrc/group/e3sm/diagnostics/e3sm_to_cmip_data/grids/vrt_remap_plev19.nc"
case = "1ma_ne30pg2"
input_files = "AVERAGE.nmonths_x1"
ts_subsection = "atm_2d_monthly_180x360_aave"
vars="ps,surf_radiative_T,SeaLevelPressure,IceWaterPath,qv_2m,precip_liq_surf_mass_flux,precip_ice_surf_mass_flux,omega_at_500hPa,omega_at_700hPa,omega_at_850hPa,T_mid_at_700hPa,T_2m,surface_upward_latent_heat_flux,surf_sens_flux,z_mid_at_700hPa,wind_speed_10m,surf_evap,U_at_10m_above_surface,V_at_10m_above_surface,LW_clrsky_flux_dn_at_model_bot,LW_clrsky_flux_up_at_model_top,LW_flux_dn_at_model_bot,LW_flux_up_at_model_bot,LW_flux_up_at_model_top,SW_clrsky_flux_dn_at_model_bot,SW_clrsky_flux_dn_at_model_top,SW_clrsky_flux_up_at_model_bot,SW_clrsky_flux_up_at_model_top,SW_flux_dn_at_model_bot,SW_flux_dn_at_model_top,SW_flux_up_at_model_bot,SW_flux_up_at_model_top,ShortwaveCloudForcing,LongwaveCloudForcing,isccp_cldtot"
cmip_vars = "pr,cltisccp,evspsbl,hfls,hfss,huss,ps,psl,rlds,rldscs,rlus,rlut,rlutcs,rsds,rsdscs,rsdt,rsus,rsuscs,rtmt,uas,vas,sfcWind,tas,ts"
[[ atm_3d_monthly_180x360_aave ]]
input_component = "eamxx"
#cmip_plevdata = "/lcrc/group/e3sm/diagnostics/e3sm_to_cmip_data/grids/vrt_remap_plev19.nc"
case = "1ma_ne30pg2"
input_files = "AVERAGE.nmonths_x1"
interp_vars = "U,V,T_mid,z_mid,omega,RelativeHumidity,p_mid,qv"
ts_subsection = "atm_3d_monthly_180x360_aave"
vars="U,V,T_mid,z_mid,omega,RelativeHumidity,p_mid,qv"
cmip_vars = "ta,ua,va,zg"
[[ land_monthly ]]
active = True
# This subtask is a dependency for the ilamb task.
# This subtask depends on the ts task's land_monthly subtask.
# Notice this subtask name matches a subtask in the `ts` task.
# If it did not, then the `ts_land_subsection` parameter would be required here to tell zppy which subtask to use.
input_component = "elm"
ts_grid = "180x360_traave"
ts_land_subsection = "land_monthly"
frequency = "monthly"
input_files = "elm.h0"
cmip_vars = "mrsos"
[e3sm_diags]
active = True
multiprocessing = True
num_workers = 8
ref_final_yr = 1995
ref_start_yr = 2004
ts_num_years = 5
walltime = "4:00:00"
years = "1995:2004:10",
environment_commands = "source /global/common/software/e3sm/anaconda_envs/load_latest_e3sm_unified_pm-cpu.sh"
[[ atm_monthly_180x360_aave ]]
# `e3sm_diags` is largely driven by which e3sm_diags sets are requested:
sets="lat_lon","zonal_mean_xy","zonal_mean_2d","polar","cosp_histogram","meridional_mean_2d","annual_cycle_zonal_mean","enso_diags","qbo","diurnal_cycle","zonal_mean_2d_stratosphere","aerosol_aeronet","mp_partition","tropical_subseasonal","precip_pdf","tc_analysis","streamflow",
climo_diurnal_frequency = "diurnal_8xdaily"
climo_diurnal_subsection = "atm_monthly_diurnal_8xdaily_180x360_aave"
ts_daily_subsection = "atm_daily_180x360_aave"
grid = '180x360_aave'
short_name = 'e3sm.amip.EAMXX.test2_1'
[[ lnd_monthly_mvm_lnd ]]
# Depends on the climo task's land_monthly_climo subtask.
sets = "lat_lon_land",
climo_subsection = "land_monthly_climo"
# Other parameters:
diff_title = "Difference"
grid = 'native'
# The reference_data_path should point to pre-computed climatology files from a nclimo/zppy run
reference_data_path = "/pscratch/sd/z/zhan391/e3smv4_project/20250906.wcycl1850.ne120pg2_r025_RRSwISC6to18E3r5.test6.1.chrysalis/post/lnd/native/clim"
ref_name = "20250906.wcycl1850.ne120pg2_r025_RRSwISC6to18E3r5.test6.1.chrysalis"
ref_final_yr = 96
ref_start_yr = 105
ref_years = "96-105",
run_type = "model_vs_model"
short_name = "e3sm.amip.EAMXX.test2_1"
short_ref_name = "v3.HR.piControl-test6.1"
swap_test_ref = False
tag = "model_vs_model"
What happened?
I encountered several related issues in the zppy [ts] and [e3sm_diags workflow], especially for the EAMxx model output.
One specific issue appears in
zppy/zppy/templates/ts.bashscript logic for mapping_file = glb:This logic removes U because it is a 3D variable and does not work with rgn_avg. However, this command line is fundamentally problematic because
(1) the workflow may still include other 3D variables, such as V, that are also incompatible with rgn_avg. As a result, only U is filtered while other incompatible variables may still be passed into the global regional-average step and cause failures.
(2) When addressing EAMxx model output, the variable list such as "surf_evap,U_at_10m_above_surface," can be unintentionally transformed into "surf_evap_at_10m_above_surface", which triggers the following errors on my test:
This suggests a broader issue in the zppy/e3sm_diags workflow variable handling in the global average section. A more robust treatment should be implemented here.
What machine were you running on?
I am running on Perlmutter but the bug should not depend on the specific machine.
Environment
I am using zppy master branch
What command did you run?
cd /pscratch/sd/z/zhan391/e3smv4_project/ne256pg2_ne256pg2.F20TR-SCREAMv1.July-1.spanc800.2xauto.acc150.n0032.test2.1 zppy -c post.eamxx.diag.cfgCopy your cfg file
What jobs are failing?
What stack trace are you encountering?