Intel18 + openmp executable (single thread test) causes crash or just hang for MOM6 test cases on all three machines theta (KNL) , lscsky50 (skylake) and theia.
Here's the crash output for global_ALE_z test case on theta and lscsky50:
EKEmin= 1.000000000000000E+016 ResMin= 236869.453598697
src= 1332071.81173317 ldamping= 8.991153093102879E-082
gamma-b= 0.832273068599009 gamma-t= 0.901219478800562
drag_visc= 2.083867476924661E-004 Ubg2= 0.000000000000000E+000
Something has gone very wrong
[NID 02598] 2018-04-20 14:33:31 Apid 4349292: initiated application termination
or for another test (benchmark):
_pmiu_daemon(SIGCHLD): [NID 00471] [c2-0c1s5n3] [Fri Apr 20 16:31:03 2018] PE RANK 19 exit signal Bus error
[NID 00471] 2018-04-20 16:31:03 Apid 4349434: initiated application termination
[NID 00471] 2018-04-20 16:31:04 Apid 4349434: Error detected during page fault processing. Process terminated via bus error.
on KNL box:
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 182461 RUNNING AT lscsky50-d
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
Intel(R) MPI Library troubleshooting guide:
https://software.intel.com/node/561764
===================================================================================
No such issues for Intel17.
No such issue for non-openmp exec with Intel18.
Intel18 + openmp executable (single thread test) causes crash or just hang for MOM6 test cases on all three machines theta (KNL) , lscsky50 (skylake) and theia.
Here's the crash output for global_ALE_z test case on theta and lscsky50:
or for another test (benchmark):
on KNL box:
No such issues for Intel17.
No such issue for non-openmp exec with Intel18.