[AMBER] Segmentation Fault when running constant pH Replica Exchange on GPUs

From: Hofer, Florian <Florian.Hofer.uibk.ac.at>
Date: Fri, 6 Dec 2019 15:52:01 +0000

Dear AMBER developers,
we are trying to run a replica exchange constant pH MD on our GPU cluster. We are using the latest AMBER version with the latest patches freshly compiled with CUDA 9.2 and openmpi-3.1.4.

However, we immediately get a segmentation fault and consequent program termination.
The exact same system setup is running fine with singular (not REMD) cpH-MD on both GPUs and CPUs and furthermore cpH-REMD works fine when run on a CPU cluster.
We also tested whether normal MD simulations running across multiple nodes produce a similar error, but they are running fine.

The MDOUT files from AMBER do not provide any information about the run, as they stop right after the general setup output. The logfile with the standard output contains the information about the segmentation faults, here is the backtrace (the whole logfile is attached to the mail):

==== backtrace ====
 4 0x00000035b5232660 killpg() ??:0
 5 0x00000000005a073f gpu_gb_ene_() ??:0
 6 0x00000000004ca5b8 __gb_force_mod_MOD_gb_cph_ene() ??:0
 7 0x000000000053cf41 __constantph_mod_MOD_cnstph_explicitmd() ??:0
 8 0x000000000049d11f __runmd_mod_MOD_runmd() ??:0
 9 0x00000000004ec59f MAIN__() pmemd.F90:0
10 0x00000000004ed39d main() ??:0
11 0x00000035b521ed1d __libc_start_main() ??:0
12 0x00000000004088a9 _start() ??:0
===================

Can anyone help us to understand what might be the cause of this?

Thank you very much in advance,
best regards

Florian Hofer MSc.
Institute of General, Inorganic and Theoretical Chemistry
University of Innsbruck
Austria


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Fri Dec 06 2019 - 08:00:01 PST
Custom Search