Re: [AMBER] Segmentation fault

From: David Case <david.case.rutgers.edu>
Date: Fri, 21 Apr 2017 10:49:16 -0400

On Fri, Apr 21, 2017, sylvester kisembo wrote:

> I have been trying to get some runs going on the supercomputer
> GPUs. Specs below (i get similar out come on CPUs):

OK: if you are getting segfaults on the CPU, then use the CPU for debugging:
let's remove any dependence on GPUs.

Second: run short serial CPU runs: let's remove any dependence on MPI.

Third: where is the error occuring (minimization, dynamics?) Can you create
a short test run on a serial CPU that illustrates the error?

> /opt/amber/bin/pmemd.cuda.MPI: error while loading shared
> libraries: libcurand.so.8.0: cannot open shared object file: No such
> file or directory

The sort of error above suggests that your LD_LIBRARY_PATH is not set
correctly. It also shows that you are trying to run pmemd.cuda.MPI.
After trying the CPU ideas above (and if they work), *first* do a short
run (that you know works on a CPU) on a single GPU (i.e. use pmemd.cuda.)
It's possible that you are seeing an environment problem that is very simple
to fix, but you need to narrow down the problem first.

[Aside, to you and others: few problems benefit much by running
on multiple GPU's. Be sure you can use pmemd.cuda itself first, and
understand the tradeoffs between running one simulation on 2 or more GPUs vs
running several simultaneous simulations, each on a single GPU. (Replica
exchange calculations are an expection here.)]

....dac


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Apr 21 2017 - 08:00:02 PDT
Custom Search