Re: [AMBER] variety of errors in membrane/umbrella sampling simulations from Stephan Schott on 2018-04-28 (Amber Archive Apr 2018)

From: Stephan Schott <schottve.hhu.de>
Date: Sun, 29 Apr 2018 02:13:42 +0200

Hi Joseph,
Given that you are doing umbrella sampling, I assume that you have
restraints in your system. Are you using restart points in between your
simulations and do these restarts relate more or less with the crashes?
Could you see if by any chance in the step where the simulations crash your
"restraint" crosses the boundary conditions? I have seen many cases myself
that this just destroys my system because the restraint doesn't necesarrily
consider the shortes distance. Also, I can just recommend to follow the
trajectory exactly on the steps previous to the crash.
Cheers,

2018-04-28 20:51 GMT+02:00 Baker, Joseph <bakerj.tcnj.edu>:

> Hi all,
>
> We are trying to run some umbrella simulations with a small molecule
> restrained in the z-direction in a number of windows with the molecule
> moving through a POPE membrane (lipid14) using Amber16 pmemd.cuda. We are
> encountering a number of errors in some windows (not all) that include the
> following:
>
> (1) Reason: cudaMemcpy GpuBuffer::Download failed an illegal memory access
> was encountered
>
> (2) ERROR: Calculation halted. Periodic box dimensions have changed too
> much from their initial values.
>
> (3) ERROR: max pairlist cutoff must be less than unit cell max sphere
> radius!
>
> (4) And occasionally NaN showing up for various energy terms in the output
> log file, in which case the system keeps running, but when we view it in
> several windows the system has completely "exploded".
>
> The strange thing (to me) is that each window has already been run for 50
> ns with no problems on the GPU (suggesting they are equilibrated), and when
> looking at the systems it does not appear there are any large fluctuations
> of box size at the point that failures are occurring. Also, windows that
> fail do not look very different compared to windows that continue to run
> okay in the second 50 ns (aside from the ones that "explode" with NaN
> errors).
>
> Our collaborator at another site has seen the same errors when running our
> system, and has also seen the same errors for their own system of a
> different small molecule moving through the POPE membrane. In their case,
> they ran their first 50 ns of each window on the CPU (pmemd.MPI no
> failures), and then when they switched to GPUs they started to see the
> failures in the second 50 ns.
> I should also add that at our site we have spot-checked one of the failing
> windows by continuing it on the CPU instead of the GPU for the 2nd 50 ns,
> and that works fine as well. So it appears that problems arise in only some
> windows and only when trying to run the second 50 ns of these simulations
> on a GPU device.
>
> We have tried a number of solutions (running shorter simulations to restart
> more frequently to attempt to fix the periodic box type errors, turning off
> the umbrella restraints to see if that was the problem, etc.), but have not
> been able to resolve these issues, and are at a bit of a loss for what
> might be going on in our case.
>
> Any advice, suggestions for tests, etc. would be greatly appreciated to
> track down what might be going on when trying to extend these systems on
> the GPU! Thanks!
>
> Kind regards,
> Joe
>
> ------
> Joseph Baker, PhD
> Assistant Professor
> Department of Chemistry
> C101 Science Complex
> The College of New Jersey
> Ewing, NJ 08628
> Phone: (609) 771-3173
> Web: http://bakerj.pages.tcnj.edu/
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>

-- 
Stephan Schott Verdugo
Biochemist
Heinrich-Heine-Universitaet Duesseldorf
Institut fuer Pharm. und Med. Chemie
Universitaetsstr. 1
40225 Duesseldorf
Germany
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Sat Apr 28 2018 - 17:30:02 PDT