Re: [AMBER] gpu error

From: Debarati DasGupta <debarati_dasgupta.hotmail.com>
Date: Sun, 25 Jul 2021 00:41:35 +0000

Hello Dr. Case,

Yes my heating steps had blown up. There were Nan energies in several steps.
I changed the thermostat from ntt=1 to ntt=3 and made gamma_ln=3
Things still blew up, later I made the setup again from scratch this time using source leaprc.protein.ff14SB instead of ff19SB and that was the magic!
My runs are happening and without crashing.
Maybe it has to do with ff19SB.

Thanks Dr Case.
Debarati

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

From: David A Case<mailto:dacase.chem.rutgers.edu>
Sent: 24 July 2021 07:38
To: AMBER Mailing List<mailto:amber.ambermd.org>
Subject: Re: [AMBER] gpu error

On Fri, Jul 23, 2021, Debarati DasGupta wrote:

>Error: an illegal memory access was encountered launching kernel kClearForces
>
> Unit 9 Error on OPEN: EOH_heating1.rst
>STOP PMEMD Terminated Abnormally!

Above is a bit unclear: are you trying to run several pmemd jobs in one big
script? That is, is the "Error on OPEN" message coming from a later step,
and just reflecting the fact that the earlier error prevented a file from
being written?

>
>Has anyone seen any errors like this using GPU for temperature ramping steps?

"illegal memory access" doesn't really say much. Have you examined the
mdout file from the job where you got this message? Did the error happen on
the first step, or further on into the job? Running short simulations with
ntpr=1 is often a good way to find problems. Also, try running the
simulation on a CPU: one often gets better error messages from the CPU
version than from the GPU.

....dac


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jul 24 2021 - 18:00:02 PDT
Custom Search