Re: [AMBER] NaN error in .rst files

From: Marek Maly <marek.maly.ujep.cz>
Date: Thu, 27 Jan 2011 11:21:48 +0100

Hi Ross, Peker and All,

#1
Unfortunately the computers with GTX 470 are in regular PC lab also or
mainly :)) for teaching
purposes, not machines fully devoted to scientific calculations so
I have just limited possibilities here (also regarding to playing with
runlevel/X env. settings)
and except exam periods and vacations I will be able to calculate here
just during the evenings/nights and weekends which is
still OK as these cards are cca 5 x faster than our Intel nodes on cluster
(each node = 2 x Intel Xeon Quad-core 5365 (3,00 GHz) a 16 GB shared RAM )
at least regarding the compared case (explicit solvent, cca 80k atoms).

But anyway I can try to do some test (regarding runlevel/X env. settings)
right now let say on one chosen "experimental"
PC. Anyway this things might be connected just with random error like this.

> -----------------------------------
> Error: unspecified launch failure launching kernel kClearForces
> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
> STOP PMEMD Terminated Abnormally!
> -----------------------------------

not with reproducible (at least in my case) NaN error which I obtained
after just 20 000 steps (maybe it appear also sooner)
on two tested machines with GTX 470 (I may verify also on another PCs in
lab) and which seems to be connected somehow
with some concrete random seeds values.

It is true that on our Tesla system I disabled X env. and I don.t remember
any errors like of that from GTX 470 PCs.


#2
I will also verify on GTX 470 PCs all thermostat/barostat settings
regarding that input files which I posted,
and will report here results.


#3
I would be grateful if anyone with GTX 470/GTX 480 system could try short
simulation (it takes just
some minutes to reach 20 000 steps on these cards) using my input files
posted here:

http://physics.ujep.cz/~mmaly/Amber/

and let us know if he obtained or did not obtain previously described NaNs
starting from the step 20 000.
Please attach also available information about your system (mainly
compilers types/versions used for Amber11 compilation
, CUDA driver version, OS etc.)

BTW Peker how finished this short test on your GTX 480 system. Did you
obtained NaNs or not ?

Best wishes,

    Marek








Dne Thu, 27 Jan 2011 07:59:31 +0100 Ross Walker <ross.rosswalker.co.uk>
napsal/-a:

> BTW, to all those following this thread. I would like to know if anybody
> has
> seen this NaN issue when running anything other than NTT=3. My hunch is
> that
> somehow the pseudo random number generator is running out of valid random
> numbers and starts handing out garbage. Weight can be added to this
> theory
> if people are running NTP simulations with NTT=1 which will not use the
> random number stream.
>
> The alternative theory is that it is something related to the barostat.
> In
> which case I would expect this problem to only show up with NTP
> simulations
> and for NVT and NVE to work fine. So essentially I would like to know if
> this problem is seen with any of the following combinations of settings:
>
> 1) NTT=3, NTB=2 - This one we already know the problem exists.
>
> 2) NTT=3, NTB=1 - This is NVT and will rule out the barostat if the
> problem
> still exists.
>
> 3) NTT=1, NTB=2 - This is NPT but NOT using the random number stream. If
> this crashes it means the problem is likely in the barostat rather than
> the
> random number stream.
>
> 4) NTT=1, NTB=1 - This is NVT not using the random number stream. If this
> crashes then both my theories are wrong.
>
> 5) NTT=0, NTB=1 - This is NVE, if this crashes then we are in a whole
> world
> of pain...
>
> Thank you.
>
> All the best
> Ross
>
> /\
> \/
> |\oss Walker
>
> ---------------------------------------------------------
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Adjunct Assistant Professor |
> | Dept. of Chemistry and Biochemistry |
> | University of California San Diego |
> | NVIDIA Fellow |
> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> ---------------------------------------------------------
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> __________ Informace od ESET NOD32 Antivirus, verze databaze 5822
> (20110126) __________
>
> Tuto zpravu proveril ESET NOD32 Antivirus.
>
> http://www.eset.cz
>
>
>


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jan 27 2011 - 03:00:03 PST
Custom Search