Re: [AMBER] Fwd: cudaMemcpy GpuBuffer::Download failed from Scott Le Grand on 2013-03-25 (Amber Archive Mar 2013)

From: Scott Le Grand <varelse2005.gmail.com>
Date: Mon, 25 Mar 2013 10:32:07 -0700

Have you ever thought of giving those hydroxyl's miniscule vdw radii?
That's how I solved that problem for my Ph.D. thesis where I would
regularly fall into that trap if I didn't...

On Mon, Mar 25, 2013 at 10:22 AM, Ross Walker <ross.rosswalker.co.uk> wrote:

> Hi Alessandro,
>
>
> This typically means something is wrong with your structure. What ends up
> happening is you get a NAN in the force array and then this causes the
> code to crash when it tries to download things. It is not easy to trap it
> within the GPU code itself due to the tens of thousands of threads that
> are running. I would take a careful look at your simulation results - does
> anything look out of place, are any energies or temperatures unreasonably
> high, does the structure look ok? One place this is happening seems to be
> with hydrogen atoms colliding with other atoms - they have zero VDW on
> hydroxyls and occasionally come too close to other atoms. This was never
> really a problem in the past since it rare and people didn't run for long
> but now people are routinely running microsecond+ simulations it is
> starting to bite more often.
>
> Does it always crash in the same place if you start with the same random
> seed and run the exact same simulation?
>
> Have you ever seen it crash if you run on the CPU?
>
> If you restart from the previous restart file does it crash very quickly?
>
> It's going to take a little digging to figure out what is going wrong
> unfortunately.
>
> All the best
> Ross
>
> On 3/25/13 5:06 AM, "Alessandro Orro" <alessandro.orro.itb.cnr.it> wrote:
>
> >dear all
> >
> >I'm trying to run a MD simulation with pmemd.cuda using the cmdline
> >
> >*time pmemd.cuda -O -i md.in -o md.out -p com.wat.leap.prm7 -c npt.rst7
> >-ref npt.rst7 -x md.trj -inf md.info -r md.rst7;*
> >
> >this is the md.in file
> >
> >*production dynamics*
> >* &cntrl*
> >* imin=0, irest=1, ntx=5,*
> >* nstlim=25000000, dt=0.002,*
> >* ntc=2, ntf=2,*
> >* cut=10.0, ntb=2, ntp=1, taup=2.0,*
> >* ntpr=1000, ntwx=1000, ntwr=50000,*
> >* ntt=3, gamma_ln=2.0,*
> >* temp0=300.0,*
> >*/*
> >
> >After about 1000 min the run crashes with the error
> >
> >*cudaMemcpy GpuBuffer::Download failed unspecified launch failure*
> >
> >I also tried with ig=-1 and ntf=1, as suggested by someone in this mailing
> >list, but the error is the same.
> >
> >I think to use the most updated version because the md.out contains
> >
> >*|--------------------- INFORMATION ----------------------*
> >*| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.*
> >*| Version 12.2*
> >*| *
> >*| 01/10/2013*
> >*| *
> >*| Implementation by:*
> >*| Ross C. Walker (SDSC)*
> >*| Scott Le Grand (nVIDIA)*
> >*| Duncan Poole (nVIDIA)*
> >*| *
> >*| CAUTION: The CUDA code is currently experimental.*
> >*| You use it at your own risk. Be sure to*
> >*| check ALL results carefully.*
> >*| *
> >*| Precision model in use:*
> >*| [SPFP] - Mixed Single/Double/Fixed Point Precision.*
> >*| (Default)*
> >*| *
> >*|--------------------------------------------------------*
> >
> >using another protein-ligand complex the simulation finished correctly.
> >Any
> >suggestions?
> >
> >thank you in advance
> >
> >Alessandro
> >_______________________________________________
> >AMBER mailing list
> >AMBER.ambermd.org
> >http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Mar 25 2013 - 11:00:03 PDT