[AMBER] R: Error: unspecified launch failure launching kernel kReduceForces

From: Giovanni Pavan <giovanni.pavan.supsi.ch>
Date: Wed, 11 Apr 2012 17:39:22 +0200

Dear Ross,
thank you for your kind reply and your support.
I have not replied to you yet because we are running several tests and
trials to be sure we are not able to capture where the problem is. In fact
it is a nonsense to bother the amber community if the problem is in the
structures input files (pdb, connections, etc.).
Just one thing:

Indeed I am not running the very last version "2.3" but the "2.2" - what
does change between the 2 versions?

Dr. Giovanni M. Pavan

Laboratory of Applied Mathematics and Physics (LaMFI)
University of Applied Sciences of Southern Switzerland (SUPSI)
Centro Galleria 2, Manno 6928, Switzerland.
e-mail: giovanni.pavan.supsi.ch
web: http://www.dti.supsi.ch/~pavan/
skype: giovanni_pavan
phone: +41 58 666 65 60

-----Messaggio originale-----
Da: Ross Walker [mailto:ross.rosswalker.co.uk]
Inviato: luned́, 2. aprile 2012 18:37
A: giovanni.pavan.supsi.ch; 'AMBER Mailing List'
Oggetto: RE: [AMBER] Error: unspecified launch failure launching kernel

Hi Giovanni,

> My system has inside a gadolinium DOTA molecule - the Gd atom has 8
> coordination bonds parametrized according to literature.

I don't think we ever tested the GPU code with 8 bonds to a single atom
since things like Gadolinium fall way outside the remit of what AMBER would
normally be used to simulate. Can you post your input files please (a
private message to me is fine) along with the exact command lines you use to
reproduce the problem and I will see if I can confirm it and then figure out
what is going wrong.

That said I am confused by some of what you are reporting:

> The problem is that the system passes successfully the minimization
> step with reasonable energy and everything seems to be ok.

Did you check the structure manually after minimization? Does everything
look good? Does the energy trend during minimization look ok? - Also did you
try minimizing with CPU.

> Error: unspecified launch failure launching kernel kReduceForces
> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
> At line 109 of file inpcrd_dat.f90 (unit = 9, file =
> 'heat1_G3_DOTA.rst')
> Fortran runtime error: End of file
> STOP PMEMD Terminated Abnormally!

This is VERY strange. The unspecified launch error for kReduceForces seems
reasonable. As in if there are infinite forces or atoms sitting on top of
each other or other strange structural anomalies then this is where it will
crash. What I don't understand is how you got the second error out. Did this
all come out of the same run? - The end of file for the inpcrd file suggests
one of two things. Either you set irest / ntx wrong such that the code
expects velocities to be in the inpcrd file and they aren't, or you gave the
code a inpcrd file lacking in box information (i.e. you minimized without
periodic boundaries) but then you requested a periodic simulation.

It is also possible that your inpcrd (restart) file is corrupt for some
reason. Either way I can't see how the code would error here saying it can't
read the restart file but still produce you energy output for step zero. So
where did the energy output for step0 that you show below come from?

> Another strange thing is that if the same MD step is run with pmemd
> instead of pmemd.cuda, no error is reported.
> Curiously, however the energies reported are quite different!

Did you also run the minimization with CPU? Try this please:

1) Minimize using the CPU code. Check the output carefully.

2) Run with nstlim=10 and ntpr=1 and ntwx=1 for using the CPU minimization
restart file for both pmemd and pmemd.cuda and compare.

> In fact PMEMD reports "usual/safe" negative values, while pmemd.cuda
> reports a "worse" energetic situation.

The origin is arbitrary MD simulations so the sign is not very indicative
BUT big differences on step 0 for otherwise identical input does suggest a

> NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 0.00 PRESS = 0.0
> Etot = 1108141.2453 EKtot = 0.0000 EPtot =
> BOND = 9126.0102 ANGLE = 3780.4845 DIHED = 2107.0903
> 1-4 NB = 1401.2256 1-4 EEL = -42904.2608 VDWAALS =
> EELEC = -638879.7492 EHBOND = 0.0000 RESTRAINT = 0.0000

> NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 0.00 PRESS = 0.0
> Etot = -585383.3566 EKtot = 0.0000 EPtot =
> BOND = 9126.0102 ANGLE = 3780.4845 DIHED = 2107.0903
> 1-4 NB = 1401.2256 1-4 EEL = -42904.2608 VDWAALS = 72165.7215
> EELEC = -631059.6280 EHBOND = 0.0000 RESTRAINT = 0.0000
> Ewald error estimate: 0.2749E-03

So interestingly here your bond, angle, dihedral, 1-4NB, 1-4EEL and elec
terms are all identical. The difference is coming from the VDWAALS term
which is radically different in each case. This should make it easier to
track down what is going on - it may be possible that your Gadolinium has
very 'unorthodox' VDW parameters - can you post them as well please.

> Am I missing something?

Can you confirm that you are running with the very latest patched version of
the AMBER 11 code.
The mdout file should contain:

|--------------------- INFORMATION ---------------------- GPU (CUDA)
|Version of PMEMD in use: NVIDIA GPU IN USE.
| Version 2.3

Note the version '2.3'.

All the best

|\oss Walker

| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
| http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

AMBER mailing list
Received on Wed Apr 11 2012 - 09:00:03 PDT
Custom Search