Re: [AMBER] gpu_download_partial_forces: download failed unspecified launch failure

From: Jason Swails <>
Date: Tue, 19 Nov 2013 11:18:46 -0500

On Tue, 2013-11-19 at 21:17 +0530, Ashutosh Shandilya wrote:
> I think extra point issue is resolved

This is not helpful. Does your system still have extra points? ParmEd
can tell you this: -p prmtop_file << EOF

Look for NUMEXTRA -- is it 0? How (exactly) did you resolve the extra
points issue?

> but this
> (gpu_dowload_partial_forces:download_failed unspecified launch failure) is
> still there.

This is also not a helpful error message. Error messages on the GPU are
often not helpful---it's one of the prices we pay for good performance.
When you get a problem like this on the GPU, here are the steps (keep
going until the problem goes away):

1) If you are running pmemd.cuda.MPI, try pmemd.cuda (with no MPI). Does
the problem go away?

2) Try running on the CPUs. Is there still a problem? Look for a
helpful error message either on stderr or in the mdout file. Does this
error message help?

3) If you ran step 2 in parallel on CPUs and it still doesn't work with
no helpful error message, try running on CPUs in serial.

Without a workflow like this, we have no way of knowing where the
problem is (is it in your system setup? is it an undocumented/unknown
incompatibility with the GPU code that isn't caught? is it a bad
starting structure? is it a bug in the GPU code? is it a bug in the GPU
_and_ CPU code?) We need to know where to start looking.

If the GPU code works in serial but not in parallel, you may be better
off just running in serial. The MPI version does not scale very well at
the moment, and it is known to be less stable than the serial version.
The parallel implementation has been redesigned, so the next version of
Amber will probably fare better.

Good luck,

Jason M. Swails
Rutgers University
Postdoctoral Researcher
AMBER mailing list
Received on Tue Nov 19 2013 - 08:30:04 PST
Custom Search