Re: [AMBER] NaN error on traj and output with AMBER CUDA - strange reproducable error

From: Ross Walker <ross.rosswalker.co.uk>
Date: Sat, 22 Jan 2011 12:24:33 -0800

Hi Marek,

> One of that error which seems appear randomly and is not
> reproducable is this one:
>
>
> Error: unspecified launch failure launching kernel kClearForces
> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
> STOP PMEMD Terminated Abnormally!
>
> This kind of errors might be solved/minimised with Peker approach
> or with cooling improvement (liquid cooling ...) I guess.

I've seen this problem as well on one of my machines but never on another
one. The difference between them being the motherboard. I have pretty much
come to the conclusion that if your motherboard was manufactured before the
GTX4XX series cards came out then you may well have some strange
incompatibilities. A bios update may help. To give you an example of what I
have seen. If I have 2 GTX480's in one of the machines (SuperMicro X7DWA-N
Motherboard) then one of the cards often gives weird kernel launch failures
but only on certain sizes of simulation. The other works fine. If I swap the
boards around then the other one fails suggesting it is PCI-E slot dependent
and not an issue with the card itself. Having just one card in either slot
works fine. 2 x C2070's also work fine. So what is the problem is anyone's
guess. Insufficient power, weird motherboard incompatibility?, driver issue?
Who knows... On a newer motherboard I have not seen the issue though so I am
inclined to think it is an incompatibility issue with the motherboard.

I think one may just need to accept that if you want to use GTX4XX / 5XX
cards then things will always be slightly temperamental. If you are happy
living with this then all is fine.

I'd be interested to hear of people who have NOT seen this problem though
and what motherboards / bios versions they are using.
 
All the best
Ross

/\
\/
|\oss Walker

---------------------------------------------------------
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
| http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
---------------------------------------------------------

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.





_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jan 22 2011 - 12:30:02 PST
Custom Search