Hi Robert,
This does indeed sound like faulty GPUs - do they both do it?
First download the latest version of the NVIDIA driver and install it -
v331.49 from here: http://www.nvidia.com/object/unix.html
See if that helps at all. Then try downloading my test suite:
https://dl.dropboxusercontent.com/u/708185/GPU_Validation_Test_2cards.tar.g
z
Run this and it should run for about 24 hours (you might have to tweak the
run script a little for your setup, paths etc). At the end check the log
files - all 10 runs should give the same final energy and both GPUs should
also match. If they don't (or you see crashes during the run) then it
means your GPUs are faulty.
All the best
Ross
On 2/24/14, 3:01 AM, "Deák Robert" <kokumetto.gmail.com> wrote:
>Dear Amber users,
>
>Recently we bought 2 GTX 670 DC mini (
>http://www.asus.com/Graphics_Cards/GTX670DCMOC2GD5/) but with both of them
>I experienced the same error message after random run time.
>
>The message is:
>*cudaMemcpy GpuBuffer::Download failed unspecified launch failure*
>
>With exactly the same input files and input settings there are no error
>messages using a GTX TITAN or a TESLA card. I have tried the GTX 670 cards
>in the other machine, and also a TITAN card in this server, but the error
>is related to GTX 670 cards, independently from the server.
>
>My question is, this type of error message means hardware failure?
>
>These are my input parameters:
> &cntrl
> imin = 0, irest = 0, ntx = 1,
> ntb = 2, pres0 = 1.0, ntp = 1,
> taup = 2.0,
> cut = 11.0, ntr = 0,
> ntc = 2, ntf = 2,
> tempi = 300.0, temp0 = 300.0,
> ntt = 3, gamma_ln = 0.1,
> nstlim = 100000000, dt = 0.002,
> ntpr = 1000, ntwx = 1000, ntwr = 1000,
> ig=1, nscm=1000
> /
>
>Thanks,
>Robert
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Feb 24 2014 - 08:30:05 PST