Hi Parker,
I would have said that you had a failing GPU but the fact you see it on multiple GPU types is troubling. That said I've been seeing several issues (all very hard to tie down) with several versions of the 346 and 340 driver trees. So you might want to try switching to a different driver tree.
First I'd stick with CUDA 6.5. CUDA 7 breaks all the device selection when process exclusive mode is enabled so I'd punt on that until CUDA 7.5 at least (which 'supposedly' should fix it).
I'd try 346.89 from here: http://www.nvidia.com/download/driverResults.aspx/88814/en-us
and see if that fixes it. If not then the next step is to see if this might be a bug in the AMBER code for which I'll need copies of your input files.
All the best
Ross
> On Aug 22, 2015, at 1:58 PM, Parker de Waal <Parker.deWaal.vai.org> wrote:
>
> Hi AMBER deverlopers,
>
> Just recently I’ve started to encounter the following error on both my local machine (GTX780) and in house HPC cluster (K80s) while using pmemd.cuda (with all patches applied) on both CUDA 6.5 (Driver Version: 340.76) and 7 (Driver Version: 346.59):
>
> Error: an illegal memory access was encountered launching kernel kClearForces
> cudaFree GpuBuffer::Deallocate failed an illegal memory access was encountered
>
> The crash behavior has only happened thus far with during production simulations (sample outfile can be found here: https://gist.github.com/anonymous/ba6cf66147b7a68dc167)
>
>> From the metrics I’ve looked at the system is at equilibrium. Additionally, in the past I’ve been able to launch this same simulation protocol for times of ~1 us without issue.
>
> Has anyone, besides myself last year, encountered an issue such as this?
>
> If any additionally information is required please let me know!
>
> Best,
> Parker
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 24 2015 - 09:00:03 PDT