Re: [AMBER] pmemd.cuda error: launch timeout..

From: Sasha Buzko <obuzko.ucla.edu>
Date: Tue, 08 Jun 2010 13:51:42 -0700

Thanks, Scott.
Will do

Sasha


Scott Le Grand wrote:
> Second, when this happens again. Try to restart from the last restart.
>
> This is important because if it goes beyond where it ostensibly should crash, that means you probably have a cooling/power problem or a flaky GPU. If not, then it's definitely a bug and please email me the quick and easy repro restart file.
>
>
>
> -----Original Message-----
> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On Behalf Of Scott Le Grand
> Sent: Tuesday, June 08, 2010 11:24
> To: AMBER Mailing List
> Subject: RE: [AMBER] pmemd.cuda error: launch timeout..
>
> Could you try a run with ntpr=1? Let me know if anything bizarre happens right before this...
>
>
> -----Original Message-----
> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On Behalf Of Sasha Buzko
> Sent: Tuesday, June 08, 2010 10:55
> To: AMBER Mailing List
> Subject: Re: [AMBER] pmemd.cuda error: launch timeout..
>
> Yes, it is. I use it now for an extended simulation. The error seems to
> occur almost randomly, sometimes at the beginning, sometimes after 10 ns..
>
> Scott Le Grand wrote:
>
>> Well that's not good...
>>
>> This is the same input file and run you sent me previously?
>>
>>
>> -----Original Message-----
>> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On Behalf Of Sasha Buzko
>> Sent: Tuesday, June 08, 2010 10:18
>> To: AMBER Mailing List
>> Subject: Re: [AMBER] pmemd.cuda error: launch timeout..
>>
>> Actually, it did happen on C1060 as well. Just the latest error came
>> when testing on a GTX480..
>>
>>
>> Scott Le Grand wrote:
>>
>>
>>> This is not happening on your C1060 chips, is it?
>>>
>>>
>>> -----Original Message-----
>>> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On Behalf Of Sasha Buzko
>>> Sent: Tuesday, June 08, 2010 09:54
>>> To: AMBER Mailing List
>>> Subject: [AMBER] pmemd.cuda error: launch timeout..
>>>
>>> Hi all,
>>> I'm testing pmemd.cuda on a GTX480 with a moderately sized system in
>>> explicit solvent (~60k atoms). Every once in a while, a run is
>>> interrupted by this error message:
>>> "Error: the launch timed out and was terminated launching kernel
>>> kPMEGetGridWeights". No other error messages are generated.
>>>
>>> The same system and input files are used by the cpu version with no
>>> issues. The process doesn't seem to be running out of memory, and no
>>> hardware issue appears to be involved.
>>> Below is the deviceQuery output.
>>>
>>> Thanks for any suggestions
>>>
>>> Sasha
>>>
>>>
>>> [sasha.redwood release]$ ./deviceQuery
>>> ./deviceQuery Starting...
>>>
>>> CUDA Device Query (Runtime API) version (CUDART static linking)
>>>
>>> There is 1 device supporting CUDA
>>>
>>> Device 0: "GeForce GTX 280"
>>> CUDA Driver Version: 3.0
>>> CUDA Runtime Version: 3.0
>>> CUDA Capability Major revision number: 1
>>> CUDA Capability Minor revision number: 3
>>> Total amount of global memory: 1073020928 bytes
>>> Number of multiprocessors: 30
>>> Number of cores: 240
>>> Total amount of constant memory: 65536 bytes
>>> Total amount of shared memory per block: 16384 bytes
>>> Total number of registers available per block: 16384
>>> Warp size: 32
>>> Maximum number of threads per block: 512
>>> Maximum sizes of each dimension of a block: 512 x 512 x 64
>>> Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
>>> Maximum memory pitch: 2147483647 bytes
>>> Texture alignment: 256 bytes
>>> Clock rate: 1.30 GHz
>>> Concurrent copy and execution: Yes
>>> Run time limit on kernels: Yes
>>> Integrated: No
>>> Support host page-locked memory mapping: Yes
>>> Compute mode: Default (multiple host
>>> threads can use this device simultaneously)
>>>
>>> deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4243455, CUDA
>>> Runtime Version = 3.0, NumDevs = 1, Device = GeForce GTX 280
>>>
>>>
>>> PASSED
>>>
>>> Press <Enter> to Quit...
>>> -----------------------------------------------------------
>>>
>>>
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> -----------------------------------------------------------------------------------
>>> This email message is for the sole use of the intended recipient(s) and may contain
>>> confidential information. Any unauthorized review, use, disclosure or distribution
>>> is prohibited. If you are not the intended recipient, please contact the sender by
>>> reply email and destroy all copies of the original message.
>>> -----------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>>
>>>
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jun 08 2010 - 14:00:03 PDT
Custom Search