Re: [AMBER] Utilization of 3 GPUs suddenly drop to 0% and won't work any more

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 31 Jul 2015 09:00:41 -0700

Hi Asai,

That's because in AMBER 11 the GPUs were approximately 8 times slower than they are now. AMBER 14 is approximately twice as quick as AMBER 11 was. So computation speed has improved about 16x. However during that time PCI-E went from Gen 2 x16 to Gen 3 x16. The communication bottleneck thus shrunk by 2x while the rate of computation increased by 16x. Hence why one cannot run across 4 GPUs anymore.

All the best
Ross

> On Jul 30, 2015, at 11:36 PM, 浅井 賢 <suguruasai.gmail.com> wrote:
>
> Dear Ross,
>
> Thanks for the reply Ross. Hum... I could use 4 GPUs in Amber 11 (prob.
> without pear to pear) that's why I thought it was strange.
> But anyway, I will use 2 x 2 GPU simulations for further simulations.
> Thanks a lot for quick reply :-)
>
> Best,
> Asai
>
> On 07/29/2015 11:31 PM, Ross Walker wrote:
>> Hi Asai,
>>
>> This problem is normal. Please read http://ambermd.org/gpus/ for details. Specifically this section http://ambermd.org/gpus/#Running
>>
>> Specifically, if you have not purchased hardware designed to support 4 GPUs in parallel via 'peer to peer' then you will most likely be limited to a maximum of 2 GPUs per simulation.
>>
>> I would suggest running either 4 independent simulations, one on each GPU or running 2 x 2 GPU simulations on each pair of P2P capable GPUs. Likely this will be 0+1 and 2+3. But you can use the check_p2p code on the above website to check this for sure.
>>
>> All the best
>> Ross
>>
>>> On Jul 28, 2015, at 11:35 PM, 浅井 賢 <suguruasai.gmail.com> wrote:
>>>
>>> Dear Amber user,
>>>
>>>
>>> Hi, I'm facing some strange problem on using 4 GPUs with pmemd.cuda.MPI sometime so I wonder if someone can help me.
>>> I am not really sure the minimum requirements for reproducing the situation but it occurs probably when I use pmemd.cuda.MPI to use two or more GPUs in a simulation.
>>> The phenomenon is a sudden utilization drop, which you can see the screen shot below.
>>>
>>> http://gyazo.com/07564256c2b9a9f02277cc5d6170ba15
>>>
>>> It seems the simulation is slow but goes OK actually but I'm afraid physical problem on GPUs, and also it is so annoying.
>>> While I have no idea what's going on, I don't know the word to search on internet.
>>> So does anybody have any idea?
>>>
>>>
>>> Thank you.
>>>
>>>
>>> Asai
>>>
>>>
>>> ATTACHEMENTS:
>>>
>>> md.out - mdout file of `pmemd.cuda.MPI`
>>> nvidia-smi.txt - `$ nvidia-smi > nvidia-smi.txt`
>>> <md.out><nvidia-smi.txt>_______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jul 31 2015 - 09:30:02 PDT
Custom Search