Re: [AMBER] GPUs parallel problem

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 5 May 2017 08:05:58 -0400

Hi Meng,

Unless you are running a large GB calculation you cannot use all 4 GPUs for a single calculation. But why would you want to? Just run 4 simulations at once, one on each GPU, started from different initial conditions / random seeds etc. There is little to be gained from running just a single MD calculation anyway. You need to run multiple runs to get converged results so it makes sense to use the 4 GPUs to run 4 calculations.

All the best
Ross

> On May 5, 2017, at 05:36, Meng Wu <wumeng.shanghaitech.edu.cn> wrote:
>
> Hi Andreas,
>
> Thanks for your answer, and also thanks Alican and Ross.
>
> I have read the link's information. And I used "check_p2p" program to our systrm, it's true that only GPUs 0 and 1 can talk to each other and GPUs 2 and 3 can talk to each other, but if I want to use the four GPUs in parallel, is there any solution to solve this problem? Or I need to rebuild the platform?
>
> Any suggestions would be greatly appreciated.Thank you in advance!
>
>
> All the best,
> Meng Wu
>
>
> ------------------------------
> Message-ID: <3e2f5284-6365-46d6-5aae-229e852db120.cup.uni-muenchen.de <mailto:3e2f5284-6365-46d6-5aae-229e852db120.cup.uni-muenchen.de>>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Have a look at this: http://ambermd.org/gpus/ <http://ambermd.org/gpus/>
>
> "In other words on a 4 GPU machine you can run a total of two by two GPU
> jobs, one on GPUs 0 and 1 and one on GPUs 2 and 3. Running a calculation
> across more than 2 GPUs will result in peer to peer being switched off
> which will likely mean the calculation will run slower than if it had
> been run on a single GPU. To see which GPUs in your system can
> communicate via peer to peer you can run the 'gpuP2PCheck' program you
> built above."
>
>
> On 05/04/2017 10:26 AM, Meng Wu wrote:
> > Dear All,
> > I have a problem about GPUs parallel met these days. There are 4 GPUs/node in our lab, when I use two of them ("export CUDA_VISIBLE_DEVICES=0,1/2,3 mpirun -np 2 pmemd.cuda.MPI -O ..." ), the speed is normal; but when I use all("export CUDA_VISIBLE_DEVICES=0,1,2,3 mpirun -np 4 pmemd.cuda.MPI -O ..."") , the speed dropped dramatically. I don't know what's the problem in and how to deal with it if I want to use 4 GPUs in parallel to get a higher speed.
> >
> > Any suggestions would be greatly appreciated.Thank you in advance!
> >
> > All the best,
> > Wu Meng
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org <mailto:AMBER.ambermd.org>
> > http://lists.ambermd.org/mailman/listinfo/amber <http://lists.ambermd.org/mailman/listinfo/amber>
>
> --
> M.Sc. Andreas Tosstorff
> Lehrstuhl f?r Pharmazeutische Technologie und Biopharmazie
> Department Pharmazie
> LMU M?nchen
> Butenandtstr. 5-13 ( Haus B)
> 81377 M?nchen
> Germany
> Tel.: +49 89 2180 77059
>
>
>
> ------------------------------

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri May 05 2017 - 05:30:03 PDT
Custom Search