Re: [AMBER] Query about parallel GPU multijob

From: Kshatresh Dutta Dubey <kshatresh.gmail.com>
Date: Mon, 23 Jun 2014 23:36:45 +0300

Hi Prof Ross,

   I did the above and following is the output.
CUDA_VISIBLE_DEVICES is unset.
CUDA-capable device count: 4
   GPU0 " Tesla K40m"
   GPU1 " Tesla K40m"
   GPU2 " Tesla K40m"
   GPU3 " Tesla K40m"

Two way peer access between:
   GPU0 and GPU1: YES
   GPU0 and GPU2: NO
   GPU0 and GPU3: NO
   GPU1 and GPU2: NO
   GPU1 and GPU3: NO
   GPU2 and GPU3: YES

It means, simply I can submit the job with nohup
$AMBERHOME....../pmemd.cuda.MPI and it will automatically take other free
node (since one parallel job is already going), isn't it?

Thanks and regards
Kshatresh





On Mon, Jun 23, 2014 at 11:25 PM, Kshatresh Dutta Dubey <kshatresh.gmail.com
> wrote:

> Thank you Dr. Ross, I am using using Amber 14. I have one more query,
> since I have already submitted one parallel job on 2 GPUs and they are
> running fine, I want to utilize other node for parallel run. Is there any
> way to get information whether running job is using node 1 or node 2?
>
> Thank you once again.
>
> Best Regards
> Kshatresh
>
>
> On Mon, Jun 23, 2014 at 11:04 PM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
>
>> Hi Kshatresh,
>>
>> Are you using AMBER 12 or AMBER 14?
>>
>> If it is AMBER 12 you have little or no hope of seeing much speedup on
>> multiple GPUs with K40s. I'd stick to running 4 x 1 GPU.
>>
>> If it is AMBER 14 then you should first check if your GPUs in each node
>> are connected to the same processor and can communicate by peer to peer. I
>> will update the website instructions shortly to explain this but in the
>> meantime you can download the following:
>>
>> https://dl.dropboxusercontent.com/u/708185/check_p2p.tar.bz2
>>
>> untar it, then cd to the directory and run make. Then run ./gpuP2PCheck.
>> It should give you something like:
>>
>> CUDA_VISIBLE_DEVICES is unset.
>> CUDA-capable device count: 2
>> GPU0 "Tesla K40"
>> GPU1 "Tesla K40"
>>
>> Two way peer access between:
>> GPU0 and GPU1: YES
>>
>> You need it to say YES here. If it says NO you will need to reorganize
>> which PCI-E slots your GPUs are in so that they are on the same CPU socket
>> otherwise you will be stuck running single GPU runs.
>>
>> If it says YES then you are good to go. Just login to the first node and
>> do:
>>
>> unset CUDA_VISIBLE_DEVICES
>> nohup mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i ... &
>>
>> Logout and repeat the same on the other node. You want the two MPI
>> processes to run on the same node. The GPUs will automagically be
>> selected.
>>
>> If you are using a queuing system you'll need to check the manual for your
>> specific queuing system but typically this would be something like:
>>
>> #PBS nodes=1,tasks_per_node=2
>>
>> Which would make sure each of your two jobs get allocated to their own
>> node. There is no point trying to span nodes these days, infiniband just
>> isn't fast enough to keep up with modern GPUs and AMBER's superdooper GPU
>> breaking lightning speed execution mode(TM).
>>
>> Hope that helps.
>>
>> All the best
>> Ross
>>
>>
>>
>> On 6/23/14, 12:43 PM, "Kshatresh Dutta Dubey" <kshatresh.gmail.com>
>> wrote:
>>
>> >Dear Users,
>> >
>> > I have 2 nodes x 2GPU ( each node has 2 GPU) Tesla K 40 machine. I
>> >want to run 2 parallel jobs (on 2 GPUs of each nodes). I followed
>> >http://ambermd.org/gpus/ but still unable to understand how to submit
>> >jobs. The link describes about running single job either on four GPUs or
>> >4
>> >jobs on each GPUs, but there is no information about 2 parallel jobs on 2
>> >nodes. Following is the output of devicequery :
>> >Device 0: "Tesla K40m"
>> >Device 1: "Tesla K40m"
>> >Device 2: "Tesla K40m"
>> >Device 3: "Tesla K40m
>> >
>> > I will be thankful for all suggestion.
>> >
>> >Regards
>> >Kshatresh
>> >_______________________________________________
>> >AMBER mailing list
>> >AMBER.ambermd.org
>> >http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
> --
> With best regards
>
> ************************************************************************************************
> Dr. Kshatresh Dutta Dubey
> Post Doctoral Researcher,
> c/o Prof Sason Shaik,
> Hebrew University of Jerusalem, Israel
> Jerusalem, Israel
>
>
>


-- 
With best regards
************************************************************************************************
Dr. Kshatresh Dutta Dubey
Post Doctoral Researcher,
c/o Prof Sason Shaik,
Hebrew University of Jerusalem, Israel
Jerusalem, Israel
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Jun 23 2014 - 14:00:02 PDT
Custom Search