Re: [AMBER] GPU peer support disabled

From: Hirdesh Kumar <hirdesh.iitd.gmail.com>
Date: Thu, 11 Aug 2016 23:11:38 +0200

Thanks Ross,

I will discuss your suggestion with my administrator.

Thanks,
Hirdesh

*​*

On Thu, Aug 11, 2016 at 11:01 PM, Ross Walker <ross.rosswalker.co.uk> wrote:

> Hi Hirdesh,
>
> This means that the two GPUs are on different PCI-E networks. Likely this
> is a dual CPU system and one GPU is on the PCI-E channels connected to one
> CPU and the other is on the channels connected to the other CPU.
> Unfortunately Intel's quick path does not permit peer to peer copies
> between the two PCI-E domains. The only option is to physically remove one
> of the GPUs and move it to a PCI slot connected to the same CPU as the
> other GPU. The only other alternative is just to run two independent single
> GPU runs.
>
> The following writeup gives a little more detail and some diagrams that
> explain it:
>
> http://exxactcorp.com/blog/exploring-the-complexities-of-
> pcie-connectivity-and-peer-to-peer-communication/
>
> All the best
> Ross
>
>
> > On Aug 11, 2016, at 1:53 PM, Hirdesh Kumar <hirdesh.iitd.gmail.com>
> wrote:
> >
> > Hi,
> >
> > I am trying to use two GPUs in parallel, but the speed is similar to that
> > of a single GPU. When I checked the output file (pasted below), I
> observed
> > that when I use single GPU, it says "Peer to Peer support: ENABLED"
> >
> > However, when I use both GPUs, the output is
> > "Peer to Peer support: DISABLED"
> >
> >
> > The output of both runs are given below:
> >
> > *1) Two GPUs in parallel*
> >
> > |------------------- GPU DEVICE INFO --------------------
> > |
> > | Task ID: 0
> > | CUDA_VISIBLE_DEVICES: 0,1
> > | CUDA Capable Devices Detected: 2
> > | CUDA Device ID in use: 0
> > | CUDA Device Name: Tesla K20m
> > | CUDA Device Global Mem Size: 4799 MB
> > | CUDA Device Num Multiprocessors: 13
> > | CUDA Device Core Freq: 0.71 GHz
> > |
> > |
> > | Task ID: 1
> > | CUDA_VISIBLE_DEVICES: 0,1
> > | CUDA Capable Devices Detected: 2
> > | CUDA Device ID in use: 1
> > | CUDA Device Name: Tesla K20m
> > | CUDA Device Global Mem Size: 4799 MB
> > | CUDA Device Num Multiprocessors: 13
> > | CUDA Device Core Freq: 0.71 GHz
> > |
> > |--------------------------------------------------------
> >
> > |---------------- GPU PEER TO PEER INFO -----------------
> > |
> > | Peer to Peer support: DISABLED
> > |
> > | (Selected GPUs cannot communicate over P2P)
> > |
> > |--------------------------------------------------------
> >
> >
> > *2) Single GPU usage:*
> >
> > |------------------- GPU DEVICE INFO --------------------
> > |
> > | Task ID: 0
> > | CUDA_VISIBLE_DEVICES: 0
> > | CUDA Capable Devices Detected: 1
> > | CUDA Device ID in use: 0
> > | CUDA Device Name: Tesla K20m
> > | CUDA Device Global Mem Size: 4799 MB
> > | CUDA Device Num Multiprocessors: 13
> > | CUDA Device Core Freq: 0.71 GHz
> > |
> > |--------------------------------------------------------
> >
> > |---------------- GPU PEER TO PEER INFO -----------------
> > |
> > | Peer to Peer support: ENABLED
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Aug 11 2016 - 14:30:04 PDT
Custom Search