Re: [AMBER] cuda versions advice

From: Scott Le Grand <varelse2005.gmail.com>
Date: Tue, 2 Sep 2014 13:26:32 -0700

The first rule of PMEMD Club is CUDA 5.0

The second rule of PMEMD Club is CUDA 5.0.

Were you thinking of building with 5.5, 6.0, or 6.5? OK. sure. We'll
throw in a 5-10% unfixable perf regression (compiler bug introduced in 5.5)
for no additional charge.

Scott



On Tue, Sep 2, 2014 at 12:50 PM, Scott Brozell <sbrozell.rci.rutgers.edu>
wrote:

> Hi,
>
> Are there particular reasons to build Amber14 with a specific cuda version
> among 5.0, 5.5, and 6.0 ? (aside from newer is better)
>
> When is Amber support for cuda 6.5 expected ?
>
>
> The target machine has
> +------------------------------------------------------+
> | NVIDIA-SMI 331.62 Driver Version: 331.62 |
>
> |-------------------------------+----------------------+----------------------+
> | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr.
> ECC |
> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
> M. |
>
> |===============================+======================+======================|
> | 0 Tesla M2070 On | 0000:11:00.0 Off |
> 0 |
> | N/A N/A P0 N/A / N/A | 672MiB / 5375MiB | 83% E.
> Thread |
>
> +-------------------------------+----------------------+----------------------+
> | 1 Tesla M2070 On | 0000:14:00.0 Off |
> 0 |
> | N/A N/A P0 N/A / N/A | 621MiB / 5375MiB | 87% E.
> Thread |
>
> +-------------------------------+----------------------+----------------------+
>
> CUDA Device Query (Runtime API) version (CUDART static linking)
>
> Detected 2 CUDA Capable device(s)
>
> Device 0: "Tesla M2070"
> CUDA Driver Version / Runtime Version 6.0 / 5.0
> CUDA Capability Major/Minor version number: 2.0
> Total amount of global memory: 5375 MBytes (5636554752
> bytes)
> (14) Multiprocessors x ( 32) CUDA Cores/MP: 448 CUDA Cores
> GPU Clock rate: 1147 MHz (1.15 GHz)
> Memory Clock rate: 1566 Mhz
> Memory Bus Width: 384-bit
> L2 Cache Size: 786432 bytes
> Max Texture Dimension Size (x,y,z) 1D=(65536),
> 2D=(65536,65535), 3D=(2048,2048,2048)
> Max Layered Texture Size (dim) x layers 1D=(16384) x 2048,
> 2D=(16384,16384) x 2048
> Total amount of constant memory: 65536 bytes
> Total amount of shared memory per block: 49152 bytes
> Total number of registers available per block: 32768
> Warp size: 32
> Maximum number of threads per multiprocessor: 1536
> Maximum number of threads per block: 1024
> Maximum sizes of each dimension of a block: 1024 x 1024 x 64
> Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
> Maximum memory pitch: 2147483647 bytes
> Texture alignment: 512 bytes
> Concurrent copy and kernel execution: Yes with 2 copy engine(s)
> Run time limit on kernels: No
> Integrated GPU sharing Host Memory: No
> Support host page-locked memory mapping: Yes
> Alignment requirement for Surfaces: Yes
> Device has ECC support: Enabled
> Device supports Unified Addressing (UVA): Yes
> Device PCI Bus ID / PCI location ID: 20 / 0
> Compute Mode:
> < Exclusive (only one host thread in one process is able to use
> ::cudaSetDevice() with this device) >
>
> thanks,
> scott
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Sep 02 2014 - 13:30:02 PDT
Custom Search