Re: [AMBER] cudaMemcpyToSymbol: SetSim copy to Sim failed for K20x using JAC and Cellulose benchmarks from Ross Walker on 2013-01-09 (Amber Archive Jan 2013)

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 09 Jan 2013 14:23:00 -0800

Use NVCC v4.2 - Cuda 5.0 is not currently supported by the release version
of the code as of Jan 9th 2013.

On 1/9/13 2:02 PM, "Mohammad Ashraf Bhuiyan" <akasheee.gmail.com> wrote:

>Hi,
>I tried to run the GPU benchmarks JAC_production_NPT and NVE and
>Cellulose_Production_NPT and NVE using CUDA 5 on Nvidia K20x. But it stuck
>at some point, and after I cancel the run by cntrl+c, it says:
>
>cudaMemcpyToSymbol: SetSim copy to Sim failed
>
>
>The last part of the "mdout" file says:
>
>--------------------------------------------------------------------------
>------
> 3. ATOMIC COORDINATES AND VELOCITIES
>--------------------------------------------------------------------------
>------
>
>
>
> begin time read from input coords = 6.000 ps
>
>
> Number of triangulated 3-point waters found: 7023
>
> Sum of charges from parm topology file = -11.00000006
> Assuming uniform neutralizing plasma
>
>..........................................................................
>....
>Is it expected?
>
>The K20x GPU is active, I can run other cuda program on it.
>The device query says:
>Device 0: "Tesla K20Xm"
> CUDA Driver Version / Runtime Version 5.0 / 5.0
> CUDA Capability Major/Minor version number: 3.5
> Total amount of global memory: 5760 MBytes (6039339008
>bytes)
> (14) Multiprocessors x (192) CUDA Cores/MP: 2688 CUDA Cores
> GPU Clock rate: 732 MHz (0.73 GHz)
> Memory Clock rate: 2600 Mhz
> Memory Bus Width: 384-bit
> L2 Cache Size: 1572864 bytes
> Max Texture Dimension Size (x,y,z) 1D=(65536),
>2D=(65536,65536), 3D=(4096,4096,4096)
> Max Layered Texture Size (dim) x layers 1D=(16384) x 2048,
>2D=(16384,16384) x 2048
> Total amount of constant memory: 65536 bytes
> Total amount of shared memory per block: 49152 bytes
> Total number of registers available per block: 65536
> Warp size: 32
> Maximum number of threads per multiprocessor: 2048
> Maximum number of threads per block: 1024
> Maximum sizes of each dimension of a block: 1024 x 1024 x 64
> Maximum sizes of each dimension of a grid: 2147483647 x 65535 x
>65535
> Maximum memory pitch: 2147483647 bytes
> Texture alignment: 512 bytes
> Concurrent copy and kernel execution: Yes with 2 copy engine(s)
> Run time limit on kernels: No
> Integrated GPU sharing Host Memory: No
> Support host page-locked memory mapping: Yes
> Alignment requirement for Surfaces: Yes
> Device has ECC support: Enabled
> Device supports Unified Addressing (UVA): Yes
> Device PCI Bus ID / PCI location ID: 132 / 0
> Compute Mode:
> < Default (multiple host threads can use ::cudaSetDevice() with
>device
>simultaneously) >
>
>deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime
>Version = 5.0, NumDevs = 1, Device0 = Tesla K20Xm
>
>
>Best Regards
>
>Ashraf
>
>--------------------------------------------------
>M Ashraf Bhuiyan, PhD
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jan 09 2013 - 14:30:04 PST