[AMBER] cudaMemcpyToSymbol: SetSim copy to Sim failed for K20x using JAC and Cellulose benchmarks

From: Mohammad Ashraf Bhuiyan <akasheee.gmail.com>
Date: Wed, 9 Jan 2013 14:02:41 -0800

Hi,
I tried to run the GPU benchmarks JAC_production_NPT and NVE and
Cellulose_Production_NPT and NVE using CUDA 5 on Nvidia K20x. But it stuck
at some point, and after I cancel the run by cntrl+c, it says:

cudaMemcpyToSymbol: SetSim copy to Sim failed


The last part of the "mdout" file says:

--------------------------------------------------------------------------------
   3. ATOMIC COORDINATES AND VELOCITIES
--------------------------------------------------------------------------------



 begin time read from input coords = 6.000 ps


 Number of triangulated 3-point waters found: 7023

     Sum of charges from parm topology file = -11.00000006
     Assuming uniform neutralizing plasma

..............................................................................
Is it expected?

The K20x GPU is active, I can run other cuda program on it.
The device query says:
Device 0: "Tesla K20Xm"
  CUDA Driver Version / Runtime Version 5.0 / 5.0
  CUDA Capability Major/Minor version number: 3.5
  Total amount of global memory: 5760 MBytes (6039339008
bytes)
  (14) Multiprocessors x (192) CUDA Cores/MP: 2688 CUDA Cores
  GPU Clock rate: 732 MHz (0.73 GHz)
  Memory Clock rate: 2600 Mhz
  Memory Bus Width: 384-bit
  L2 Cache Size: 1572864 bytes
  Max Texture Dimension Size (x,y,z) 1D=(65536),
2D=(65536,65536), 3D=(4096,4096,4096)
  Max Layered Texture Size (dim) x layers 1D=(16384) x 2048,
2D=(16384,16384) x 2048
  Total amount of constant memory: 65536 bytes
  Total amount of shared memory per block: 49152 bytes
  Total number of registers available per block: 65536
  Warp size: 32
  Maximum number of threads per multiprocessor: 2048
  Maximum number of threads per block: 1024
  Maximum sizes of each dimension of a block: 1024 x 1024 x 64
  Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535
  Maximum memory pitch: 2147483647 bytes
  Texture alignment: 512 bytes
  Concurrent copy and kernel execution: Yes with 2 copy engine(s)
  Run time limit on kernels: No
  Integrated GPU sharing Host Memory: No
  Support host page-locked memory mapping: Yes
  Alignment requirement for Surfaces: Yes
  Device has ECC support: Enabled
  Device supports Unified Addressing (UVA): Yes
  Device PCI Bus ID / PCI location ID: 132 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device
simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime
Version = 5.0, NumDevs = 1, Device0 = Tesla K20Xm


Best Regards

Ashraf

--------------------------------------------------
M Ashraf Bhuiyan, PhD
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jan 09 2013 - 14:30:03 PST
Custom Search