Re: [AMBER] cudaMemcpyToSymbol: SetSim copy to Sim failed for K20x using JAC and Cellulose benchmarks

From: Ross Walker <ross.rosswalker.co.uk>
Date: Thu, 25 Apr 2013 21:10:22 -0700

Hi Chinh,

AMBER 11 does not support CUDA 5.0 so yes you will need CUDA 4.2. That
said the AMBER 11 GPU code is no longer supported, it would be too
difficult to keep this updated in addition to the AMBER 12 code. I would
highly suggest that if you can you upgrade to AMBER 12 and you will find
that everything works much better.

Firsly it supports CUDA 5.0, is faster, has more features, a better
precision model and has multiple fixes related to corner cases causing
issues when run on GPUs.

All the best
Ross



On 4/25/13 9:00 PM, "Chinh Su Tran To" <chinh.sutranto.gmail.com> wrote:

>Hi,
>
>I encountered the same problem* "cudaMemcpyToSymbol: SetSim copy to Sim
>failed"* when i ran pmemd. And I found the solution from Dr. Walker as
>below.
>I have to admit that I am very new to CUDA and all installation stuff, so
>please help me with some suggestion.
>My information:
>- Cuda-5.0, nvcc release 5.0, amber11, ubuntu 12.10
>- GPU GeForce GTX580
>
>Does that mean I need to reinstall the nvcc v4.2? Do I have to start from
>scratch? If yes, could you please instruct me how to do that?
>
>It was really nightmare to deal with "conflict" of gcc, g++, and gfortran
>while compiling pmemd cuda in amber11!!
>
>Please help. Thank you.
>Chinh
>
>
>On Thu, Jan 10, 2013 at 6:23 AM, Ross Walker <ross.rosswalker.co.uk>
>wrote:
>
>> Use NVCC v4.2 - Cuda 5.0 is not currently supported by the release
>>version
>> of the code as of Jan 9th 2013.
>>
>>
>>
>> On 1/9/13 2:02 PM, "Mohammad Ashraf Bhuiyan" <akasheee.gmail.com> wrote:
>>
>> >Hi,
>> >I tried to run the GPU benchmarks JAC_production_NPT and NVE and
>> >Cellulose_Production_NPT and NVE using CUDA 5 on Nvidia K20x. But it
>>stuck
>> >at some point, and after I cancel the run by cntrl+c, it says:
>> >
>> >cudaMemcpyToSymbol: SetSim copy to Sim failed
>> >
>> >
>> >The last part of the "mdout" file says:
>> >
>>
>>>------------------------------------------------------------------------
>>>--
>> >------
>> > 3. ATOMIC COORDINATES AND VELOCITIES
>>
>>>------------------------------------------------------------------------
>>>--
>> >------
>> >
>> >
>> >
>> > begin time read from input coords = 6.000 ps
>> >
>> >
>> > Number of triangulated 3-point waters found: 7023
>> >
>> > Sum of charges from parm topology file = -11.00000006
>> > Assuming uniform neutralizing plasma
>> >
>>
>>>........................................................................
>>>..
>> >....
>> >Is it expected?
>> >
>> >The K20x GPU is active, I can run other cuda program on it.
>> >The device query says:
>> >Device 0: "Tesla K20Xm"
>> > CUDA Driver Version / Runtime Version 5.0 / 5.0
>> > CUDA Capability Major/Minor version number: 3.5
>> > Total amount of global memory: 5760 MBytes
>>(6039339008
>> >bytes)
>> > (14) Multiprocessors x (192) CUDA Cores/MP: 2688 CUDA Cores
>> > GPU Clock rate: 732 MHz (0.73 GHz)
>> > Memory Clock rate: 2600 Mhz
>> > Memory Bus Width: 384-bit
>> > L2 Cache Size: 1572864 bytes
>> > Max Texture Dimension Size (x,y,z) 1D=(65536),
>> >2D=(65536,65536), 3D=(4096,4096,4096)
>> > Max Layered Texture Size (dim) x layers 1D=(16384) x 2048,
>> >2D=(16384,16384) x 2048
>> > Total amount of constant memory: 65536 bytes
>> > Total amount of shared memory per block: 49152 bytes
>> > Total number of registers available per block: 65536
>> > Warp size: 32
>> > Maximum number of threads per multiprocessor: 2048
>> > Maximum number of threads per block: 1024
>> > Maximum sizes of each dimension of a block: 1024 x 1024 x 64
>> > Maximum sizes of each dimension of a grid: 2147483647 x 65535 x
>> >65535
>> > Maximum memory pitch: 2147483647 bytes
>> > Texture alignment: 512 bytes
>> > Concurrent copy and kernel execution: Yes with 2 copy
>>engine(s)
>> > Run time limit on kernels: No
>> > Integrated GPU sharing Host Memory: No
>> > Support host page-locked memory mapping: Yes
>> > Alignment requirement for Surfaces: Yes
>> > Device has ECC support: Enabled
>> > Device supports Unified Addressing (UVA): Yes
>> > Device PCI Bus ID / PCI location ID: 132 / 0
>> > Compute Mode:
>> > < Default (multiple host threads can use ::cudaSetDevice() with
>> >device
>> >simultaneously) >
>> >
>> >deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA
>>Runtime
>> >Version = 5.0, NumDevs = 1, Device0 = Tesla K20Xm
>> >
>> >
>> >Best Regards
>> >
>> >Ashraf
>> >
>> >--------------------------------------------------
>> >M Ashraf Bhuiyan, PhD
>> >_______________________________________________
>> >AMBER mailing list
>> >AMBER.ambermd.org
>> >http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Apr 25 2013 - 21:30:04 PDT
Custom Search