Re: [AMBER] Cuda test hangs with Amber12 compiled with CUDA5

From: Jean-Christophe Ducom <jcducom.scripps.edu>
Date: Fri, 19 Oct 2012 11:18:46 -0700

Thank you for your quick reply.
I did search the archive before posting it. However I couldn't find
anything (wrong key words I guess. Now of course I saw the message.
Sorry for spamming.)
Thanks again
JC


On 10/19/2012 11:12 AM, Ismail, Mohd F. wrote:
> I think just yesterday, Scott mentioned that CUDA 5 is not supported yet. Search the archive.
>
> *******************************
> Mohd Ismail
> Graduate Student
> Dept. of Chemistry/Biochemistry
> University of Oklahoma
> Norman 73019
>
> ________________________________________
> From: Jean-Christophe Ducom [jcducom.scripps.edu]
> Sent: Friday, October 19, 2012 12:40 PM
> To: amber.ambermd.org
> Subject: [AMBER] Cuda test hangs with Amber12 compiled with CUDA5
>
> Hi-
> We have been updated GPU nodes to CUDA5 and some Amber12 tests are hanging.
> The system is running Opensuse11.3 on 16cores E5-2650 0 . 2.00GHz with
> 86GB of memory, The GPU card is NVIDIA Teslma M2090 (/usr/bin/nvidia-smi
> -pm 1 -c 3 --ecc-config=0)
>
> Amber12 (patched) compiled with CUDA5:
> -----------------------------------------------------------------
> > ldd pmemd.cuda
> linux-vdso.so.1 => (0x00007fff571ff000)
> libcurand.so.5.0 => /usr/local/cuda/lib64/libcurand.so.5.0
> (0x00007f526aa0e000)
> libcufft.so.5.0 => /usr/local/cuda/lib64/libcufft.so.5.0
> (0x00007f5268a31000)
> libcudart.so.5.0 => /usr/local/cuda/lib64/libcudart.so.5.0
> (0x00007f52687d5000)
> libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007f52684ef000)
> libm.so.6 => /lib64/libm.so.6 (0x00007f5268298000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f5268082000)
> libc.so.6 => /lib64/libc.so.6 (0x00007f5267d22000)
> libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f5267a18000)
> libdl.so.2 => /lib64/libdl.so.2 (0x00007f5267814000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f52675f7000)
> librt.so.1 => /lib64/librt.so.1 (0x00007f52673ee000)
> /lib64/ld-linux-x86-64.so.2 (0x00007f526c933000)
>
> >make test.cuda
> [...]
> ---------------------------------------------
> Running Extended CUDA Explicit solvent tests.
> Precision Model = SPFP
> ---------------------------------------------
> cd 4096wat/ && ./Run.pure_wat SPFP
> /opt/applications/amber/12/gnu/include/netcdf.mod
>
> It hangs there for ever.
>
> > cat mdout.pure_wat
>
> -------------------------------------------------------
> Amber 12 SANDER 2012
> -------------------------------------------------------
>
> However the GPU card seems to be working
> # nvidia-smi -a
>
> ==============NVSMI LOG==============
>
> Timestamp : Fri Oct 19 10:04:28 2012
> Driver Version : 304.54
>
> Attached GPUs : 1
> GPU 0000:42:00.0
> Product Name : Tesla M2090
> Display Mode : Disabled
> Persistence Mode : Disabled
> Driver Model
> Current : N/A
> Pending : N/A
> Serial Number : 0321412003326
> GPU UUID : GPU-8ba67388-783d-ca37-1d16-286dc5764189
> VBIOS Version : 70.10.46.00.01
> Inforom Version
> Image Version : N/A
> OEM Object : 1.1
> ECC Object : 2.0
> Power Management Object : 4.0
> GPU Operation Mode
> Current : N/A
> Pending : N/A
> PCI
> Bus : 0x42
> Device : 0x00
> Domain : 0x0000
> Device Id : 0x109110DE
> Bus Id : 0000:42:00.0
> Sub System Id : 0x088710DE
> GPU Link Info
> PCIe Generation
> Max : 2
> Current : 2
> Link Width
> Max : 16x
> Current : 16x
> Fan Speed : N/A
> Performance State : P0
> Clocks Throttle Reasons : N/A
> Memory Usage
> Total : 6143 MB
> Used : 115 MB
> Free : 6028 MB
> Compute Mode : Default
> Utilization
> Gpu : 99 %
> Memory : 0 %
> Ecc Mode
> Current : Disabled
> Pending : Disabled
> ECC Errors
> Volatile
> Single Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Double Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Aggregate
> Single Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Double Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Temperature
> Gpu : N/A
> Power Readings
> Power Management : Supported
> Power Draw : 95.12 W
> Power Limit : 225.00 W
> Default Power Limit : N/A
> Min Power Limit : N/A
> Max Power Limit : N/A
> Clocks
> Graphics : 650 MHz
> SM : 1301 MHz
> Memory : 1848 MHz
> Applications Clocks
> Graphics : N/A
> Memory : N/A
> Max Clocks
> Graphics : 650 MHz
> SM : 1301 MHz
> Memory : 1848 MHz
> Compute Processes
> Process ID : 6829
> Name : ../../../bin/pmemd.cuda_SPFP
> Used GPU Memory : 101 MB
>
>
> Amber12 (patched) compiled with CUDA4.2:
> --------------------------------------------------------------------
> > ldd pmemd.cuda
> linux-vdso.so.1 => (0x00007ffff31ff000)
> libcurand.so.4 => /usr/local/cuda/lib64/libcurand.so.4
> (0x00007fb49375e000)
> libcufft.so.4 => /usr/local/cuda/lib64/libcufft.so.4
> (0x00007fb491726000)
> libcudart.so.4 => /usr/local/cuda/lib64/libcudart.so.4
> (0x00007fb4914cb000)
> libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007fb4911e5000)
> libm.so.6 => /lib64/libm.so.6 (0x00007fb490f8e000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb490d78000)
> libc.so.6 => /lib64/libc.so.6 (0x00007fb490a18000)
> libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fb49070e000)
> libdl.so.2 => /lib64/libdl.so.2 (0x00007fb49050a000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb4902ed000)
> librt.so.1 => /lib64/librt.so.1 (0x00007fb4900e4000)
> /lib64/ld-linux-x86-64.so.2 (0x00007fb495870000)
>
>
> >make test.cuda
> [...]
> ---------------------------------------------
> Running Extended CUDA Explicit solvent tests.
> Precision Model = SPFP
> ---------------------------------------------
> cd 4096wat/ && ./Run.pure_wat SPFP
> /opt/applications/amber/12/gnu/include/netcdf.mod
> diffing mdout.pure_wat.GPU_SPFP with mdout.pure_wat
> PASSED
> [...]
> All the tests passed successfully
>
> Any idea?
> Let me know if you need additional info.
> Best,
> JC
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 19 2012 - 11:30:04 PDT
Custom Search