Re: [AMBER] CUDA running error

From: Ross Walker <ross.rosswalker.co.uk>
Date: Tue, 8 May 2012 09:57:24 -0700

Hi Albert,

Which version of Intel and which version of Mpich2 and which version of
nvcc?

It works fine for me with Intel 11.1.069 and mpich2-1.4. Also does the CPU
version work fine with your intel and mpich2?

Please please please make sure you isolate errors to the GPU code before
reporting them. I.e. ALWAYS test the cpu codes thoroughly first.

I assume the serial GPU code works fine with Intel? - Also note that you
will see little to no performance improvement with the Intel compilers over
GNU for the GPU code.

All the best
Ross

> -----Original Message-----
> From: Albert [mailto:mailmd2011.gmail.com]
> Sent: Monday, May 07, 2012 10:19 PM
> To: AMBER Mailing List
> Subject: [AMBER] CUDA running error
>
>
>
> hello:
>
> I've compiled Amber 12 by MPICH2+intel ICC+IFORT, and I am trying to
> running Amber 12 CUDA by command:
>
>
> mpiexec -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -i md.in -p bm.prmtop -c
> npt.rst -o md.out -r md.rst -x md.mdcrd &
>
>
> but it failed with following infomrations:
>
>
>
>
>
> --------md.out-----------
> -------------------------------------------------------
> Amber 12 SANDER 2012
> -------------------------------------------------------
>
> | PMEMD implementation of SANDER, Release 12
>
> | Run on 05/07/2012 at 22:26:15
>
> File Assignments:
> | MDIN: md.in
> | MDOUT: md.out
> | INPCRD: npt.rst
> | PARM: bm.prmtop
> | RESTRT: md.rst
> | REFC: refc
> | MDVEL: mdvel
> | MDEN: mden
> | MDCRD: md.mdcrd
> | MDINFO: mdinfo
> |LOGFILE: logfile
>
>
> Here is the input file:
>
> production dynamics
> &cntrl
> imin=0, irest=1, ntx=5,
> nstlim=10000000, dt=0.002,
> ntc=2, ntf=2,
> cut=10.0, ntb=2, ntp=1, taup=2.0,
> ntpr=5000, ntwx=5000, ntwr=50000,
> ntt=3, gamma_ln=2.0,
> temp0=300.0,
> /
>
>
> |--------------------- INFORMATION ----------------------
> | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> | Version 12.0
> |
> | 03/19/2012
> |
> | Implementation by:
> | Ross C. Walker (SDSC)
> | Scott Le Grand (nVIDIA)
> | Duncan Poole (nVIDIA)
> |
> | CAUTION: The CUDA code is currently experimental.
> | You use it at your own risk. Be sure to
> | check ALL results carefully.
> |
> | Precision model in use:
> | [SPDP] - Hybrid Single/Double Precision (Default).
> |
> |--------------------------------------------------------
>
> |------------------- GPU DEVICE INFO --------------------
> |
> | Task ID: 0
> | CUDA Capable Devices Detected: 4
> | CUDA Device ID in use: 0
> | CUDA Device Name: GeForce GTX 590
> | CUDA Device Global Mem Size: 1535 MB
> | CUDA Device Num Multiprocessors: 0
> | CUDA Device Core Freq: 1.22 GHz
> |
> |
> | Task ID: 1
> | CUDA Capable Devices Detected: 4
> | CUDA Device ID in use: 1
> | CUDA Device Name: GeForce GTX 590
> | CUDA Device Global Mem Size: 1535 MB
> | CUDA Device Num Multiprocessors: 0
> | CUDA Device Core Freq: 1.22 GHz
> |
> |--------------------------------------------------------
>
>
> | Conditional Compilation Defines Used:
> | DIRFRC_COMTRANS
> | DIRFRC_EFS
> | DIRFRC_NOVEC
> | MPI
> | PUBFFT
> | FFTLOADBAL_2PROC
> | BINTRAJ
> | MKL
> | CUDA
>
> | Largest sphere to fit in unit cell has radius = 33.920
>
> | New format PARM file being parsed.
> | Version = 1.000 Date = 05/02/12 Time = 13:49:08
>
> | Note: 1-4 EEL scale factors are being read from the topology file.
>
> | Note: 1-4 VDW scale factors are being read from the topology file.
> | Duplicated 0 dihedrals
>
> | Duplicated 0 dihedrals
>
> -----------------------------------------------------------------------
> ---------
> 1. RESOURCE USE:
> -----------------------------------------------------------------------
> ---------
>
> getting new box info from bottom of inpcrd
>
> NATOM = 36356 NTYPES = 19 NBONH = 33899 MBONA = 2451
> NTHETH = 5199 MTHETA = 3321 NPHIH = 10329 MPHIA = 8468
> NHPARM = 0 NPARM = 0 NNB = 67990 NRES = 10898
> NBONA = 2451 NTHETA = 3321 NPHIA = 8468 NUMBND = 61
> NUMANG = 120 NPTRA = 71 NATYP = 45 NPHB = 1
> IFBOX = 1 NMXRS = 24 IFCAP = 0 NEXTRA = 0
> NCOPY = 0
>
> | Coordinate Index Table dimensions: 12 12 11
> | Direct force subcell size = 6.0700 6.1089 6.1673
>
> BOX TYPE: RECTILINEAR
>
> -----------------------------------------------------------------------
> ---------
> 2. CONTROL DATA FOR THE RUN
> -----------------------------------------------------------------------
> ---------
>
> default_name
>
> General flags:
> imin = 0, nmropt = 0
>
> Nature and format of input:
> ntx = 5, irest = 1, ntrx = 1
>
> Nature and format of output:
> ntxo = 1, ntpr = 5000, ntrx = 1, ntwr
> = 50000
> iwrap = 0, ntwx = 5000, ntwv = 0, ntwe
> = 0
> ioutfm = 0, ntwprt = 0, idecomp = 0,
> rbornstat= 0
>
> Potential function:
> ntf = 2, ntb = 2, igb = 0, nsnb
> = 25
> ipol = 0, gbsa = 0, iesp = 0
> dielc = 1.00000, cut = 10.00000, intdiel = 1.00000
>
> Frozen or restrained atoms:
> ibelly = 0, ntr = 0
>
> Molecular dynamics:
> nstlim = 10000000, nscm = 1000, nrespa = 1
> t = 0.00000, dt = 0.00200, vlimit = -1.00000
>
> Langevin dynamics temperature regulation:
> ig = 71277
> temp0 = 300.00000, tempi = 0.00000, gamma_ln= 2.00000
>
> Pressure regulation:
> ntp = 1
> pres0 = 1.00000, comp = 44.60000, taup = 2.00000
>
> SHAKE:
> ntc = 2, jfastw = 0
> tol = 0.00001
>
> | Intermolecular bonds treatment:
> | no_intermolecular_bonds = 1
>
> | Energy averages sample interval:
> | ene_avg_sampling = 5000
>
> Ewald parameters:
> verbose = 0, ew_type = 0, nbflag = 1, use_pme
> = 1
> vdwmeth = 1, eedmeth = 1, netfrc = 1
> Box X = 72.839 Box Y = 73.307 Box Z = 67.840
> Alpha = 90.000 Beta = 90.000 Gamma = 90.000
> NFFT1 = 80 NFFT2 = 80 NFFT3 = 64
> Cutoff= 10.000 Tol =0.100E-04
> Ewald Coefficient = 0.27511
> Interpolation order = 4
>
> | PMEMD ewald parallel performance parameters:
> | block_fft = 0
> | fft_blk_y_divisor = 2
> | excl_recip = 0
> | excl_master = 0
> | atm_redist_freq = 320
>
> -----------------------------------------------------------------------
> ---------
> 3. ATOMIC COORDINATES AND VELOCITIES
> -----------------------------------------------------------------------
> ---------
>
> default_name
> begin time read from input coords = 1300.000 ps
>
>
> Number of triangulated 3-point waters found: 10538
>
> Sum of charges from parm topology file = -0.00000015
> Forcing neutrality...
>
>
>
> --------------logfile--------------------------------
> FFT slabs assigned to 1 tasks
> Maximum of 64 xy slabs per task
> Maximum of 80 zx slabs per task
> Count of FFT xy slabs assigned to each task:
> 0 64
> Count of FFT xz slabs assigned to each task:
> 0 80
>
>
> -----------terminal--------------log---------------------
> Image PC Routine Line
> Source
> pmemd.cuda.MPI 000000000057E4BD Unknown Unknown
> Unknown
> pmemd.cuda.MPI 000000000057EB62 Unknown Unknown
> Unknown
> pmemd.cuda.MPI 0000000000555DF5 Unknown Unknown
> Unknown
> pmemd.cuda.MPI 000000000051D5F2 Unknown Unknown
> Unknown
> pmemd.cuda.MPI 00000000004F901E Unknown Unknown
> Unknown
> pmemd.cuda.MPI 00000000004057FC Unknown Unknown
> Unknown
> libc.so.6 00002B98685A8BFD Unknown Unknown
> Unknown
> pmemd.cuda.MPI 00000000004056F9 Unknown Unknown
> Unknown
> forrtl: severe (71): integer divide by zero
> Image PC Routine Line
> Source
> pmemd.cuda.MPI 000000000057E4BD Unknown Unknown
> Unknown
> pmemd.cuda.MPI 000000000057EB62 Unknown Unknown
> Unknown
> pmemd.cuda.MPI 0000000000555DF5 Unknown Unknown
> Unknown
> pmemd.cuda.MPI 000000000051D5F2 Unknown Unknown
> Unknown
> pmemd.cuda.MPI 00000000004F901E Unknown Unknown
> Unknown
> pmemd.cuda.MPI 00000000004057FC Unknown Unknown
> Unknown
> libc.so.6 00002AF6CA4B0BFD Unknown Unknown
> Unknown
> pmemd.cuda.MPI 00000000004056F9 Unknown Unknown
> Unknown
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 08 2012 - 10:00:05 PDT
Custom Search