Re: [AMBER] cuda pmemd failed from Jason Swails on 2012-10-18 (Amber Archive Oct 2012)

From: Jason Swails <jason.swails.gmail.com>
Date: Thu, 18 Oct 2012 09:39:17 -0400

Please wait for a response rather than continuing to email the list.
Knowledgeable people will respond as they have time.

Also keep in mind that people around the world live in different time zones
and have different schedules, so it's not unusual not to hear a response
within a couple days.

In case you were getting weird bounce-back messages, your emails *are*
getting through.

All the best,
Jason

P.S. -- I think I recall reading that CUDA 5.0 does not work with Amber
yet. So try using 4.2 for the time being. But I'm not certain.

On Thu, Oct 18, 2012 at 9:00 AM, Yubo Fan <pengpengtadie.gmail.com> wrote:

> Hello,
>
> The cluster that I run pmemd.cuda.MPI has been upgraded to cuda 5.0.
> Both Amber 11 and 12 have been recompiled with Intel and cuda
> compilers and openmpi or mvapich2-1.8.1 compiled with Intel. The
> pmemd.MPI works well for my jobs but both pmemd.cuda and
> pmemd.cuda.MPI failed to output data after the upgrade. Some
> information is listed below.
>
> List in my job directory:
> -rw-r--r-- 1 yf4 ddc3 1786 Oct 17 12:39 cuda2gpu.job
> -rw-r--r-- 1 yf4 ddc3 287 Oct 17 12:39 logfile
> -rw-r--r-- 1 yf4 ddc3 354 Jul 27 06:00 md4ns.in
> -rw-r--r-- 1 yf4 ddc3 1193 Oct 15 18:04 mdinfo
> -rwxr-xr-x 1 yf4 ddc3 5630016 Oct 17 12:37 pmemd.cuda_SPDP
> -rwxr-xr-x 1 yf4 ddc3 9033291 Oct 17 12:39 pmemd.cuda_SPDP.MPI
> -rwxr-xr-x 1 yf4 ddc3 4200370 Oct 17 12:37 pmemd.MPI
> -rwxr-xr-x 1 yf4 ddc3 11885653 Oct 17 12:37 sander.MPI
> -rw-r--r-- 1 yf4 ddc3 5640959 Oct 3 08:05
> STAT3_SH2_1bg1_dimer_noPO4_md200.rst7
> -rw-r--r-- 1 yf4 ddc3 772 Oct 17 12:39
> STAT3_SH2_1bg1_dimer_noPO4_md201.nc
> -rw-r--r-- 1 yf4 ddc3 8019 Oct 17 12:39
> STAT3_SH2_1bg1_dimer_noPO4_md201.out
> -rw-r--r-- 1 yf4 ddc3 0 Oct 17 12:39
> STAT3_SH2_1bg1_dimer_noPO4_md201.rst7
> -rw-r--r-- 1 yf4 ddc3 13662681 May 8 16:57
> STAT3_SH2_1bg1_dimer_noPO4.prmtop
>
> The output file, STAT3_SH2_1bg1_dimer_noPO4_md201.out, contains:
> -------------------------------------------------------
> Amber 11 SANDER 2010
> -------------------------------------------------------
>
> | PMEMD implementation of SANDER, Release 11
>
> | Run on 10/17/2012 at 12:39:50
>
> [-O]verwriting output
>
> File Assignments:
> | MDIN: md4ns.in
> | MDOUT: STAT3_SH2_1bg1_dimer_noPO4_md201.out
> | INPCRD: STAT3_SH2_1bg1_dimer_noPO4_md200.rst7
> | PARM: STAT3_SH2_1bg1_dimer_noPO4.prmtop
> | RESTRT: STAT3_SH2_1bg1_dimer_noPO4_md201.rst7
> | REFC: refc
> | MDVEL: mdvel
> | MDEN: mden
> | MDCRD: STAT3_SH2_1bg1_dimer_noPO4_md201.nc
> | MDINFO: mdinfo
> |LOGFILE: logfile
>
>
> Here is the input file:
>
> STAT3 SH2 Dimer with unphosphorylated Tyr-705: 1-ns MD &cntrl
> imin = 0, irest = 1, ntx = 5,
> ig = -1,
> ntb = 2, pres0 = 1.0, ntp = 1,
> taup = 2.0,
> cut = 10, ntr = 0,
> ntc = 2, ntf = 2,
> tempi = 300.0, temp0 = 300.0,
> ntt = 3, gamma_ln = 1.0,
> nstlim = 2000000, dt = 0.002,
> ntpr = 500, ntwx = 500, ntwr = 100000,
> ioutfm = 1, iwrap = 1,
> /
>
>
> Note: ig = -1. Setting random seed based on wallclock time in microseconds
> and disabling the synchronization of random numbers between tasks
> to improve performance.
>
> |--------------------- INFORMATION ---------------------- GPU (CUDA)
> |Version of PMEMD in use: NVIDIA GPU IN USE.
> | Version 12.0
> |
> | 03/19/2012
> |
> | Implementation by:
> | Ross C. Walker (SDSC)
> | Scott Le Grand (nVIDIA)
> | Duncan Poole (nVIDIA)
> |
> | CAUTION: The CUDA code is currently experimental.
> | You use it at your own risk. Be sure to
> | check ALL results carefully.
> |
> | Precision model in use:
> | [SPDP] - Hybrid Single/Double Precision (Default).
> |
> |--------------------------------------------------------
>
> |------------------- GPU DEVICE INFO --------------------
> |
> | Task ID: 0
> | CUDA Capable Devices Detected: 2
> | CUDA Device ID in use: 0
> | CUDA Device Name: Tesla M2050
> | CUDA Device Global Mem Size: 2687 MB
> | CUDA Device Num Multiprocessors: 14
> | CUDA Device Core Freq: 1.15 GHz
> |
> |
> | Task ID: 1
> | CUDA Capable Devices Detected: 2
> | CUDA Device ID in use: 1
> | CUDA Device Name: Tesla M2050
> | CUDA Device Global Mem Size: 2687 MB
> | CUDA Device Num Multiprocessors: 14
> | CUDA Device Core Freq: 1.15 GHz
> |
> |--------------------------------------------------------
>
>
> | Conditional Compilation Defines Used:
> | DIRFRC_COMTRANS
> | DIRFRC_EFS
> | DIRFRC_NOVEC
> | MPI
> | PUBFFT
> | FFTLOADBAL_2PROC
> | BINTRAJ
> | CUDA
>
> | Largest sphere to fit in unit cell has radius = 40.866
>
> | New format PARM file being parsed.
> | Version = 1.000 Date = 05/03/12 Time = 10:32:59
>
> | Note: 1-4 EEL scale factors were NOT found in the topology file.
> | Using default value of 1.2.
>
> | Note: 1-4 VDW scale factors were NOT found in the topology file.
> | Using default value of 2.0.
> | Duplicated 0 dihedrals
>
> | Duplicated 0 dihedrals
>
>
> --------------------------------------------------------------------------------
> 1. RESOURCE USE:
>
> --------------------------------------------------------------------------------
>
> getting new box info from bottom of inpcrd
>
> NATOM = 77271 NTYPES = 16 NBONH = 73189 MBONA = 4198
> NTHETH = 8918 MTHETA = 5780 NPHIH = 17546 MPHIA = 13412
> NHPARM = 0 NPARM = 0 NNB = 136792 NRES = 23593
> NBONA = 4198 NTHETA = 5780 NPHIA = 13412 NUMBND = 44
> NUMANG = 92 NPTRA = 48 NATYP = 31 NPHB = 1
> IFBOX = 2 NMXRS = 24 IFCAP = 0 NEXTRA = 0
> NCOPY = 0
>
> | Coordinate Index Table dimensions: 14 14 14
> | Direct force subcell size = 7.1500 7.1500 7.1500
>
> BOX TYPE: TRUNCATED OCTAHEDRON
>
>
> --------------------------------------------------------------------------------
> 2. CONTROL DATA FOR THE RUN
>
> --------------------------------------------------------------------------------
>
>
>
> General flags:
> imin = 0, nmropt = 0
>
> Nature and format of input:
> ntx = 5, irest = 1, ntrx = 1
>
> Nature and format of output:
> ntxo = 1, ntpr = 500, ntrx = 1, ntwr =
> 100000
> iwrap = 1, ntwx = 500, ntwv = 0, ntwe =
> 0
> ioutfm = 1, ntwprt = 0, idecomp = 0, rbornstat=
> 0
>
> Potential function:
> ntf = 2, ntb = 2, igb = 0, nsnb =
> 25
> ipol = 0, gbsa = 0, iesp = 0
> dielc = 1.00000, cut = 10.00000, intdiel = 1.00000
>
> Frozen or restrained atoms:
> ibelly = 0, ntr = 0
>
> Molecular dynamics:
> nstlim = 2000000, nscm = 1000, nrespa = 1
> t = 0.00000, dt = 0.00200, vlimit = -1.00000
>
> Langevin dynamics temperature regulation:
> ig = 824075
> temp0 = 300.00000, tempi = 300.00000, gamma_ln= 1.00000
>
> Pressure regulation:
> ntp = 1
> pres0 = 1.00000, comp = 44.60000, taup = 2.00000
>
> SHAKE:
> ntc = 2, jfastw = 0
> tol = 0.00001
>
> | Intermolecular bonds treatment:
> | no_intermolecular_bonds = 1
>
> | Energy averages sample interval:
> | ene_avg_sampling = 500
>
> Ewald parameters:
> verbose = 0, ew_type = 0, nbflag = 1, use_pme =
> 1
> vdwmeth = 1, eedmeth = 1, netfrc = 1
> Box X = 100.101 Box Y = 100.101 Box Z = 100.101
> Alpha = 109.471 Beta = 109.471 Gamma = 109.471
> NFFT1 = 108 NFFT2 = 108 NFFT3 = 108
> Cutoff= 10.000 Tol =0.100E-04
> Ewald Coefficient = 0.27511
> Interpolation order = 4
>
> | PMEMD ewald parallel performance parameters:
> | block_fft = 0
> | fft_blk_y_divisor = 2
> | excl_recip = 0
> | excl_master = 0
> | atm_redist_freq = 320
>
>
> --------------------------------------------------------------------------------
> 3. ATOMIC COORDINATES AND VELOCITIES
>
> --------------------------------------------------------------------------------
>
>
> begin time read from input coords =800170.000 ps
>
>
> Number of triangulated 3-point waters found: 23091
>
> Sum of charges from parm topology file = -0.00001037
> Forcing neutrality...
>
> =======================================================
> There is no further output and nc trajectory stored although the job
> is still running in the queue system.
>
> The job file is listed below:
> #PBS -N Dimer_noPO4
> #PBS -l walltime=24:00:00
> #PBS -l nodes=1:ppn=12
> #PBS -o cuda2gpu.err
> #PBS -M yfan.tmhs.org
> #PBS -m abe
> #PBS -q graphics
> #PBS -V
>
> export AMBERHOME=/work/ddc3/yf4/program/amber12
> export MPI_HOME=/work/ddc3/yf4/program/mvapich2-1.8.1-intel
> export filename='STAT3_SH2_1bg1_dimer_noPO4'
>
> export PATH=$PATH:$MPI_HOME/bin:$AMBERHOME/bin
>
> cd $PBS_O_WORKDIR
>
> cp $AMBERHOME/bin/sander.MPI .
> cp $AMBERHOME/bin/pmemd.MPI .
> cp $AMBERHOME/bin/pmemd.cuda_SPDP .
> cp $AMBERHOME/bin/pmemd.cuda_SPDP.MPI .
>
> export n=200
> until [ -e $filename\_md$n\.rst7 ]; do
> let n="$n - 1"
> done
> let n="$n + 1"
> let nend="$n + 1"
>
> while [ $n -le $nend ]; do
> let m="$n - 1"
> mpiexec -np 2 ./pmemd.cuda_SPDP.MPI -O -i md4ns.in -p
> $filename\.prmtop -c $filename\_md$m.rst7 -o $filename\_md$n.out -r
> $filename\_md$n.rst7 -x $filename\_md$n.nc let n="$n + 1"
> done
> =======================================================
>
> Suggestions?
>
> Thanks,
> Yubo
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Thu Oct 18 2012 - 07:00:05 PDT