Re: [AMBER] using two GPUs

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 25 Apr 2012 14:44:31 -0400

How long were you trying to run? My suggestion is to run shorter
simulations, printing each step (for starters), and if you can narrow down
the problem. In my experience, an infinite hang is impossible for even the
most knowledgeable people to debug without a reproducible case.

HTH,
Jason

On Wed, Apr 25, 2012 at 2:41 PM, Vijay Manickam Achari <vjrajamany.yahoo.com
> wrote:

> Dear Jason,
>
> Thank you so much for your reply.
>
> This time I tried with your suggestion and it worked BUT the run just hang
> after few steps (100). Here I put the output file that I got.
>
>
> ***********************************************************************************
>
>
> -------------------------------------------------------
> Amber 12 SANDER 2012
> -------------------------------------------------------
>
> | PMEMD implementation of SANDER, Release 12
>
> | Run on 04/26/2012 at 02:35:03
>
> [-O]verwriting output
>
> File Assignments:
> | MDIN: MD-betaMalto-THERMO.in
>
> | MDOUT: malto-THERMO-RT-MD00-run1000.out
>
> | INPCRD: betaMalto-THERMO-MD03-run0100.rst.1
>
> | PARM: malto-THERMO.top
>
> | RESTRT: malto-THERMO-RT-MD01-run0100.rst
>
> | REFC: refc
>
> | MDVEL: mdvel
>
> | MDEN: mden
>
> | MDCRD: malto-THERMO-RT-MD00-run1000.traj
>
> | MDINFO: mdinfo
>
> |LOGFILE: logfile
>
>
>
> Here is the input file:
>
> Dynamic Simulation with Constant Pressure
>
> &cntrl
>
> imin=0,
>
> irest=1, ntx=5,
>
> ntxo=1, iwrap=1, nscm=2000,
>
> ntt=2,
>
> tempi = 300.0, temp0=300.0, tautp=2.0,
>
> ntp=2, ntb=2, taup=2.0,
>
> ntc=2, ntf=2,
>
> nstlim=100000, dt=0.001,
>
> ntwe=100, ntwx=100, ntpr=100, ntwr=-50000,
>
> ntr=0,
>
> cut = 9
>
> /
>
>
>
>
>
>
> |--------------------- INFORMATION ----------------------
> | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> | Version 12.0
> |
> | 03/19/2012
> |
> | Implementation by:
> | Ross C. Walker (SDSC)
> | Scott Le Grand (nVIDIA)
> | Duncan Poole (nVIDIA)
> |
> | CAUTION: The CUDA code is currently experimental.
> | You use it at your own risk. Be sure to
> | check ALL results carefully.
> |
> | Precision model in use:
> | [SPDP] - Hybrid Single/Double Precision (Default).
> |
> |--------------------------------------------------------
>
> |------------------- GPU DEVICE INFO --------------------
> |
> | Task ID: 0
> | CUDA Capable Devices Detected: 2
> | CUDA Device ID in use: 0
> | CUDA Device Name: Tesla C2075
> | CUDA Device Global Mem Size: 6143 MB
> | CUDA Device Num Multiprocessors: 14
> | CUDA Device Core Freq: 1.15 GHz
> |
> |
> | Task ID: 1
> | CUDA Capable Devices Detected: 2
> | CUDA Device ID in use: 1
> | CUDA Device Name: Tesla C2075
> | CUDA Device Global Mem Size: 6143 MB
> | CUDA Device Num Multiprocessors: 14
> | CUDA Device Core Freq: 1.15 GHz
> |
> |--------------------------------------------------------
>
>
> | Conditional Compilation Defines Used:
> | DIRFRC_COMTRANS
> | DIRFRC_EFS
> | DIRFRC_NOVEC
> | MPI
> | PUBFFT
> | FFTLOADBAL_2PROC
> | BINTRAJ
> | CUDA
>
> | Largest sphere to fit in unit cell has radius = 23.378
>
> | New format PARM file being parsed.
> | Version = 1.000 Date = 07/07/08 Time = 10:50:18
>
> | Note: 1-4 EEL scale factors were NOT found in the topology file.
> | Using default value of 1.2.
>
> | Note: 1-4 VDW scale factors were NOT found in the topology file.
> | Using default value of 2.0.
> | Duplicated 0 dihedrals
>
> | Duplicated 0 dihedrals
>
>
> --------------------------------------------------------------------------------
> 1. RESOURCE USE:
>
> --------------------------------------------------------------------------------
>
> getting new box info from bottom of inpcrd
>
> NATOM = 20736 NTYPES = 7 NBONH = 11776 MBONA = 9216
> NTHETH = 27648 MTHETA = 12032 NPHIH = 45312 MPHIA = 21248
> NHPARM = 0 NPARM = 0 NNB = 119552 NRES = 256
> NBONA = 9216 NTHETA = 12032 NPHIA = 21248 NUMBND = 7
> NUMANG = 14 NPTRA = 20 NATYP = 7 NPHB = 0
> IFBOX = 1 NMXRS = 81 IFCAP = 0 NEXTRA = 0
> NCOPY = 0
>
> | Coordinate Index Table dimensions: 13 8 9
> | Direct force subcell size = 5.9091 5.8446 5.8286
>
> BOX TYPE: RECTILINEAR
>
>
> --------------------------------------------------------------------------------
> 2. CONTROL DATA FOR THE RUN
>
> --------------------------------------------------------------------------------
>
>
>
>
> General flags:
> imin = 0, nmropt = 0
>
> Nature and format of input:
> ntx = 5, irest = 1, ntrx = 1
>
> Nature and format of output:
> ntxo = 1, ntpr = 100, ntrx = 1, ntwr =
> -50000
> iwrap = 1, ntwx = 100, ntwv = 0, ntwe =
> 100
> ioutfm = 0, ntwprt = 0, idecomp = 0, rbornstat=
> 0
>
> Potential function:
> ntf = 2, ntb = 2, igb = 0, nsnb =
> 25
> ipol = 0, gbsa = 0, iesp = 0
> dielc = 1.00000, cut = 9.00000, intdiel = 1.00000
>
> Frozen or restrained atoms:
> ibelly = 0, ntr = 0
>
> Molecular dynamics:
> nstlim = 100000, nscm = 2000, nrespa = 1
> t = 0.00000, dt = 0.00100, vlimit = -1.00000
>
> Anderson (strong collision) temperature regulation:
> ig = 71277, vrand = 1000
> temp0 = 300.00000, tempi = 300.00000
>
> Pressure regulation:
> ntp = 2
> pres0 = 1.00000, comp = 44.60000, taup = 2.00000
>
> SHAKE:
> ntc = 2, jfastw = 0
> tol = 0.00001
>
> | Intermolecular bonds treatment:
> | no_intermolecular_bonds = 1
>
> | Energy averages sample interval:
> | ene_avg_sampling = 100
>
> Ewald parameters:
> verbose = 0, ew_type = 0, nbflag = 1, use_pme =
> 1
> vdwmeth = 1, eedmeth = 1, netfrc = 1
> Box X = 76.818 Box Y = 46.757 Box Z = 52.457
> Alpha = 90.000 Beta = 90.000 Gamma = 90.000
> NFFT1 = 80 NFFT2 = 48 NFFT3 = 56
> Cutoff= 9.000 Tol =0.100E-04
> Ewald Coefficient = 0.30768
> Interpolation order = 4
>
> | PMEMD ewald parallel performance parameters:
> | block_fft = 0
> | fft_blk_y_divisor = 2
> | excl_recip = 0
> | excl_master = 0
> | atm_redist_freq = 320
>
>
> --------------------------------------------------------------------------------
> 3. ATOMIC COORDINATES AND VELOCITIES
>
> --------------------------------------------------------------------------------
>
> trajectory generated by ptraj
>
> begin time read from input coords = 0.000 ps
>
>
> Number of triangulated 3-point waters found: 0
>
> Sum of charges from parm topology file = 0.00000000
> Forcing neutrality...
>
> | Dynamic Memory, Types Used:
> | Reals 1291450
> | Integers 3021916
>
> | Nonbonded Pairs Initial Allocation: 3136060
>
> | GPU memory information:
> | KB of GPU memory in use: 146374
> | KB of CPU memory in use: 30984
>
> | Running AMBER/MPI version on 2 nodes
>
>
>
> --------------------------------------------------------------------------------
> 4. RESULTS
>
> --------------------------------------------------------------------------------
>
> ---------------------------------------------------
> APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
> using 5000.0 points per unit in tabled values
> TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
> | CHECK switch(x): max rel err = 0.2738E-14 at 2.422500
> | CHECK d/dx switch(x): max rel err = 0.8314E-11 at 2.736960
> ---------------------------------------------------
> |---------------------------------------------------
> | APPROXIMATING direct energy using CUBIC SPLINE INTERPOLATION
> | with 50.0 points per unit in tabled values
> | Relative Error Limit not exceeded for r .gt. 2.39
> | APPROXIMATING direct force using CUBIC SPLINE INTERPOLATION
> | with 50.0 points per unit in tabled values
> | Relative Error Limit not exceeded for r .gt. 2.84
> |---------------------------------------------------
>
> NSTEP = 100 TIME(PS) = 0.100 TEMP(K) = 147.84 PRESS =
> -2761.2
> Etot = 43456.9441 EKtot = 7407.5049 EPtot =
> 36049.4392
> BOND = 2118.3308 ANGLE = 7884.5560 DIHED =
> 3309.5527
> 1-4 NB = 3447.1834 1-4 EEL = 95835.1134 VDWAALS =
> -13101.7433
> EELEC = -63443.5538 EHBOND = 0.0000 RESTRAINT =
> 0.0000
> EKCMT = 110.3519 VIRIAL = 11263.9500 VOLUME =
> 187084.9335
> Density =
> 1.1602
>
> ------------------------------------------------------------------------------
>
>
> ***************************************************************************************
>
> The run just hang and there is no progress at all.
> Does the command still need any other input?
>
> Regards
>
>
> Vijay Manickam Achari
> (Phd Student c/o Prof Rauzah Hashim)
> Chemistry Department,
> University of Malaya,
> Malaysia
> vjramana.gmail.com
>
>
> ________________________________
> From: Jason Swails <jason.swails.gmail.com>
> To: AMBER Mailing List <amber.ambermd.org>
> Sent: Thursday, 26 April 2012, 0:05
> Subject: Re: [AMBER] using two GPUs
>
> Hello,
>
> On Wed, Apr 25, 2012 at 1:34 AM, Vijay Manickam Achari <
> vjrajamany.yahoo.com
> > wrote:
>
> > Thank you for the kind reply.
> > I have tried to figure out based on your info and other sources as well
> to
> > get the two GPUs work.
> >
> > For the machinefile:
> > I checked /dev folder and I saw list of NVIDIA card names as :-
> nvidia0,
> > nvidia1, nvidia2, nvidia3, nvidia4. I understand these names should be
> > listed in the mahcinefile. I comment out nvidia0, nvidia1, nvidia2 since
> I
> > only wanted to use two GPUs.
> >
>
> The names in the hostfile (or machinefile) are the host name (you can get
> this via "hostname"). However, machinefiles are really only necessary if
> you plan on going off-node. What it tells the MPI is *where* on the
> network each thread should be launched.
>
> If you want to run everything locally on the same machine, every MPI
> implementation that I've ever used allows you to just say:
>
> mpirun -np 2 pmemd.cuda.MPI -O -i mdin ...etc.
>
> If you need to use the hostfile or machinefile, look at the mpirun manpage
> to see how your particular MPI reads them.
>
> HTH,
> Jason
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Candidate
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Apr 25 2012 - 12:00:05 PDT
Custom Search