I performed some tests last night.
All were NPT simulations. The test system was AMD 1090T.3.9Ghz, GTX590 (Asus) and 3GB GTX580 (Palit). Suse11.3.
JAC:
GTX590 1GPU= 32.61 ns/day
GTX590 2GPU =42.19 ns/day
GTX580 1GPU= 40.73 ns/day
GTX580 plus GTX590=50.21 ns/day
Factor IX:
GTX590 1GPU= 9.54 ns/day
GTX590 2GPU =12.24 ns/day
GTX580 1GPU= 11.72 ns/day
GTX580 plus GTX590=14.69 ns/day
Cellulose:
GTX580 (3GB) =2.67 ns/day
Regards,
Filip
P.S. My 1.5GB memory issue was still not solved...I will reduce waters from 12A to 10A. Not good but seems the only way for now...hope to work...
________________________________
From: Levi Pierce <levipierce.gmail.com>
To: AMBER Mailing List <amber.ambermd.org>
Sent: Saturday, August 20, 2011 10:20 AM
Subject: Re: [AMBER] Major GPU Update Released
Had a chance to sit down and test out the new patch. Wow! Very
impressive performance boost on a variety of systems I have been running
pmemd.cuda on. Great work!
On Fri, Aug 19, 2011 at 4:37 PM, Scott Le Grand <varelse2005.gmail.com>wrote:
> Use a different gpu foe display I suspect
> On Aug 19, 2011 4:09 PM, "filip fratev" <filipfratev.yahoo.com> wrote:
> > Hi Ross,
> > I compiled the new code and performed many tests and the results are
> really impressive! I will post later.
> >
> > However, I am in a big trouble with my systems (116K atoms) and hope that
> you will be able to help me.
> > The problem is that with the new code I am not able to simulate these
> proteins (116K) with GTX590 (1.5GB per core), because of some memory
> issue/bug:
> > cudaMalloc GpuBuffer::Allocate failed out of memory
> >
> > With the older code I had no any problems with same input files and
> configuration. I tried both NPT and NVT but the same problem...
> > Then I use GTX580 3GB and it works fine. From output you can see that the
> requested memory is just 882MB:
> > For NPT:
> > | GPU memory information:
> > | KB of GPU memory in use: 882413
> > | KB of CPU memory in use: 104090
> >
> > and for restrained NVT:
> >
> > | GPU memory information:
> > | KB of GPU memory in use: 1006146
> > | KB of CPU memory in use: 99724
> > Thus I shouldn’t have any problem.
> >
> > What could be the issue and how I can solve it?
> >
> > Regards,
> > Filip
> >
> > Below is the output file (my NPT density.out) and heat.in:
> >
> > -------------------------------------------------------
> > Amber 11 SANDER 2010
> > -------------------------------------------------------
> >
> > | PMEMD implementation of SANDER, Release 11
> >
> > | Run on 08/20/2011 at 01:42:20
> >
> > [-O]verwriting output
> >
> > File Assignments:
> > | MDIN:
> densityF.in
> > | MDOUT:
> 0densitytest580Karti.out
> > | INPCRD:
> heattest.rst
> > | PARM:
> MyosinWT.prmtop
> > | RESTRT:
> density1test.rst
> > | REFC:
> heattest.rst
> > | MDVEL:
> mdvel
> > | MDEN:
> mden
> > | MDCRD:
> density1test.mdcrd
> > | MDINFO:
> mdinfo
> >
> >
> > Here is the input file:
> >
> > Ligand9
> density
> >
> &cntrl
>
> > imin=0,irest=1,
> ntx=5,
> >
> nstlim=5000,dt=0.002,
>
> > ntc=2,ntf=2, ig=-1,
> iwrap=1,
> > cut=8.0, ntb=2, ntp=1,
> taup=1.0,
> > ntpr=5000, ntwx=5000,
> ntwr=10000,
> > ntt=3,
> gamma_ln=2.0,
> >
> temp0=300.0,
>
> >
> /
>
> >
>
>
> >
>
>
> >
>
>
> >
>
>
> >
> >
> > Note: ig = -1. Setting random seed based on wallclock time in
> microseconds.
> >
> > |--------------------- INFORMATION ----------------------
> > | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> > | Version 2.2
> > |
> > | 08/16/2011
> > |
> > |
> > | Implementation by:
> > | Ross C. Walker (SDSC)
> > | Scott Le Grand (nVIDIA)
> > | Duncan Poole (nVIDIA)
> > |
> > | CAUTION: The CUDA code is currently experimental.
> > | You use it at your own risk. Be sure to
> > | check ALL results carefully.
> > |
> > | Precision model in use:
> > | [SPDP] - Hybrid Single/Double Precision (Default).
> > |
> > |--------------------------------------------------------
> >
> > |------------------- GPU DEVICE INFO --------------------
> > |
> > | CUDA Capable Devices Detected: 1
> > | CUDA Device ID in use: 0
> > | CUDA Device Name: GeForce GTX 580
> > | CUDA Device Global Mem Size: 3071 MB
> > | CUDA Device Num Multiprocessors: 16
> > | CUDA Device Core Freq: 1.57 GHz
> > |
> > |--------------------------------------------------------
> >
> >
> > | Conditional Compilation Defines Used:
> > | DIRFRC_COMTRANS
> > | DIRFRC_EFS
> > | DIRFRC_NOVEC
> > | PUBFFT
> > | FFTLOADBAL_2PROC
> > | BINTRAJ
> > | CUDA
> >
> > | Largest sphere to fit in unit cell has radius = 48.492
> >
> > | New format PARM file being parsed.
> > | Version = 1.000 Date = 05/27/11 Time = 11:50:53
> >
> > | Note: 1-4 EEL scale factors were NOT found in the topology file.
> > | Using default value of 1.2.
> >
> > | Note: 1-4 VDW scale factors were NOT found in the topology file.
> > | Using default value of 2.0.
> > | Duplicated 0 dihedrals
> >
> > | Duplicated 0 dihedrals
> >
> >
>
> --------------------------------------------------------------------------------
> > 1. RESOURCE USE:
> >
>
> --------------------------------------------------------------------------------
> >
> > getting new box info from bottom of inpcrd
> >
> > NATOM = 116271 NTYPES = 21 NBONH = 109977 MBONA = 6423
> > NTHETH = 14190 MTHETA = 8659 NPHIH = 27033 MPHIA = 21543
> > NHPARM = 0 NPARM = 0 NNB = 207403 NRES = 35368
> > NBONA = 6423 NTHETA = 8659 NPHIA = 21543 NUMBND = 59
> > NUMANG = 124 NPTRA = 64 NATYP = 40 NPHB = 1
> > IFBOX = 2 NMXRS = 43 IFCAP = 0 NEXTRA = 0
> > NCOPY = 0
> >
> > | Coordinate Index Table dimensions: 23 23 23
> > | Direct force subcell size = 5.1644 5.1644 5.1644
> >
> > BOX TYPE: TRUNCATED OCTAHEDRON
> >
> >
>
> --------------------------------------------------------------------------------
> > 2. CONTROL DATA FOR THE RUN
> >
>
> --------------------------------------------------------------------------------
> >
> >
>
>
> >
> > General flags:
> > imin = 0, nmropt = 0
> >
> > Nature and format of input:
> > ntx = 5, irest = 1, ntrx = 1
> >
> > Nature and format of output:
> > ntxo = 1, ntpr = 5000, ntrx = 1, ntwr =
> 10000
> > iwrap = 1, ntwx = 5000, ntwv = 0, ntwe
> = 0
> > ioutfm = 0, ntwprt = 0, idecomp = 0,
> rbornstat= 0
> >
> > Potential function:
> > ntf = 2, ntb = 2, igb = 0, nsnb
> = 25
> > ipol = 0, gbsa = 0, iesp = 0
> > dielc = 1.00000, cut = 8.00000, intdiel = 1.00000
> >
> > Frozen or restrained atoms:
> > ibelly = 0, ntr = 0
> >
> > Molecular dynamics:
> > nstlim = 5000, nscm = 1000, nrespa = 1
> > t = 0.00000, dt = 0.00200, vlimit = -1.00000
> >
> > Langevin dynamics temperature regulation:
> > ig = 974683
> > temp0 = 300.00000, tempi = 0.00000, gamma_ln= 2.00000
> >
> > Pressure regulation:
> > ntp = 1
> > pres0 = 1.00000, comp = 44.60000, taup = 1.00000
> >
> > SHAKE:
> > ntc = 2, jfastw = 0
> > tol = 0.00001
> >
> > | Intermolecular bonds treatment:
> > | no_intermolecular_bonds = 1
> >
> > | Energy averages sample interval:
> > | ene_avg_sampling = 5000
> >
> > Ewald parameters:
> > verbose = 0, ew_type = 0, nbflag = 1, use_pme
> = 1
> > vdwmeth = 1, eedmeth = 1, netfrc = 1
> > Box X = 118.781 Box Y = 118.781 Box Z = 118.781
> > Alpha = 109.471 Beta = 109.471 Gamma = 109.471
> > NFFT1 = 128 NFFT2 = 128 NFFT3 = 128
> > Cutoff= 8.000 Tol =0.100E-04
> > Ewald Coefficient = 0.34864
> > Interpolation order = 4
> >
> >
>
> --------------------------------------------------------------------------------
> > 3. ATOMIC COORDINATES AND VELOCITIES
> >
>
> --------------------------------------------------------------------------------
> >
> >
>
>
> > begin time read from input coords = 10.000 ps
> >
> >
> > Number of triangulated 3-point waters found: 34583
> >
> > Sum of charges from parm topology file = -0.00000040
> > Forcing neutrality...
> >
> > | Dynamic Memory, Types Used:
> > | Reals 3524690
> > | Integers 3800219
> >
> > | Nonbonded Pairs Initial Allocation: 19420163
> >
> > | GPU memory information:
> > | KB of GPU memory in use: 882413
> > | KB of CPU memory in use: 104090
> >
> >
>
> --------------------------------------------------------------------------------
> > 4. RESULTS
> >
>
> --------------------------------------------------------------------------------
> >
> > ---------------------------------------------------
> > APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
> > using 5000.0 points per unit in tabled values
> > TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
> > | CHECK switch(x): max rel err = 0.2738E-14 at 2.422500
> > | CHECK d/dx switch(x): max rel err = 0.8332E-11 at 2.782960
> > ---------------------------------------------------
> > |---------------------------------------------------
> > | APPROXIMATING direct energy using CUBIC SPLINE INTERPOLATION
> > | with 50.0 points per unit in tabled values
> > | Relative Error Limit not exceeded for r .gt. 2.47
> > | APPROXIMATING direct force using CUBIC SPLINE INTERPOLATION
> > | with 50.0 points per unit in tabled values
> > | Relative Error Limit not exceeded for r .gt. 2.89
> > |---------------------------------------------------
> > wrapping first mol.: 38.333512154956900
> 54.211771142609109 93.897534410964738
> > wrapping first mol.: 38.333512154956900
> 54.211771142609109 93.897534410964738
> >
> > NSTEP = 5000 TIME(PS) = 20.000 TEMP(K) = 300.01 PRESS =
> -23.7
> > Etot = -281399.8069 EKtot = 71193.9609 EPtot =
> -352593.7679
> > BOND = 2490.5718 ANGLE = 6429.0655 DIHED =
> 8582.5720
> > 1-4 NB = 2942.3115 1-4 EEL = 32655.1879 VDWAALS =
> 42104.9713
> > EELEC = -447798.4479 EHBOND = 0.0000 RESTRAINT =
> 0.0000
> > EKCMT = 30939.5460 VIRIAL = 31538.3575 VOLUME =
> 1170788.5879
> > Density =
> 1.0106
> >
>
> ------------------------------------------------------------------------------
> >
> >
> > A V E R A G E S O V E R 1 S T E P S
> >
> >
> > NSTEP = 5000 TIME(PS) = 20.000 TEMP(K) = 300.01 PRESS =
> -23.7
> > Etot = -281399.8069 EKtot = 71193.9609 EPtot =
> -352593.7679
> > BOND = 2490.5718 ANGLE = 6429.0655 DIHED =
> 8582.5720
> > 1-4 NB = 2942.3115 1-4 EEL = 32655.1879 VDWAALS =
> 42104.9713
> > EELEC = -447798.4479 EHBOND = 0.0000 RESTRAINT =
> 0.0000
> > EKCMT = 30939.5460 VIRIAL = 31538.3575 VOLUME =
> 1170788.5879
> > Density =
> 1.0106
> >
>
> ------------------------------------------------------------------------------
> >
> >
> > R M S F L U C T U A T I O N S
> >
> >
> > NSTEP = 5000 TIME(PS) = 20.000 TEMP(K) = 0.00 PRESS
> = 0.0
> > Etot = 0.0000 EKtot = 0.0000 EPtot =
> 0.0000
> > BOND = 0.0000 ANGLE = 0.0000 DIHED =
> 0.0000
> > 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
> 0.0000
> > EELEC = 0.0000 EHBOND = 0.0000 RESTRAINT =
> 0.0000
> >
>
> ------------------------------------------------------------------------------
> >
> >
>
> --------------------------------------------------------------------------------
> > 5. TIMINGS
> >
>
> --------------------------------------------------------------------------------
> >
> > | NonSetup CPU Time in Major Routines:
> > |
> > | Routine Sec %
> > | ------------------------------
> > | Nonbond 97.05 92.22
> > | Bond 0.00 0.00
> > | Angle 0.00 0.00
> > | Dihedral 0.00 0.00
> > | Shake 2.47 2.34
> > | RunMD 5.71 5.43
> > | Other 0.00 0.00
> > | ------------------------------
> > | Total 105.24
> >
> > | PME Nonbond Pairlist CPU Time:
> > |
> > | Routine Sec %
> > | ---------------------------------
> > | Set Up Cit 0.00 0.00
> > | Build List 0.00 0.00
> > | ---------------------------------
> > | Total 0.00 0.00
> >
> > | PME Direct Force CPU Time:
> > |
> > | Routine Sec %
> > | ---------------------------------
> > | NonBonded Calc 0.00 0.00
> > | Exclude Masked 0.00 0.00
> > | Other 0.00 0.00
> > | ---------------------------------
> > | Total 0.00 0.00
> >
> > | PME Reciprocal Force CPU Time:
> > |
> > | Routine Sec %
> > | ---------------------------------
> > | 1D bspline 0.00 0.00
> > | Grid Charges 0.00 0.00
> > | Scalar Sum 0.00 0.00
> > | Gradient Sum 0.00 0.00
> > | FFT 0.00 0.00
> > | ---------------------------------
> > | Total 0.00 0.00
> >
> > | Final Performance Info:
> > | -----------------------------------------------------
> > | Average timings for last 0 steps:
> > | Elapsed(s) = 0.00 Per Step(ms) = +Infinity
> > | ns/day = 0.00 seconds/ns = +Infinity
> > |
> > | Average timings for all steps:
> > | Elapsed(s) = 105.26 Per Step(ms) = 21.05
> > | ns/day = 8.21 seconds/ns = 10525.53
> > | -----------------------------------------------------
> >
> > | Setup CPU time: 0.90 seconds
> > | NonSetup CPU time: 105.24 seconds
> > | Total CPU time: 106.13 seconds 0.03 hours
> >
> > | Setup wall time: 1 seconds
> > | NonSetup wall time: 105 seconds
> > | Total wall time: 106 seconds 0.03 hours
> >
> >
> > heat Ligand9
> > &cntrl
> > irest=0, ntx=1,
> > nstlim=5000, dt=0.002,
> > ntc=2,ntf=2, iwrap=1,
> > cut=8.0, ntb=1, ig=-1,
> > ntpr=1000, ntwx=1000, ntwr=10000,
> > ntt=3, gamma_ln=2.0,
> > tempi=0.0, temp0=300.0,
> > ioutfm=1, ntr=1,
> > ntr=1,
> > /
> > Group input for restrained atoms
> > 2.0
> > RES 1 790
> > END
> > END
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
--
--
==
Levi C.T. Pierce, UCSD Graduate Student
McCammon Laboratory
http://mccammon.ucsd.edu/
w: 858-534-2916
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Aug 20 2011 - 07:30:02 PDT