Re: [AMBER] Major GPU Update Released

From: filip fratev <filipfratev.yahoo.com>
Date: Sat, 20 Aug 2011 07:16:55 -0700 (PDT)

Hi Scott and Ross,

> For NPT:
> | GPU memory information:
> | KB of GPU memory in use:    882413
> | KB of CPU memory in use:    104090
>
>
> | GPU memory information:
> | KB of GPU memory in use:   1006146
> | KB of CPU memory in use:     99724

So, the above values are from the older code and for the new are about 2x more? Just to be clear.


Indeed that the update is great! I've wrote about that! Just need some more information and help.   


Regards,
Filip





________________________________
From: filip fratev <filipfratev.yahoo.com>
To: AMBER Mailing List <amber.ambermd.org>
Sent: Saturday, August 20, 2011 5:02 PM
Subject: Re: [AMBER] Major GPU Update Released

I performed some tests last night.
All were NPT simulations. The test system was AMD 1090T.3.9Ghz, GTX590 (Asus) and 3GB GTX580 (Palit). Suse11.3.   


JAC:
GTX590 1GPU= 32.61 ns/day

GTX590 2GPU =42.19 ns/day

GTX580 1GPU= 40.73 ns/day

GTX580 plus GTX590=50.21 ns/day

Factor IX:
GTX590 1GPU= 9.54 ns/day

GTX590 2GPU =12.24 ns/day

GTX580 1GPU= 11.72 ns/day

GTX580 plus GTX590=14.69 ns/day

Cellulose:
GTX580 (3GB) =2.67 ns/day 



Regards,
Filip

P.S. My 1.5GB memory issue was still not solved...I will reduce waters from 12A to 10A. Not good but seems the only way for now...hope to work...


 



________________________________
From: Levi Pierce <levipierce.gmail.com>
To: AMBER Mailing List <amber.ambermd.org>
Sent: Saturday, August 20, 2011 10:20 AM
Subject: Re: [AMBER] Major GPU Update Released

Had a chance to sit down and test out the new patch.  Wow! Very
impressive performance boost on a variety of systems I have been running
pmemd.cuda on.  Great work!

On Fri, Aug 19, 2011 at 4:37 PM, Scott Le Grand <varelse2005.gmail.com>wrote:

> Use a different gpu foe display I suspect
> On Aug 19, 2011 4:09 PM, "filip fratev" <filipfratev.yahoo.com> wrote:
> > Hi Ross,
> > I compiled the new code and performed many tests and the results are
> really impressive! I will post later.
> >
> > However, I am in a big trouble with my systems (116K atoms) and hope that
> you will be able to help me.
> > The problem is that with the new code I am not able to simulate these
> proteins (116K) with GTX590 (1.5GB per core), because of some memory
> issue/bug:
> > cudaMalloc GpuBuffer::Allocate failed out of memory
> >
> > With the older code I had no any problems with same input files and
> configuration. I tried both NPT and NVT but the same problem...
> > Then I use GTX580 3GB and it works fine. From output you can see that the
> requested memory is just 882MB:
> > For NPT:
> > | GPU memory information:
> > | KB of GPU memory in use:    882413
> > | KB of CPU memory in use:    104090
> >
> > and for restrained NVT:
> >
> > | GPU memory information:
> > | KB of GPU memory in use:   1006146
> > | KB of CPU memory in use:     99724
> > Thus I shouldn’t have any problem.
> >
> > What could be the issue and how I can solve it?
> >
> > Regards,
> > Filip
> >
> > Below is the output file (my NPT density.out) and heat.in:
> >
> >           -------------------------------------------------------
> >           Amber 11 SANDER                              2010
> >           -------------------------------------------------------
> >
> > | PMEMD implementation of SANDER, Release 11
> >
> > | Run on 08/20/2011 at 01:42:20
> >
> >   [-O]verwriting output
> >
> > File Assignments:
> > |   MDIN:
> densityF.in
> > |  MDOUT:
> 0densitytest580Karti.out
> > | INPCRD:
> heattest.rst
> > |   PARM:
> MyosinWT.prmtop
> > | RESTRT:
> density1test.rst
> > |   REFC:
> heattest.rst
> > |  MDVEL:
> mdvel
> > |   MDEN:
> mden
> > |  MDCRD:
> density1test.mdcrd
> > | MDINFO:
> mdinfo
> >
> >
> >  Here is the input file:
> >
> > Ligand9
> density
> >
>  &cntrl
>
> >   imin=0,irest=1,
> ntx=5,
> >
> nstlim=5000,dt=0.002,
>
> >   ntc=2,ntf=2, ig=-1,
> iwrap=1,
> >   cut=8.0, ntb=2, ntp=1,
> taup=1.0,
> >   ntpr=5000, ntwx=5000,
> ntwr=10000,
> >   ntt=3,
> gamma_ln=2.0,
> >
> temp0=300.0,
>
> >
> /
>
> >
>
>
> >
>
>
> >
>
>
> >
>
>
> >
> >
> > Note: ig = -1. Setting random seed based on wallclock time in
> microseconds.
> >
> > |--------------------- INFORMATION ----------------------
> > | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> > |                      Version 2.2
> > |
> > |                      08/16/2011
> > |
> > |
> > | Implementation by:
> > |                    Ross C. Walker     (SDSC)
> > |                    Scott Le Grand     (nVIDIA)
> > |                    Duncan Poole       (nVIDIA)
> > |
> > | CAUTION: The CUDA code is currently experimental.
> > |          You use it at your own risk. Be sure to
> > |          check ALL results carefully.
> > |
> > | Precision model in use:
> > |      [SPDP] - Hybrid Single/Double Precision (Default).
> > |
> > |--------------------------------------------------------
> >
> > |------------------- GPU DEVICE INFO --------------------
> > |
> > |   CUDA Capable Devices Detected:      1
> > |           CUDA Device ID in use:      0
> > |                CUDA Device Name: GeForce GTX 580
> > |     CUDA Device Global Mem Size:   3071 MB
> > | CUDA Device Num Multiprocessors:     16
> > |           CUDA Device Core Freq:   1.57 GHz
> > |
> > |--------------------------------------------------------
> >
> >
> > | Conditional Compilation Defines Used:
> > | DIRFRC_COMTRANS
> > | DIRFRC_EFS
> > | DIRFRC_NOVEC
> > | PUBFFT
> > | FFTLOADBAL_2PROC
> > | BINTRAJ
> > | CUDA
> >
> > | Largest sphere to fit in unit cell has radius =    48.492
> >
> > | New format PARM file being parsed.
> > | Version =    1.000 Date = 05/27/11 Time = 11:50:53
> >
> > | Note: 1-4 EEL scale factors were NOT found in the topology file.
> > |       Using default value of 1.2.
> >
> > | Note: 1-4 VDW scale factors were NOT found in the topology file.
> > |       Using default value of 2.0.
> > | Duplicated    0 dihedrals
> >
> > | Duplicated    0 dihedrals
> >
> >
>
> --------------------------------------------------------------------------------
> >    1.  RESOURCE   USE:
> >
>
> --------------------------------------------------------------------------------
> >
> >  getting new box info from bottom of inpcrd
> >
> >  NATOM  =  116271 NTYPES =      21 NBONH =  109977 MBONA  =    6423
> >  NTHETH =   14190 MTHETA =    8659 NPHIH =   27033 MPHIA  =   21543
> >  NHPARM =       0 NPARM  =       0 NNB   =  207403 NRES   =   35368
> >  NBONA  =    6423 NTHETA =    8659 NPHIA =   21543 NUMBND =      59
> >  NUMANG =     124 NPTRA  =      64 NATYP =      40 NPHB   =       1
> >  IFBOX  =       2 NMXRS  =      43 IFCAP =       0 NEXTRA =       0
> >  NCOPY  =       0
> >
> > | Coordinate Index Table dimensions:    23   23   23
> > | Direct force subcell size =     5.1644    5.1644 5.1644
> >
> >      BOX TYPE: TRUNCATED OCTAHEDRON
> >
> >
>
> --------------------------------------------------------------------------------
> >    2.  CONTROL  DATA  FOR  THE  RUN
> >
>
> --------------------------------------------------------------------------------
> >
> >
>
>
> >
> > General flags:
> >      imin    =       0, nmropt  =       0
> >
> > Nature and format of input:
> >      ntx     =       5, irest   =       1, ntrx    =       1
> >
> > Nature and format of output:
> >      ntxo    =       1, ntpr    =    5000, ntrx    =       1, ntwr    =
> 10000
> >      iwrap   =       1, ntwx    =    5000, ntwv    =       0, ntwe
> =       0
> >      ioutfm  =       0, ntwprt  =       0, idecomp =       0,
> rbornstat=      0
> >
> > Potential function:
> >      ntf     =       2, ntb     =       2, igb     =       0, nsnb
> =      25
> >      ipol    =       0, gbsa    =       0, iesp    =       0
> >      dielc   =   1.00000, cut     =   8.00000, intdiel =   1.00000
> >
> > Frozen or restrained atoms:
> >      ibelly  =       0, ntr     =       0
> >
> > Molecular dynamics:
> >      nstlim  =      5000, nscm    =      1000, nrespa  =         1
> >      t       =   0.00000, dt      =   0.00200, vlimit  =  -1.00000
> >
> > Langevin dynamics temperature regulation:
> >      ig      =  974683
> >      temp0   = 300.00000, tempi   =   0.00000, gamma_ln=   2.00000
> >
> > Pressure regulation:
> >      ntp     =       1
> >      pres0   =   1.00000, comp    =  44.60000, taup    =   1.00000
> >
> > SHAKE:
> >      ntc     =       2, jfastw  =       0
> >      tol     =   0.00001
> >
> > | Intermolecular bonds treatment:
> > |     no_intermolecular_bonds =       1
> >
> > | Energy averages sample interval:
> > |     ene_avg_sampling =    5000
> >
> > Ewald parameters:
> >      verbose =       0, ew_type =       0, nbflag  =       1, use_pme
> =       1
> >      vdwmeth =       1, eedmeth =       1, netfrc  =       1
> >      Box X =  118.781   Box Y =  118.781   Box Z =  118.781
> >      Alpha =  109.471   Beta  =  109.471   Gamma =  109.471
> >      NFFT1 =  128       NFFT2 =  128       NFFT3 =  128
> >      Cutoff=    8.000   Tol   =0.100E-04
> >      Ewald Coefficient =  0.34864
> >      Interpolation order =    4
> >
> >
>
> --------------------------------------------------------------------------------
> >    3.  ATOMIC COORDINATES AND VELOCITIES
> >
>
> --------------------------------------------------------------------------------
> >
> >
>
>
> >  begin time read from input coords =    10.000 ps
> >
> >
> >  Number of triangulated 3-point waters found:    34583
> >
> >      Sum of charges from parm topology file =  -0.00000040
> >      Forcing neutrality...
> >
> > | Dynamic Memory, Types Used:
> > | Reals             3524690
> > | Integers          3800219
> >
> > | Nonbonded Pairs Initial Allocation:    19420163
> >
> > | GPU memory information:
> > | KB of GPU memory in use:    882413
> > | KB of CPU memory in use:    104090
> >
> >
>
> --------------------------------------------------------------------------------
> >    4.  RESULTS
> >
>
> --------------------------------------------------------------------------------
> >
> >  ---------------------------------------------------
> >  APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
> >  using   5000.0 points per unit in tabled values
> >  TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
> > | CHECK switch(x): max rel err =   0.2738E-14   at   2.422500
> > | CHECK d/dx switch(x): max rel err =   0.8332E-11   at   2.782960
> >  ---------------------------------------------------
> > |---------------------------------------------------
> > | APPROXIMATING direct energy using CUBIC SPLINE INTERPOLATION
> > |  with   50.0 points per unit in tabled values
> > | Relative Error Limit not exceeded for r .gt.   2.47
> > | APPROXIMATING direct force using CUBIC SPLINE INTERPOLATION
> > |  with   50.0 points per unit in tabled values
> > | Relative Error Limit not exceeded for r .gt.   2.89
> > |---------------------------------------------------
> >  wrapping first mol.:   38.333512154956900
> 54.211771142609109        93.897534410964738
> >  wrapping first mol.:   38.333512154956900
> 54.211771142609109        93.897534410964738
> >
> >  NSTEP =     5000   TIME(PS) =      20.000  TEMP(K) =   300.01  PRESS =
> -23.7
> >  Etot   =   -281399.8069  EKtot   =     71193.9609  EPtot      =
> -352593.7679
> >  BOND   =      2490.5718  ANGLE   =      6429.0655  DIHED      =
> 8582.5720
> >  1-4 NB =      2942.3115  1-4 EEL =     32655.1879  VDWAALS    =
> 42104.9713
> >  EELEC  =   -447798.4479  EHBOND  =         0.0000  RESTRAINT  =
> 0.0000
> >  EKCMT  =     30939.5460  VIRIAL  =     31538.3575  VOLUME     =
> 1170788.5879
> >                                                     Density    =
> 1.0106
> >
>
>  ------------------------------------------------------------------------------
> >
> >
> >       A V E R A G E S   O V E R       1 S T E P S
> >
> >
> >  NSTEP =     5000   TIME(PS) =      20.000  TEMP(K) =   300.01  PRESS =
> -23.7
> >  Etot   =   -281399.8069  EKtot   =     71193.9609  EPtot      =
> -352593.7679
> >  BOND   =      2490.5718  ANGLE   =      6429.0655  DIHED      =
> 8582.5720
> >  1-4 NB =      2942.3115  1-4 EEL =     32655.1879  VDWAALS    =
> 42104.9713
> >  EELEC  =   -447798.4479  EHBOND  =         0.0000  RESTRAINT  =
> 0.0000
> >  EKCMT  =     30939.5460  VIRIAL  =     31538.3575  VOLUME     =
> 1170788.5879
> >                                                     Density    =
> 1.0106
> >
>
>  ------------------------------------------------------------------------------
> >
> >
> >       R M S  F L U C T U A T I O N S
> >
> >
> >  NSTEP =     5000   TIME(PS) =      20.000  TEMP(K) =     0.00  PRESS
> =     0.0
> >  Etot   =         0.0000  EKtot   =         0.0000  EPtot      =
> 0.0000
> >  BOND   =         0.0000  ANGLE   =         0.0000  DIHED      =
> 0.0000
> >  1-4 NB =         0.0000  1-4 EEL =         0.0000  VDWAALS    =
> 0.0000
> >  EELEC  =         0.0000  EHBOND  =         0.0000  RESTRAINT  =
> 0.0000
> >
>
>  ------------------------------------------------------------------------------
> >
> >
>
> --------------------------------------------------------------------------------
> >    5.  TIMINGS
> >
>
> --------------------------------------------------------------------------------
> >
> > |  NonSetup CPU Time in Major Routines:
> > |
> > |     Routine           Sec        %
> > |     ------------------------------
> > |     Nonbond          97.05   92.22
> > |     Bond              0.00    0.00
> > |     Angle             0.00    0.00
> > |     Dihedral          0.00    0.00
> > |     Shake             2.47    2.34
> > |     RunMD             5.71    5.43
> > |     Other             0.00    0.00
> > |     ------------------------------
> > |     Total           105.24
> >
> > |  PME Nonbond Pairlist CPU Time:
> > |
> > |     Routine              Sec        %
> > |     ---------------------------------
> > |     Set Up Cit           0.00    0.00
> > |     Build List           0.00    0.00
> > |     ---------------------------------
> > |     Total                0.00    0.00
> >
> > |  PME Direct Force CPU Time:
> > |
> > |     Routine              Sec        %
> > |     ---------------------------------
> > |     NonBonded Calc       0.00    0.00
> > |     Exclude Masked       0.00    0.00
> > |     Other                0.00    0.00
> > |     ---------------------------------
> > |     Total                0.00    0.00
> >
> > |  PME Reciprocal Force CPU Time:
> > |
> > |     Routine              Sec        %
> > |     ---------------------------------
> > |     1D bspline           0.00    0.00
> > |     Grid Charges         0.00    0.00
> > |     Scalar Sum           0.00    0.00
> > |     Gradient Sum         0.00    0.00
> > |     FFT                  0.00    0.00
> > |     ---------------------------------
> > |     Total                0.00    0.00
> >
> > |  Final Performance Info:
> > |     -----------------------------------------------------
> > |     Average timings for last       0 steps:
> > |         Elapsed(s) =       0.00 Per Step(ms) =  +Infinity
> > |             ns/day =       0.00   seconds/ns =  +Infinity
> > |
> > |     Average timings for all steps:
> > |         Elapsed(s) =     105.26 Per Step(ms) =      21.05
> > |             ns/day =       8.21   seconds/ns =   10525.53
> > |     -----------------------------------------------------
> >
> > |  Setup CPU time:            0.90 seconds
> > |  NonSetup CPU time:       105.24 seconds
> > |  Total CPU time:          106.13 seconds     0.03 hours
> >
> > |  Setup wall time:           1    seconds
> > |  NonSetup wall time:      105    seconds
> > |  Total wall time:         106    seconds     0.03 hours
> >
> >
> > heat Ligand9
> >  &cntrl
> >   irest=0, ntx=1,
> >   nstlim=5000, dt=0.002,
> >   ntc=2,ntf=2, iwrap=1,
> >   cut=8.0, ntb=1, ig=-1,
> >   ntpr=1000, ntwx=1000, ntwr=10000,
> >   ntt=3, gamma_ln=2.0,
> >   tempi=0.0, temp0=300.0,
> >   ioutfm=1, ntr=1,
> >   ntr=1,
> >   /
> > Group input for restrained atoms
> > 2.0
> > RES 1 790
> > END
> > END
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
--
==
Levi C.T. Pierce,  UCSD Graduate Student
McCammon Laboratory
http://mccammon.ucsd.edu/
w: 858-534-2916
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Aug 20 2011 - 07:30:03 PDT
Custom Search