Re: [AMBER] Major GPU Update Released

From: Santosh Mogurampelly <santosh.physics.iisc.ernet.in>
Date: Sun, 21 Aug 2011 10:09:00 +0530 (IST)

Hi Filip,
My GPU details are:

|------------------- GPU DEVICE INFO --------------------
|
| CUDA Capable Devices Detected: 4
| CUDA Device ID in use: 3
| CUDA Device Name: Tesla S2050
| CUDA Device Global Mem Size: 3071 MB
| CUDA Device Num Multiprocessors: 14
| CUDA Device Core Freq: 1.15 GHz
|
|--------------------------------------------------------


Good day,
Santosh.


On Sat, 20 Aug 2011, filip fratev wrote:

> Hi Santosh,
> What kind of GPU you use?
>
>
> Regards,
> Filip
>
>
>
>
> ________________________________
> From: Santosh Mogurampelly <santosh.physics.iisc.ernet.in>
> To: AMBER Mailing List <amber.ambermd.org>
> Sent: Saturday, August 20, 2011 12:39 PM
> Subject: Re: [AMBER] Major GPU Update Released
>
> For my system of 109K atoms, earlier I get 3.75 ns/day and now with
> new updates I get 6.5 ns/day. Almost double! Great news for us. Thanks a
> lot Prof. Ross and team.
>
> Santosh
>
>
>
> On Sat, 20 Aug 2011, Levi Pierce wrote:
>
>> Had a chance to sit down and test out the new patch.  Wow! Very
>> impressive performance boost on a variety of systems I have been running
>> pmemd.cuda on.  Great work!
>>
>> On Fri, Aug 19, 2011 at 4:37 PM, Scott Le Grand <varelse2005.gmail.com>wrote:
>>
>>> Use a different gpu foe display I suspect
>>> On Aug 19, 2011 4:09 PM, "filip fratev" <filipfratev.yahoo.com> wrote:
>>>> Hi Ross,
>>>> I compiled the new code and performed many tests and the results are
>>> really impressive! I will post later.
>>>>
>>>> However, I am in a big trouble with my systems (116K atoms) and hope that
>>> you will be able to help me.
>>>> The problem is that with the new code I am not able to simulate these
>>> proteins (116K) with GTX590 (1.5GB per core), because of some memory
>>> issue/bug:
>>>> cudaMalloc GpuBuffer::Allocate failed out of memory
>>>>
>>>> With the older code I had no any problems with same input files and
>>> configuration. I tried both NPT and NVT but the same problem...
>>>> Then I use GTX580 3GB and it works fine. From output you can see that the
>>> requested memory is just 882MB:
>>>> For NPT:
>>>> | GPU memory information:
>>>> | KB of GPU memory in use:    882413
>>>> | KB of CPU memory in use:    104090
>>>>
>>>> and for restrained NVT:
>>>>
>>>> | GPU memory information:
>>>> | KB of GPU memory in use:  1006146
>>>> | KB of CPU memory in use:    99724
>>>> Thus I shouldn’t have any problem.
>>>>
>>>> What could be the issue and how I can solve it?
>>>>
>>>> Regards,
>>>> Filip
>>>>
>>>> Below is the output file (my NPT density.out) and heat.in:
>>>>
>>>>           -------------------------------------------------------
>>>>           Amber 11 SANDER                              2010
>>>>           -------------------------------------------------------
>>>>
>>>> | PMEMD implementation of SANDER, Release 11
>>>>
>>>> | Run on 08/20/2011 at 01:42:20
>>>>
>>>>   [-O]verwriting output
>>>>
>>>> File Assignments:
>>>> |  MDIN:
>>> densityF.in
>>>> |  MDOUT:
>>> 0densitytest580Karti.out
>>>> | INPCRD:
>>> heattest.rst
>>>> |  PARM:
>>> MyosinWT.prmtop
>>>> | RESTRT:
>>> density1test.rst
>>>> |  REFC:
>>> heattest.rst
>>>> |  MDVEL:
>>> mdvel
>>>> |  MDEN:
>>> mden
>>>> |  MDCRD:
>>> density1test.mdcrd
>>>> | MDINFO:
>>> mdinfo
>>>>
>>>>
>>>>   Here is the input file:
>>>>
>>>> Ligand9
>>> density
>>>>
>>>   &cntrl
>>>
>>>>   imin=0,irest=1,
>>> ntx=5,
>>>>
>>> nstlim=5000,dt=0.002,
>>>
>>>>   ntc=2,ntf=2, ig=-1,
>>> iwrap=1,
>>>>   cut=8.0, ntb=2, ntp=1,
>>> taup=1.0,
>>>>   ntpr=5000, ntwx=5000,
>>> ntwr=10000,
>>>>   ntt=3,
>>> gamma_ln=2.0,
>>>>
>>> temp0=300.0,
>>>
>>>>
>>> /
>>>
>>>>
>>>
>>>
>>>>
>>>
>>>
>>>>
>>>
>>>
>>>>
>>>
>>>
>>>>
>>>>
>>>> Note: ig = -1. Setting random seed based on wallclock time in
>>> microseconds.
>>>>
>>>> |--------------------- INFORMATION ----------------------
>>>> | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>>>> |                      Version 2.2
>>>> |
>>>> |                      08/16/2011
>>>> |
>>>> |
>>>> | Implementation by:
>>>> |                    Ross C. Walker    (SDSC)
>>>> |                    Scott Le Grand    (nVIDIA)
>>>> |                    Duncan Poole      (nVIDIA)
>>>> |
>>>> | CAUTION: The CUDA code is currently experimental.
>>>> |          You use it at your own risk. Be sure to
>>>> |          check ALL results carefully.
>>>> |
>>>> | Precision model in use:
>>>> |      [SPDP] - Hybrid Single/Double Precision (Default).
>>>> |
>>>> |--------------------------------------------------------
>>>>
>>>> |------------------- GPU DEVICE INFO --------------------
>>>> |
>>>> |  CUDA Capable Devices Detected:      1
>>>> |          CUDA Device ID in use:      0
>>>> |                CUDA Device Name: GeForce GTX 580
>>>> |    CUDA Device Global Mem Size:  3071 MB
>>>> | CUDA Device Num Multiprocessors:    16
>>>> |          CUDA Device Core Freq:  1.57 GHz
>>>> |
>>>> |--------------------------------------------------------
>>>>
>>>>
>>>> | Conditional Compilation Defines Used:
>>>> | DIRFRC_COMTRANS
>>>> | DIRFRC_EFS
>>>> | DIRFRC_NOVEC
>>>> | PUBFFT
>>>> | FFTLOADBAL_2PROC
>>>> | BINTRAJ
>>>> | CUDA
>>>>
>>>> | Largest sphere to fit in unit cell has radius =    48.492
>>>>
>>>> | New format PARM file being parsed.
>>>> | Version =    1.000 Date = 05/27/11 Time = 11:50:53
>>>>
>>>> | Note: 1-4 EEL scale factors were NOT found in the topology file.
>>>> |      Using default value of 1.2.
>>>>
>>>> | Note: 1-4 VDW scale factors were NOT found in the topology file.
>>>> |      Using default value of 2.0.
>>>> | Duplicated    0 dihedrals
>>>>
>>>> | Duplicated    0 dihedrals
>>>>
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>     1.  RESOURCE  USE:
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>
>>>>   getting new box info from bottom of inpcrd
>>>>
>>>>   NATOM  =  116271 NTYPES =      21 NBONH =  109977 MBONA  =    6423
>>>>   NTHETH =  14190 MTHETA =    8659 NPHIH =  27033 MPHIA  =  21543
>>>>   NHPARM =      0 NPARM  =      0 NNB  =  207403 NRES  =  35368
>>>>   NBONA  =    6423 NTHETA =    8659 NPHIA =  21543 NUMBND =      59
>>>>   NUMANG =    124 NPTRA  =      64 NATYP =      40 NPHB  =      1
>>>>   IFBOX  =      2 NMXRS  =      43 IFCAP =      0 NEXTRA =      0
>>>>   NCOPY  =      0
>>>>
>>>> | Coordinate Index Table dimensions:    23  23  23
>>>> | Direct force subcell size =    5.1644    5.1644 5.1644
>>>>
>>>>       BOX TYPE: TRUNCATED OCTAHEDRON
>>>>
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>     2.  CONTROL  DATA  FOR  THE  RUN
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>
>>>>
>>>
>>>
>>>>
>>>> General flags:
>>>>       imin    =      0, nmropt  =      0
>>>>
>>>> Nature and format of input:
>>>>       ntx    =      5, irest  =      1, ntrx    =      1
>>>>
>>>> Nature and format of output:
>>>>       ntxo    =      1, ntpr    =    5000, ntrx    =      1, ntwr    =
>>> 10000
>>>>       iwrap  =      1, ntwx    =    5000, ntwv    =      0, ntwe
>>> =      0
>>>>       ioutfm  =      0, ntwprt  =      0, idecomp =      0,
>>> rbornstat=      0
>>>>
>>>> Potential function:
>>>>       ntf    =      2, ntb    =      2, igb    =      0, nsnb
>>> =      25
>>>>       ipol    =      0, gbsa    =      0, iesp    =      0
>>>>       dielc  =  1.00000, cut    =  8.00000, intdiel =  1.00000
>>>>
>>>> Frozen or restrained atoms:
>>>>       ibelly  =      0, ntr    =      0
>>>>
>>>> Molecular dynamics:
>>>>       nstlim  =      5000, nscm    =      1000, nrespa  =        1
>>>>       t      =  0.00000, dt      =  0.00200, vlimit  =  -1.00000
>>>>
>>>> Langevin dynamics temperature regulation:
>>>>       ig      =  974683
>>>>       temp0  = 300.00000, tempi  =  0.00000, gamma_ln=  2.00000
>>>>
>>>> Pressure regulation:
>>>>       ntp    =      1
>>>>       pres0  =  1.00000, comp    =  44.60000, taup    =  1.00000
>>>>
>>>> SHAKE:
>>>>       ntc    =      2, jfastw  =      0
>>>>       tol    =  0.00001
>>>>
>>>> | Intermolecular bonds treatment:
>>>> |    no_intermolecular_bonds =      1
>>>>
>>>> | Energy averages sample interval:
>>>> |    ene_avg_sampling =    5000
>>>>
>>>> Ewald parameters:
>>>>       verbose =      0, ew_type =      0, nbflag  =      1, use_pme
>>> =      1
>>>>       vdwmeth =      1, eedmeth =      1, netfrc  =      1
>>>>       Box X =  118.781  Box Y =  118.781  Box Z =  118.781
>>>>       Alpha =  109.471  Beta  =  109.471  Gamma =  109.471
>>>>       NFFT1 =  128      NFFT2 =  128      NFFT3 =  128
>>>>       Cutoff=    8.000  Tol  =0.100E-04
>>>>       Ewald Coefficient =  0.34864
>>>>       Interpolation order =    4
>>>>
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>     3.  ATOMIC COORDINATES AND VELOCITIES
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>
>>>>
>>>
>>>
>>>>   begin time read from input coords =    10.000 ps
>>>>
>>>>
>>>>   Number of triangulated 3-point waters found:    34583
>>>>
>>>>       Sum of charges from parm topology file =  -0.00000040
>>>>       Forcing neutrality...
>>>>
>>>> | Dynamic Memory, Types Used:
>>>> | Reals            3524690
>>>> | Integers          3800219
>>>>
>>>> | Nonbonded Pairs Initial Allocation:    19420163
>>>>
>>>> | GPU memory information:
>>>> | KB of GPU memory in use:    882413
>>>> | KB of CPU memory in use:    104090
>>>>
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>     4.  RESULTS
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>
>>>>   ---------------------------------------------------
>>>>   APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
>>>>   using  5000.0 points per unit in tabled values
>>>>   TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
>>>> | CHECK switch(x): max rel err =  0.2738E-14  at  2.422500
>>>> | CHECK d/dx switch(x): max rel err =  0.8332E-11  at  2.782960
>>>>   ---------------------------------------------------
>>>> |---------------------------------------------------
>>>> | APPROXIMATING direct energy using CUBIC SPLINE INTERPOLATION
>>>> |  with  50.0 points per unit in tabled values
>>>> | Relative Error Limit not exceeded for r .gt.  2.47
>>>> | APPROXIMATING direct force using CUBIC SPLINE INTERPOLATION
>>>> |  with  50.0 points per unit in tabled values
>>>> | Relative Error Limit not exceeded for r .gt.  2.89
>>>> |---------------------------------------------------
>>>>   wrapping first mol.:  38.333512154956900
>>> 54.211771142609109        93.897534410964738
>>>>   wrapping first mol.:  38.333512154956900
>>> 54.211771142609109        93.897534410964738
>>>>
>>>>   NSTEP =    5000  TIME(PS) =      20.000  TEMP(K) =  300.01  PRESS =
>>> -23.7
>>>>   Etot  =  -281399.8069  EKtot  =    71193.9609  EPtot      =
>>> -352593.7679
>>>>   BOND  =      2490.5718  ANGLE  =      6429.0655  DIHED      =
>>> 8582.5720
>>>>   1-4 NB =      2942.3115  1-4 EEL =    32655.1879  VDWAALS    =
>>> 42104.9713
>>>>   EELEC  =  -447798.4479  EHBOND  =        0.0000  RESTRAINT  =
>>> 0.0000
>>>>   EKCMT  =    30939.5460  VIRIAL  =    31538.3575  VOLUME    =
>>> 1170788.5879
>>>>                                                     Density    =
>>> 1.0106
>>>>
>>>
>>>   ------------------------------------------------------------------------------
>>>>
>>>>
>>>>       A V E R A G E S  O V E R      1 S T E P S
>>>>
>>>>
>>>>   NSTEP =    5000  TIME(PS) =      20.000  TEMP(K) =  300.01  PRESS =
>>> -23.7
>>>>   Etot  =  -281399.8069  EKtot  =    71193.9609  EPtot      =
>>> -352593.7679
>>>>   BOND  =      2490.5718  ANGLE  =      6429.0655  DIHED      =
>>> 8582.5720
>>>>   1-4 NB =      2942.3115  1-4 EEL =    32655.1879  VDWAALS    =
>>> 42104.9713
>>>>   EELEC  =  -447798.4479  EHBOND  =        0.0000  RESTRAINT  =
>>> 0.0000
>>>>   EKCMT  =    30939.5460  VIRIAL  =    31538.3575  VOLUME    =
>>> 1170788.5879
>>>>                                                     Density    =
>>> 1.0106
>>>>
>>>
>>>   ------------------------------------------------------------------------------
>>>>
>>>>
>>>>       R M S  F L U C T U A T I O N S
>>>>
>>>>
>>>>   NSTEP =    5000  TIME(PS) =      20.000  TEMP(K) =    0.00  PRESS
>>> =    0.0
>>>>   Etot  =        0.0000  EKtot  =        0.0000  EPtot      =
>>> 0.0000
>>>>   BOND  =        0.0000  ANGLE  =        0.0000  DIHED      =
>>> 0.0000
>>>>   1-4 NB =        0.0000  1-4 EEL =        0.0000  VDWAALS    =
>>> 0.0000
>>>>   EELEC  =        0.0000  EHBOND  =        0.0000  RESTRAINT  =
>>> 0.0000
>>>>
>>>
>>>   ------------------------------------------------------------------------------
>>>>
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>     5.  TIMINGS
>>>>
>>>
>>> --------------------------------------------------------------------------------
>>>>
>>>> |  NonSetup CPU Time in Major Routines:
>>>> |
>>>> |    Routine          Sec        %
>>>> |    ------------------------------
>>>> |    Nonbond          97.05  92.22
>>>> |    Bond              0.00    0.00
>>>> |    Angle            0.00    0.00
>>>> |    Dihedral          0.00    0.00
>>>> |    Shake            2.47    2.34
>>>> |    RunMD            5.71    5.43
>>>> |    Other            0.00    0.00
>>>> |    ------------------------------
>>>> |    Total          105.24
>>>>
>>>> |  PME Nonbond Pairlist CPU Time:
>>>> |
>>>> |    Routine              Sec        %
>>>> |    ---------------------------------
>>>> |    Set Up Cit          0.00    0.00
>>>> |    Build List          0.00    0.00
>>>> |    ---------------------------------
>>>> |    Total                0.00    0.00
>>>>
>>>> |  PME Direct Force CPU Time:
>>>> |
>>>> |    Routine              Sec        %
>>>> |    ---------------------------------
>>>> |    NonBonded Calc      0.00    0.00
>>>> |    Exclude Masked      0.00    0.00
>>>> |    Other                0.00    0.00
>>>> |    ---------------------------------
>>>> |    Total                0.00    0.00
>>>>
>>>> |  PME Reciprocal Force CPU Time:
>>>> |
>>>> |    Routine              Sec        %
>>>> |    ---------------------------------
>>>> |    1D bspline          0.00    0.00
>>>> |    Grid Charges        0.00    0.00
>>>> |    Scalar Sum          0.00    0.00
>>>> |    Gradient Sum        0.00    0.00
>>>> |    FFT                  0.00    0.00
>>>> |    ---------------------------------
>>>> |    Total                0.00    0.00
>>>>
>>>> |  Final Performance Info:
>>>> |    -----------------------------------------------------
>>>> |    Average timings for last      0 steps:
>>>> |        Elapsed(s) =      0.00 Per Step(ms) =  +Infinity
>>>> |            ns/day =      0.00  seconds/ns =  +Infinity
>>>> |
>>>> |    Average timings for all steps:
>>>> |        Elapsed(s) =    105.26 Per Step(ms) =      21.05
>>>> |            ns/day =      8.21  seconds/ns =  10525.53
>>>> |    -----------------------------------------------------
>>>>
>>>> |  Setup CPU time:            0.90 seconds
>>>> |  NonSetup CPU time:      105.24 seconds
>>>> |  Total CPU time:          106.13 seconds    0.03 hours
>>>>
>>>> |  Setup wall time:          1    seconds
>>>> |  NonSetup wall time:      105    seconds
>>>> |  Total wall time:        106    seconds    0.03 hours
>>>>
>>>>
>>>> heat Ligand9
>>>>   &cntrl
>>>>   irest=0, ntx=1,
>>>>   nstlim=5000, dt=0.002,
>>>>   ntc=2,ntf=2, iwrap=1,
>>>>   cut=8.0, ntb=1, ig=-1,
>>>>   ntpr=1000, ntwx=1000, ntwr=10000,
>>>>   ntt=3, gamma_ln=2.0,
>>>>   tempi=0.0, temp0=300.0,
>>>>   ioutfm=1, ntr=1,
>>>>   ntr=1,
>>>>   /
>>>> Group input for restrained atoms
>>>> 2.0
>>>> RES 1 790
>>>> END
>>>> END
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>
>>
>>
>>
>
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Aug 20 2011 - 22:30:03 PDT
Custom Search