Ok sir
I am not any familiar to GPU and cuda.
I will do the patch as you said.
Thank you very much sir.
I will report as early as possible.
thanking you
On Fri, Mar 15, 2013 at 9:57 PM, Ross Walker <ross.rosswalker.co.uk> wrote:
> Dear Mary,
>
> Ok, firstly you have not listened to the advice and updated the software
> hence you are still running with the original release version, bugs and
> all.
>
> In terms of running calculations for just running on 1 GPU you do not need
> mpirun at all. So you would do:
>
> $AMBERHOME/pmemd.cuda -O -i ..
>
> Note the 'pmemd.cuda' does NOT have the .MPI here.
>
> All the best
> Ross
>
>
>
>
> On 3/15/13 9:23 AM, "Mary Varughese" <maryvj1985.gmail.com> wrote:
>
> >Sir,
> >
> >
> >
> >The input of the .out file i have sent is
> >
> >mpirun -n 8 pmemd.cuda.MPI -O -i TbNrb_md9.in -p TbNrb.prmtop -c
> >TbNrb_md8.rst -o TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd
> >
> >
> >
> >*The system i am using has 24 CPU's and a single GPU with 512 cores. *
> >
> >*As i understand now, in single GPU there is no need of mpdboot, no need
> >to
> >specify no of cores to be used, as in accordance with % of utilization
> >load
> >two or more programs and its all thread base.*
> >
> >
> >
> >So i tried this command since .MPI work only with more than 1 GPU
> >
> >
> >
> >mpirun -n 1 pmemd.cuda -O -i TbNrb_md9.in -p TbNrb.prmtop -c TbNrb_md8.rst
> >-o TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd
> >
> >
> >
> >the output got waited at the last step for more than half hour and i
> >terminated the program.
> >
> >
> >
> >          -------------------------------------------------------
> >
> >          Amber 11 SANDER                              2010
> >
> >          -------------------------------------------------------
> >
> >| PMEMD implementation of SANDER, Release 11
> >
> >| Run on 03/15/2013 at 18:50:47
> >
> >  [-O]verwriting output
> >
> >File Assignments:
> >
> >|   MDIN: TbNrb_md9.in
> >
> >
> >|  MDOUT: TbNrb_md9.out
> >
> >
> >| INPCRD: TbNrb_md8.rst
> >
> >
> >|   PARM: TbNrb.prmtop
> >
> >
> >| RESTRT: TbNrb_md9.rst
> >
> >
> >|   REFC: refc
> >
> >
> >|  MDVEL: mdvel
> >
> >
> >|   MDEN: mden
> >
> >
> >|  MDCRD: TbNrb_md9.mdcrd
> >
> >
> >| MDINFO: mdinfo
> >
> >
> > Here is the input file:
> >
> >Tb-Ntr complex : 200ps MD (production run in NPT)
> >
> >
> > &cntrl
> >
> >
> >  imin   = 0,
> >
> >
> >  irest  = 1,
> >
> >
> >  ntx    = 5,
> >
> >
> >  ntb    = 2, ntp = 1, pres0 = 1.0,
> >
> >
> >  cut    = 10,
> >
> >
> >  ntr    = 0,
> >
> >
> >  ntc    = 2,
> >
> >
> >  ntf    = 2,
> >
> >
> >  tempi  = 300.0,
> >
> >
> >  temp0  = 300.0,
> >
> >
> >  ntt    = 3,
> >
> >
> >  gamma_ln = 1,
> >
> >
> >  nstlim = 1000, dt = 0.002,
> >
> >
> >  ntpr = 500, ntwx = 500, ntwr = 1000,
> >
> >
> > /
> >
> >
> >|--------------------- INFORMATION ----------------------
> >
> >| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> >
> >|                     Version 12.0
> >
> >|                      03/19/2012
> >
> >| Implementation by:
> >
> >|                    Ross C. Walker     (SDSC)
> >
> >|                    Scott Le Grand     (nVIDIA)
> >
> >|                    Duncan Poole       (nVIDIA)
> >
> >| CAUTION: The CUDA code is currently experimental.
> >
> >|          You use it at your own risk. Be sure to
> >
> >|          check ALL results carefully.
> >
> >| Precision model in use:
> >
> >|      [SPDP] - Hybrid Single/Double Precision (Default).
> >
> >|--------------------------------------------------------
> >
> >|------------------- GPU DEVICE INFO --------------------
> >
> >|   CUDA Capable Devices Detected:      1
> >
> >|           CUDA Device ID in use:      0
> >
> >|                CUDA Device Name: Tesla M2090
> >
> >|     CUDA Device Global Mem Size:   5375 MB
> >
> >| CUDA Device Num Multiprocessors:     16
> >
> >|           CUDA Device Core Freq:   1.30 GHz
> >
> >|
> >
> >|--------------------------------------------------------
> >
> >| Conditional Compilation Defines Used:
> >
> >| DIRFRC_COMTRANS
> >
> >| DIRFRC_EFS
> >
> >| DIRFRC_NOVEC
> >
> >| PUBFFT
> >
> >| FFTLOADBAL_2PROC
> >
> >| BINTRAJ
> >
> >| CUDA
> >
> >| Largest sphere to fit in unit cell has radius =    47.238
> >
> >| New format PARM file being parsed.
> >
> >| Version =    1.000 Date = 10/04/12 Time = 11:18:48
> >
> >| Note: 1-4 EEL scale factors are being read from the topology file.
> >
> >| Note: 1-4 VDW scale factors are being read from the topology file.
> >
> >| Duplicated    0 dihedrals
> >
> >| Duplicated    0 dihedrals
> >
> >--------------------------------------------------------------------------
> >------
> >
> >   1.  RESOURCE   USE:
> >
> >--------------------------------------------------------------------------
> >------
> >
> > getting new box info from bottom of inpcrd
> >
> > NATOM  =  119092 NTYPES =      20 NBONH =  112233 MBONA  =    6978
> >
> > NTHETH =   14955 MTHETA =    9471 NPHIH =   29675 MPHIA  =   23498
> >
> > NHPARM =       0 NPARM  =       0 NNB   =  214731 NRES   =   36118
> >
> > NBONA  =    6978 NTHETA =    9471 NPHIA =   23498 NUMBND =      78
> >
> > NUMANG =     158 NPTRA  =      71 NATYP =      52 NPHB   =       1
> >
> > IFBOX  =       2 NMXRS  =      44 IFCAP =       0 NEXTRA =       0
> >
> > NCOPY  =       0
> >
> >| Coordinate Index Table dimensions:    18   18   18
> >
> >| Direct force subcell size =     6.4283    6.4283    6.4283
> >
> >     BOX TYPE: TRUNCATED OCTAHEDRON
> >
> >--------------------------------------------------------------------------
> >------
> >
> >   2.  CONTROL  DATA  FOR  THE  RUN
> >
> >--------------------------------------------------------------------------
> >------
> >
> >
> >General flags:
> >
> >     imin    =       0, nmropt  =       0
> >
> >Nature and format of input:
> >
> >     ntx     =       5, irest   =       1, ntrx    =       1
> >
> >Nature and format of output:
> >
> >     ntxo    =       1, ntpr    =     500, ntrx    =       1, ntwr    =
> >1000
> >
> >     iwrap   =       0, ntwx    =     500, ntwv    =       0, ntwe    =
> >    0
> >
> >     ioutfm  =       0, ntwprt  =       0, idecomp =       0, rbornstat=
> >0
> >
> >Potential function:
> >
> >     ntf     =       2, ntb     =       2, igb     =       0, nsnb    =
> > 25
> >
> >     ipol    =       0, gbsa    =       0, iesp    =       0
> >
> >     dielc   =   1.00000, cut     =  10.00000, intdiel =   1.00000
> >
> >Frozen or restrained atoms:
> >
> >     ibelly  =       0, ntr     =       0
> >
> >Molecular dynamics:
> >
> >     nstlim  =      1000, nscm    =      1000, nrespa  =         1
> >
> >     t       =   0.00000, dt      =   0.00200, vlimit  =  -1.00000
> >
> >Langevin dynamics temperature regulation:
> >
> >     ig      =   71277
> >
> >     temp0   = 300.00000, tempi   = 300.00000, gamma_ln=   1.00000
> >
> >Pressure regulation:
> >
> >     ntp     =       1
> >
> >     pres0   =   1.00000, comp    =  44.60000, taup    =   1.00000
> >
> >SHAKE:
> >
> >     ntc     =       2, jfastw  =       0
> >
> >     tol     =   0.00001
> >
> >| Intermolecular bonds treatment:
> >
> >|     no_intermolecular_bonds =       1
> >
> >| Energy averages sample interval:
> >
> >|     ene_avg_sampling =     500
> >
> > Ewald parameters:
> >
> >     verbose =       0, ew_type =       0, nbflag  =       1, use_pme =
> >1
> >
> >     vdwmeth =       1, eedmeth =       1, netfrc  =       1
> >
> >     Box X =  115.709   Box Y =  115.709   Box Z =  115.709
> >
> >     Alpha =  109.471   Beta  =  109.471   Gamma =  109.471
> >
> >     NFFT1 =  120       NFFT2 =  120       NFFT3 =  120
> >
> >     Cutoff=   10.000   Tol   =0.100E-04
> >
> >     Ewald Coefficient =  0.27511
> >
> >     Interpolation order =    4
> >
> >--------------------------------------------------------------------------
> >------
> >
> >   3.  ATOMIC COORDINATES AND VELOCITIES
> >
> >--------------------------------------------------------------------------
> >------
> >
> > begin time read from input coords =   400.000 ps
> >
> > Number of triangulated 3-point waters found:    35215
> >
> >     Sum of charges from parm topology file =  -0.00000042
> >
> >     Forcing neutrality...
> >
> >* *
> >
> >* *
> >* *
> >
> >***Then i tried this: waiting at the last step, i wait for 45 minutes and
> >terminate.*
> >* *
> >
> >
> >
> >pmemd.cuda -O -i TbNrb_md9.in -p TbNrb.prmtop -c TbNrb_md8.rst -o
> >TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd
> >
> >
> >
> >          -------------------------------------------------------
> >
> >          Amber 11 SANDER                              2010
> >
> >          -------------------------------------------------------
> >
> >| PMEMD implementation of SANDER, Release 11
> >
> >| Run on 03/15/2013 at 19:14:46
> >
> >  [-O]verwriting output
> >
> >File Assignments:
> >
> >|   MDIN: TbNrb_md9.in
> >
> >
> >|  MDOUT: TbNrb_md9.out
> >
> >
> >| INPCRD: TbNrb_md8.rst
> >
> >
> >|   PARM: TbNrb.prmtop
> >
> >
> >| RESTRT: TbNrb_md9.rst
> >
> >
> >|   REFC: refc
> >
> >
> >|  MDVEL: mdvel
> >
> >
> >|   MDEN: mden
> >
> >
> >|  MDCRD: TbNrb_md9.mdcrd
> >
> >
> >| MDINFO: mdinfo
> >
> >
> > Here is the input file:
> >
> >Tb-Ntr complex : 200ps MD (production run in NPT)
> >
> >
> > &cntrl
> >
> >
> >  imin   = 0,
> >
> >
> >  irest  = 1,
> >
> >
> >  ntx    = 5,
> >
> >
> >  ntb    = 2, ntp = 1, pres0 = 1.0,
> >
> >
> >  cut    = 10,
> >
> >
> >  ntr    = 0,
> >
> >
> >  ntc    = 2,
> >
> >
> >  ntf    = 2,
> >
> >
> >  tempi  = 300.0,
> >
> >
> >  temp0  = 300.0,
> >
> >
> >  ntt    = 3,
> >
> >
> >  gamma_ln = 1,
> >
> >
> >  nstlim = 1000, dt = 0.002,
> >
> >
> >  ntpr = 500, ntwx = 500, ntwr = 1000,
> >
> >
> > /
> >
> >
> >|--------------------- INFORMATION ----------------------
> >
> >| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> >
> >|                     Version 12.0
> >
> >|                      03/19/2012
> >
> >| Implementation by:
> >
> >|                    Ross C. Walker     (SDSC)
> >
> >|                    Scott Le Grand     (nVIDIA)
> >
> >|                    Duncan Poole       (nVIDIA)
> >
> >| CAUTION: The CUDA code is currently experimental.
> >
> >|          You use it at your own risk. Be sure to
> >
> >|          check ALL results carefully.
> >
> >| Precision model in use:
> >
> >|      [SPDP] - Hybrid Single/Double Precision (Default).
> >
> >|------------------- GPU DEVICE INFO --------------------
> >
> >|   CUDA Capable Devices Detected:      1
> >
> >|           CUDA Device ID in use:      0
> >
> >|                CUDA Device Name: Tesla M2090
> >
> >|     CUDA Device Global Mem Size:   5375 MB
> >
> >| CUDA Device Num Multiprocessors:     16
> >
> >|           CUDA Device Core Freq:   1.30 GHz
> >
> >|--------------------------------------------------------
> >
> >| Conditional Compilation Defines Used:
> >
> >| DIRFRC_COMTRANS
> >
> >| DIRFRC_EFS
> >
> >| DIRFRC_NOVEC
> >
> >| PUBFFT
> >
> >| FFTLOADBAL_2PROC
> >
> >| BINTRAJ
> >
> >| CUDA
> >
> >
> >
> >| Largest sphere to fit in unit cell has radius =    47.238
> >
> >| New format PARM file being parsed.
> >
> >| Version =    1.000 Date = 10/04/12 Time = 11:18:48
> >
> >| Note: 1-4 EEL scale factors are being read from the topology file.
> >
> >| Note: 1-4 VDW scale factors are being read from the topology file.
> >
> >| Duplicated    0 dihedrals
> >
> >| Duplicated    0 dihedrals
> >
> >--------------------------------------------------------------------------
> >------
> >
> >   1.  RESOURCE   USE:
> >
> >--------------------------------------------------------------------------
> >------
> >
> > getting new box info from bottom of inpcrd
> >
> > NATOM  =  119092 NTYPES =      20 NBONH =  112233 MBONA  =    6978
> >
> > NTHETH =   14955 MTHETA =    9471 NPHIH =   29675 MPHIA  =   23498
> >
> > NHPARM =       0 NPARM  =       0 NNB   =  214731 NRES   =   36118
> >
> > NBONA  =    6978 NTHETA =    9471 NPHIA =   23498 NUMBND =      78
> >
> > NUMANG =     158 NPTRA  =      71 NATYP =      52 NPHB   =       1
> >
> > IFBOX  =       2 NMXRS  =      44 IFCAP =       0 NEXTRA =       0
> >
> > NCOPY  =       0
> >
> >| Coordinate Index Table dimensions:    18   18   18
> >
> >| Direct force subcell size =     6.4283    6.4283    6.4283
> >
> >     BOX TYPE: TRUNCATED OCTAHEDRON
> >
> >--------------------------------------------------------------------------
> >------
> >
> >   2.  CONTROL  DATA  FOR  THE  RUN
> >
> >--------------------------------------------------------------------------
> >------
> >
> >General flags:
> >
> >     imin    =       0, nmropt  =       0
> >
> >Nature and format of input:
> >
> >     ntx     =       5, irest   =       1, ntrx    =       1
> >
> >Nature and format of output:
> >
> >     ntxo    =       1, ntpr    =     500, ntrx    =       1, ntwr    =
> >1000
> >
> >     iwrap   =       0, ntwx    =     500, ntwv    =       0, ntwe    =
> >     0
> >
> >     ioutfm  =       0, ntwprt  =       0, idecomp =       0, rbornstat=
> >0
> >
> >Potential function:
> >
> >     ntf     =       2, ntb     =       2, igb     =       0, nsnb    =
> >25
> >
> >     ipol    =       0, gbsa    =       0, iesp    =       0
> >
> >     dielc   =   1.00000, cut     =  10.00000, intdiel =   1.00000
> >
> >Frozen or restrained atoms:
> >
> >     ibelly  =       0, ntr     =       0
> >
> >Molecular dynamics:
> >
> >     nstlim  =      1000, nscm    =      1000, nrespa  =         1
> >
> >     t       =   0.00000, dt      =   0.00200, vlimit  =  -1.00000
> >
> >Langevin dynamics temperature regulation:
> >
> >     ig      =   71277
> >
> >     temp0   = 300.00000, tempi   = 300.00000, gamma_ln=   1.00000
> >
> >Pressure regulation:
> >
> >     ntp     =       1
> >
> >     pres0   =   1.00000, comp    =  44.60000, taup    =   1.00000
> >
> >SHAKE:
> >
> >     ntc     =       2, jfastw  =       0
> >
> >     tol     =   0.00001
> >
> >| Intermolecular bonds treatment:
> >
> >|     no_intermolecular_bonds =       1
> >
> >| Energy averages sample interval:
> >
> >|     ene_avg_sampling =     500
> >
> >
> >
> >Ewald parameters:
> >
> >     verbose =       0, ew_type =       0, nbflag  =       1, use_pme =
> >1
> >
> >     vdwmeth =       1, eedmeth =       1, netfrc  =       1
> >
> >     Box X =  115.709   Box Y =  115.709   Box Z =  115.709
> >
> >     Alpha =  109.471   Beta  =  109.471   Gamma =  109.471
> >
> >     NFFT1 =  120       NFFT2 =  120       NFFT3 =  120
> >
> >     Cutoff=   10.000   Tol   =0.100E-04
> >
> >     Ewald Coefficient =  0.27511
> >
> >     Interpolation order =    4
> >
> >--------------------------------------------------------------------------
> >------
> >
> >   3.  ATOMIC COORDINATES AND VELOCITIES
> >
> >*
> >--------------------------------------------------------------------------
> >-----
> >*
> >
> > begin time read from input coords =   400.000 ps
> >
> > Number of triangulated 3-point waters found:    35215
> >
> >     Sum of charges from parm topology file =  -0.00000042
> >
> >     Forcing neutrality...
> >
> >
> >
> >
> >
> >Am i doing anything wrong.
> >
> >Please tell me if my input has some problem.
> >
> >Someone says if there is only one GPU only one core will work.
> >
> >Please tell me the correct syntax of using pmemd.
> >
> >
> >
> >Also I donšt understand what this really meant
> >
> >if (igb/=0 & cut<systemsize)
> >
> >*GPU accelerated implicit solvent GB simulations do not support a cutoff.*
> >
> >I am using the same input used for sander!
> >I will work on the patch.
> >
> >Thanking you
> >
> >
> >On Fri, Mar 15, 2013 at 8:57 PM, Ross Walker <ross.rosswalker.co.uk>
> >wrote:
> >
> >> Hi Mary,
> >>
> >> Please read the following page: http://ambermd.org/gpus/
> >>
> >> This has all the information you should need for running correctly on
> >>GPUs.
> >>
> >> All the best
> >> Ross
> >>
> >>
> >>
> >>
> >> On 3/14/13 8:20 PM, "Mary Varughese" <maryvj1985.gmail.com> wrote:
> >>
> >> >Sir,
> >> >
> >> >Infact this is a single GPU with 24 cores as i understand.
> >> >bugixes have been done.
> >> >But i will  try the step u suggested.
> >> >Also this work run without any problem in CPU workstaion.
> >> >Hope the input doesnt contain any variable not compatible with pmemd!
> >> >
> >> >Thanking you
> >> >
> >> >On Thu, Mar 14, 2013 at 9:16 PM, Ross Walker <ross.rosswalker.co.uk>
> >> >wrote:
> >> >
> >> >> Hi Mary,
> >> >>
> >> >> 8 GPUs is a lot to use you probably won't get optimal scaling unless
> >>you
> >> >> have very good interconnect and only 1 GPU per node. Some things to
> >>try
> >> >>/
> >> >> consider:
> >> >>
> >> >>
> >> >> >|--------------------- INFORMATION ----------------------
> >> >> >
> >> >> >| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> >> >> >
> >> >> >| Version 12.0
> >> >> >
> >> >> >|
> >> >> >
> >> >> >| 03/19/2012
> >> >>
> >> >> You should update your copy of AMBER since there have been many
> >>tweaks
> >> >>and
> >> >> bug fixes. Do:
> >> >>
> >> >> cd $AMBERHOME
> >> >> ./patch_amber.py --update
> >> >>
> >> >> Run this until it stops saying there are updates (about 3 or 4
> >>times).
> >> >>Then
> >> >>
> >> >> make clean
> >> >> ./configure gnu
> >> >> make
> >> >> ./configure -mpi gnu
> >> >> make
> >> >> ./configure -cuda gnu
> >> >> make
> >> >> ./configure -cuda -mpi gnu
> >> >> make
> >> >>
> >> >> >begin time read from input coords = 400.000 ps
> >> >> >Number of triangulated 3-point waters found: 35215
> >> >> >Sum of charges from parm topology file = -0.00000042
> >> >> >Forcing neutrality...
> >> >>
> >> >> This happens with the CPU code sometimes - often when the inpcrd /
> >> >>restart
> >> >> file does not contain box information when a periodic simulation is
> >> >> requested. Does it run ok with the CPU code? - Alternatively it may
> >>just
> >> >> be running so slow over 8 GPUs that it hasn't even got to 500 steps
> >>yet
> >> >>to
> >> >> print anything. Try it with just one GPU and see what happens.
> >> >>
> >> >>
> >> >> All the best
> >> >> Ross
> >> >>
> >> >> /\
> >> >> \/
> >> >> |\oss Walker
> >> >>
> >> >> ---------------------------------------------------------
> >> >> |             Assistant Research Professor              |
> >> >> |            San Diego Supercomputer Center             |
> >> >> |             Adjunct Assistant Professor               |
> >> >> |         Dept. of Chemistry and Biochemistry           |
> >> >> |          University of California San Diego           |
> >> >> |                     NVIDIA Fellow                     |
> >> >> | http://www.rosswalker.co.uk | http://www.wmd-lab.org  |
> >> >> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk  |
> >> >> ---------------------------------------------------------
> >> >>
> >> >> Note: Electronic Mail is not secure, has no guarantee of delivery,
> >>may
> >> >>not
> >> >> be read every day, and should not be used for urgent or sensitive
> >> >>issues.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> AMBER mailing list
> >> >> AMBER.ambermd.org
> >> >> http://lists.ambermd.org/mailman/listinfo/amber
> >> >>
> >> >
> >> >
> >> >
> >> >--
> >> >Mary Varughese
> >> >Research Scholar
> >> >School of Pure and Applied Physics
> >> >Mahatma Gandhi University
> >> >India
> >> >_______________________________________________
> >> >AMBER mailing list
> >> >AMBER.ambermd.org
> >> >http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >
> >
> >
> >--
> >Mary Varughese
> >Research Scholar
> >School of Pure and Applied Physics
> >Mahatma Gandhi University
> >India
> >_______________________________________________
> >AMBER mailing list
> >AMBER.ambermd.org
> >http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
-- 
Mary Varughese
Research Scholar
School of Pure and Applied Physics
Mahatma Gandhi University
India
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 15 2013 - 10:30:03 PDT