Re: [AMBER] PMEMD in amber12

From: Mary Varughese <>
Date: Fri, 15 Mar 2013 21:53:53 +0530


The input of the .out file i have sent is

mpirun -n 8 pmemd.cuda.MPI -O -i -p TbNrb.prmtop -c
TbNrb_md8.rst -o TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd

*The system i am using has 24 CPU's and a single GPU with 512 cores. *

*As i understand now, in single GPU there is no need of mpdboot, no need to
specify no of cores to be used, as in accordance with % of utilization load
two or more programs and its all thread base.*

So i tried this command since .MPI work only with more than 1 GPU

mpirun -n 1 pmemd.cuda -O -i -p TbNrb.prmtop -c TbNrb_md8.rst
-o TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd

the output got waited at the last step for more than half hour and i
terminated the program.


          Amber 11 SANDER 2010


| PMEMD implementation of SANDER, Release 11

| Run on 03/15/2013 at 18:50:47

  [-O]verwriting output

File Assignments:


| MDOUT: TbNrb_md9.out

| INPCRD: TbNrb_md8.rst

| PARM: TbNrb.prmtop

| RESTRT: TbNrb_md9.rst

| REFC: refc

| MDVEL: mdvel

| MDEN: mden

| MDCRD: TbNrb_md9.mdcrd

| MDINFO: mdinfo

 Here is the input file:

Tb-Ntr complex : 200ps MD (production run in NPT)


  imin = 0,

  irest = 1,

  ntx = 5,

  ntb = 2, ntp = 1, pres0 = 1.0,

  cut = 10,

  ntr = 0,

  ntc = 2,

  ntf = 2,

  tempi = 300.0,

  temp0 = 300.0,

  ntt = 3,

  gamma_ln = 1,

  nstlim = 1000, dt = 0.002,

  ntpr = 500, ntwx = 500, ntwr = 1000,


|--------------------- INFORMATION ----------------------

| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.

| Version 12.0

| 03/19/2012

| Implementation by:

| Ross C. Walker (SDSC)

| Scott Le Grand (nVIDIA)

| Duncan Poole (nVIDIA)

| CAUTION: The CUDA code is currently experimental.

| You use it at your own risk. Be sure to

| check ALL results carefully.

| Precision model in use:

| [SPDP] - Hybrid Single/Double Precision (Default).


|------------------- GPU DEVICE INFO --------------------

| CUDA Capable Devices Detected: 1

| CUDA Device ID in use: 0

| CUDA Device Name: Tesla M2090

| CUDA Device Global Mem Size: 5375 MB

| CUDA Device Num Multiprocessors: 16

| CUDA Device Core Freq: 1.30 GHz



| Conditional Compilation Defines Used:








| Largest sphere to fit in unit cell has radius = 47.238

| New format PARM file being parsed.

| Version = 1.000 Date = 10/04/12 Time = 11:18:48

| Note: 1-4 EEL scale factors are being read from the topology file.

| Note: 1-4 VDW scale factors are being read from the topology file.

| Duplicated 0 dihedrals

| Duplicated 0 dihedrals




 getting new box info from bottom of inpcrd

 NATOM = 119092 NTYPES = 20 NBONH = 112233 MBONA = 6978

 NTHETH = 14955 MTHETA = 9471 NPHIH = 29675 MPHIA = 23498

 NHPARM = 0 NPARM = 0 NNB = 214731 NRES = 36118

 NBONA = 6978 NTHETA = 9471 NPHIA = 23498 NUMBND = 78

 NUMANG = 158 NPTRA = 71 NATYP = 52 NPHB = 1

 IFBOX = 2 NMXRS = 44 IFCAP = 0 NEXTRA = 0

 NCOPY = 0

| Coordinate Index Table dimensions: 18 18 18

| Direct force subcell size = 6.4283 6.4283 6.4283





General flags:

     imin = 0, nmropt = 0

Nature and format of input:

     ntx = 5, irest = 1, ntrx = 1

Nature and format of output:

     ntxo = 1, ntpr = 500, ntrx = 1, ntwr =

     iwrap = 0, ntwx = 500, ntwv = 0, ntwe =

     ioutfm = 0, ntwprt = 0, idecomp = 0, rbornstat=

Potential function:

     ntf = 2, ntb = 2, igb = 0, nsnb =

     ipol = 0, gbsa = 0, iesp = 0

     dielc = 1.00000, cut = 10.00000, intdiel = 1.00000

Frozen or restrained atoms:

     ibelly = 0, ntr = 0

Molecular dynamics:

     nstlim = 1000, nscm = 1000, nrespa = 1

     t = 0.00000, dt = 0.00200, vlimit = -1.00000

Langevin dynamics temperature regulation:

     ig = 71277

     temp0 = 300.00000, tempi = 300.00000, gamma_ln= 1.00000

Pressure regulation:

     ntp = 1

     pres0 = 1.00000, comp = 44.60000, taup = 1.00000


     ntc = 2, jfastw = 0

     tol = 0.00001

| Intermolecular bonds treatment:

| no_intermolecular_bonds = 1

| Energy averages sample interval:

| ene_avg_sampling = 500

 Ewald parameters:

     verbose = 0, ew_type = 0, nbflag = 1, use_pme =

     vdwmeth = 1, eedmeth = 1, netfrc = 1

     Box X = 115.709 Box Y = 115.709 Box Z = 115.709

     Alpha = 109.471 Beta = 109.471 Gamma = 109.471

     NFFT1 = 120 NFFT2 = 120 NFFT3 = 120

     Cutoff= 10.000 Tol =0.100E-04

     Ewald Coefficient = 0.27511

     Interpolation order = 4




 begin time read from input coords = 400.000 ps

 Number of triangulated 3-point waters found: 35215

     Sum of charges from parm topology file = -0.00000042

     Forcing neutrality...

* *

* *
* *

***Then i tried this: waiting at the last step, i wait for 45 minutes and
* *

pmemd.cuda -O -i -p TbNrb.prmtop -c TbNrb_md8.rst -o
TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd


          Amber 11 SANDER 2010


| PMEMD implementation of SANDER, Release 11

| Run on 03/15/2013 at 19:14:46

  [-O]verwriting output

File Assignments:


| MDOUT: TbNrb_md9.out

| INPCRD: TbNrb_md8.rst

| PARM: TbNrb.prmtop

| RESTRT: TbNrb_md9.rst

| REFC: refc

| MDVEL: mdvel

| MDEN: mden

| MDCRD: TbNrb_md9.mdcrd

| MDINFO: mdinfo

 Here is the input file:

Tb-Ntr complex : 200ps MD (production run in NPT)


  imin = 0,

  irest = 1,

  ntx = 5,

  ntb = 2, ntp = 1, pres0 = 1.0,

  cut = 10,

  ntr = 0,

  ntc = 2,

  ntf = 2,

  tempi = 300.0,

  temp0 = 300.0,

  ntt = 3,

  gamma_ln = 1,

  nstlim = 1000, dt = 0.002,

  ntpr = 500, ntwx = 500, ntwr = 1000,


|--------------------- INFORMATION ----------------------

| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.

| Version 12.0

| 03/19/2012

| Implementation by:

| Ross C. Walker (SDSC)

| Scott Le Grand (nVIDIA)

| Duncan Poole (nVIDIA)

| CAUTION: The CUDA code is currently experimental.

| You use it at your own risk. Be sure to

| check ALL results carefully.

| Precision model in use:

| [SPDP] - Hybrid Single/Double Precision (Default).

|------------------- GPU DEVICE INFO --------------------

| CUDA Capable Devices Detected: 1

| CUDA Device ID in use: 0

| CUDA Device Name: Tesla M2090

| CUDA Device Global Mem Size: 5375 MB

| CUDA Device Num Multiprocessors: 16

| CUDA Device Core Freq: 1.30 GHz


| Conditional Compilation Defines Used:








| Largest sphere to fit in unit cell has radius = 47.238

| New format PARM file being parsed.

| Version = 1.000 Date = 10/04/12 Time = 11:18:48

| Note: 1-4 EEL scale factors are being read from the topology file.

| Note: 1-4 VDW scale factors are being read from the topology file.

| Duplicated 0 dihedrals

| Duplicated 0 dihedrals




 getting new box info from bottom of inpcrd

 NATOM = 119092 NTYPES = 20 NBONH = 112233 MBONA = 6978

 NTHETH = 14955 MTHETA = 9471 NPHIH = 29675 MPHIA = 23498

 NHPARM = 0 NPARM = 0 NNB = 214731 NRES = 36118

 NBONA = 6978 NTHETA = 9471 NPHIA = 23498 NUMBND = 78

 NUMANG = 158 NPTRA = 71 NATYP = 52 NPHB = 1

 IFBOX = 2 NMXRS = 44 IFCAP = 0 NEXTRA = 0

 NCOPY = 0

| Coordinate Index Table dimensions: 18 18 18

| Direct force subcell size = 6.4283 6.4283 6.4283





General flags:

     imin = 0, nmropt = 0

Nature and format of input:

     ntx = 5, irest = 1, ntrx = 1

Nature and format of output:

     ntxo = 1, ntpr = 500, ntrx = 1, ntwr =

     iwrap = 0, ntwx = 500, ntwv = 0, ntwe =

     ioutfm = 0, ntwprt = 0, idecomp = 0, rbornstat=

Potential function:

     ntf = 2, ntb = 2, igb = 0, nsnb =

     ipol = 0, gbsa = 0, iesp = 0

     dielc = 1.00000, cut = 10.00000, intdiel = 1.00000

Frozen or restrained atoms:

     ibelly = 0, ntr = 0

Molecular dynamics:

     nstlim = 1000, nscm = 1000, nrespa = 1

     t = 0.00000, dt = 0.00200, vlimit = -1.00000

Langevin dynamics temperature regulation:

     ig = 71277

     temp0 = 300.00000, tempi = 300.00000, gamma_ln= 1.00000

Pressure regulation:

     ntp = 1

     pres0 = 1.00000, comp = 44.60000, taup = 1.00000


     ntc = 2, jfastw = 0

     tol = 0.00001

| Intermolecular bonds treatment:

| no_intermolecular_bonds = 1

| Energy averages sample interval:

| ene_avg_sampling = 500

Ewald parameters:

     verbose = 0, ew_type = 0, nbflag = 1, use_pme =

     vdwmeth = 1, eedmeth = 1, netfrc = 1

     Box X = 115.709 Box Y = 115.709 Box Z = 115.709

     Alpha = 109.471 Beta = 109.471 Gamma = 109.471

     NFFT1 = 120 NFFT2 = 120 NFFT3 = 120

     Cutoff= 10.000 Tol =0.100E-04

     Ewald Coefficient = 0.27511

     Interpolation order = 4




 begin time read from input coords = 400.000 ps

 Number of triangulated 3-point waters found: 35215

     Sum of charges from parm topology file = -0.00000042

     Forcing neutrality...

Am i doing anything wrong.

Please tell me if my input has some problem.

Someone says if there is only one GPU only one core will work.

Please tell me the correct syntax of using pmemd.

Also I don’t understand what this really meant

if (igb/=0 & cut<systemsize)

*GPU accelerated implicit solvent GB simulations do not support a cutoff.*

I am using the same input used for sander!
I will work on the patch.

Thanking you

On Fri, Mar 15, 2013 at 8:57 PM, Ross Walker <> wrote:

> Hi Mary,
> Please read the following page:
> This has all the information you should need for running correctly on GPUs.
> All the best
> Ross
> On 3/14/13 8:20 PM, "Mary Varughese" <> wrote:
> >Sir,
> >
> >Infact this is a single GPU with 24 cores as i understand.
> >bugixes have been done.
> >But i will try the step u suggested.
> >Also this work run without any problem in CPU workstaion.
> >Hope the input doesnt contain any variable not compatible with pmemd!
> >
> >Thanking you
> >
> >On Thu, Mar 14, 2013 at 9:16 PM, Ross Walker <>
> >wrote:
> >
> >> Hi Mary,
> >>
> >> 8 GPUs is a lot to use you probably won't get optimal scaling unless you
> >> have very good interconnect and only 1 GPU per node. Some things to try
> >>/
> >> consider:
> >>
> >>
> >> >|--------------------- INFORMATION ----------------------
> >> >
> >> >| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> >> >
> >> >| Version 12.0
> >> >
> >> >|
> >> >
> >> >| 03/19/2012
> >>
> >> You should update your copy of AMBER since there have been many tweaks
> >>and
> >> bug fixes. Do:
> >>
> >> cd $AMBERHOME
> >> ./ --update
> >>
> >> Run this until it stops saying there are updates (about 3 or 4 times).
> >>Then
> >>
> >> make clean
> >> ./configure gnu
> >> make
> >> ./configure -mpi gnu
> >> make
> >> ./configure -cuda gnu
> >> make
> >> ./configure -cuda -mpi gnu
> >> make
> >>
> >> >begin time read from input coords = 400.000 ps
> >> >Number of triangulated 3-point waters found: 35215
> >> >Sum of charges from parm topology file = -0.00000042
> >> >Forcing neutrality...
> >>
> >> This happens with the CPU code sometimes - often when the inpcrd /
> >>restart
> >> file does not contain box information when a periodic simulation is
> >> requested. Does it run ok with the CPU code? - Alternatively it may just
> >> be running so slow over 8 GPUs that it hasn't even got to 500 steps yet
> >>to
> >> print anything. Try it with just one GPU and see what happens.
> >>
> >>
> >> All the best
> >> Ross
> >>
> >> /\
> >> \/
> >> |\oss Walker
> >>
> >> ---------------------------------------------------------
> >> | Assistant Research Professor |
> >> | San Diego Supercomputer Center |
> >> | Adjunct Assistant Professor |
> >> | Dept. of Chemistry and Biochemistry |
> >> | University of California San Diego |
> >> | NVIDIA Fellow |
> >> | | |
> >> | Tel: +1 858 822 0854 | EMail:- |
> >> ---------------------------------------------------------
> >>
> >> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> >>not
> >> be read every day, and should not be used for urgent or sensitive
> >>issues.
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >>
> >>
> >>
> >
> >
> >
> >--
> >Mary Varughese
> >Research Scholar
> >School of Pure and Applied Physics
> >Mahatma Gandhi University
> >India
> >_______________________________________________
> >AMBER mailing list
> >
> >
> _______________________________________________
> AMBER mailing list

Mary Varughese
Research Scholar
School of Pure and Applied Physics
Mahatma Gandhi University
AMBER mailing list
Received on Fri Mar 15 2013 - 09:30:03 PDT
Custom Search