Re: [AMBER] PMEMD in amber12

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 15 Mar 2013 09:27:31 -0700

Dear Mary,

Ok, firstly you have not listened to the advice and updated the software
hence you are still running with the original release version, bugs and
all.

In terms of running calculations for just running on 1 GPU you do not need
mpirun at all. So you would do:

$AMBERHOME/pmemd.cuda -O -i ..

Note the 'pmemd.cuda' does NOT have the .MPI here.

All the best
Ross




On 3/15/13 9:23 AM, "Mary Varughese" <maryvj1985.gmail.com> wrote:

>Sir,
>
>
>
>The input of the .out file i have sent is
>
>mpirun -n 8 pmemd.cuda.MPI -O -i TbNrb_md9.in -p TbNrb.prmtop -c
>TbNrb_md8.rst -o TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd
>
>
>
>*The system i am using has 24 CPU's and a single GPU with 512 cores. *
>
>*As i understand now, in single GPU there is no need of mpdboot, no need
>to
>specify no of cores to be used, as in accordance with % of utilization
>load
>two or more programs and its all thread base.*
>
>
>
>So i tried this command since .MPI work only with more than 1 GPU
>
>
>
>mpirun -n 1 pmemd.cuda -O -i TbNrb_md9.in -p TbNrb.prmtop -c TbNrb_md8.rst
>-o TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd
>
>
>
>the output got waited at the last step for more than half hour and i
>terminated the program.
>
>
>
> -------------------------------------------------------
>
> Amber 11 SANDER 2010
>
> -------------------------------------------------------
>
>| PMEMD implementation of SANDER, Release 11
>
>| Run on 03/15/2013 at 18:50:47
>
> [-O]verwriting output
>
>File Assignments:
>
>| MDIN: TbNrb_md9.in
>
>
>| MDOUT: TbNrb_md9.out
>
>
>| INPCRD: TbNrb_md8.rst
>
>
>| PARM: TbNrb.prmtop
>
>
>| RESTRT: TbNrb_md9.rst
>
>
>| REFC: refc
>
>
>| MDVEL: mdvel
>
>
>| MDEN: mden
>
>
>| MDCRD: TbNrb_md9.mdcrd
>
>
>| MDINFO: mdinfo
>
>
> Here is the input file:
>
>Tb-Ntr complex : 200ps MD (production run in NPT)
>
>
> &cntrl
>
>
> imin = 0,
>
>
> irest = 1,
>
>
> ntx = 5,
>
>
> ntb = 2, ntp = 1, pres0 = 1.0,
>
>
> cut = 10,
>
>
> ntr = 0,
>
>
> ntc = 2,
>
>
> ntf = 2,
>
>
> tempi = 300.0,
>
>
> temp0 = 300.0,
>
>
> ntt = 3,
>
>
> gamma_ln = 1,
>
>
> nstlim = 1000, dt = 0.002,
>
>
> ntpr = 500, ntwx = 500, ntwr = 1000,
>
>
> /
>
>
>|--------------------- INFORMATION ----------------------
>
>| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>
>| Version 12.0
>
>| 03/19/2012
>
>| Implementation by:
>
>| Ross C. Walker (SDSC)
>
>| Scott Le Grand (nVIDIA)
>
>| Duncan Poole (nVIDIA)
>
>| CAUTION: The CUDA code is currently experimental.
>
>| You use it at your own risk. Be sure to
>
>| check ALL results carefully.
>
>| Precision model in use:
>
>| [SPDP] - Hybrid Single/Double Precision (Default).
>
>|--------------------------------------------------------
>
>|------------------- GPU DEVICE INFO --------------------
>
>| CUDA Capable Devices Detected: 1
>
>| CUDA Device ID in use: 0
>
>| CUDA Device Name: Tesla M2090
>
>| CUDA Device Global Mem Size: 5375 MB
>
>| CUDA Device Num Multiprocessors: 16
>
>| CUDA Device Core Freq: 1.30 GHz
>
>|
>
>|--------------------------------------------------------
>
>| Conditional Compilation Defines Used:
>
>| DIRFRC_COMTRANS
>
>| DIRFRC_EFS
>
>| DIRFRC_NOVEC
>
>| PUBFFT
>
>| FFTLOADBAL_2PROC
>
>| BINTRAJ
>
>| CUDA
>
>| Largest sphere to fit in unit cell has radius = 47.238
>
>| New format PARM file being parsed.
>
>| Version = 1.000 Date = 10/04/12 Time = 11:18:48
>
>| Note: 1-4 EEL scale factors are being read from the topology file.
>
>| Note: 1-4 VDW scale factors are being read from the topology file.
>
>| Duplicated 0 dihedrals
>
>| Duplicated 0 dihedrals
>
>--------------------------------------------------------------------------
>------
>
> 1. RESOURCE USE:
>
>--------------------------------------------------------------------------
>------
>
> getting new box info from bottom of inpcrd
>
> NATOM = 119092 NTYPES = 20 NBONH = 112233 MBONA = 6978
>
> NTHETH = 14955 MTHETA = 9471 NPHIH = 29675 MPHIA = 23498
>
> NHPARM = 0 NPARM = 0 NNB = 214731 NRES = 36118
>
> NBONA = 6978 NTHETA = 9471 NPHIA = 23498 NUMBND = 78
>
> NUMANG = 158 NPTRA = 71 NATYP = 52 NPHB = 1
>
> IFBOX = 2 NMXRS = 44 IFCAP = 0 NEXTRA = 0
>
> NCOPY = 0
>
>| Coordinate Index Table dimensions: 18 18 18
>
>| Direct force subcell size = 6.4283 6.4283 6.4283
>
> BOX TYPE: TRUNCATED OCTAHEDRON
>
>--------------------------------------------------------------------------
>------
>
> 2. CONTROL DATA FOR THE RUN
>
>--------------------------------------------------------------------------
>------
>
>
>General flags:
>
> imin = 0, nmropt = 0
>
>Nature and format of input:
>
> ntx = 5, irest = 1, ntrx = 1
>
>Nature and format of output:
>
> ntxo = 1, ntpr = 500, ntrx = 1, ntwr =
>1000
>
> iwrap = 0, ntwx = 500, ntwv = 0, ntwe =
> 0
>
> ioutfm = 0, ntwprt = 0, idecomp = 0, rbornstat=
>0
>
>Potential function:
>
> ntf = 2, ntb = 2, igb = 0, nsnb =
> 25
>
> ipol = 0, gbsa = 0, iesp = 0
>
> dielc = 1.00000, cut = 10.00000, intdiel = 1.00000
>
>Frozen or restrained atoms:
>
> ibelly = 0, ntr = 0
>
>Molecular dynamics:
>
> nstlim = 1000, nscm = 1000, nrespa = 1
>
> t = 0.00000, dt = 0.00200, vlimit = -1.00000
>
>Langevin dynamics temperature regulation:
>
> ig = 71277
>
> temp0 = 300.00000, tempi = 300.00000, gamma_ln= 1.00000
>
>Pressure regulation:
>
> ntp = 1
>
> pres0 = 1.00000, comp = 44.60000, taup = 1.00000
>
>SHAKE:
>
> ntc = 2, jfastw = 0
>
> tol = 0.00001
>
>| Intermolecular bonds treatment:
>
>| no_intermolecular_bonds = 1
>
>| Energy averages sample interval:
>
>| ene_avg_sampling = 500
>
> Ewald parameters:
>
> verbose = 0, ew_type = 0, nbflag = 1, use_pme =
>1
>
> vdwmeth = 1, eedmeth = 1, netfrc = 1
>
> Box X = 115.709 Box Y = 115.709 Box Z = 115.709
>
> Alpha = 109.471 Beta = 109.471 Gamma = 109.471
>
> NFFT1 = 120 NFFT2 = 120 NFFT3 = 120
>
> Cutoff= 10.000 Tol =0.100E-04
>
> Ewald Coefficient = 0.27511
>
> Interpolation order = 4
>
>--------------------------------------------------------------------------
>------
>
> 3. ATOMIC COORDINATES AND VELOCITIES
>
>--------------------------------------------------------------------------
>------
>
> begin time read from input coords = 400.000 ps
>
> Number of triangulated 3-point waters found: 35215
>
> Sum of charges from parm topology file = -0.00000042
>
> Forcing neutrality...
>
>* *
>
>* *
>* *
>
>***Then i tried this: waiting at the last step, i wait for 45 minutes and
>terminate.*
>* *
>
>
>
>pmemd.cuda -O -i TbNrb_md9.in -p TbNrb.prmtop -c TbNrb_md8.rst -o
>TbNrb_md9.out -r TbNrb_md9.rst -x TbNrb_md9.mdcrd
>
>
>
> -------------------------------------------------------
>
> Amber 11 SANDER 2010
>
> -------------------------------------------------------
>
>| PMEMD implementation of SANDER, Release 11
>
>| Run on 03/15/2013 at 19:14:46
>
> [-O]verwriting output
>
>File Assignments:
>
>| MDIN: TbNrb_md9.in
>
>
>| MDOUT: TbNrb_md9.out
>
>
>| INPCRD: TbNrb_md8.rst
>
>
>| PARM: TbNrb.prmtop
>
>
>| RESTRT: TbNrb_md9.rst
>
>
>| REFC: refc
>
>
>| MDVEL: mdvel
>
>
>| MDEN: mden
>
>
>| MDCRD: TbNrb_md9.mdcrd
>
>
>| MDINFO: mdinfo
>
>
> Here is the input file:
>
>Tb-Ntr complex : 200ps MD (production run in NPT)
>
>
> &cntrl
>
>
> imin = 0,
>
>
> irest = 1,
>
>
> ntx = 5,
>
>
> ntb = 2, ntp = 1, pres0 = 1.0,
>
>
> cut = 10,
>
>
> ntr = 0,
>
>
> ntc = 2,
>
>
> ntf = 2,
>
>
> tempi = 300.0,
>
>
> temp0 = 300.0,
>
>
> ntt = 3,
>
>
> gamma_ln = 1,
>
>
> nstlim = 1000, dt = 0.002,
>
>
> ntpr = 500, ntwx = 500, ntwr = 1000,
>
>
> /
>
>
>|--------------------- INFORMATION ----------------------
>
>| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>
>| Version 12.0
>
>| 03/19/2012
>
>| Implementation by:
>
>| Ross C. Walker (SDSC)
>
>| Scott Le Grand (nVIDIA)
>
>| Duncan Poole (nVIDIA)
>
>| CAUTION: The CUDA code is currently experimental.
>
>| You use it at your own risk. Be sure to
>
>| check ALL results carefully.
>
>| Precision model in use:
>
>| [SPDP] - Hybrid Single/Double Precision (Default).
>
>|------------------- GPU DEVICE INFO --------------------
>
>| CUDA Capable Devices Detected: 1
>
>| CUDA Device ID in use: 0
>
>| CUDA Device Name: Tesla M2090
>
>| CUDA Device Global Mem Size: 5375 MB
>
>| CUDA Device Num Multiprocessors: 16
>
>| CUDA Device Core Freq: 1.30 GHz
>
>|--------------------------------------------------------
>
>| Conditional Compilation Defines Used:
>
>| DIRFRC_COMTRANS
>
>| DIRFRC_EFS
>
>| DIRFRC_NOVEC
>
>| PUBFFT
>
>| FFTLOADBAL_2PROC
>
>| BINTRAJ
>
>| CUDA
>
>
>
>| Largest sphere to fit in unit cell has radius = 47.238
>
>| New format PARM file being parsed.
>
>| Version = 1.000 Date = 10/04/12 Time = 11:18:48
>
>| Note: 1-4 EEL scale factors are being read from the topology file.
>
>| Note: 1-4 VDW scale factors are being read from the topology file.
>
>| Duplicated 0 dihedrals
>
>| Duplicated 0 dihedrals
>
>--------------------------------------------------------------------------
>------
>
> 1. RESOURCE USE:
>
>--------------------------------------------------------------------------
>------
>
> getting new box info from bottom of inpcrd
>
> NATOM = 119092 NTYPES = 20 NBONH = 112233 MBONA = 6978
>
> NTHETH = 14955 MTHETA = 9471 NPHIH = 29675 MPHIA = 23498
>
> NHPARM = 0 NPARM = 0 NNB = 214731 NRES = 36118
>
> NBONA = 6978 NTHETA = 9471 NPHIA = 23498 NUMBND = 78
>
> NUMANG = 158 NPTRA = 71 NATYP = 52 NPHB = 1
>
> IFBOX = 2 NMXRS = 44 IFCAP = 0 NEXTRA = 0
>
> NCOPY = 0
>
>| Coordinate Index Table dimensions: 18 18 18
>
>| Direct force subcell size = 6.4283 6.4283 6.4283
>
> BOX TYPE: TRUNCATED OCTAHEDRON
>
>--------------------------------------------------------------------------
>------
>
> 2. CONTROL DATA FOR THE RUN
>
>--------------------------------------------------------------------------
>------
>
>General flags:
>
> imin = 0, nmropt = 0
>
>Nature and format of input:
>
> ntx = 5, irest = 1, ntrx = 1
>
>Nature and format of output:
>
> ntxo = 1, ntpr = 500, ntrx = 1, ntwr =
>1000
>
> iwrap = 0, ntwx = 500, ntwv = 0, ntwe =
> 0
>
> ioutfm = 0, ntwprt = 0, idecomp = 0, rbornstat=
>0
>
>Potential function:
>
> ntf = 2, ntb = 2, igb = 0, nsnb =
>25
>
> ipol = 0, gbsa = 0, iesp = 0
>
> dielc = 1.00000, cut = 10.00000, intdiel = 1.00000
>
>Frozen or restrained atoms:
>
> ibelly = 0, ntr = 0
>
>Molecular dynamics:
>
> nstlim = 1000, nscm = 1000, nrespa = 1
>
> t = 0.00000, dt = 0.00200, vlimit = -1.00000
>
>Langevin dynamics temperature regulation:
>
> ig = 71277
>
> temp0 = 300.00000, tempi = 300.00000, gamma_ln= 1.00000
>
>Pressure regulation:
>
> ntp = 1
>
> pres0 = 1.00000, comp = 44.60000, taup = 1.00000
>
>SHAKE:
>
> ntc = 2, jfastw = 0
>
> tol = 0.00001
>
>| Intermolecular bonds treatment:
>
>| no_intermolecular_bonds = 1
>
>| Energy averages sample interval:
>
>| ene_avg_sampling = 500
>
>
>
>Ewald parameters:
>
> verbose = 0, ew_type = 0, nbflag = 1, use_pme =
>1
>
> vdwmeth = 1, eedmeth = 1, netfrc = 1
>
> Box X = 115.709 Box Y = 115.709 Box Z = 115.709
>
> Alpha = 109.471 Beta = 109.471 Gamma = 109.471
>
> NFFT1 = 120 NFFT2 = 120 NFFT3 = 120
>
> Cutoff= 10.000 Tol =0.100E-04
>
> Ewald Coefficient = 0.27511
>
> Interpolation order = 4
>
>--------------------------------------------------------------------------
>------
>
> 3. ATOMIC COORDINATES AND VELOCITIES
>
>*
>--------------------------------------------------------------------------
>-----
>*
>
> begin time read from input coords = 400.000 ps
>
> Number of triangulated 3-point waters found: 35215
>
> Sum of charges from parm topology file = -0.00000042
>
> Forcing neutrality...
>
>
>
>
>
>Am i doing anything wrong.
>
>Please tell me if my input has some problem.
>
>Someone says if there is only one GPU only one core will work.
>
>Please tell me the correct syntax of using pmemd.
>
>
>
>Also I donšt understand what this really meant
>
>if (igb/=0 & cut<systemsize)
>
>*GPU accelerated implicit solvent GB simulations do not support a cutoff.*
>
>I am using the same input used for sander!
>I will work on the patch.
>
>Thanking you
>
>
>On Fri, Mar 15, 2013 at 8:57 PM, Ross Walker <ross.rosswalker.co.uk>
>wrote:
>
>> Hi Mary,
>>
>> Please read the following page: http://ambermd.org/gpus/
>>
>> This has all the information you should need for running correctly on
>>GPUs.
>>
>> All the best
>> Ross
>>
>>
>>
>>
>> On 3/14/13 8:20 PM, "Mary Varughese" <maryvj1985.gmail.com> wrote:
>>
>> >Sir,
>> >
>> >Infact this is a single GPU with 24 cores as i understand.
>> >bugixes have been done.
>> >But i will try the step u suggested.
>> >Also this work run without any problem in CPU workstaion.
>> >Hope the input doesnt contain any variable not compatible with pmemd!
>> >
>> >Thanking you
>> >
>> >On Thu, Mar 14, 2013 at 9:16 PM, Ross Walker <ross.rosswalker.co.uk>
>> >wrote:
>> >
>> >> Hi Mary,
>> >>
>> >> 8 GPUs is a lot to use you probably won't get optimal scaling unless
>>you
>> >> have very good interconnect and only 1 GPU per node. Some things to
>>try
>> >>/
>> >> consider:
>> >>
>> >>
>> >> >|--------------------- INFORMATION ----------------------
>> >> >
>> >> >| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>> >> >
>> >> >| Version 12.0
>> >> >
>> >> >|
>> >> >
>> >> >| 03/19/2012
>> >>
>> >> You should update your copy of AMBER since there have been many
>>tweaks
>> >>and
>> >> bug fixes. Do:
>> >>
>> >> cd $AMBERHOME
>> >> ./patch_amber.py --update
>> >>
>> >> Run this until it stops saying there are updates (about 3 or 4
>>times).
>> >>Then
>> >>
>> >> make clean
>> >> ./configure gnu
>> >> make
>> >> ./configure -mpi gnu
>> >> make
>> >> ./configure -cuda gnu
>> >> make
>> >> ./configure -cuda -mpi gnu
>> >> make
>> >>
>> >> >begin time read from input coords = 400.000 ps
>> >> >Number of triangulated 3-point waters found: 35215
>> >> >Sum of charges from parm topology file = -0.00000042
>> >> >Forcing neutrality...
>> >>
>> >> This happens with the CPU code sometimes - often when the inpcrd /
>> >>restart
>> >> file does not contain box information when a periodic simulation is
>> >> requested. Does it run ok with the CPU code? - Alternatively it may
>>just
>> >> be running so slow over 8 GPUs that it hasn't even got to 500 steps
>>yet
>> >>to
>> >> print anything. Try it with just one GPU and see what happens.
>> >>
>> >>
>> >> All the best
>> >> Ross
>> >>
>> >> /\
>> >> \/
>> >> |\oss Walker
>> >>
>> >> ---------------------------------------------------------
>> >> | Assistant Research Professor |
>> >> | San Diego Supercomputer Center |
>> >> | Adjunct Assistant Professor |
>> >> | Dept. of Chemistry and Biochemistry |
>> >> | University of California San Diego |
>> >> | NVIDIA Fellow |
>> >> | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
>> >> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
>> >> ---------------------------------------------------------
>> >>
>> >> Note: Electronic Mail is not secure, has no guarantee of delivery,
>>may
>> >>not
>> >> be read every day, and should not be used for urgent or sensitive
>> >>issues.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >
>> >
>> >
>> >--
>> >Mary Varughese
>> >Research Scholar
>> >School of Pure and Applied Physics
>> >Mahatma Gandhi University
>> >India
>> >_______________________________________________
>> >AMBER mailing list
>> >AMBER.ambermd.org
>> >http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
>--
>Mary Varughese
>Research Scholar
>School of Pure and Applied Physics
>Mahatma Gandhi University
>India
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 15 2013 - 09:30:05 PDT
Custom Search