Re: [AMBER] change in density on multiple gpus

From: Neha Gandhi <n.gandhiau.gmail.com>
Date: Sat, 11 Nov 2017 11:26:51 +1000

Hi Ross,

I think GaMD has similar issue on parallel GPUs if I remember from my
experience earlier next year.

Thank you,

Cheers,
Neha

On 10 November 2017 at 21:08, Ross Walker <ross.rosswalker.co.uk> wrote:

> Hi Neha,
>
> Thanks for testing. I'm glad to see the problem is isolated to a specific
> option. My suspicion would be ScaledMD rather than restraints but we'll
> take a look and see where the problem is.
>
> All the best
> Ross
>
> > On Nov 9, 2017, at 01:22, Neha Gandhi <n.gandhiau.gmail.com> wrote:
> >
> > Hi Ross,
> >
> > I tried a small test run with a peptide in a cubic box. No restraints or
> > SMD.
> > Using parallel gpus (M40), the average density is correct. EIther smd or
> > restraints are problem in parallel.
> >
> > NSTEP = 1000 TIME(PS) = 1002.000 TEMP(K) = 291.84 PRESS =
> > 38.3
> > Etot = -123129.5273 EKtot = 20192.0117 EPtot =
> > -143321.5390
> > BOND = 47.5188 ANGLE = 184.6402 DIHED =
> > 177.1934
> > 1-4 NB = 67.0236 1-4 EEL = 680.7912 VDWAALS =
> > 20459.2485
> > EELEC = -164937.9546 EHBOND = 0.0000 RESTRAINT =
> > 0.0000
> > EKCMT = 10055.8297 VIRIAL = 9768.3861 VOLUME =
> > 347191.7786
> > Density =
> > 1.0015
> > ------------------------------------------------------------
> ------------------
> >
> > wrapping first mol.: 25.57034 36.16192 62.63428
> >
> > NSTEP = 2000 TIME(PS) = 1004.000 TEMP(K) = 293.17 PRESS =
> > 38.2
> > Etot = -122959.5146 EKtot = 20284.1270 EPtot =
> > -143243.6415
> > BOND = 61.8348 ANGLE = 170.9486 DIHED =
> > 174.6173
> > 1-4 NB = 60.0539 1-4 EEL = 673.0535 VDWAALS =
> > 20535.7533
> > EELEC = -164919.9029 EHBOND = 0.0000 RESTRAINT =
> > 0.0000
> > EKCMT = 10090.8180 VIRIAL = 9803.9675 VOLUME =
> > 347496.5420
> > Density =
> > 1.0007
> > ------------------------------------------------------------
> ------------------
> >
> > Cheers then,
> > Neha
> >
> > On 9 November 2017 at 07:33, Ross Walker <ross.rosswalker.co.uk> wrote:
> >
> >> Hi Neha,
> >>
> >> The best approach for systems that require you to request and entire
> node
> >> is just to run two separate jobs. E.g.
> >>
> >> cd job1
> >> $AMBERHOME/bin/pmemd.cuda -O -i mdin -o mdout ...... &
> >> cd ../job2
> >> $AMBERHOME/bin/pmemd.cuda -O -i mdin -o mdout ...... &
> >> wait
> >>
> >> The key is the keyword wait here which stops the script returning until
> >> both jobs have completed. If the jobs are the same system run with the
> same
> >> input just different random seeds to get improved sampling that should
> take
> >> almost identical time so the script load balances well. The other option
> >> you can do is write a loop in here, with the wait statement, that loops
> 2
> >> calculations over a series of steps - say lots of 50ns production steps
> one
> >> after the other. That way if one job finishes quickly that GPU gets
> given
> >> more work to do while the other one is still running.
> >>
> >> With regards to the problem you are seeing using both GPUs for a single
> >> calculation though this definitely looks like some kind of bug. It maye
> be
> >> related to the scaledMD though. Can you try this leaving out all the
> >> options after the ig=-1 (no restraints and no scaled MD) and see if you
> get
> >> the same difference between single GPU and multi-GPU.
> >>
> >> All the best
> >> Ross
> >>
> >>> On Nov 8, 2017, at 4:24 PM, Neha Gandhi <n.gandhiau.gmail.com> wrote:
> >>>
> >>> Hi Ross,
> >>>
> >>> I understand that serial GPU performs better over parallel for some
> >>> systems. We are required to request 2 GPUs per job on most
> supercomputing
> >>> facilities in Australia even if the job will only use single gpu.
> >>>
> >>> The specification of my job is below
> >>>
> >>> AMBER version 16.
> >>>
> >>> &cntrl
> >>>
> >>>
> >>> imin=0,irest=1,ntx=5,
> >>>
> >>>
> >>> nstlim=5000000,dt=0.002,
> >>>
> >>>
> >>> ntc=2,ntf=2,
> >>>
> >>> cut=12.0, ntb=2, ntp=1,
> >>> taup=1,
> >>> ntpr=5000,
> >>> ntwx=5000,ntwr=5000,
> >>> ntt=3,
> >>> gamma_ln=5.0,
> >>>
> >>> temp0=310.0,iwrap=1,ig=-1,
> >>>
> >>> ntr=1,
> >>> restraintmask=':CNT',
> >>>
> >>> restraint_wt=2.0,
> >>>
> >>>
> >>> scaledMD=1,
> >>>
> >>>
> >>> scaledMD_lambda=0.70,
> >>>
> >>> /
> >>>
> >>>
> >>>
> >>> Jobscript on HPC
> >>>
> >>> #!/bin/bash -l
> >>> #PBS -N cnt2
> >>> #PBS -l walltime=24:00:00
> >>> #PBS -l select=1:ncpus=2:ngpus=2:mpiprocs=2:gputype=M40:mem=10gb
> >>> #PBS -j oe
> >>>
> >>> module purge
> >>> module load
> >>> amber/16-iomkl-2016.09-gcc-4.9.3-2.25-ambertools-16-
> patchlevel-5-14-cuda
> >>>
> >>> cd $PBS_O_WORKDIR
> >>> mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i smd.in -o smd1.out -p
> >>> solvated.prmtop -c npt1.rst -r smd1.rst -x smd1.netcdf -ref npt1.rst
> >>>
> >>>
> >>> |------------------- GPU DEVICE INFO --------------------
> >>> |
> >>> | Task ID: 0
> >>> | CUDA_VISIBLE_DEVICES: 0,1
> >>> | CUDA Capable Devices Detected: 2
> >>> | CUDA Device ID in use: 0
> >>> | CUDA Device Name: Tesla M40 24GB
> >>> | CUDA Device Global Mem Size: 22971 MB
> >>> | CUDA Device Num Multiprocessors: 24
> >>> | CUDA Device Core Freq: 1.11 GHz
> >>> |
> >>> |
> >>> | Task ID: 1
> >>> | CUDA_VISIBLE_DEVICES: 0,1
> >>> | CUDA Capable Devices Detected: 2
> >>> | CUDA Device ID in use: 1
> >>> | CUDA Device Name: Tesla M40 24GB
> >>> | CUDA Device Global Mem Size: 22971 MB
> >>> | CUDA Device Num Multiprocessors: 24
> >>> | CUDA Device Core Freq: 1.11 GHz
> >>> |
> >>>
> >>> I can confirm that the same job with single GPU shows density of 0.99.
> >>>
> >>> NSTEP = 5000 TIME(PS) = 40660.000 TEMP(K) = 310.08 PRESS =
> >>> 28.4
> >>> Etot = -267200.5515 EKtot = 113029.5000 EPtot =
> >>> -380230.0515
> >>> BOND = 2973.9049 ANGLE = 3436.4722 DIHED =
> >>> 5568.2840
> >>> 1-4 NB = 5125.8320 1-4 EEL = 9867.1070 VDWAALS =
> >>> 76308.4188
> >>> EELEC = -646585.5951 EHBOND = 0.0000 RESTRAINT =
> >>> 119.7885
> >>> EAMBER (non-restraint) = -380349.8399
> >>> EKCMT = 54095.5514 VIRIAL = 52953.5751 VOLUME =
> >>> 1860086.9230
> >>> Density =
> >>> 0.9909
> >>> ------------------------------------------------------------
> >> ------------------
> >>>
> >>>
> >>> I also tried parallel job on Pascal gpu on different hpc but there are
> >>> issues with density on parallel gpus.
> >>>
> >>> Thanks,
> >>> Neha
> >>>
> >>> On 9 November 2017 at 00:21, Ross Walker <rosscwalker.gmail.com>
> wrote:
> >>>
> >>>> Hi Neha,
> >>>>
> >>>> There should be no difference in the density - or any of the
> properties
> >> -
> >>>> using a single or multiple GPUs. Can you confirm that if you restart
> the
> >>>> calculation from the restart file at the 40.65ns stage you show below
> >> on a
> >>>> single GPU then the simulation continues at a density of approximately
> >> 0.99?
> >>>>
> >>>> The key being to isolate that the problem is just from using more
> than 1
> >>>> GPU and not from some other issue such as an incorrect setting in the
> >> mdin
> >>>> file for example.
> >>>>
> >>>> It would also help if you can provide some more details about your
> >> setup,
> >>>> GPU model, AMBER version etc.
> >>>>
> >>>> Note, as an aside running a single simulation on multiple GPUs is not
> >>>> always faster so you might want to check that you actually get a speed
> >>>> improvement from using more than 1 GPU at once. Although that's
> separate
> >>>> from the issue you are reporting since even if it runs slower on
> >> multiple
> >>>> GPUs it shouldn't give incorrect answers.
> >>>>
> >>>> All the best
> >>>> Ross
> >>>>
> >>>>> On Nov 8, 2017, at 1:15 AM, Neha Gandhi <n.gandhiau.gmail.com>
> wrote:
> >>>>>
> >>>>> Dear List,
> >>>>>
> >>>>> I am running NPT simulation using parallel gpus. Upon using
> pmemd.cuda,
> >>>> the
> >>>>> density is 0.99.
> >>>>>
> >>>>>
> >>>>> A V E R A G E S O V E R 2000 S T E P S
> >>>>>
> >>>>>
> >>>>> NSTEP = 10000000 TIME(PS) = 40650.000 TEMP(K) = 309.99 PRESS
> =
> >>>>> 1.8
> >>>>> Etot = -461754.3426 EKtot = 120156.4717 EPtot =
> >>>>> -581910.8142
> >>>>> BOND = 2213.1206 ANGLE = 2543.5550 DIHED =
> >>>>> 5043.7658
> >>>>> 1-4 NB = 5223.7909 1-4 EEL = 9977.1082 VDWAALS =
> >>>>> 81530.0154
> >>>>> EELEC = -688536.8366 EHBOND = 0.0000 RESTRAINT =
> >>>>> 94.6664
> >>>>> EAMBER (non-restraint) = -582005.4806
> >>>>> EKCMT = 57615.1448 VIRIAL = 57540.2827 VOLUME =
> >>>>> 1978006.8374
> >>>>> Density =
> >>>>> 0.9904
> >>>>> -----------------------------------------------------------
> >>>>> -------------------
> >>>>>
> >>>>>
> >>>>> When I use parallel gpus, (the jobs are still onging), the density is
> >>>> 0.87.
> >>>>> I was wondering if this is the expected behaviour when using
> >>>>> pmemd.cuda.MPI.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> NSTEP = 70000 TIME(PS) = 40790.000 TEMP(K) = 310.56 PRESS
> =
> >>>>> -7.3
> >>>>> Etot = -203302.7551 EKtot = 113220.4141 EPtot =
> >>>>> -316523.1692
> >>>>> BOND = 2874.4111 ANGLE = 3392.6226 DIHED =
> >>>>> 5522.4047
> >>>>> 1-4 NB = 5274.6312 1-4 EEL = 9963.2258 VDWAALS =
> >>>>> 54760.1196
> >>>>> EELEC = -534081.2338 EHBOND = 0.0000 RESTRAINT =
> >>>>> 117.8627
> >>>>> EAMBER (non-restraint) = -316641.0319
> >>>>> EKCMT = 53755.0242 VIRIAL = 54084.9861 VOLUME =
> >>>>> 2107014.1166
> >>>>> Density =
> >>>>> 0.8749
> >>>>> ------------------------------------------------------------
> >>>> ------------------
> >>>>>
> >>>>>
> >>>>> NSTEP = 75000 TIME(PS) = 40800.000 TEMP(K) = 309.02 PRESS
> =
> >>>>> 12.8
> >>>>> Etot = -204018.1082 EKtot = 112657.9297 EPtot =
> >>>>> -316676.0379
> >>>>> BOND = 2955.5325 ANGLE = 3323.3666 DIHED =
> >>>>> 5514.8454
> >>>>> 1-4 NB = 5328.7581 1-4 EEL = 9964.6209 VDWAALS =
> >>>>> 55136.4175
> >>>>> EELEC = -534740.1213 EHBOND = 0.0000 RESTRAINT =
> >>>>> 122.2405
> >>>>> EAMBER (non-restraint) = -316798.2784
> >>>>> EKCMT = 53530.0160 VIRIAL = 52950.1399 VOLUME =
> >>>>> 2104363.9330
> >>>>> Density =
> >>>>> 0.8760
> >>>>> ------------------------------------------------------------
> >>>> ------------------
> >>>>>
> >>>>> I am happy to provide more information on job script and input files
> if
> >>>>> required.
> >>>>>
> >>>>> Regards,
> >>>>> Neha
> >>>>> _______________________________________________
> >>>>> AMBER mailing list
> >>>>> AMBER.ambermd.org
> >>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Regards,
> >>> Dr. Neha S. Gandhi,
> >>> Vice Chancellor's Research Fellow,
> >>> Queensland University of Technology,
> >>> 2 George Street, Brisbane, QLD 4000
> >>> Australia
> >>> LinkedIn
> >>> Research Gate
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >
> >
> >
> > --
> > Regards,
> > Dr. Neha S. Gandhi,
> > Vice Chancellor's Research Fellow,
> > Queensland University of Technology,
> > 2 George Street, Brisbane, QLD 4000
> > Australia
> > LinkedIn
> > Research Gate
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Regards,
Dr. Neha S. Gandhi,
Vice Chancellor's Research Fellow,
Queensland University of Technology,
2 George Street, Brisbane, QLD 4000
Australia
LinkedIn
Research Gate
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Nov 10 2017 - 17:30:02 PST
Custom Search