Re: [AMBER] change in density on multiple gpus

From: Neha Gandhi <n.gandhiau.gmail.com>
Date: Thu, 9 Nov 2017 07:24:48 +1000

Hi Ross,

I understand that serial GPU performs better over parallel for some
systems. We are required to request 2 GPUs per job on most supercomputing
facilities in Australia even if the job will only use single gpu.

The specification of my job is below

AMBER version 16.

 &cntrl


imin=0,irest=1,ntx=5,


nstlim=5000000,dt=0.002,


ntc=2,ntf=2,

  cut=12.0, ntb=2, ntp=1,
taup=1,
  ntpr=5000,
ntwx=5000,ntwr=5000,
  ntt=3,
gamma_ln=5.0,

temp0=310.0,iwrap=1,ig=-1,

  ntr=1,
restraintmask=':CNT',

restraint_wt=2.0,


scaledMD=1,


scaledMD_lambda=0.70,

 /



Jobscript on HPC

#!/bin/bash -l
#PBS -N cnt2
#PBS -l walltime=24:00:00
#PBS -l select=1:ncpus=2:ngpus=2:mpiprocs=2:gputype=M40:mem=10gb
#PBS -j oe

module purge
module load
amber/16-iomkl-2016.09-gcc-4.9.3-2.25-ambertools-16-patchlevel-5-14-cuda

cd $PBS_O_WORKDIR
mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i smd.in -o smd1.out -p
solvated.prmtop -c npt1.rst -r smd1.rst -x smd1.netcdf -ref npt1.rst


|------------------- GPU DEVICE INFO --------------------
|
| Task ID: 0
| CUDA_VISIBLE_DEVICES: 0,1
| CUDA Capable Devices Detected: 2
| CUDA Device ID in use: 0
| CUDA Device Name: Tesla M40 24GB
| CUDA Device Global Mem Size: 22971 MB
| CUDA Device Num Multiprocessors: 24
| CUDA Device Core Freq: 1.11 GHz
|
|
| Task ID: 1
| CUDA_VISIBLE_DEVICES: 0,1
| CUDA Capable Devices Detected: 2
| CUDA Device ID in use: 1
| CUDA Device Name: Tesla M40 24GB
| CUDA Device Global Mem Size: 22971 MB
| CUDA Device Num Multiprocessors: 24
| CUDA Device Core Freq: 1.11 GHz
|

I can confirm that the same job with single GPU shows density of 0.99.

 NSTEP = 5000 TIME(PS) = 40660.000 TEMP(K) = 310.08 PRESS =
28.4
 Etot = -267200.5515 EKtot = 113029.5000 EPtot =
-380230.0515
 BOND = 2973.9049 ANGLE = 3436.4722 DIHED =
5568.2840
 1-4 NB = 5125.8320 1-4 EEL = 9867.1070 VDWAALS =
76308.4188
 EELEC = -646585.5951 EHBOND = 0.0000 RESTRAINT =
119.7885
 EAMBER (non-restraint) = -380349.8399
 EKCMT = 54095.5514 VIRIAL = 52953.5751 VOLUME =
1860086.9230
                                                    Density =
0.9909
 ------------------------------------------------------------------------------


I also tried parallel job on Pascal gpu on different hpc but there are
issues with density on parallel gpus.

Thanks,
Neha

On 9 November 2017 at 00:21, Ross Walker <rosscwalker.gmail.com> wrote:

> Hi Neha,
>
> There should be no difference in the density - or any of the properties -
> using a single or multiple GPUs. Can you confirm that if you restart the
> calculation from the restart file at the 40.65ns stage you show below on a
> single GPU then the simulation continues at a density of approximately 0.99?
>
> The key being to isolate that the problem is just from using more than 1
> GPU and not from some other issue such as an incorrect setting in the mdin
> file for example.
>
> It would also help if you can provide some more details about your setup,
> GPU model, AMBER version etc.
>
> Note, as an aside running a single simulation on multiple GPUs is not
> always faster so you might want to check that you actually get a speed
> improvement from using more than 1 GPU at once. Although that's separate
> from the issue you are reporting since even if it runs slower on multiple
> GPUs it shouldn't give incorrect answers.
>
> All the best
> Ross
>
> > On Nov 8, 2017, at 1:15 AM, Neha Gandhi <n.gandhiau.gmail.com> wrote:
> >
> > Dear List,
> >
> > I am running NPT simulation using parallel gpus. Upon using pmemd.cuda,
> the
> > density is 0.99.
> >
> >
> > A V E R A G E S O V E R 2000 S T E P S
> >
> >
> > NSTEP = 10000000 TIME(PS) = 40650.000 TEMP(K) = 309.99 PRESS =
> > 1.8
> > Etot = -461754.3426 EKtot = 120156.4717 EPtot =
> > -581910.8142
> > BOND = 2213.1206 ANGLE = 2543.5550 DIHED =
> > 5043.7658
> > 1-4 NB = 5223.7909 1-4 EEL = 9977.1082 VDWAALS =
> > 81530.0154
> > EELEC = -688536.8366 EHBOND = 0.0000 RESTRAINT =
> > 94.6664
> > EAMBER (non-restraint) = -582005.4806
> > EKCMT = 57615.1448 VIRIAL = 57540.2827 VOLUME =
> > 1978006.8374
> > Density =
> > 0.9904
> > -----------------------------------------------------------
> > -------------------
> >
> >
> > When I use parallel gpus, (the jobs are still onging), the density is
> 0.87.
> > I was wondering if this is the expected behaviour when using
> > pmemd.cuda.MPI.
> >
> >
> >
> >
> > NSTEP = 70000 TIME(PS) = 40790.000 TEMP(K) = 310.56 PRESS =
> > -7.3
> > Etot = -203302.7551 EKtot = 113220.4141 EPtot =
> > -316523.1692
> > BOND = 2874.4111 ANGLE = 3392.6226 DIHED =
> > 5522.4047
> > 1-4 NB = 5274.6312 1-4 EEL = 9963.2258 VDWAALS =
> > 54760.1196
> > EELEC = -534081.2338 EHBOND = 0.0000 RESTRAINT =
> > 117.8627
> > EAMBER (non-restraint) = -316641.0319
> > EKCMT = 53755.0242 VIRIAL = 54084.9861 VOLUME =
> > 2107014.1166
> > Density =
> > 0.8749
> > ------------------------------------------------------------
> ------------------
> >
> >
> > NSTEP = 75000 TIME(PS) = 40800.000 TEMP(K) = 309.02 PRESS =
> > 12.8
> > Etot = -204018.1082 EKtot = 112657.9297 EPtot =
> > -316676.0379
> > BOND = 2955.5325 ANGLE = 3323.3666 DIHED =
> > 5514.8454
> > 1-4 NB = 5328.7581 1-4 EEL = 9964.6209 VDWAALS =
> > 55136.4175
> > EELEC = -534740.1213 EHBOND = 0.0000 RESTRAINT =
> > 122.2405
> > EAMBER (non-restraint) = -316798.2784
> > EKCMT = 53530.0160 VIRIAL = 52950.1399 VOLUME =
> > 2104363.9330
> > Density =
> > 0.8760
> > ------------------------------------------------------------
> ------------------
> >
> > I am happy to provide more information on job script and input files if
> > required.
> >
> > Regards,
> > Neha
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Regards,
Dr. Neha S. Gandhi,
Vice Chancellor's Research Fellow,
Queensland University of Technology,
2 George Street, Brisbane, QLD 4000
Australia
LinkedIn
Research Gate
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Nov 08 2017 - 13:30:03 PST
Custom Search