Re: [AMBER] pmemd.cuda.MPI on Comet- MPI dying from Ross Walker on 2015-10-22 (Amber Archive Oct 2015)

From: Ross Walker <ross.rosswalker.co.uk>
Date: Thu, 22 Oct 2015 21:37:32 -0700

Hi Kenneth,

A few things to try.

1) Right after the modules are loaded add: nvidia-smi -pm 1 (this will force loading of the nvidia driver)

2) Are you certain that the CUDA_VISIBLE_DEVICES you are specifying in the mpi command is getting propogated? - What does your two mdout files report for the value of CUDA_VISIBLE_DEVICES? If this isn't getting propogated that would explain it.

3) Ditch all this fancy mpi crap: MV2_USE_CUDA=1 MV2_USE_GPUDIRECT_GDRCOPY=0 MV2_CPU_MAPPING=0:2

I have no idea what these options are doing and they are probably just breaking things. I'd also ditch all the fancy module loads:

> module load amber/14

> module load intel/2015.2.164
> module load cuda/6.5
> module load mvapich2_gdr

Load plain old vanilla GCC and Gfortran and vanilla mpich3 or mpich2 - and compile your own copy of amber 14 with:

./configure -cuda gnu
make
make clean
./configure -cuda -mpi gnu
make
make clean

And you'll probably find all you problems go away. Amber GPU was written deliberately not to need ANY fancy compilers or mpi libraries, or fancy interconnects or fancy GPU direct options etc. They all just make things fragile. I don't know why the SDSC folk compiled the GPU code for the modules listed above in the first place.

Ultimately you want it nice and simple. I don't have a login on Comet so I can't give you the exact options but something like

module load gnu/4.4.7
module load mpich3
module load cuda/6.5

copy in your own amber 14 and AmberTools15 tar files. Untar them in ~/ and

export AMBERHOME=~/amber14
./update_amber --update
./configure -cuda gnu
make -j8 install
make clean
./configure -cuda -mpi gnu
make -j8 install
make clean

Then have your runscript be real simple like:

#!/bin/bash
#SBATCH --job-name="testB1"
#SBATCH --output="comet.%j.%N.out"
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --no-requeue
#SBATCH --gres=gpu:4
#SBATCH --export=ALL
#SBATCH -t 00:10:000
#SBATCH -A TG-TRA130030
#SBATCH --mail-type=begin
#SBATCH --mail-type=end

module load gnu/4.4.7
module load mpich3
module load cuda/6.5

hostname
nvidia-smi (-pm 1 if you can - may need root in which case leave it out)

export AMBERHOME=~/amber14

cd job1
export CUDA_VISIBLE_DEVICES=0,1
mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i ... >& job1.log &

cd ../job2
export CUDA_VISIBLE_DEVICES=2,3
mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i ... >& job2.log &

wait

Hope that helps.

All the best
Ross

> On Oct 22, 2015, at 20:39, Kenneth Huang <kennethneltharion.gmail.com> wrote:
>
> Hi Ross,
>
> Right, I suppose functionally identical would be a better description.
>
> First thing first is to run jobs with 'IDENTICAL' input on both sets of
>> GPUs. If you see it fail on one set but not the other then it means it is a
>> machine configuration issue / bios / etc and I can escalate it to SDSC
>> support.
>>
>
> That's what happens when it fails- I haven't been able to see it fail on
> both jobs yet, which might be an issue of running more tests. Whatever is
> the first job in the script seems to be the one to fail when the error pops
> up.
>
> With identical inputs for both GPUs using 06B, the first job in the script
> failed while the second was able to run. But resubmitting the exact same
> job, both jobs ran without any issues. Doing the same thing for 06A didn't
> see any issues on two tries, even though that job was the one originally
> failing in the past (possibly because it was first).
>
> If I try with different inputs, ie swapping 06B on GPUs 0,1 and 06A on 2,3,
> then when it's the one on top (06B) that fails. Likewise, the bottom one
> that previously kept hitting the MPI error (06A) no longer has any issues.
>
> I actually opened a ticket with SDSC support through the XSEDE help desk
> earlier this week about this and some bizarre performance drops on one of
> the GPU nodes, but we couldn't figure out if this problem was a bug or a
> resource issue, so I figured to check and see.
>
>
> If it fails on both (or runs fine on both) then it says it is something
>> with your job and we can attempt to find if there is a bug in the GPU code
>> or something weird about your input. To do this though I need input that
>> fails on any combination of 2 GPUs.
>
>
> That's the part I can't get my head around is that the error doesn't seem
> to be consistent? The 06B job mentioned above used the below script and
> failed with first job, but work fine on both when I resubmitted it without
> changing anything.
>
> #!/bin/bash
> #SBATCH --job-name="testB1"
> #SBATCH --output="comet.%j.%N.out"
> #SBATCH --partition=gpu
> #SBATCH --nodes=1
> #SBATCH --ntasks-per-node=24
> #SBATCH --no-requeue
> #SBATCH --gres=gpu:4
> #SBATCH --export=ALL
> #SBATCH -t 00:10:000
> #SBATCH -A TG-TRA130030
> #SBATCH --mail-type=begin
> #SBATCH --mail-type=end
>
> module load amber/14
> module load intel/2015.2.164
> module load cuda/6.5
> module load mvapich2_gdr
>
> export SLURM_NODEFILE=`generate_pbs_nodefile`
> mpirun_rsh -hostfile $SLURM_NODEFILE -np 2 MV2_USE_CUDA=1
> MV2_USE_GPUDIRECT_GDRCOPY=0 MV2_CPU_MAPPING=0:2 CUDA_VISIBLE_DEVICES=0,1
> /share/apps/gpu/amber/pmemd.cuda.MPI -O -i 06B_prod.in -o testB1.out -p
> B.prmtop -c 06B_preprod2.rst -r testB1.rst -x testB1.nc -inf testB1.mdinfo
> -l testB1.log &
>
> mpirun_rsh -hostfile $SLURM_NODEFILE -np 2 MV2_USE_CUDA=1
> MV2_USE_GPUDIRECT_GDRCOPY=0 MV2_CPU_MAPPING=1:3 CUDA_VISIBLE_DEVICES=2,3
> /share/apps/gpu/amber/pmemd.cuda.MPI -O -i 06B_prod.in -o testB2.out -p
> B.prmtop -c 06B_preprod2.rst -r testB2.rst -x testB2.nc -inf testB2.mdinfo
> -l testB2.log &
>
> wait
>
>
>
> Best,
>
> Kenneth
>
> On Thu, Oct 22, 2015 at 12:16 PM, Ross Walker <rosscwalker.gmail.com> wrote:
>
>> Hi Kenneth,
>>
>>> Yes, both the inputs and systems themselves are almost identical- 06B
>> has a
>>> ligand that 06A doesn't have, so the only difference in the inputs is the
>>> nmr restraint file that they refer to.
>>>
>>
>> So they are not the same. There is no such thing as 'almost' identical.
>> Same as there is no such thing as 'almost' unique. The terms identical and
>> unique are absolute adjectives. They can be true or false but nothing in
>> between. The same is true of the word 'perfect' - although I note that even
>> the US constitution gets this wrong with the phrase "..in order to form a
>> more perfect union..."
>>
>> First thing first is to run jobs with 'IDENTICAL' input on both sets of
>> GPUs. If you see it fail on one set but not the other then it means it is a
>> machine configuration issue / bios / etc and I can escalate it to SDSC
>> support.
>>
>> If it fails on both (or runs fine on both) then it says it is something
>> with your job and we can attempt to find if there is a bug in the GPU code
>> or something weird about your input. To do this though I need input that
>> fails on any combination of 2 GPUs.
>>
>> All the best
>> Ross
>>
>> /\
>> \/
>> |\oss Walker
>>
>> ---------------------------------------------------------
>> | Associate Research Professor |
>> | San Diego Supercomputer Center |
>> | Adjunct Associate Professor |
>> | Dept. of Chemistry and Biochemistry |
>> | University of California San Diego |
>> | NVIDIA Fellow |
>> | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
>> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
>> ---------------------------------------------------------
>>
>> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
>> be read every day, and should not be used for urgent or sensitive issues.
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
> --
> Ask yourselves, all of you, what power would hell have if those imprisoned
> here could not dream of heaven?
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Oct 22 2015 - 22:00:04 PDT