Re: [AMBER] Amber 20 - Support Multi GPU

From: Adriano José Ferruzzi <adrianof.unicamp.br>
Date: Thu, 23 Jul 2020 11:11:26 -0300

Dear Davi.

We have followed the instructions presented at page too:
https://ambermd.org/GPUHowTo.php
  Installation and Testing
  Multiple GPU (pmemd.cuda.MPI)
  Running GPU-Accelerated Simulations
  Multi GPU

We compiled Amber using these instructions:
https://ambermd.org/doc12/Amber20.pdf
  Chapter 2. Installation
  2.1. Basic installation guide


I made the tests and put logs below and I did set CUDA_VISIBLE_DEVICES=0,1
and export DO_PARALLEL="mpirun -np 2".


make test.cuda.parallel
(cd test && make test.cuda.parallel)
make[1]: Entering directory '/home/allan/programas/amber20/test'
./test_amber_clean.sh
make[2]: Entering directory '/home/allan/programas/amber20/test/cuda/remd'
make[3]: Entering directory '/home/allan/programas/amber20/test/cuda/remd'
make[3]: Leaving directory '/home/allan/programas/amber20/test/cuda/remd'
make[2]: Leaving directory '/home/allan/programas/amber20/test/cuda/remd'
./test_amber_cuda_parallel.sh

Tests being run with DO_PARALLEL="mpirun -np 2".

Using default PREC_MODEL = DPFP
make[2]: Entering directory '/home/allan/programas/amber20/test'
cd cuda && make -k test.pmemd.cuda.MPI PREC_MODEL=DPFP
make[3]: Entering directory '/home/allan/programas/amber20/test/cuda'
---------------------------------------------
Running Extended CUDA Implicit solvent tests.
      Precision Model = DPFP
---------------------------------------------
cd trpcage/ && ./Run_md_trpcage DPFP yes
diffing trpcage_md.out.GPU_DPFP with trpcage_md.out
PASSED
==============================================================
cd myoglobin/ && ./Run_md_myoglobin DPFP yes
diffing myoglobin_md.out.GPU_DPFP with myoglobin_md.out
PASSED
==============================================================
cd myoglobin/ && ./Run_md_myoglobin_igb7 DPFP yes
diffing myoglobin_md_igb7.out.GPU_DPFP with myoglobin_md_igb7.out
PASSED
==============================================================
cd myoglobin/ && ./Run_md_myoglobin_igb8 DPFP yes
diffing myoglobin_md_igb8.out.GPU_DPFP with myoglobin_md_igb8.out
PASSED
==============================================================
cd myoglobin/ && ./Run_md_myoglobin_igb8_gbsa DPFP yes
diffing myoglobin_md_igb8_gbsa.out.GPU_DPFP with myoglobin_md_igb8_gbsa.out
PASSED
==============================================================
cd myoglobin/ && ./Run_md_myoglobin_igb8_gbsa3 DPFP yes
diffing myoglobin_md_igb8_gbsa3.out.GPU_DPFP with
myoglobin_md_igb8_gbsa3.out
PASSED
==============================================================
cd gbsa_xfin/ && ./Run.gbsa3 DPFP yes
diffing mdout.gbsa3_out.GPU_DPFP with mdout.gbsa3_out
PASSED
==============================================================
cd cnstph/implicit/ && ./Run.cnstph DPFP yes
diffing mdout.GPU_DPFP with mdout
PASSED
==============================================================
diffing cpout.GPU_DPFP with cpout
PASSED
==============================================================
cd cnste/implicit/ && ./Run.cnste DPFP yes
diffing mdout.GPU_DPFP with mdout
PASSED
==============================================================
diffing ceout.GPU_DPFP with ceout
PASSED
==============================================================
cd cnstphe/implicit/ && ./Run.cnstphe DPFP yes
diffing mdout.GPU_DPFP with mdout
PASSED
==============================================================
diffing cpout.GPU_DPFP with cpout
PASSED
==============================================================
diffing ceout.GPU_DPFP with ceout
PASSED
==============================================================
cd chamber/dhfr/ && ./Run.dhfr_charmm.md DPFP yes
diffing mdout.dhfr_charmm_md.GPU_DPFP with mdout.dhfr_charmm_md
PASSED
==============================================================
cd chamber/dhfr_cmap/ && ./Run.dhfr_charmm.md DPFP yes
diffing mdout.dhfr_charmm_md.GPU_DPFP with mdout.dhfr_charmm_md
PASSED
==============================================================
cd nucleosome/ && ./Run_md.1 DPFP yes

Test stops here ...


If we start a real system on Amber, it stays freezed in 0% until it is
canceled, but we can see the processes there.

+-----------------------------------------------------------------------------+
| Processes: GPU
Memory |
| GPU PID Type Process name Usage
   |
|=============================================================================|
| 0 20718 C pmemd.cuda_SPFP.MPI
 721MiB |
| 1 20719 C pmemd.cuda_SPFP.MPI
 909MiB |
+-----------------------------------------------------------------------------+


Thank you for your attention.
Cheers.

Em qua., 22 de jul. de 2020 às 21:37, David A Case <david.case.rutgers.edu>
escreveu:

> On Wed, Jul 22, 2020, Adriano José Ferruzzi wrote:
> >
> >We are now trying to run pmemd.cuda_SPFP.MPI using multi GPU to take
> >advantage of the 2 GPUs Tesla k40 we have per node, but Amber starts to
> run
> >and gets 0% on both GPUs. Here, we followed the instructions presented at
> >ambermd.org/gpus16/index.htm
>
> As that page notes, it is based on Amber16, and may be out of date.
> You didn't say what parts of that page you followed.
>
> Have you tried running "make test.cuda.parallel"? That will help see if
> the
> problem is a generic one, or might be based on something wrong with the
> script
> you are running.
>
> How did you set CUDA_VISIBLE_DEVICES? I think it should be something like
> "0,1", but I'm not a multi-GPU expert -- just trying to collect information
> for others to consider.
>
> ...thanks...dac
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>


-- 
Adriano Ferruzzi
Systems Administrator
Institute of Chemistry - Unicamp
CCES - Center for Computing in Engineering & Sciences
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jul 23 2020 - 08:00:03 PDT
Custom Search