Re: [AMBER] using two GPUs

From: Vijay Manickam Achari <vjrajamany.yahoo.com>
Date: Wed, 25 Apr 2012 06:34:52 +0100 (BST)

Thank you for the kind reply.
I have tried to figure out based on your info and other sources as well to get the two GPUs work.

For the machinefile:
I checked /dev folder and I saw list of NVIDIA card names as  :-  nvidia0, nvidia1, nvidia2, nvidia3, nvidia4. I understand these names should be listed in the mahcinefile. I comment out nvidia0, nvidia1, nvidia2 since I only wanted to use two GPUs.

machineFilevj: 
nvidia3
nvidia4

For the hostFile
I typed hostname command and I got "gpucc". So then I created a file called hostFile which contain gpucc name twice since this using two GPUs.

hostFilevj:
gpucc
gpucc

Then I created a script file which contain info as below:  


********************************************************************
#/bin/csh
setenv $AMBERHOME /usr/local/apps/amber12
setenv CUDA_VISIBLE_DEVICES "3,4"

set  system="malto-THERMO-RT"
set  input="MD-betaMalto-THERMO.in"
set  top="malto-THERMO.top"
set  initialCoor="betaMalto-THERMO-MD03-run0100.rst.1"
set  MPIrun="mpirun ./machineFilevj ./hostFilevj -np 2 pmemd.cuda.MPI"

$MPIrun -O -i $input -p $top -c $initialCoor -o $system-MD00-run1000.out -x $system-MD00-run1000.traj -r $system-MD01-run0100.rst  < /dev/null
********************************************************************************************************************************

When I execute this command I get the error message as below:


***************************************************************************************************************
[vijay.gpucc benchMark-malto-Thermo-in-2GPU-amber12]$ ./gpu-md-malHL-RT-1ns.sh 
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[proxy:0:0.gpucc] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file ./machineFilevj (Permission denied)
[vijay.gpucc benchMark-malto-Thermo-in-2GPU-amber12]$ ll

*****************************************************************************************************************

How to go about this ?
Appreciate any help.

site note: I am using using 4 unist of NVIDIA TESLA C2075  and one using of Quadpro for display. 
Regards
Vj






Yes, you need to create the machine file.  There is probably an alternative approach that you can use to specify the host(s) on the mpirun command line as well.
The machine file is used by MPI to specify which machines the individual MPI ranks will run on.
As I said, to run Amber on multiple GPUs requires the use of MPI, even if you are only running on one machine.

You will probably need to learn how to use MPI, it's not that difficult.

I did a google search of "mpi machinefile" and got many useful hits, both on how to create a machine file as well as general MPI Tutorials.

Your machine may also have a man page for the mpirun command, which should help.

And as Jason said, MPICH2 1.4 does not need the mpd command, so ignore that.

In linux, you can get the hostname of the machine you are working on using the command:

hostname

for example, if you ran hostname on your machine, let's say you got back:

myhostxxx

Then take the results of that command and edit the file ~/mpd.hosts and put that name twice in the file (for using 2 GPUs):


myhostxxx
myhostxxx

Then try running your amber job with the mpirun command:

mpirun -machinefile ~/mpd.hosts -np 2 ./pmemd.cuda.MPI -O -o mdout -x mdcrd -r restrt -inf mdinfo

You will need to modify the above line to specify the input and output files you will use.

-----Original Message-----
From: Vijay Manickam Achari [mailto:vjrajamany.yahoo.com]
Sent: Tuesday, April 24, 2012 12:17 AM
To: AMBER Mailing List
Subject: Re: [AMBER] using two GPUs

Thank you for the kind reply.

I have less understanding on  -machinefile and~/mpd.hosts I don't use any parallel job scheduler. In this situation how or from where I can get the machinefile and mpd.hosts info? Do I need to creat one? Sorry I am not aware of these issues.

I appreciate if you can help me here. I tried surf the net to get some info on this matter but no avail.

Regards.
  
Vijay Manickam Achari
(Phd Student c/o Prof Rauzah Hashim)
Chemistry Department,
University of Malaya,
Malaysia
vjramana.gmail.com


________________________________
From: Robert Crovella <RCrovella.nvidia.com>
To: AMBER Mailing List <amber.ambermd.org>
Sent: Saturday, 21 April 2012, 20:50
Subject: Re: [AMBER] using two GPUs

For MPICH2 you will need to start the mpi daemon first:

mpd &
mpirun -machinefile ~/mpd.hosts -np 2 ./pmemd.cuda.MPI -O -o mdout -x mdcrd -r restrt -inf mdinfo

something like that should work

your machinefile (mpd.hosts) would just have your hostname listed twice, like this:

hostname
hostname

The above assumes that your MPI is installed properly and that you have built pmemd.cuda.MPI according to the AMBER build instructions.

-----Original Message-----
From: Vijay Manickam Achari [mailto:vjrajamany.yahoo.com]
Sent: Friday, April 20, 2012 10:15 PM
To: AMBER Mailing List
Subject: Re: [AMBER] using two GPUs

Thanks for the info.
By the way I am using mpich2-1.4 version.

Anyone could help on this?

Regards
 

 
Vijay Manickam Achari
(Phd Student c/o Prof Rauzah Hashim)
Chemistry Department,
University of Malaya,
Malaysia
vjramana.gmail.com


________________________________
From: Robert Crovella <RCrovella.nvidia.com>
To: Amber mailing List <amber.ambermd.org>
Sent: Saturday, 21 April 2012, 10:07
Subject: Re: [AMBER] using two GPUs

To use multiple GPUs, you must use the MPI build of GPU AMBER, i.e. pmemd.cuda.MPI

For MVAPICH2  1.6 version of MPI, a 2 gpu invocation might look something like this:
mpirun_rsh -ssh -hostfile ~/.amber.hosts.2 -np 2 pmemd.cuda.MPI -O -i mdin -c inpcrd -l logmpi -o ~/results/amber-2-gpus

you'll need to setup your hostfile properly for MPI

-----Original Message-----
From: Vijay Manickam Achari [mailto:vjrajamany.yahoo.com]
Sent: Friday, April 20, 2012 7:40 PM
To: Amber mailing List
Subject: [AMBER] using two GPUs

I am trying to use more then one GPU to run my job. For instance I would like to use two GPUs with its ID 2 and 3.

I used environment setup as given in Amber12 manual (page 247) as below

setenv $AMBERHOME /usr/local/apps/amber12 #setenv CUDA_VISIBLE_DEVICES 2,3 export CUDA_VISIBLE_DEVICES=2,3
 

But when I use top command I can see only one pmemd.cuda command is running not two.
IS there any way to make both the GPUs run on same job?

Thanks


Vijay Manickam Achari
(Phd Student c/o Prof Rauzah Hashim)
Chemistry Department,
University of Malaya,
Malaysia
vjramana.gmail.com
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 24 2012 - 23:00:04 PDT
Custom Search