Re: AMBER: Error while compiling Amber

From: Junmei Wang <junmwang.gmail.com>
Date: Fri, 15 Feb 2008 11:16:47 -0600

Hi, Lili,
Here is the story. At the beginning, I tried to use qsub, as suggested by
the adminstrator to submit jobs. However, I found I couldn't gain any
benefit from the parallel calculations, the wall time of a 6-cpu parallel
job was even longer than that of one cpu. I could see six processor were
running on three machines (each has two cpu). So it was very weird.

So we decided to submit jobs from command lines. When I start mpd, I first
get a list of machines that have no computational load (I wrote a program to
do that, this program reads in the complete list of nodes and the output of
'qstat', and write out mpd.hosts), then I start mpd using command 'mpd -n
20' if 20 nodes will be used. A typicall mpd.hosts is like this:
compute-0-0:2
compute-0-1:2
......
each node has two cpu

To submit jobs, I first generate a machine file which has same format as
mpd.hosts. I wrote a program to do so. This program first reads in a batch
file from where to figure out how many cpu are to be used, then executes
commands 'mpdtrace' to find out the available nodes and 'mpdlistjobs' to
find out which nodes are currently running jobs, then obtains those node
having no computational load and writes out a machine file for this job.
Here is a typical command of running sander.MPI

nice +19 mpirun -machinefile ./machine -np 4 sander.MPI -ng 2 -groupfile
sanderjob.group < /dev/null

This procedure works fine for me and all I need is to run the program to
generate a new machine file when I submit a job.

Best

Junmei

On Thu, Feb 14, 2008 at 4:24 PM, Lili Peng <lpeng.ucsd.edu> wrote:

> Hi Junmei,
>
> Thanks for your input. I think I'm having the same problem with
> submitting parallel jobs. Could you go into more detail about using and
> specifying which machine files for each job? I tried to look for this in
> the MPICH2 manual, but didn't find anything. Any advice from your end would
> be greatly appreciated.
>
> Bests,
> Lili
>
>
>
> On 14/02/2008, Junmei Wang <junmwang.gmail.com> wrote:
> >
> > I am also running AMBER9 on a rocks cluster. I used intel 10 compilers,
> > but intelmpi didn't work for me, neither openmpi. Finally, I used mpich2 to
> > run parallel jobs. As it was pointed out, for mpich2, one needs to run mpd
> > daemon first. I found the mpich2 installation manual is really helpful (
> > http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-doc-install.pdf).
> >
> >
> > I only had one problem when I submit multiple parallel jobs: all jobs
> > were running on the first several nodes, although many nodes in
> > mpd.hosts have no load at all. I solved this problem by using many
> > machine files. It is a little bit silly to specify a machine file for each
> > job, but it works. I also wrote a simple program to prepare the machine
> > files.
> >
> > Best
> >
> > Junmei
> >
> >
> >
> > On Thu, Feb 14, 2008 at 11:41 AM, Lili Peng <lpeng.ucsd.edu> wrote:
> >
> > > Hi Jack,
> > >
> > > For the serial or parallel? I tried it with te serial, and ran into a
> > > segmentation fault during the AMBER compile. (You'd think newer versions of
> > > software would work better.) I did not try it with the parallel compile.
> > >
> > > Lili
> > >
> > >
> > > On 14/02/2008, Jack Lei <leiming72.gmail.com> wrote:
> > > >
> > > > Lili:
> > > >
> > > > Did you try Intel10 compiler?
> > > >
> > > > Best,
> > > > Jack.
> > > >
> > > > On Thu, Feb 14, 2008 at 5:09 PM, Lili Peng <lpeng.ucsd.edu> wrote:
> > > >
> > > > > Hi Ross, Chen:
> > > > >
> > > > > I'm running Linux v2.6.9 on a 210-node ROCKS cluster.
> > > > >
> > > > > That aside, I was able to successfully compile a parallel version
> > > > > of AMBER 9 (finally!). I didn't have to invoke the use of mpd, though, at
> > > > > least not for running the "test parallel". In fact, the real problem had to
> > > > > do with the version of the Intel compiler I was using. Initially I was using
> > > > > v9.0.033 and that was resulting in segmentation faults during the
> > > > > "make parallel" for TIP5P. Then I downloaded v9.1.039, and the
> > > > > parallel compile went smoothly.
> > > > >
> > > > > Interesting how 9.1.039 results in a segmentation fault for Divcon
> > > > > in serial compiles (
> > > > > http://archive.ambermd.org/200610/0273.html),
> > > > > but works fine for parallel compiles. On the other hand, 9.0.033results in a segmentation fault for TIP5P for parallel compilations but
> > > > > works fine for serial compiles. So picky..
> > > > >
> > > > > Lili
> > > > >
> > > > >
> > > > > On 09/02/2008, chen <chen.hhmi.umbc.edu> wrote:
> > > > > >
> > > > > > Looks like you don't have the right python package,. Check if
> > > > > > you have
> > > > > > python, and what version, and where, if you have python version
> > > > > > higher
> > > > > > then 2.4, then just make a link "ln -s /[the python installed]
> > > > > > /usr/bin/python2.4". For version lower than 2.4, try that too,
> > > > > > although
> > > > > > I am not sure if it's going to work. You didn't mention (or
> > > > > > maybe I
> > > > > > missed) your OS and version. That could help us to help you
> > > > > > better.
> > > > > >
> > > > > > Chen
> > > > > >
> > > > > > > Hi Ross,
> > > > > > >
> > > > > > > Thanks for your suggestion. I did look into the README of
> > > > > > MPICH2
> > > > > > > (
> > > > > > http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-doc-README.txt
> > > > > > )
> > > > > > > and tried to set up the daemon for MPD:
> > > > > > >
> > > > > > > $ export PATH="/nas/lpeng/src/mpich2-1.0.6p1/bin"
> > > > > > > $ which mpd
> > > > > > > /nas/lpeng/src/mpich2-1.0.6p1/bin/mpd
> > > > > > > $ which mpiexec
> > > > > > > /nas/lpeng/src/mpich2-1.0.6p1/bin/mpiexec
> > > > > > > $ which mpirun
> > > > > > > /nas/lpeng/src/mpich2-1.0.6p1/bin/mpirun
> > > > > > >
> > > > > > > Everything looks fine until I try to ring up the MPD:
> > > > > > > $ mpd &
> > > > > > > [1] 13151
> > > > > > > /usr/bin/env: python2.4: No such file or directory
> > > > > > > [1]+ Exit 127 mpd
> > > > > > >
> > > > > > > What happened? why is it trying to access the env
> > > > > > subdirectory, which
> > > > > > > doesn't even exist bin?
> > > > > > >
> > > > > > > Any leads on your behalf would be appreciated.
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Lili
> > > > > > >
> > > > > > > On 09/02/2008, *Ross Walker* <ross.rosswalker.co.uk
> > > > > >
> > > > > > > <mailto:ross.rosswalker.co.uk>> wrote:
> > > > > > >
> > > > > > > Hi Lili,
> > > > > > >
> > > > > > > You should check the manual for mpich2 - this version of
> > > > > > mpi
> > > > > > > requires a demon to be running (in this case called MPD)
> > > > > > on all of
> > > > > > > the nodes on which you want the code to run. This
> > > > > > typically means
> > > > > > > setting up a machine file or equivalent that is read by
> > > > > > mpd.
> > > > > > >
> > > > > > > If you just want to run this on the local machine to which
> > > > > > you are
> > > > > > > logged in then you can try.
> > > > > > >
> > > > > > > mpd &
> > > > > > > export DO_PARALLEL='mpirun =np 4'
> > > > > > > make test.parallel
> > > > > > >
> > > > > > > All the best
> > > > > > > Ross
> > > > > > >
> > > > > > > /\
> > > > > > > \/
> > > > > > > |\oss Walker
> > > > > > >
> > > > > > > | Assistant Research Professor |
> > > > > > > | San Diego Supercomputer Center |
> > > > > > > | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk
> > > > > >
> > > > > > > <mailto:ross.rosswalker.co.uk> |
> > > > > > > | http://www.rosswalker.co.uk <
> > > > > > http://www.rosswalker.co.uk/> | PGP
> > > > > >
> > > > > > > Key available on request |
> > > > > > >
> > > > > > > Note: Electronic Mail is not secure, has no guarantee of
> > > > > > delivery,
> > > > > > > may not be read every day, and should not be used for
> > > > > > urgent or
> > > > > > > sensitive issues.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > ------------------------------------------------------------------------
> > > > > > > *From:* owner-amber.scripps.edu
> > > > > > > <mailto:owner-amber.scripps.edu>
> > > > > > > [mailto:owner-amber.scripps.edu
> > > > > > > <mailto:owner-amber.scripps.edu>] *On Behalf Of *Lili
> > > > > > Peng
> > > > > > > *Sent:* Friday, February 08, 2008 17:52
> > > > > >
> > > > > > > *To:* amber.scripps.edu <mailto:amber.scripps.edu>
> > > > > >
> > > > > > > *Subject:* Re: AMBER: Error while compiling Amber
> > > > > > >
> > > > > > > Hi Dr. Case,
> > > > > > >
> > > > > > > Thank you for your help. I added the parser.c file,
> > > > > > and the
> > > > > > > serial compilation of AMBER worked.
> > > > > > >
> > > > > > > ... Though I have another problem now. It's about
> > > > > > compiling
> > > > > > > Sander in parallel. Although I've searched the
> > > > > > mailing list
> > > > > > > archives extensively, none of what's been discussed
> > > > > > addresses
> > > > > > > the issue. It has to do with the make install
> > > > > > function, and
> > > > > > > I'm trying to do a test run of sander. I have already
> > > > > > > compiled my MPI library (the MPICH2 version). Here's
> > > > > > my input
> > > > > > > code (in bold) and the errors:
> > > > > > >
> > > > > > > *$ export AMBERHOME="/nas/lpeng/src/amber9"*
> > > > > > > *$ export MPI_HOME="/nas/lpeng/opt"*
> > > > > > > *$ export CC=icc*
> > > > > > > *$ export FC=ifort*
> > > > > > > *$ ./configure --prefix=/nas/lpeng/opt CC=$CC FC=$FC*
> > > > > > > *$ make*
> > > > > > > *$ make install*
> > > > > > > *$ ./configure -mpich2 -p4 -static ifort_ia32*
> > > > > > > *$ make parallel*
> > > > > > >
> > > > > > > ... and Amber 9 was compiled successfully. Following
> > > > > > the
> > > > > > > compiling instructions, I continued on..
> > > > > > >
> > > > > > > *$ export
> > > > > > > DO_PARALLEL="/nas/lpeng/src/mpich2-1.0.6p1/bin/mpirun
> > > > > > -np 4"*
> > > > > > > *$ make test.sander*
> > > > > > >
> > > > > > > Then I get the error:
> > > > > > >
> > > > > > > make: *** No rule ot make target 'test.sander'. Stop.
> > > > > > >
> > > > > > > I also tried:
> > > > > > >
> > > > > > > *$ make test.parallel*
> > > > > > >
> > > > > > > ..and receive in response:
> > > > > > >
> > > > > > > export
> > > > > > >
> > > > > > TESTsander=/nas/lpeng/src/amber9/exe/sander.MPI/exe/sander.MPI;
> > > > > > > make test.sander.BASIC
> > > > > > > make[1]: Entering directory
> > > > > > `/nas/lpeng/src/amber9/test'
> > > > > > > cd dmp; ./Run.dmp
> > > > > > > This test not set up for parallel
> > > > > > > cannot run in parallel with #residues < #pes
> > > > > > > cd adenine; ./Run.adenine
> > > > > > > This test not set up for parallel
> > > > > > > cannot run in parallel with #residues < #pes
> > > > > > >
> > > > > > ==============================================================
> > > > > > > cd cytosine; ./Run.cytosine
> > > > > > > mpiexec_granite.ucsd.edu: cannot connect to local mpd
> > > > > > > (/tmp/mpd2.console_lpeng); possible causes:
> > > > > > > 1. no mpd is running on this host
> > > > > > > 2. an mpd is running but was started without a
> > > > > > "console" (-n
> > > > > > > option)
> > > > > > > In case 1, you can start an mpd on this host with:
> > > > > > > mpd &
> > > > > > > and you will be able to run jobs just on this host.
> > > > > > > For more details on starting mpds on a set of hosts,
> > > > > > see
> > > > > > > the MPICH2 Installation Guide.
> > > > > > > ./Run.cytosine: Program error
> > > > > > > make[1]: *** [test.sander.BASIC] Error 1
> > > > > > > make[1]: Leaving directory
> > > > > > `/nas/lpeng/src/amber9/test'
> > > > > > > make: *** [test.sander.BASIC.MPI] Error 2
> > > > > > >
> > > > > > > Do you have any leads on what is causing this to
> > > > > > occur? Any
> > > > > > > input from your end would be greatly appreciated.
> > > > > > >
> > > > > > > Sincerely,
> > > > > > > Lili
> > > > > > >
> > > > > > >
> > > > > > > On 07/02/2008, *David A. Case* <case.scripps.edu
> > > > > >
> > > > > > > <mailto:case.scripps.edu>> wrote:
> > > > > > >
> > > > > > > On Thu, Feb 07, 2008, Lili Peng wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > *$ ./configure -p4 ifort_x86_64*
> > > > > > >
> > > > > > > What kind of machine are you compiling on? On the
> > > > > > one
> > > > > > > hand, you say "-p4"
> > > > > > > (which is for Pentium IV), yet you also say
> > > > > > x86_64, which
> > > > > > > is a very different
> > > > > > > type of architecture.
> > > > > > >
> > > > > > > Typing "uname -m" should give you the information
> > > > > > you
> > > > > > > need. Then choose the
> > > > > > > correct options.
> > > > > > >
> > > > > > > >
> > > > > > > > *$ ./configure -p4 ifort_ia32 *
> > > > > > > > yacc parser.y
> > > > > > > > make[2]: execvp: yacc: Permission denied
> > > > > > >
> > > > > > > It looks like you got a lot further this time, so
> > > > > > my guess
> > > > > > > is that you are on
> > > > > > > an ia32 (aka i386 aka i686) system. You should
> > > > > > see that
> > > > > > > sander and lots of
> > > > > > > other programs have been compiled. For the above
> > > > > > problem:
> > > > > > >
> > > > > > > First, try "which yacc" and try to find out if it
> > > > > > exists,
> > > > > > > and if so, why
> > > > > > > you are getting permission denied.
> > > > > > >
> > > > > > > Second, try "which bison": if it looks like you
> > > > > > have
> > > > > > > permissions to run bison,
> > > > > > > add the line
> > > > > > >
> > > > > > > YACC = bison -y
> > > > > > >
> > > > > > > to your config.h file (in $AMBERHOME/src).
> > > > > > >
> > > > > > > Or, if neither work, put the attached parser.cfile in
> > > > > > > $AMBERHOME/src/leap/src/leap, and try again.
> > > > > > >
> > > > > > > ....dac
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----------------------------------------------------------------------
> > > > > > The AMBER Mail Reflector
> > > > > > To post, send mail to amber.scripps.edu
> > > > > > To unsubscribe, send "unsubscribe amber" to
> > > > > > majordomo.scripps.edu
> > > > > >
> > > > >
> > > > >
> > > > > <lpeng.ucsd.edu>
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> >

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Sun Feb 17 2008 - 06:07:37 PST
Custom Search