Re: [AMBER] problem with "make test.parallel" for AMBER12

From: Jason Swails <jason.swails.gmail.com>
Date: Mon, 16 Apr 2012 15:05:15 -0400

On Mon, Apr 16, 2012 at 10:15 AM, Marc van der Kamp <
marcvanderkamp.gmail.com> wrote:

> Hi,
>
> I have a problem running the tests for the parallel executables. (My
> apologies if this is not an AMBER issue per se)
>
> I have:
> compiled & tested AMBER12 in serial (with success)
> downloaded and compiled openmpi-1.5.4 (using "./configure_openmpi -gnu" in
> $AMBERHOME/AmberTools/src/)
> compiled AMBER12 in parallel (with success, it seems)
>
> Now, I'd like to test AMBER12 in parallel.
> First, I tried simply doing:
>
> cd $AMBERHOME
> make test
>
> The first series of tests that run are in AmberTools/test/nab and
> these tests are fine.
> Then, when trying to run tests in AmberTools/test/mmpbsa_py I get a bunch
> of errors like this:
>
> make[3]: Entering directory
> `/export/users/chmwvdk/amber12/AmberTools/test/mmpbsa_py'
> cd EstRAL_Files && ./Run.makeparms
> This is not a parallel test.
> cd 01_Generalized_Born && ./Run.GB
> [curie.chm.bris.ac.uk:23918] [[32424,1],0] ORTE_ERROR_LOG: Data unpack
> would read past end of buffer in file util/nidmap.c at line 371
> [curie.chm.bris.ac.uk:23918] [[32424,1],0] ORTE_ERROR_LOG: Data unpack
> would read past end of buffer in file base/ess_base_nidmap.c at line 62
> [curie.chm.bris.ac.uk:23918] [[32424,1],0] ORTE_ERROR_LOG: Data unpack
> would read past end of buffer in file ess_env_module.c at line 173
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_ess_base_build_nidmap failed
> --> Returned value Data unpack would read past end of buffer (-26)
> instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> ...
> ...
>
> This is unfortunate, but not really of great importance to me, as the
> parallel executables of the AMBER12 programs (not AmberTools12) is what I'm
> really after.
> However, when I get to this stage, the process simply 'hangs' after:
> make[2]: Entering directory `/export/users/chmwvdk/amber12/test'
> export TESTsander=/users/chmwvdk/amber12/bin/sander.MPI; make -k
> test.sander.BASIC
> make[3]: Entering directory `/export/users/chmwvdk/amber12/test'
> cd cytosine && ./Run.cytosine
>
>
> It turns out that the cytosine test has run fine, writing up to 'section 4'
> in the output file (values in cytosine/cytosine.out and
> cytosine/cytosine.out.save are identical), but nothing happens after that.
> In other words, the process seems to hang on getting the TIMINGS
> information (section 5) in the output file.
>
> I checked that my environment was OK (as far as I can tell), i.e.
> $PATH contains $AMBERHOME/bin
> $LD_LIBRARY_PATH contains $AMBERHOME/lib
> $MPI_HOME is set to $AMBERHOME
> 'which mpirun' gives $AMBERHOME/bin/mpirun
> 'which mpicc' gives $AMBERHOME/bin/mpicc
>
> This installation is on a cluster, and I'm not sure if it is set up to run
> in parallel on the headnode, so I also made a PBS submission script (as
> suggested in http://archive.ambermd.org/200701/0112.html ) and submitted
> that.
> This is the submission script:
>

Ah, sometimes clusters can be trickier.


> #!/bin/bash
> #
> #PBS -l walltime=5:0:0,nodes=1:ppn=4
> #PBS -q veryshort
> #PBS -N parallel_test
> #PBS -j oe
>
> export DO_PARALLEL="mpirun -np 4 "
>

Try setting DO_PARALLEL to something that uses the PBS_NODEFILE. Maybe
something like this:

export DO_PARALLEL="mpirun -hostfile $PBS_NODEFILE"

(the -hostfile flag may differ depending on your MPI implementation -- see
the mpirun/mpiexec man pages).

As an aside, I notice that the 'header' of output files from AMBER12 still
> reads:
>
> -------------------------------------------------------
> Amber 11 SANDER 2010
> -------------------------------------------------------
>
> That should probably be updated...
>

It is -- bugfix.2 for Amber 12. You can run:

$AMBERHOME/patch_amber.py --update

to download and apply all patches.

HTH,
Jason

--
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Apr 16 2012 - 12:30:03 PDT
Custom Search