Re: [AMBER] openmp on 64core/AMD vs MPI over IB on multiple nodes

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 25 Sep 2013 16:19:07 -0400

On Wed, Sep 25, 2013 at 3:42 PM, harry mangalam <harry.mangalam.uci.edu>wrote:

> Is there a general agreement on which is the better approach?
>

With Amber you have no choice. There is very little OpenMP in Amber 11 and
the main simulation engines exclusively use MPI (in all versions). It
still operates on shared memory machines, but unless the MPI implementation
is _very_ clever and aggressively optimizes, each thread will have its own
memory space.

I have a user who wants to run 128core amber11 jobs on our very busy cluster
> but his queue wait for 128 cores is going to be extremely long and he'd
> probably get better perf on a single 64c (AMD Bulldozer) node that doesn't
> have to communicate off-board.
>
> We do have Mellanox&Voltaire QDR interconnects between 64c nodes, but still
> the Q time and comm time are going to be significant. Any general
> real-life
> observations that would push either approach?
>

If there are 64c nodes, I would be stunned if the user gets any speedup
going to 2 nodes unless they plan on leaving most of the cores empty.
 Since Amber employs MPI and not a hybrid parallel approach, each thread
has to have access to some of the infiniband bandwidth (again, unless the
MPI implementation optimizes this out). I would suggest locking 'normal'
jobs to a single node (unless doing REMD-related simulations).


> Also, is there a suggested set of compilers and MPI versions that are
> appropriate for amber11? gcc 4.4.7 (painfully) compiles the serial
> version,
> but the openMPI version of the fortran compiler fails immediately on the
> parallel version of sander:
>
>
> root.hpc-s:/data/apps/amber/amber11/src
> 1457 $ make parallel
> Starting installation of Amber11 (parallel) at Wed Sep 25 12:38:38 PDT
> 2013.
> cd sander && make parallel
> make[1]: Entering directory `/data/apps/amber/amber11/src/sander'
> cpp -traditional constants.f > _constants.f
> mpif90 -c -O3 -mtune=generic -o constants.o _constants.f
> constants.f:24.1:
>
> module constants
> 1
> Error: Non-numeric character in statement label at (1)
>

The Fortran free-format flag is not being passed. This suggests to me that
AT15_Amber11.py was not run (and therefore that the updates were not
applied). Make sure that the Amber11 and AmberTools1.5 installations are
fully patched (a painful process, I know).

Alternatively, and I strongly suggest this approach, install AmberTools 13
and Amber 12 instead (it is MUCH easier). UCI has a site license, as Dave
mentioned earlier, through Ray Luo's research group.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 25 2013 - 13:30:03 PDT
Custom Search