Re: [AMBER] MPI problem on TACC Stampede

From: Jason Swails <>
Date: Tue, 05 Nov 2013 08:14:13 -0500

On Tue, 2013-11-05 at 13:04 +0000, Sorensen, Jesper wrote:
> Hi Jason and others,
> After leaving this issue for a while, an MPI Pro at TACC Stampede got
> back to me and said this is related to a bug in mvapich2, that is
> being updated in the next iteration of mvapich2. But in the mean time
> he offered a temporary fix (see the details below). I tested this on
> 128 cores (8 nodes) and it works now.
> I am sending this just in case others see a similar issue.
> > This is a known issue for Mvapich2 team when some 3rd party
> > libraries
> >
> > are interacting with their internal memory (ptmalloc) library. They
> > got similar reports earlier with MPI programs integrated with Perl
> > and
> > some other external libraries. This interaction causing
> > memory functions appearing before MVAPICH2 library ( in
> > dynamic shared lib ordering which is leading to Ptmalloc
> > initialization failure. Mvapich2 2.0a has a fix for this issue, but
> > it's not yet available on Stampede.
> > For time being can you please try with run-time parameter
> > MV2_ON_DEMAND_THRESHOLD=<your job size>. With this parameter, your
> > application should continue with out registration cache feature with
> > some performance degradation.

Cool. On the bright side, performance degradation is highly unlikely.
Outside of a few MPI_Barrier calls, does not carry out any
communications between threads (since they are unnecessary).

Thanks for the info,

Jason M. Swails
Rutgers University
Postdoctoral Researcher
AMBER mailing list
Received on Tue Nov 05 2013 - 05:30:03 PST
Custom Search