On Tue, 2013-11-05 at 13:04 +0000, Sorensen, Jesper wrote:
> Hi Jason and others,
>
>
> After leaving this issue for a while, an MPI Pro at TACC Stampede got
> back to me and said this is related to a bug in mvapich2, that is
> being updated in the next iteration of mvapich2. But in the mean time
> he offered a temporary fix (see the details below). I tested this on
> 128 cores (8 nodes) and it works now.
> I am sending this just in case others see a similar issue.
>
>
>
> > This is a known issue for Mvapich2 team when some 3rd party
> > libraries
> >
> > are interacting with their internal memory (ptmalloc) library. They
> > got similar reports earlier with MPI programs integrated with Perl
> > and
> > some other external libraries. This interaction causing libc.so
> > memory functions appearing before MVAPICH2 library (libmpich.so) in
> > dynamic shared lib ordering which is leading to Ptmalloc
> > initialization failure. Mvapich2 2.0a has a fix for this issue, but
> > it's not yet available on Stampede.
> > For time being can you please try with run-time parameter
> > MV2_ON_DEMAND_THRESHOLD=<your job size>. With this parameter, your
> > application should continue with out registration cache feature with
> > some performance degradation.
Cool. On the bright side, performance degradation is highly unlikely.
Outside of a few MPI_Barrier calls, MMPBSA.py.MPI does not carry out any
communications between threads (since they are unnecessary).
Thanks for the info,
Jason
--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Nov 05 2013 - 05:30:03 PST