Re: [AMBER] error on running sander.MPI

From: Ross Walker <ross.rosswalker.co.uk>
Date: Sun, 12 Dec 2010 10:22:19 -0800

Hi George,

Ok, I have figured this out. The crucial error is:

> MPIR_Localcopy(349)...........: memcpy arguments alias each other,
dst=0x1099e26f8 src=0x1099e26f8 len=54936
> APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

and comes from line 544 or parallel.f in $AMBERHOME/src/sander/

#ifdef NO_RED_SCAT_INPLACE

     ! It seems some mpi implementations won't do an mpi_reduce_scatter
     ! in place. Turn this on if you get an error when you don't have a
power
     ! of two cpus.
     call mpi_reduce_scatter(f, tmp(iparpt3(mytaskid)+1), &
           rcvcnt3, MPI_DOUBLE_PRECISION, mpi_sum, &
           commsander, ierr)
     f(iparpt3(mytaskid)+1:iparpt3(mytaskid+1)) =
tmp(iparpt3(mytaskid)+1:iparpt3(mytaskid+1))
#else
     call mpi_reduce_scatter(f, f(iparpt3(mytaskid)+1), &
           rcvcnt3, MPI_DOUBLE_PRECISION, mpi_sum, &
           commsander, ierr)
#endif

Technically from the MPI standard you cannot have the send and receive
buffers point to the same location. For things like mpi_reduce_scatter this
normally was not an issue and just about every MPI version worked fine.
However, recently various MPI installations, particularly mpich have started
doing more rigorous error checking and they now check if the send and
receive buffers overlap and if they do they quit with an error.

A simple fix is to edit $AMBERHOME/src/config.h and next to all definitions
of -DMPI in that file add -DNO_RED_SCAT_INPLACE - then do a make clean and a
recompile and you should be good.

I will issue a bugfix to the code to make the NO_RED_SCAT_INPLACE version
the default since it looks like this problem will occur more and more
frequently with new MPI versions.

Let me know how you get on.

All the best
Ross

> -----Original Message-----
> From: George Tzotzos [mailto:gtzotzos.me.com]
> Sent: Sunday, December 12, 2010 10:15 AM
> To: AMBER Mailing List
> Subject: Re: [AMBER] error on running sander.MPI
>
> Thank you Ross
>
> pmemd.MPI -np 12 runs fine. Thanks Jason for the hint
>
> I am running Amber 11. I installed the program only two days ago and it is
> running with the latest bugfixes
>
> Best regards
>
> George
>
>
> On Dec 12, 2010, at 7:09 PM, Ross Walker wrote:
>
> > Hi George,
> >
> > Firstly it is NOT advisable to run sander.MPI with the number of
processors
> > not being a power of 2. Sander uses a different way of collating the
forces
> > when there is a non-power of 2 cpus and thus the efficiency is often a
LOT
> > lower. You may find that using 8 cores is actually faster than 12. What
I
> > suspect is happening is that there is a bug in the code used when the
> number
> > of MPI processes is not a power of 2. PMEMD is unaffected by this
because
> it
> > uses point to point communications.
> >
> > The non-power of 2 code in sander is probably very rarely tested so it
is
> > possible there is a bug in there.
> >
> > Can you let me know the exact version of AMBER you are running along
> with
> > what bugfixes have been applied.
> >
> > All the best
> > Ross
> >
> >> -----Original Message-----
> >> From: George Tzotzos [mailto:gtzotzos.me.com]
> >> Sent: Sunday, December 12, 2010 8:10 AM
> >> To: AMBER Mailing List
> >> Subject: Re: [AMBER] error on running sander.MPI
> >>
> >> Apologies. Fingers are sometimes quicker than the mind.
> >>
> >> The installed compilers is gcc44.
> >>
> >> Just in case this helps in the diagnostics.
> >>
> >> Setting -np to 8 works.
> >>
> >> pmemd.MPI with -np 12 also works.
> >>
> >> Cheers
> >>
> >> George
> >>
> >>
> >> On Dec 12, 2010, at 3:46 PM, George Tzotzos wrote:
> >>
> >>> Many thanks for the prompt reply.
> >>>
> >>> I'm using OSX 10.6.5 (2 x 2.93 GHz 6-Core Intel Xeon)
> >>>
> >>> The MPI is mpich2 and the compiler gcc installed from MacPorts
> >>>
> >>> The output file is attached.
> >>>
> >>> Many thanks
> >>>
> >>> George<heat.out>
> >>>
> >>>
> >>>
> >>>
> >>> On Dec 12, 2010, at 3:32 PM, case wrote:
> >>>
> >>>> On Sun, Dec 12, 2010, George Tzotzos wrote:
> >>>>>
> >>>>> In the previous message mpirun -np was set to 12.
> >>>>> Setting -np to 4 does not produce the same error. sander.MPI works.
> >>>>
> >>>> Can you tell us what version of MPI you are using, and which
compilers
> >> and OS?
> >>>> Also, what was in the output file? (That is, did the failure occur
> > right
> >>>> away?)
> >>>>
> >>>> Second, could you try the simulation without the &rst namelist? That
> >> facility
> >>>> doesn't get used with minimization, and is not often used, so there
> > might
> >> be a
> >>>> bug lurking related to that.
> >>>>
> >>>> ...thanks...dac
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Dec 12 2010 - 10:30:07 PST
Custom Search