Re: [AMBER] mpi problem

From: Jason Swails <jason.swails.gmail.com>
Date: Tue, 15 May 2012 07:42:02 -0700

Run it the same way you ran sander.MPI on several nodes (using mpirun).

On Tue, May 15, 2012 at 3:29 AM, Syed Tarique Moin <tarisyed.yahoo.com>wrote:

> No, the given below program ran smoothly on a single now but I do not know
> how to check across different nodes.
>
> Regards
>
>
> Tarique
>
>
>
>
> >________________________________
> > From: Jason Swails <jason.swails.gmail.com>
> >To: AMBER Mailing List <amber.ambermd.org>
> >Sent: Tuesday, May 15, 2012 12:53 PM
> >Subject: Re: [AMBER] mpi problem
> >
> >Is this problem unique to Amber, or do you get this kind of issue with any
> >MPI program?
> >
> >My suggestion is to write a quick MPI program that does some basic
> >collective communication and see if it works across different nodes. An
> >example Fortran program is:
> >
> >program test_mpi
> > implicit none
> > include 'mpif.h'
> > integer holder, ierr
> > call mpi_init(ierr)
> > holder = 1
> > call mpi_bcast(holder, 1, mpi_integer, 0, mpi_comm_world, ierr)
> > call mpi_finalize()
> >end program test_mpi
> >
> >You can compile it with "mpif90 program_name.f90" Do you still get the
> >same error with this program?
> >
> >On Mon, May 14, 2012 at 11:04 PM, Syed Tarique Moin <tarisyed.yahoo.com
> >wrote:
> >
> >> Hello,
> >>
> >> I have compiled mpich2 and amber12 with intel compiler successfully.
> When
> >> I run the job with sander.MPI on single nodes with multicore it runs
> >> without errors. But the same job run with multiple nodes is giving
> >> following errors with mpirun and mpiexec.
> >>
> >> Kindly guide me.
> >>
> >> Thanks and Regards
> >>
> >>
> >> ------------------------------
> >>
> >> mpiexec -np 8 -machinefile /etc/mpich/machines.LINUX
> >> $AMBERHOME/bin/sander.MPI -O -i sim_cmplx_1000.in -o test.out -p
> a.prmtop
> >> -c sim_cmplx_1000_36.rst -r test.rst -x test.mdcrd -e test.mden &
> >> [1] 4374
> >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> >> Fatal error in PMPI_Barrier: Other MPI error, error stack:
> >> PMPI_Barrier(425)...........: MPI_Barrier(MPI_COMM_WORLD) failed
> >> MPIR_Barrier_impl(306)......:
> >> MPIR_Bcast_impl(1321).......:
> >> MPIR_Bcast_intra(1155)......:
> >> MPIR_Bcast_binomial(213)....: Failure during collective
> >> MPIR_Barrier_impl(292)......:
> >> MPIR_Barrier_or_coll_fn(121):
> >> MPIR_Barrier_intra(83)......:
> >> dequeue_and_set_error(596)..: Communication error with rank 0
> >>
> >> [1]+ Exit 1 mpiexec -np 8 -machinefile
> >> /etc/mpich/machines.LINUX $AMBERHOME/bin/sander.MPI -O -i
> >> sim_cmplx_1000.in -o test.out -p a.prmtop
> >>
> >> -----------------------------------------------
> >> -------------------------------------------
> >> mpirun -np 8 -machinefile
> >> /etc/mpich/machines.LINUX $AMBERHOME/bin/sander.MPI -O -i
> >> sim_cmplx_1000.in -o test.out -p a.prmtop -c sim_cmplx_1000_36.rst -r
> >> test.rst -x test.mdcrd -e test.mden &
> >>
> >>
> >>
> >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> >> Fatal error in PMPI_Barrier: Other MPI error, error stack:
> >> PMPI_Barrier(425)...........: MPI_Barrier(MPI_COMM_WORLD) failed
> >> MPIR_Barrier_impl(306)......:
> >> MPIR_Bcast_impl(1321).......:
> >> MPIR_Bcast_intra(1155)......:
> >> MPIR_Bcast_binomial(213)....: Failure during collective
> >> MPIR_Barrier_impl(292)......:
> >> MPIR_Barrier_or_coll_fn(121):
> >> MPIR_Barrier_intra(83)......:
> >> dequeue_and_set_error(596)..: Communication error with rank 0
> >>
> >> -------------------------------------------------------------
> >>
> >>
> >>
> >> Syed Tarique Moin
> >> Ph.D. Research Fellow,
> >> International Center for Chemical and Biological Sciences,
> >> University of Karachi
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >
> >
> >
> >--
> >Jason M. Swails
> >Quantum Theory Project,
> >University of Florida
> >Ph.D. Candidate
> >352-392-4032
> >_______________________________________________
> >AMBER mailing list
> >AMBER.ambermd.org
> >http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 15 2012 - 08:00:05 PDT
Custom Search