Re: [AMBER] mpi problem

From: Syed Tarique Moin <tarisyed.yahoo.com>
Date: Fri, 18 May 2012 00:30:15 -0700 (PDT)

thanks, the problem is solved now.

Regards

 
Tarique



>>________________________________
>> From: Jason Swails <jason.swails.gmail.com>
>>To: Syed Tarique Moin <tarisyed.yahoo.com>; AMBER Mailing List <amber.ambermd.org>
>>Sent: Tuesday, May 15, 2012 7:42 PM
>>Subject: Re: [AMBER] mpi problem
>>
>>Run it the same way you ran sander.MPI on several nodes (using mpirun).
>>
>>On Tue, May 15, 2012 at 3:29 AM, Syed Tarique Moin <tarisyed.yahoo.com>wrote:
>>
>>> No, the given below program ran smoothly on a single now but I do not know
>>> how to check across different nodes.
>>>
>>> Regards
>>>
>>>
>>> Tarique
>>>
>>>
>>>
>>>
>>> >________________________________
>>> > From: Jason Swails <jason.swails.gmail.com>
>>> >To: AMBER Mailing List <amber.ambermd.org>
>>> >Sent: Tuesday, May 15, 2012 12:53 PM
>>> >Subject: Re: [AMBER] mpi problem
>>> >
>>> >Is this problem unique to Amber, or do you get this kind of issue with any
>>> >MPI program?
>>> >
>>> >My suggestion is to write a quick MPI program that does some basic
>>> >collective communication and see if it works across different nodes.  An
>>> >example Fortran program is:
>>> >
>>> >program test_mpi
>>> >   implicit none
>>> >   include 'mpif.h'
>>> >   integer holder, ierr
>>> >   call mpi_init(ierr)
>>> >   holder = 1
>>> >   call mpi_bcast(holder, 1, mpi_integer, 0, mpi_comm_world, ierr)
>>> >   call mpi_finalize()
>>> >end program test_mpi
>>> >
>>> >You can compile it with "mpif90 program_name.f90"  Do you still get the
>>> >same error with this program?
>>> >
>>> >On Mon, May 14, 2012 at 11:04 PM, Syed Tarique Moin <tarisyed.yahoo.com
>>> >wrote:
>>> >
>>> >> Hello,
>>> >>
>>> >> I have compiled mpich2 and amber12 with intel compiler successfully.
>>> When
>>> >> I run the job with sander.MPI on single nodes with multicore it  runs
>>> >> without errors. But the same job run with multiple nodes is giving
>>> >> following errors with mpirun and mpiexec.
>>> >>
>>> >> Kindly guide me.
>>> >>
>>> >> Thanks and Regards
>>> >>
>>> >>
>>> >> ------------------------------
>>> >>
>>> >> mpiexec -np 8 -machinefile /etc/mpich/machines.LINUX
>>> >> $AMBERHOME/bin/sander.MPI -O -i sim_cmplx_1000.in -o test.out -p
>>> a.prmtop
>>> >> -c sim_cmplx_1000_36.rst -r test.rst -x test.mdcrd -e test.mden &
>>> >> [1] 4374
>>> >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>>> >> Fatal error in PMPI_Barrier: Other MPI error, error stack:
>>> >> PMPI_Barrier(425)...........: MPI_Barrier(MPI_COMM_WORLD) failed
>>> >> MPIR_Barrier_impl(306)......:
>>> >> MPIR_Bcast_impl(1321).......:
>>> >> MPIR_Bcast_intra(1155)......:
>>> >> MPIR_Bcast_binomial(213)....: Failure during collective
>>> >> MPIR_Barrier_impl(292)......:
>>> >> MPIR_Barrier_or_coll_fn(121):
>>> >> MPIR_Barrier_intra(83)......:
>>> >> dequeue_and_set_error(596)..: Communication error with rank 0
>>> >>
>>> >> [1]+  Exit 1                  mpiexec -np 8 -machinefile
>>> >> /etc/mpich/machines.LINUX $AMBERHOME/bin/sander.MPI -O -i
>>> >> sim_cmplx_1000.in -o test.out -p a.prmtop
>>> >>
>>> >>  -----------------------------------------------
>>> >> -------------------------------------------
>>> >> mpirun -np 8 -machinefile
>>> >> /etc/mpich/machines.LINUX $AMBERHOME/bin/sander.MPI -O -i
>>> >> sim_cmplx_1000.in -o test.out -p a.prmtop -c sim_cmplx_1000_36.rst -r
>>> >> test.rst -x test.mdcrd -e test.mden &
>>> >>
>>> >>
>>> >>
>>> >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>>> >> Fatal error in PMPI_Barrier: Other MPI error, error stack:
>>> >> PMPI_Barrier(425)...........: MPI_Barrier(MPI_COMM_WORLD) failed
>>> >> MPIR_Barrier_impl(306)......:
>>> >> MPIR_Bcast_impl(1321).......:
>>> >> MPIR_Bcast_intra(1155)......:
>>> >> MPIR_Bcast_binomial(213)....: Failure during collective
>>> >> MPIR_Barrier_impl(292)......:
>>> >> MPIR_Barrier_or_coll_fn(121):
>>> >> MPIR_Barrier_intra(83)......:
>>> >> dequeue_and_set_error(596)..: Communication error with rank 0
>>> >>
>>> >> -------------------------------------------------------------
>>> >>
>>> >>
>>> >>
>>> >> Syed Tarique Moin
>>> >> Ph.D. Research Fellow,
>>> >> International Center for Chemical and Biological Sciences,
>>> >> University of Karachi
>>> >> _______________________________________________
>>> >> AMBER mailing list
>>> >> AMBER.ambermd.org
>>> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> >>
>>> >
>>> >
>>> >
>>> >--
>>> >Jason M. Swails
>>> >Quantum Theory Project,
>>> >University of Florida
>>> >Ph.D. Candidate
>>> >352-392-4032
>>> >_______________________________________________
>>> >AMBER mailing list
>>> >AMBER.ambermd.org
>>> >http://lists.ambermd.org/mailman/listinfo/amber
>>> >
>>> >
>>> >
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>
>>
>>
>>--
>>Jason M. Swails
>>Quantum Theory Project,
>>University of Florida
>>Ph.D. Candidate
>>352-392-4032
>>_______________________________________________
>>AMBER mailing list
>>AMBER.ambermd.org
>>http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri May 18 2012 - 01:00:03 PDT
Custom Search