Re: Linux cluster Amber7 sander mpi error -- Null communicator, IOT Trap

From: Jim Newhouse <james.newhouse_at_mhpcc.hpc.mil>
Date: Tue 07 Jan 2003 09:07:32 -1000

 From Jim Newhouse at Maui High Performance Computing Center:
  An observation. I'm installing Amber 7 on our Linux Cluster (Intel
933MHz Pentium II 2-way SMP,
Red Hat Linux, Myrinet). There are flags in the lines in the MACHINE
file that set up optimization
levels that are the cause of the "undefined reference" errors. The
flag -Msecond_underscore means that
the loader cannot recognize the mpich subroutines in the libraries
(which on our system don't have the
second underscore); try removing it.
                    I hope this helps,
                                               Jim

Chris Switzer wrote:

>Amber 7 sander newly compiled on a linux cluster gives the following error:
>
>1 - MPI_COMM_RANK : Null communicator
>[1] Aborting program !
>[1] Aborting program!
>Process aborting...
>IOT Trap
>0 - MPI_COMM_RANK : Null communicator
>[0] Aborting program !
>[0] Aborting program!
>Process aborting...
>IOT Trap
>-------------------
>
>System:
> Linux cluster 2.4.9-31smp
> Mpich-1.2.1..7b
>-------------------
>
>Makefile used to create the Amber7 sander giving above error:
> An altered version of "Machine.g77_mpich". Amber7 would only finish compiling when the "g77" references in Machine.g77_mpich were changed to "mpif77" per a reflector e-mail.
>-------------------
>
>It is noteworthy that when Amber7 is compiled non-parallel using Machine.g77 without any alterations, sander runs fine.
>-------------------
>-------------------
>
>Some additional notes...
>Compiling behavior with other machine files:
>
>Attempted compiling with Machine.g77_mpich unaltered gives the following type of error:
> g77 -c -g _nxtsec_.f
> ../Compile LOAD -o new2oldparm new2oldparm.o nxtsec.o
> g77 -O6 -o new2oldparm new2oldparm.o nxtsec.o -lm -L/usr/local/mpich-1.2.4..8/lib -lmpich
> /usr/local/mpich-1.2.4..8/lib/libmpich.a(gmpi_regcache.o): In function `gmpi_regcache_init':
> gmpi_regcache.o(.text+0x1e): undefined reference to `gm_hash_hash_ptr'
> gmpi_regcache.o(.text+0x23): undefined reference to `gm_hash_compare_ptrs'
> etc....
> make[1]: *** [new2oldparm] Error 1
> make[1]: Leaving directory `/home/switzer/amber7/src/lib'
> make: *** [install] Error 2
>
> I have pgf77. Attempted compilation with Machine.pgf77_mpi gives errors of the following sort:
> SYSLIB=`../sysdir lib` ; ../Compile LOAD -o sander \
> sander.o ....etc..... ../blas/blas.a ../lib/nxtsec.o $SYSLIB;
> pgf77 -o sander sander.o cshf.o ....etc..... decomp.o ../lapack/lapack.a ../blas/blas.a ../lib/nxtsec.o
> /home/switzer/amber7/src/Machines/standard/sys.a -lm
> sander.o: In function `trajene':
> _sander_.f:1084: undefined reference to `mpi_init__'
> _sander_.f:1085: undefined reference to `mpi_comm_rank__'
> _sander_.f:1086: undefined reference to `mpi_comm_size__'
> _sander_.f:1282: undefined reference to `mpi_bcast__'
> ....etc....
> new_time.o(.text+0x2f46): undefined reference to `mpi_send__'
> new_time.o(.text+0x32dd): undefined reference to `mpi_recv__'
> make[1]: *** [sander] Error 1
> make[1]: Leaving directory `/home/switzer/amber7/src/sander'
> make: *** [install] Error 2
>-------------------
>
>Any help much appreciated.
>
>Sincerely,
>
>Chris Switzer
>Chemistry Dept
>UC Riverside
>Riverside, CA
>92521
>
>
>
>
>
>
Received on Tue Jan 07 2003 - 11:07:32 PST
Custom Search