Re: AMBER: parallel Sander (amber8): run-time error under Irix

From: Viktor Hornak <hornak.csb.sunysb.edu>
Date: Thu, 03 Jun 2004 15:01:40 -0400

Dear Karol,

We have compiled and run (all sander tests passing) amber8 on our SGI
without a problem: Irix 6.5.23m, MIPSpro 7.3.1.3m, SGI MPI 4.3 (MPT 1.8).
We don't have MIPSpro 7.4.x compilers (due to licensing issues) so I
cannot confirm that sander runs with those.
Did you try compiling serial (non-MPI) sander? Does it work? This would
at least isolate possible MPI problems from other problems (I am not
sure if stack trace points unambiguously at MPI problem and it's easy
enough to try serial sander)...

Cheers,
-Viktor

Karol Miaskiewicz wrote:

>AMBER8
>Irix 6.5.23
>MIPSpro 7.4.2 compilers
>SGI MPI 4.3 (MPT 1.8)
>
>Sander compiles fine (although we had problems compiling with 7.4
>compilers, but the upgrade to 7.4.2 fixed them). However, program aborts
>at runtime when trying to run tests that come with the program.
>
>Below is the traceback for parallel version of Sander:
>
>
>MPI: Program ../../exe/sander, Rank 0, Process 1177431 received signal
>SIGSEGV(11)
>
>
>MPI: --------stack traceback-------
>PC: 0x5ddb100 MPI_SGI_stacktraceback in /usr/lib32/libmpi.so
>PC: 0x5ddb544 first_arriver_handler in /usr/lib32/libmpi.so
>PC: 0x5ddb7d8 slave_sig_handler in /usr/lib32/libmpi.so
>PC: 0xfaee79c _sigtramp in /usr/lib32/libc.so.1
>PC: 0x5dfb420 MPI_SGI_comm_coll_opt in /usr/lib32/libmpi.so
>PC: 0x5dfab20 MPI_SGI_barrier in /usr/lib32/libmpi.so
>PC: 0x5dfae80 PMPI_Barrier in /usr/lib32/libmpi.so
>PC: 0x5e32dd4 pmpi_barrier_ in /usr/lib32/libmpi.so
>PC: 0x10047e48 multisander in ../../exe/sander
>PC: 0xad69d74 main in /usr/lib32/libftn.so
>
>
>MPI: dbx version 7.3.3 (78517_Dec16 MR) Dec 16 2001 07:45:22
>MPI: Process 1177431 (sander_parallel) stopped at [__waitsys:24
>+0x8,0xfa53338]
>MPI: Source (of
>/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/proc/waitsys.s) not
>available for Process 1177431
>MPI: > 0 __waitsys(0x0, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4, 0x1, 0x0,
>0x0) ["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/proc/waitsys.s":24,
>0xfa53338]
>MPI: 1 _system(0x7ffece70, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4, 0x1,
>0x0, 0x0)
>["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/stdio/system.c":116,
>fa5f868]
>MPI: 2 MPI_SGI_stacktraceback(0x0, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4,
>0x1, 0x0, 0x0)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":242,
>0x5ddb268]
>MPI: 3 first_arriver_handler(0xb, 0x71756974, 0x7ffecda0, 0x3,
>0x10e4f4, 0x1, 0x0, 0x0)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":445,
>0x5ddb544]
>MPI: 4 slave_sig_handler(0xb, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4, 0x1,
>0x0, 0x0)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":542,
>0x5ddb7e0]
>MPI: 5 _sigtramp(0x0, 0x126497, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0)
>["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/signal/sigtramp.s":71,
>0xfaee79c]
>MPI: 6 MPI_SGI_comm_coll_opt(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0,
>0x0, 0x5db062c)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/coll/collutil.c":99,
>0x5dfb428]
>MPI: 7 MPI_SGI_barrier(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
>0x5db062c)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/coll/barrier.c":53,
>0x5dfab20]
>MPI: 8 PMPI_Barrier(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
>0x5db062c)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/coll/barrier.c":231,
>0x5dfae80
>MPI: More (n if no)?]
>MPI: 9 pmpi_barrier_(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
>0x5db062c)
>["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/sgi77/barrier77.c":24,
>0x5e32dd4]
>MPI: 10 multisander(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
>0x5db062c) ["/usr/local/fbscapp/amber8/src/sander/_sander.f":374,
>0x10047e48]
>MPI: 11 main(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0, 0x5db062c)
>["/j7/mtibuild/v742/workarea/v7.4.2m/libF77/main.c":101, 0xad69d74]
>MPI: 12 __start()
>["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s":177,
>0x10009ce8]
>
>MPI: -----stack traceback ends-----
>MPI: Program ../../exe/sander, Rank 0, Process 1177431: Dumping core on
>signal SIGSEGV(11) into directory /usr/local/fbscapp/amber8/test/cytosine
>MPI: Program ../../exe/sander, Rank 1, Process 1203283: Core dump on
>signal SIGSEGV(11) suppressed.
>MPI: MPI_COMM_WORLD rank 1 has terminated without calling MPI_Finalize()
>MPI: aborting job
>MPI: Received signal 9
>
> ./Run.cytosine: Program error
>
>
>This is an n32 executable. We tried 64-bit, but the result is similar,
>i.e. run-time abort.
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Thu Jun 03 2004 - 20:53:01 PDT
Custom Search