Re: AMBER: parallel Sander (amber8): run-time error under Irix

From: Karol Miaskiewicz <miaskiew.ncifcrf.gov>
Date: Thu, 3 Jun 2004 11:29:06 -0400

On Thu, 3 Jun 2004, Carlos Simmerling wrote:

> did you try changing the stack settings in your
> shell? we had similar problems on SGI that were
> resolved by unlimiting stack, etc.
> Scott Brozell posting somthing on this earlier:
> limit datasize unlimited
> limit stacksize unlimited
> limit memoryuse unlimited
>
> I'm not sure that this will resolve your problem
> but it may be worth trying.


We cannot unlimit the stacksize. In our configuration, it is capped via
kernel configuration at 2 Gb (still 2GB is quite a sizable stack, that
should not be a problem):

cputime unlimited
filesize unlimited
datasize unlimited
stacksize 2097152 kbytes
coredumpsize unlimited
memoryuse unlimited
descriptors 2500
vmemoryuse unlimited
threads 1024


        Karol


>
> ----- Original Message -----
> From: "Karol Miaskiewicz" <miaskiew.ncifcrf.gov>
> To: <amber.scripps.edu>
> Sent: Thursday, June 03, 2004 10:42 AM
> Subject: AMBER: parallel Sander (amber8): run-time error under Irix
>
>
> >
> > AMBER8
> > Irix 6.5.23
> > MIPSpro 7.4.2 compilers
> > SGI MPI 4.3 (MPT 1.8)
> >
> > Sander compiles fine (although we had problems compiling with 7.4
> > compilers, but the upgrade to 7.4.2 fixed them). However, program aborts
> > at runtime when trying to run tests that come with the program.
> >
> > Below is the traceback for parallel version of Sander:
> >
> >
> > MPI: Program ../../exe/sander, Rank 0, Process 1177431 received signal
> > SIGSEGV(11)
> >
> >
> > MPI: --------stack traceback-------
> > PC: 0x5ddb100 MPI_SGI_stacktraceback in /usr/lib32/libmpi.so
> > PC: 0x5ddb544 first_arriver_handler in /usr/lib32/libmpi.so
> > PC: 0x5ddb7d8 slave_sig_handler in /usr/lib32/libmpi.so
> > PC: 0xfaee79c _sigtramp in /usr/lib32/libc.so.1
> > PC: 0x5dfb420 MPI_SGI_comm_coll_opt in /usr/lib32/libmpi.so
> > PC: 0x5dfab20 MPI_SGI_barrier in /usr/lib32/libmpi.so
> > PC: 0x5dfae80 PMPI_Barrier in /usr/lib32/libmpi.so
> > PC: 0x5e32dd4 pmpi_barrier_ in /usr/lib32/libmpi.so
> > PC: 0x10047e48 multisander in ../../exe/sander
> > PC: 0xad69d74 main in /usr/lib32/libftn.so
> >
> >
> > MPI: dbx version 7.3.3 (78517_Dec16 MR) Dec 16 2001 07:45:22
> > MPI: Process 1177431 (sander_parallel) stopped at [__waitsys:24
> > +0x8,0xfa53338]
> > MPI: Source (of
> > /xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/proc/waitsys.s) not
> > available for Process 1177431
> > MPI: > 0 __waitsys(0x0, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4, 0x1, 0x0,
> > 0x0) ["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/proc/waitsys.s":24,
> > 0xfa53338]
> > MPI: 1 _system(0x7ffece70, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4, 0x1,
> > 0x0, 0x0)
> > ["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/stdio/system.c":116,
> > fa5f868]
> > MPI: 2 MPI_SGI_stacktraceback(0x0, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4,
> > 0x1, 0x0, 0x0)
> > ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":242,
> > 0x5ddb268]
> > MPI: 3 first_arriver_handler(0xb, 0x71756974, 0x7ffecda0, 0x3,
> > 0x10e4f4, 0x1, 0x0, 0x0)
> > ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":445,
> > 0x5ddb544]
> > MPI: 4 slave_sig_handler(0xb, 0x126497, 0x7ffecda0, 0x3, 0x10e4f4, 0x1,
> > 0x0, 0x0)
> > ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":542,
> > 0x5ddb7e0]
> > MPI: 5 _sigtramp(0x0, 0x126497, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0)
> > ["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/signal/sigtramp.s":71,
> > 0xfaee79c]
> > MPI: 6 MPI_SGI_comm_coll_opt(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0,
> > 0x0, 0x5db062c)
> > ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/coll/collutil.c":99,
> > 0x5dfb428]
> > MPI: 7 MPI_SGI_barrier(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
> > 0x5db062c)
> > ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/coll/barrier.c":53,
> > 0x5dfab20]
> > MPI: 8 PMPI_Barrier(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
> > 0x5db062c)
> > ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/coll/barrier.c":231,
> > 0x5dfae80
> > MPI: More (n if no)?]
> > MPI: 9 pmpi_barrier_(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
> > 0x5db062c)
> >
> ["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/sgi77/barrier77.c":24,
> > 0x5e32dd4]
> > MPI: 10 multisander(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0,
> > 0x5db062c) ["/usr/local/fbscapp/amber8/src/sander/_sander.f":374,
> > 0x10047e48]
> > MPI: 11 main(0x0, 0x43bd700, 0x0, 0x7ffedb70, 0x0, 0x0, 0x0, 0x5db062c)
> > ["/j7/mtibuild/v742/workarea/v7.4.2m/libF77/main.c":101, 0xad69d74]
> > MPI: 12 __start()
> > ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s":177,
> > 0x10009ce8]
> >
> > MPI: -----stack traceback ends-----
> > MPI: Program ../../exe/sander, Rank 0, Process 1177431: Dumping core on
> > signal SIGSEGV(11) into directory /usr/local/fbscapp/amber8/test/cytosine
> > MPI: Program ../../exe/sander, Rank 1, Process 1203283: Core dump on
> > signal SIGSEGV(11) suppressed.
> > MPI: MPI_COMM_WORLD rank 1 has terminated without calling MPI_Finalize()
> > MPI: aborting job
> > MPI: Received signal 9
> >
> > ./Run.cytosine: Program error
> >
> >
> > This is an n32 executable. We tried 64-bit, but the result is similar,
> > i.e. run-time abort.
> >
> >
> > --
> > Karol Miaskiewicz
> > Advanced Biomedical Computing Center
> > NCI-Frederick, PO Box B, Frederick, MD 21702
> > miaskiew.ncifcrf.gov, phone 301-8465664, fax 301-8465762
> >
> > -----------------------------------------------------------------------
> > The AMBER Mail Reflector
> > To post, send mail to amber.scripps.edu
> > To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
> >
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Thu Jun 03 2004 - 16:53:01 PDT
Custom Search