Hi Amber users,
for amber 9.
when using the serial-version of sander it works on both i386 and ia64
running CentOS linux.
when using a parallel version of sander (sander.MPI ) on ia64 it crashes
in the middle of the run.
on i386 it runs to the end, but hangs for several minutes before it
crashes with an mpi-write eror.
when I use the MPICH library the job
run runs for about 18 minutes before it crashes, with this error:
steinar.compute-0-0 % time mpirun -np
4 --all-local /home/steinar/src/amber9/amber9/exe/sander.MPI -i heat.in
-p VSA_DNA_min_wat.prmtop -c VSA_DNA_min_all.rst -r VSA_DNA_heat.rst -x
VSA_DNA_heat.mdcrd -o heat.out.1 -ref VSA_DNA_min_all.rst
p0_31185: p4_error: interrupt SIGx: 15
Killed by signal 2.
Killed by signal 2.
Killed by signal 2.
forrtl: error (69): process interrupted (SIGINT)
p0_31185: (1090.372580) net_send: could not write to fd=4, errno = 32
real 18m10.854s
user 16m38.686s
sys 0m32.270s
when I try to use another MPI library namly scampi, I get an similar
error:
steinar.compute-0-0 % time
mpimon /home/steinar/src/amber9/amber9/exe/sander.MPI -i heat.in -p
VSA_DNA_min_wat.prmtop -c VSA_DNA_min_all.rst -r VSA_DNA_heat.rst -x
VSA_DNA_heat.mdcrd -o heat.out.1 -ref VSA_DNA_min_all.rst -- compute-0-0
4
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)
--- mpimon --- Aborting run after process-1 terminated abnormally
Childprocess 2874 exited with exitcode 1 ---
real 57m51.757s
user 0m0.002s
sys 0m0.005s
when using pmemd
it crashes in the middle of the run on both
i386 and ia64.
I am attaching the output and input files.
Thanks in advance for the reply.
Regards
Rafi
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
- application/octet-stream attachment: heat.in
Received on Sun Oct 22 2006 - 06:07:25 PDT