Re: AMBER: job crashes

From: JunJun Liu <ljjlp03.gmail.com>
Date: Tue, 26 Sep 2006 23:39:52 -0400

Hi Xiaowei,

Google it with "semop lock failed" and you will know it's related to your
MPI shared memory. Try using "cleanipcs" to clean them up in each
computation node.

Good luck!

Liu

On Tue, 26 Sep 2006 23:09:04 -0400, Xiaowei (David) Li <xl3a.virginia.edu>
wrote:

> Dear all:
> I have met intensive (almost every job I submitted) MD job crashes
> during recent simulation work. The job crashes always happend upon the
> completion points of simulations (for example, the crash happens around
> 950 ps for a 1 ns simulation). All of the errors messages have the
> "semop lock failed" information as following.
> Job is running on node(s):
> ------------------------
> compute-2-5 compute-2-6 compute-2-7 compute-2-9
> ------------------------
> p4_error: latest msg from perror: Invalid argument
> p0_9469: p4_error: OOPS: semop lock failed: -1
> forrtl: error (69): process interrupted (SIGINT)
> forrtl: error (69): process interrupted (SIGINT)
> forrtl: error (69): process interrupted (SIGINT)
> forrtl: error (69): process interrupted (SIGINT)
> p3_28418: (207846.623829) net_send: could not write to fd=5, errno = 32
> forrtl: error (69): process interrupted (SIGINT)
> p0_9469: (207848.918605) net_send: could not write to fd=4, errno = 32:
>
> I was running the parallel simulation with MPI on a linux cluster with
> Athlon Opteron 244 processors. The input file is:
> &cntrl
> imin = 0,
> irest = 1,
> ntx = 5,
> ntb = 2,
> pres0 = 1.0,
> ntp = 1,
> tautp=5
> taup =5,
> cut = 10,
> ntr = 0,
> ntc = 2,
> ntf = 2,
> tempi = 300.0,
> temp0 = 300.0,
> ntt = 1,
> nstlim =500000,
> dt = 0.002,
> ntpr = 100,
> ntwx = 100,
> ntwr = 1000,
> nscm=1,
> &end
> Any help or suggestion will be deeply appreciated. Thanks.
>
> Best,
> Xiaowei Li
> University of Virginia
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu



-- 
JunJun Liu
College of Chemistry
Central China Normal University
WuHan   430079
P.R. China
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Sep 27 2006 - 06:07:24 PDT
Custom Search