Hi David,
Thank you for the reply. As you said that Vlimit problem itself would not stop a run and there are some reasons else. I just copy the related part in *.log file, could you help me to diagnose?
==================================
vlimit exceeded for step ******; vmax = 29.9206
vlimit exceeded for step ******; vmax = 21.0694
vlimit exceeded for step ******; vmax = 26.8572
vlimit exceeded for step ******; vmax = 46.3269
vlimit exceeded for step ******; vmax = 27.0013
vlimit exceeded for step ******; vmax = 45.7748
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 36 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
orterun has exited due to process rank 36 with PID 3004 on
node n-2-12 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by orterun (as reported here).
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libintlc.so.5 00002AD753ADC1F9 Unknown Unknown Unknown
libintlc.so.5 00002AD753ADAB70 Unknown Unknown Unknown
libifcoremt.so.5 00002AD75291D5EF Unknown Unknown Unknown
libifcoremt.so.5 00002AD7528819B9 Unknown Unknown Unknown
libifcoremt.so.5 00002AD75289350E Unknown Unknown Unknown
libpthread.so.0 0000003F8740EB10 Unknown Unknown Unknown
libpthread.so.0 0000003F8740B725 Unknown Unknown Unknown
libmlx4-rdmav2.so 00002AAAAAABDECC Unknown Unknown Unknown
mca_btl_openib.so 00002AD7569EFCAF Unknown Unknown Unknown
libopen-pal.so.0 00002AD75238A394 Unknown Unknown Unknown
libmpi.so.0 00002AD751E7E1B1 Unknown Unknown Unknown
libmpi.so.0 00002AD751EAF492 Unknown Unknown Unknown
libmpi_f77.so.0 00002AD751C39839 Unknown Unknown Unknown
pmemd.MPI 00000000005AAFA1 Unknown Unknown Unknown
pmemd.MPI 00000000005A8D8A Unknown Unknown Unknown
pmemd.MPI 00000000005A6E9F Unknown Unknown Unknown
pmemd.MPI 0000000000573E28 Unknown Unknown Unknown
pmemd.MPI 0000000000561DFD Unknown Unknown Unknown
pmemd.MPI 00000000004E98C2 Unknown Unknown Unknown
pmemd.MPI 000000000041E2D6 Unknown Unknown Unknown
libc.so.6 0000003F86C1D994 Unknown Unknown Unknown
pmemd.MPI 000000000041E1D9 Unknown Unknown Unknown
Apr 30 00:41:22 2014 6470 4 7.06 checkPAMRESActionTab: action 31 RES_KILL_TASKS 9 to host <n-11-13> timed out after 60 seconds
Apr 30 00:47:25 2014 6470 3 7.06 PAM: waitForPJLExit: Timed out while waiting for PJL to exit. Sending SIGKILL
TID HOST_NAME COMMAND_LINE STATUS TERMINATION_TIME
===== ========== ================ ======================= ===================
00000 n-3-3 pmemd.MPI -p ./G Killed by PAM (SIGKILL) 04/30/2014 00:40:22
00001 n-11-12 pmemd.MPI -p ./G Killed by PAM (SIGKILL) 04/30/2014 00:40:22
00002 n-11-12 pmemd.MPI -p ./G Killed by PAM (SIGKILL) 04/30/2014 00:40:25
00003 n-11-12 pmemd.MPI -p ./G Signaled (SIGPIPE) 04/30/2014 00:40:25
... ...
00062 n-4-8 pmemd.MPI -p ./G Killed by PAM (SIGKILL) 04/30/2014 00:40:25
00063 n-2-16 pmemd.MPI -p ./G Exit (1) 04/30/2014 00:40:25
====================
So, is the problem caused by MPI job or Vlimit? I tried the smaller time step but it still didn't cure. Thank you.
Best,
gw
-----Original Message-----
From: David A Case [mailto:case.biomaps.rutgers.edu]
Sent: Tuesday, April 29, 2014 8:55 PM
To: AMBER Mailing List
Subject: Re: [AMBER] Extract velocity from the restart file
On Tue, Apr 29, 2014, Yin, Guowei wrote:
>
> During my MD runs, I got the situation that "vlimit exceeded for step
> ******; vmax = 20.4201". Once this appeared, the run was stopped. It
> happened normally after the simulation started about a few ns or a
> dozen ns later.
Yikes...look like we need to change to format of the warning message, now that numbers of steps are larger than in the past. But a vlimit problem itself does not stop a run...something else bad must have happened.
>
> I may use smaller step length to circumvent it. But to better assess
> the reason causing this "vlimit exceeding", I hope to extract the
> velocity from the restart file to see which atoms from which residues
> are with the higher velocity. Does anyone know how to do extract the
> velocity from the restart file and assign the velocity to each atom?
> Is there any commands regarding this?
Depends on the sort of restart file you have: if you used the default ntxo=1, it's easy to find the velocities with a text editor; see the file formats link on the Amber web site for details on what is in the file. But knowing which atoms are involved may not be very helpful: vlimit problems are often not very closely related to particular problems.
> In general, is there any suggestion to solve the "vlimit" problem?
No general rules, since it depends on the size of the system, whether or not you are using a barostat or a thermostat (and with what parameters), the step size, how well equilibrated your system is, whether you have restraints or other artificial components, etc. Generally, longer values of tautp and taup help, and (as you noted) shorter time steps help. But you may have to do some experimentation.
...dac
_______________________________________________
AMBER mailing list
AMBER.ambermd.org<mailto:AMBER.ambermd.org>
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Apr 30 2014 - 08:30:02 PDT