Re: [AMBER] job terminate abnormally

From: Jason Swails <jason.swails.gmail.com>
Date: Tue, 8 May 2012 10:37:51 -0400

Nevermind, I see the problem. Re-run your simulation specifying the "-O"
command-line argument. The problem is with one of the bug fixes (it has a
bug, heh). We will get this fixed, but in the meantime you can use
pmemd.cuda -O to circumvent it.

HTH,
Jason

Note for the devs:

bugfix.1 (and its 1st fix, bugfix.3) addressed the flaky buffer-flushing of
modern Linux kernels by manually closing and reopening the restart file
every time we decided to write it. However, the reopening was done via:

+!We actually open the restart file here and close it afterwards rather than
+!using the original method of just rewinding in order to force newer
'broken?'
+!linuxes to flush the write buffer.
+
+ if (ntxo .le. 0) then
+ call amopen(restrt, restrt_name, owrite, 'U', 'W')
+ else
+ call amopen(restrt, restrt_name, owrite, 'F', 'W')
+ end if

Obviously, the 2nd time during the simulation that we write a restart file,
the file already exists. If we didn't give "-O", then it will fail here.
Every time. This obviously needs to be fixed (because otherwise -O is
broken). The appropriate behavior would be to open it with "owrite" the
first time, then with an "i want to overwrite this file" value every time
after that.

On Tue, May 8, 2012 at 10:30 AM, Jason Swails <jason.swails.gmail.com>wrote:

> What is your command-line syntax? (i.e., how did you run pmemd.cuda?)
>
> On Tue, May 8, 2012 at 10:27 AM, Marc van der Kamp <
> marcvanderkamp.gmail.com> wrote:
>
>> Hi,
>>
>> Well, the error message points to a probable cause; it says:
>>
>> Unit 16 Error on OPEN: md.rst
>>
>> So it appears that pmemd.cuda can't open the md.rst file for writing, and
>> the job crashes when writing md.rst is attempted for the first time
>> (ntwr=100000, so after 100000 steps). Is this correct? If so, you should
>> probably check the write permissions etc.
>>
>> If you want to test this efficiently, use something like
>> nstlim=10, ntpr=1, ntwr=2
>> in your input.
>>
>> --Mar
>>
>> On 8 May 2012 15:16, Albert <mailmd2011.gmail.com> wrote:
>>
>> > hello:
>> > I am using the following md.in for MD simulations with PMEMD.CUDA but
>> > it terminate
>> >
>> >
>> > production dynamics
>> > &cntrl
>> > imin=0, irest=1, ntx=5,
>> > nstlim=20000000, dt=0.002,
>> > ntc=2, ntf=2,
>> > cut=10.0, ntb=2, ntp=1, taup=2.0,
>> > ntpr=2000, ntwx=2000, ntwr=100000,
>> > ntt=3, gamma_ln=2.0,
>> > temp0=300.0,
>> > /
>> >
>> >
>> >
>> > I tried many times, it always terminate after a few minutes of running.
>> > I am wondering, what happen? is there anything wrong with my md.infile?
>> >
>> > thank you very much.
>> >
>> > Albert
>> >
>> >
>> >
>> >
>> ----------------------------md.out--------------------------------------------------
>> >
>> >
>> > NSTEP = 196000 TIME(PS) = 1692.000 TEMP(K) = 301.37 PRESS
>> > = 190.5
>> > Etot = -93344.7044 EKtot = 22508.2617 EPtot =
>> > -115852.9661
>> > BOND = 905.2599 ANGLE = 2415.4628 DIHED =
>> > 3347.2978
>> > 1-4 NB = 1092.6237 1-4 EEL = 11987.6085 VDWAALS =
>> > 12588.5567
>> > EELEC = -148189.7755 EHBOND = 0.0000 RESTRAINT
>> > = 0.0000
>> > EKCMT = 9541.3962 VIRIAL = 8051.4272 VOLUME =
>> > 362241.7256
>> > Density
>> > = 1.0342
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> >
>> > NSTEP = 198000 TIME(PS) = 1696.000 TEMP(K) = 297.58 PRESS
>> > = -166.2
>> > Etot = -93770.8213 EKtot = 22225.8613 EPtot =
>> > -115996.6826
>> > BOND = 847.2596 ANGLE = 2482.9980 DIHED =
>> > 3334.3182
>> > 1-4 NB = 1069.4747 1-4 EEL = 11918.1088 VDWAALS =
>> > 12286.2835
>> > EELEC = -147935.1254 EHBOND = 0.0000 RESTRAINT
>> > = 0.0000
>> > EKCMT = 9447.8886 VIRIAL = 10748.3041 VOLUME =
>> > 362310.2508
>> > Density
>> > = 1.0340
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Unit 16 Error on OPEN: md.rst
>> >
>> >
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Candidate
> 352-392-4032
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 08 2012 - 08:00:03 PDT
Custom Search