Re: [AMBER] How to read mdinfo file without crashing the simulation?

From: Bill Ross <ross.cgl.ucsf.edu>
Date: Sun, 24 Apr 2016 15:46:55 -0700

One way to hack around it would be to have the code output a new file
each time: mdinfo.1, mdinfo.2 or what have you (numbered by step). Then
you'd never collide, though you'd have to delete a bunch of small files,
and worst case might exceed the number of files a directory can hold.

Bill

On 4/24/16 12:16 PM, Karolina Markowska wrote:
> I can do "cat mdinfo" and in most cases everything runs just fine.
> But if I make "cat mdinfo" in the exact moment when the contents of mdinfo
> changes, I get the resource unavailable error and the simulation crashes.
> Also if I use "tail mdinfo" and pmemd changes the mdinfo file, everything
> crashes. And this is probably related with the queueing system because when
> I run the simulation without submitting it into the queue - everything is
> OK.
> If I don't look into mdinfo file during whole simulation, everything runs
> OK.
>
> 2016-04-23 20:55 GMT+02:00 Bill Ross <ross.cgl.ucsf.edu>:
>
>> contents of mdinfo
>>
>> On 4/23/16 11:53 AM, Bill Ross wrote:
>>> So with pmemd, you can run for an arbitrary amount of time, and then the
>>> moment you 'cat mdinfo' the simulation crashes on resource unavailable,
>>> and the contents of pmemd are different each time, and if you don't look
>>> at the file, the job runs to completion?
>>>
>>> Bill
>>>
>>> On 4/22/16 2:24 AM, Karolina Markowska wrote:
>>>> I'm using Ubuntu 14.04, and the file system is ext4. We're using quota.
>>>>
>>>> I don't have any problem with the "ls -l mdinfo". I'm the owner of this
>>>> file, I (theoretically) can read it or change it. The file is present
>> with
>>>> a non-zero length and I can read it using "cat mdinfo" command.
>>>> -rw-r----- 1 karolinam user 1257 kwi 21 14:57 mdinfo
>>>>
>>>> It looks OK, I guess:
>>>>
>>>> NSTEP = 950000 TIME(PS) = 52980.000 TEMP(K) = 298.41 PRESS
>> =
>>>> 0.0
>>>> Etot = -140475.4219 EKtot = 36982.3438 EPtot =
>>>> -177457.7657
>>>> BOND = 1208.0342 ANGLE = 3085.0042 DIHED =
>>>> 5565.0947
>>>> 1-4 NB = 1290.0424 1-4 EEL = 12081.5006 VDWAALS =
>>>> 21022.0260
>>>> EELEC = -221794.1217 EHBOND = 0.0000 RESTRAINT =
>>>> 0.0000
>>>> EAMD_BOOST = 84.6540
>>>>
>> ------------------------------------------------------------------------------
>>>> | Current Timing Info
>>>> | -------------------
>>>> | Total steps : 25000000 | Completed : 950000 | Remaining :
>> 24050000
>>>> |
>>>> | Average timings for last 20000 steps:
>>>> | Elapsed(s) = 93.93 Per Step(ms) = 4.70
>>>> | ns/day = 36.79 seconds/ns = 2348.27
>>>> |
>>>> | Average timings for all steps:
>>>> | Elapsed(s) = 4461.87 Per Step(ms) = 4.70
>>>> | ns/day = 36.79 seconds/ns = 2348.36
>>>> |
>>>> |
>>>> | Estimated time remaining: 31.4 hours.
>>>>
>> ------------------------------------------------------------------------------
>>>> This issue does not depend on the type of MD simulation I run - it
>> happens
>>>> during classical MD and aMD.
>>>> I rerun the same job using the CPU, typed "tail -f mdinfo" and nothing
>>>> happened. The simulation is running.
>>>> Could it be a problem with pmemd.cuda?
>>>> I've ran a simulation on a cluster without PBS (on CPU and GPU) and...
>>>> everything worked. I don't get it.
>>>>
>>>> Best regards.
>>>>
>>>>
>>>> 2016-04-21 14:14 GMT+02:00 David A Case <david.case.rutgers.edu>:
>>>>
>>>>> On Thu, Apr 21, 2016, Karolina Markowska wrote:
>>>>>> I have a strange problem. I'm running different aMD simulations tests
>>>>> and I
>>>>>> want to compare the timings. I know I can find that kind of
>> informations
>>>>> in
>>>>>> the mdinfo file, but here comes the problem: several times when I
>> opened
>>>>>> mdinfo file (using just "cat mdinfo"), the simulation crashed and I've
>>>>> got
>>>>>> an error:
>>>>>> At line 810 of file runfiles.F90 (unit = 7, file = 'mdinfo')
>>>>>> Fortran runtime error: Resource temporarily unavailable
>>>>> I don't remember any reports like this, and the amber developers
>> (including
>>>>> me) do this all the time.
>>>>>
>>>>> What is your operating system; do you know what sort of file system is
>>>>> being
>>>>> used on the drive where the mdinfo file is?
>>>>>
>>>>> Do you run into problems with commands like "ls -l mdinfo"? Is a
>> mdinfo
>>>>> file
>>>>> present (with non-zero length) when you execute the "cat mdinfo"
>> command?
>>>>> Can you try to narrow down the problem? Does is depend on using aMD vs.
>>>>> regular MD? Using GPUs vs CPUs? Submitted to a queuing system vs.
>>>>> running interactively?
>>>>>
>>>>> ....thx...dac
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Apr 24 2016 - 16:00:04 PDT
Custom Search