Re: [AMBER] How to read mdinfo file without crashing the simulation?

From: Karolina Markowska <markowska.kar.gmail.com>
Date: Mon, 25 Apr 2016 14:00:08 +0200

Ok, this sounds like something that might work.
Could you give me an advice, how to run a job with this mdinfo trick?

Best regards.



2016-04-25 0:46 GMT+02:00 Bill Ross <ross.cgl.ucsf.edu>:

> One way to hack around it would be to have the code output a new file
> each time: mdinfo.1, mdinfo.2 or what have you (numbered by step). Then
> you'd never collide, though you'd have to delete a bunch of small files,
> and worst case might exceed the number of files a directory can hold.
>
> Bill
>
> On 4/24/16 12:16 PM, Karolina Markowska wrote:
> > I can do "cat mdinfo" and in most cases everything runs just fine.
> > But if I make "cat mdinfo" in the exact moment when the contents of
> mdinfo
> > changes, I get the resource unavailable error and the simulation crashes.
> > Also if I use "tail mdinfo" and pmemd changes the mdinfo file, everything
> > crashes. And this is probably related with the queueing system because
> when
> > I run the simulation without submitting it into the queue - everything is
> > OK.
> > If I don't look into mdinfo file during whole simulation, everything runs
> > OK.
> >
> > 2016-04-23 20:55 GMT+02:00 Bill Ross <ross.cgl.ucsf.edu>:
> >
> >> contents of mdinfo
> >>
> >> On 4/23/16 11:53 AM, Bill Ross wrote:
> >>> So with pmemd, you can run for an arbitrary amount of time, and then
> the
> >>> moment you 'cat mdinfo' the simulation crashes on resource unavailable,
> >>> and the contents of pmemd are different each time, and if you don't
> look
> >>> at the file, the job runs to completion?
> >>>
> >>> Bill
> >>>
> >>> On 4/22/16 2:24 AM, Karolina Markowska wrote:
> >>>> I'm using Ubuntu 14.04, and the file system is ext4. We're using
> quota.
> >>>>
> >>>> I don't have any problem with the "ls -l mdinfo". I'm the owner of
> this
> >>>> file, I (theoretically) can read it or change it. The file is present
> >> with
> >>>> a non-zero length and I can read it using "cat mdinfo" command.
> >>>> -rw-r----- 1 karolinam user 1257 kwi 21 14:57 mdinfo
> >>>>
> >>>> It looks OK, I guess:
> >>>>
> >>>> NSTEP = 950000 TIME(PS) = 52980.000 TEMP(K) = 298.41
> PRESS
> >> =
> >>>> 0.0
> >>>> Etot = -140475.4219 EKtot = 36982.3438 EPtot =
> >>>> -177457.7657
> >>>> BOND = 1208.0342 ANGLE = 3085.0042 DIHED =
> >>>> 5565.0947
> >>>> 1-4 NB = 1290.0424 1-4 EEL = 12081.5006 VDWAALS =
> >>>> 21022.0260
> >>>> EELEC = -221794.1217 EHBOND = 0.0000 RESTRAINT =
> >>>> 0.0000
> >>>> EAMD_BOOST = 84.6540
> >>>>
> >>
> ------------------------------------------------------------------------------
> >>>> | Current Timing Info
> >>>> | -------------------
> >>>> | Total steps : 25000000 | Completed : 950000 | Remaining :
> >> 24050000
> >>>> |
> >>>> | Average timings for last 20000 steps:
> >>>> | Elapsed(s) = 93.93 Per Step(ms) = 4.70
> >>>> | ns/day = 36.79 seconds/ns = 2348.27
> >>>> |
> >>>> | Average timings for all steps:
> >>>> | Elapsed(s) = 4461.87 Per Step(ms) = 4.70
> >>>> | ns/day = 36.79 seconds/ns = 2348.36
> >>>> |
> >>>> |
> >>>> | Estimated time remaining: 31.4 hours.
> >>>>
> >>
> ------------------------------------------------------------------------------
> >>>> This issue does not depend on the type of MD simulation I run - it
> >> happens
> >>>> during classical MD and aMD.
> >>>> I rerun the same job using the CPU, typed "tail -f mdinfo" and nothing
> >>>> happened. The simulation is running.
> >>>> Could it be a problem with pmemd.cuda?
> >>>> I've ran a simulation on a cluster without PBS (on CPU and GPU) and...
> >>>> everything worked. I don't get it.
> >>>>
> >>>> Best regards.
> >>>>
> >>>>
> >>>> 2016-04-21 14:14 GMT+02:00 David A Case <david.case.rutgers.edu>:
> >>>>
> >>>>> On Thu, Apr 21, 2016, Karolina Markowska wrote:
> >>>>>> I have a strange problem. I'm running different aMD simulations
> tests
> >>>>> and I
> >>>>>> want to compare the timings. I know I can find that kind of
> >> informations
> >>>>> in
> >>>>>> the mdinfo file, but here comes the problem: several times when I
> >> opened
> >>>>>> mdinfo file (using just "cat mdinfo"), the simulation crashed and
> I've
> >>>>> got
> >>>>>> an error:
> >>>>>> At line 810 of file runfiles.F90 (unit = 7, file = 'mdinfo')
> >>>>>> Fortran runtime error: Resource temporarily unavailable
> >>>>> I don't remember any reports like this, and the amber developers
> >> (including
> >>>>> me) do this all the time.
> >>>>>
> >>>>> What is your operating system; do you know what sort of file system
> is
> >>>>> being
> >>>>> used on the drive where the mdinfo file is?
> >>>>>
> >>>>> Do you run into problems with commands like "ls -l mdinfo"? Is a
> >> mdinfo
> >>>>> file
> >>>>> present (with non-zero length) when you execute the "cat mdinfo"
> >> command?
> >>>>> Can you try to narrow down the problem? Does is depend on using aMD
> vs.
> >>>>> regular MD? Using GPUs vs CPUs? Submitted to a queuing system vs.
> >>>>> running interactively?
> >>>>>
> >>>>> ....thx...dac
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> AMBER mailing list
> >>>>> AMBER.ambermd.org
> >>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Apr 25 2016 - 05:30:02 PDT
Custom Search