Re: [AMBER] how to keep the previous restart file when saving the current one

From: Thomas Evangelidis <tevang3.gmail.com>
Date: Thu, 24 Jan 2013 15:11:23 +0200

Hi Jason,

I use PBS/Torque with dependencies, but I suspect that the problem is that Iuse
"depend=afterok". Since I set "walltime=24:00:00" in my pbs file, if AMBER
completes nstlim steps in less than 24 hours, it kills all the jobs in the
queue. See example stderr below:

Killing processes of user lspro220u2 on the batch nodes
Node: g16-ib
Done
---------------------------------------------------------
Resources Requested:

11229

---------------------------------------------------------
Resources Used:

cput=19:37:47,mem=173040kb,vmem=67609700kb,walltime=19:38:18


I guess the solution is to use "depend=afterany" instead. I 'll let you
know how it goes.

On another hand, since you raised that issue, writing restart files less
frequently than ntwx steps will result to frame redundancy when the next
jobs resumes (job2). I.e. if ntwx=1000, ntwr=10000 and the job is killed
for some reason at step 639400, then the next jobs will resume from step
630000, which means that traj1.crd will have 9 frames that will overlap
with traj2.crd. That will lead to confusion when concatenating the
trajectories to do the analysis, especially in my case when I run aMD and I
have to parse the amd.log files as well. Is there an elegant way to discard
the overlapping frames from all the traj*.crd files?

Thomas




On 24 January 2013 01:26, Jason Swails <jason.swails.gmail.com> wrote:

> Aron already mentioned the negative values of ntwr, which is good advice
> and will do what you want probably better than the 2-restart approach.
>
> Other unsolicited advice based on my experience -- set nstlim to something
> that will actually finish within the amount of time. Most queuing systems
> will allow you to submit a stack of jobs that will execute one right after
> the other (via dependencies). If you are using PBS/torque, see this page
> as a quick intro to taking full(er) advantage of its capabilities:
> http://jswails.wikidot.com/using-pbs
>
> Also, I would suggest writing restart files infrequently -- maybe only 5
> times the entire simulation. Restart files are by far the most expensive
> things to write out, since they're written (typically) in ASCII at higher
> precision, and with velocities as well as coordinates. This makes them well
> over 2x larger than a single frame of a mdcrd file. By setting ntwr =
> -nstlim / 5, you will get at most 5 restarts and will lose at most 1/5 of
> the total amount of data you were expecting.
>
> HTH,
> Jason
>
> P.S., if you use SGE instead of Torque/PBS, you can still set up
> dependencies, but the usage is slightly different. Manual pages help here!
>
> On Wed, Jan 23, 2013 at 5:17 PM, Thomas Evangelidis <tevang3.gmail.com
> >wrote:
>
> > Dear AMBER users,
> >
> > Is it possible to save 2 restart (.rst) files instead of one every ntwr
> > steps, namely the current one (e.g. trajectory.rst) and the previous one
> > (e.g. trajectory.old.rst). For some reason the queueing system on the
> > supercomputer I am using cannot resume the simulation from the previous
> one
> > if the previous has finished in less that 24 hours. For instance AMBER
> runs
> > 6 ns/day for my system so in the configuration file I put
> "nstlim=5000000,
> > dt=0.002". This raises the problem that the queueing system may kill the
> > job while AMBER writes the restart file, hence the next job cannot start.
> > That results to a loss of 24 GPU hours.
> >
> >
> > Thomas
> >
> >
> > --
> >
> > ======================================================================
> >
> > Thomas Evangelidis
> >
> > PhD student
> > University of Athens
> > Faculty of Pharmacy
> > Department of Pharmaceutical Chemistry
> > Panepistimioupoli-Zografou
> > 157 71 Athens
> > GREECE
> >
> > email: tevang.pharm.uoa.gr
> >
> > tevang3.gmail.com
> >
> >
> > website: https://sites.google.com/site/thomasevangelidishomepage/
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
>
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Candidate
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
======================================================================
Thomas Evangelidis
PhD student
University of Athens
Faculty of Pharmacy
Department of Pharmaceutical Chemistry
Panepistimioupoli-Zografou
157 71 Athens
GREECE
email: tevang.pharm.uoa.gr
          tevang3.gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jan 24 2013 - 05:30:02 PST
Custom Search