Re: [AMBER] Amber12 issue with SMD protocol from Niel Henriksen on 2012-07-26 (Amber Archive Jul 2012)

From: Niel Henriksen <niel.henriksen.utah.edu>
Date: Thu, 26 Jul 2012 16:14:24 +0000

On monster machines, like Kraken, it is advantageous to
bundle several hundred jobs together. Due to hardware
constraints, performance can be variable and thus you
can end up with some jobs finishing earlier than others
and you waste cpu cycles. For several thousand processors
this can get expensive. One approach is to let all the jobs
run up to the wallclock time. If buffers aren't flushed nicely
then you get messed up restart files, which is annoying.

A better approach is to implement a time limit rather than
a number of steps limit. I've done it but haven't put it in
git yet.

--Niel
________________________________________
From: David Case [dacase.rci.rutgers.edu]
Sent: Thursday, July 26, 2012 10:02 AM
To: AMBER Mailing List
Subject: Re: [AMBER] Amber12 issue with SMD protocol

On Jul 26, 2012, at 9:14 AM, Jan-Philip Gehrcke <jgehrcke.googlemail.com> wrote:

>
> Actually "the only sure-fire way of forcing a buffer flush is to close
> the open file unit" is unprecise.

My question is this: is this a real problem? I very rarely have jobs that fail to finish or fail to get the results written to disk. And I generally limit any single run to one or two days, in order to limit the loss if something prevents results from being dumped to disk. And in the rare times when something bad happens, I almost always start again from the last fully completed run, and never try to rescue some partial set.

So, what are other people's experience? Are there places where this is a bigger problem than I see?

...dac

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jul 26 2012 - 09:30:04 PDT