Re: [AMBER] Stopping sander in production phase. Why?

From: Jason Swails <jason.swails.gmail.com>
Date: Fri, 23 Oct 2015 12:01:26 -0400

On Fri, Oct 23, 2015 at 11:48 AM, Atila Petrosian <atila.petrosian.gmail.com
> wrote:

> Dear Jason,
>
> > It could be that the job was simply killed.​
> > It could be a problem during cleanup of sander.
>
> How to avoid this job killing


​That depends on why the job was killed, which again we don't know without
the error message. But this is not an Amber problem -- sander can't
prevent the OS from killing it. Maybe sander used too much memory... then
you need to ask for more memory. Maybe it ran too long... then you need to
ask for more time or run a shorter simulation. Maybe you used more CPUs
than you requested on the same node... then you need to stop doing that.
Maybe...



> or cleanup process?
>

​Then it's a bug, and one of the developers needs to be able to reproduce
the bug in order to fix it and release an update. It's probably also a
compiler-dependent bug (since I haven't seen this reported before), in
which case you might be able to work around it by using a different
compiler.

Do the Amber tests run and finish correctly? This is *always* a good
question to ask when having problems like these.

> ​Did you ask for more steps than this?
>
> Each of production phase is 5 ns (2500000 steps).
>

​Then the simulation probably died in the cleanup stage... most likely.
We'd need to see the error message (and then likely someone would need to
be able to reproduce the failure) in order to get this fixed.



> My main question in my last case in which
>
> out file: NSTEP = 867500
>
> info file: NSTEP = 2500000
>
> is that can I be sure that MD simulation was ended?
>

​Well it certainly ran all 2500000 steps -- whether the simulation died
before or *after* writing the final restart file is something only you will
know. If the restart file was written, and visualizing your system does
not reveal any problems, you are likely fine. If it was a double-free or
some other kind of cleanup-related error, then that's almost certainly a
sign that the simulation was fine.

The size of mdcrd file seems low. Lower than the size of a mdcrd file with
> 2500000 steps.
>

​seems is only mildly useful (to you -- not really useful at all to us).
Try to determine the "is". IS it small? DOES it have too few frames?
 cpptraj can tell you this. Computers deal with exacts, which means
debugging requires dealing with exacts as well.

But there's really nothing more helpful I can say without more information.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 23 2015 - 09:30:04 PDT
Custom Search