Re: [AMBER] Production phase for MMPBSA from Andy Watkins on 2016-03-09 (Amber Archive Mar 2016)

From: Andy Watkins <andy.watkins2.gmail.com>
Date: Wed, 9 Mar 2016 02:10:24 -0800

Oh, I absolutely agree--so perhaps I'll modify my question a bit. Though
I've only done a relatively cursory literature review, I've seen minimal
discussion of how to estimate the correlation time for a simulation as it's
spoken of in this context. (For the variation of a specific quantity like a
bond length that's of particular interest, sure.)

I suppose I was pricing in some degree of that uncertainty and asking the
question at the margins, i.e. given that:

1. even an estimate of your correlation time has some error bars on it
2. frames separated by slightly less than the true correlation time are not
remotely as correlated as frames separated by one tenth of the correlation
time

at what point does throwing out data become deleterious? (I could imagine a
heuristic being "once frames are separated by 105% of the correlation time
as calculated by method A" or some such.)

For full disclosure, I do almost no MD simulation in my day-to-day, so I
may lack some amount of expertise that you'd otherwise take for granted.

On Wed, Mar 9, 2016 at 1:49 AM, Hannes Loeffler <Hannes.Loeffler.stfc.ac.uk>
wrote:

> On Wed, 9 Mar 2016 01:06:31 -0800
> Andy Watkins <andy.watkins2.gmail.com> wrote:
>
> > > The more time between adjacent snapshots, the less correlated the
> > > results
> > will be (and therefore, the more statistically significant they will
> > be).
> >
> > So there are two possible choices that might strengthen the
> > statistics of a given MM/PBSA calculation, right? One can include
> > more snapshots in total, thus sampling more total states of the
> > protein, and one can perform more simulation so that one's snapshots
> > may be better spaced out, to diminish inter-snapshot correlation.
> > What's the conventional wisdom to balance these two competing aims?
> > That is, suppose you're already doing as much simulation in all as
> > your computational resources make possible. Provided that
> > constraint--be it 10 ns, 100 ns, or 1 us--how do you optimize
> > snapshot number vs. correlation?
>
> I think that's the wrong way of looking at the problem. If you want to
> get meaningful statistics, and I would say that is what you should
> really aim for, then you need to accept that you can only use
> uncorrelated snapshots. In other words, you can only increase the
> number of snapshots by increasing the simulations time (with MD), and
> actually estimate the correlation time to understand how much of your
> data needs to be discarded. Combine this with multiple runs to
> generate _independent_ trajectories.
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Mar 09 2016 - 02:30:03 PST