Re: [AMBER] What is the typical stimulation time? from Carlos Simmerling on 2012-07-06 (Amber Archive Jul 2012)

From: Carlos Simmerling <carlos.simmerling.gmail.com>
Date: Fri, 6 Jul 2012 14:21:55 -0400

what i meant was fairly simplistic... imagine that you want to look at the
histogram of some distance distribution. during a short run, the structure
won't change much, so the histograms of first and second half are quite
similar. when you extend the run, and more of conformational space is
sampled, the histograms may not match, until much later, when everything
has been sampled many times and the same distribution is obtained in the
first and second half. My point was that the agreement in the data in the
first case does not reliably indicate that a longer run would keep giving
the same results, whereas in the last case it does.

On Fri, Jul 6, 2012 at 2:18 PM, George Tzotzos <gtzotzos.me.com> wrote:

> Apologies to jump into this discussion. I've been following it and found
> it extremely useful.
>
> However, Carlo's last message is a bit ambiguous. He states "in short runs
> the first and second half match, then later they don't, then much later
> they do again".
>
> What does "later" mean if one is dealing with "two halves"? For, say, a
> 10ns simulation, I would understand this, if one were to split it in 2x5ns
> and looked at a particular property, say deltaG, then one extended the
> simulation for an additional 10ns, did the same, etc.
>
> Or alternatively, for a 40ns simulation, one split it in 4x10ns parts and
> looked at the halves of each part.
>
> Concerning Ross' 2nd suggestion "Write a simple script that starts with
> the data for frame 1 and calculates the binding energy, then repeats this
> for frames 1,2 combined, then 1,2,3, then 1 to 4
> etc."
>
> This seems to me very heavy on computer resources. Am I right?
>
> A clarification would be most welcome
>
> George
>
> On Jul 6, 2012, at 7:48 PM, Carlos Simmerling wrote:
>
> > to add to Ross's good advice, you always need to worry that sometimes
> your
> > first and second half match just because you are looking at times so
> short
> > that it hasn't even done much. In other words, in short runs the first
> and
> > second half match, then later they don't, then much later they do again.
> To
> > avoid the "premature convergence", you should probably also look at 2
> > independent simulations. What you mean by "independent" again depends
> > entirely on what you're trying to study, which you haven't told us about
> > yet so we can't really give specific advice.
> >
> >
> > On Fri, Jul 6, 2012 at 12:23 PM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
> >
> >> Hi Catherine,
> >>
> >> To add some more practical advice to this discussion. In a slight
> >> rephrasing
> >> of what Thomas says I would state that your simulations need to be long
> >> enough that the property you are attempting to measure is not dependent
> on
> >> the length of the simulation. This is another way of saying that your
> >> simulation is converged. I would start by going back over what you
> learnt
> >> as
> >> an undergrad for statistical error analysis. Most people learn this in
> >> terms
> >> of experiment but the same principles apply to MD computer simulation
> and
> >> it
> >> can be helpful to think of the simulations in terms of experiment.
> >>
> >> A simple test and one I recommend is to calculate the property you are
> >> interested in based on the data you have, then double the run length and
> >> see
> >> if the property changes and if so by how much. This will at least give
> you
> >> some idea of the sampling error due to convergence and will give you
> >> ammunition to defend the length of your simulations. Another way to do
> this
> >> is to effectively throw away half your data. Take just the first half of
> >> your data and calculate the properties you are interested in, say a
> binding
> >> energy. Then repeat that with just the second half of the data, then
> repeat
> >> it with all the data and compare the results. You can also take this one
> >> step further and script it such that you show convergence of the
> property
> >> being measured as a function of your dataset size. This is pretty easy
> to
> >> do
> >> and something I think everyone publishing such things should do.
> However, I
> >> suspect that in most cases things are horribly unconverged and so
> showing
> >> such a plot would 'detract' from the result a little too much. It is
> nice
> >> to
> >> do though just to prove to yourself if the data is converged or not.
> Take a
> >> MMPBSA calculation for example. Say you have 10,000 frames. Write a
> simple
> >> script that starts with the data for frame 1 and calculates the binding
> >> energy, then repeats this for frames 1,2 combined, then 1,2,3, then 1
> to 4
> >> etc. If you convert the frames into time, as in frame 1 = 10ps, frame 2
> =
> >> 20ps etc you can then easily plot the binding energy as a function of
> >> cumulative simulation time. This function 'SHOULD' converge since each
> >> point
> >> includes all the previous points. You'll be shocked though how long it
> >> actually takes to converge. However, armed with such data showing
> >> convergence I think you can easily convince a skeptical reviewer that
> your
> >> simulation lengths (and number of simulations) were sufficient.
> >>
> >> Good luck,
> >>
> >> All the best
> >> Ross
> >>
> >>> -----Original Message-----
> >>> From: steinbrt.rci.rutgers.edu [mailto:steinbrt.rci.rutgers.edu]
> >>> Sent: Friday, July 06, 2012 2:51 AM
> >>> To: AMBER Mailing List
> >>> Subject: Re: [AMBER] What is the typical stimulation time?
> >>>
> >>> Hi,
> >>>
> >>> well, there have been many responses so far, some of them serious...
> >>>
> >>> Still, to give my two cents: You should rephrase the question into a
> >>> statement. When you write up your manuscript, think about adding that
> >>> you
> >>> are confident that the results presented from a simulation of length X
> >>> are
> >>> sufficiently converged, because...
> >>>
> >>> ...and then it depends on what you want to say, e.g. because multiple
> >>> transitions along a reaction coordinate that you study have been
> >>> observed,
> >>> because the correlation time of whatever property you look at is much
> >>> smaller than X, because you get good agreement to experiment, etc. The
> >>> last one may not be such a good justification, but is seen in papers
> >>> often
> >>> enough.
> >>>
> >>> Kind Regards,,
> >>>
> >>> Thomas
> >>>
> >>>
> >>> On Thu, July 5, 2012 10:25 pm, Dr. Vitaly V. G. Chaban wrote:
> >>>> and whether the simulation is equilibrium dynamics, but we go
> >>> flooding...
> >>>>
> >>>> Good journals do not like to accept applicable simulation studies
> >>>> based on less than 10 ns trajectories, this is a purely practical
> >>>> advice/observation.
> >>>>
> >>>> Vitaly
> >>>>
> >>>>
> >>>> On Thu, Jul 5, 2012 at 10:06 PM, Ganesh Kamath
> >>> <gkamath9173.gmail.com>
> >>>> wrote:
> >>>>> Depends on what you are trying to simulate ......
> >>>>>
> >>>>> On Jul 5, 2012 8:56 PM, "Dr. Vitaly V. G. Chaban"
> >>> <vvchaban.gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> Dear Sir/Madam,
> >>>>>>> What is the typical simulation time to get a reasonable and data
> >>> for
> >>>>>>> publications use?
> >>>>>>> Best regards,
> >>>>>>> Catherine
> >>>>>>
> >>>>>> Catherine -
> >>>>>>
> >>>>>> Not any femtosecond more after ergodicity is achieved.
> >>>>>>
> >>>>>>
> >>>>>> Dr. Vitaly V. Chaban, 430 Hutchison Hall
> >>>>>> Dept. Chemistry, University of Rochester
> >>>>>> 120 Trustee Road, Rochester, NY 14627-0216
> >>>>>> THE UNITED STATES OF AMERICA
> >>>>>>
> >>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>
> >>>
> >>>
> >>> Dr. Thomas Steinbrecher
> >>> formerly at the
> >>> BioMaps Institute
> >>> Rutgers University
> >>> 610 Taylor Rd.
> >>> Piscataway, NJ 08854
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jul 06 2012 - 11:30:03 PDT