Re: [AMBER] What is the typical stimulation time? from George Tzotzos on 2012-07-07 (Amber Archive Jul 2012)

From: George Tzotzos <gtzotzos.me.com>
Date: Sat, 07 Jul 2012 13:37:39 +0200

First of all, thanks again for the contributions. I have one last question.

To check for convergence, does it pay to run a relatively long simulation, say 60ns, or is it better to run 6 independent simulations of 10ns each?

The reason I'm asking is because I have found that in protein-ligand complexes the starting conformation of the ligand may have an influence on measuring a particular property (deltaG in my case). I've started MD simulations from different docking poses and have found that the ligand may be unable to overcome low minima in the time frame of my simulations.

Thanks in advance for any comments.

Good weekend to all

George

On Jul 6, 2012, at 9:04 PM, Ilyas Yildirim wrote:

> George -
>
> There is no simple way to check convergence of a system. It all depends on what
> the system is and what type of analysis you are looking for. I will try to give
> my experiences, which are generally systems with DNA/RNA molecules.
>
> If the system is small, let's say a mononucleoside, which has 2 important
> motions in it (chi rotation and sugar puckering), then doing a histogram
> analysis will show you for sure the convergence of the system (how often a
> state is being occupied). As an example, running 5-10 independent simulations
> and comparing the histograms will tell you if the system has sampled the phase
> space enough. This is a quick calculation compared to bigger systems. Also,
> knowing the experimental background will help on how long MD simulation you
> will need. Again, in nucleosides, conformational change of sugar puckering and
> chi rotation are fast; so in a 50 ns MD, you will see multiple transformation.
> That is what you want to see at the end in a dynamics system; to see all
> conformational transformations (if there are any of course).
>
> If the system is big, however, then things get challenging. Let's say you have
> a protein-DNA complex; the question is "What am I looking for in this system?".
> Is it the binding constant of DNA-protein complex, stability of the system,
> dynamics or a particular motion of the system, etc. Assuming you have a
> starting structure of the system or a model (from XRAY or NMR), the first thing
> is to check if the system is stable. This is probably the easiest one to check;
> as Ross said, you can compare the RMS deviation of the system by time to see if
> it stays in a particular conformation. If it is the binding free energy, then
> you can use MMPBSA to get relative binding constants (for different versions of
> the system). This is actually not too hard, but you need to write your own
> script. I am not sure if the current mmpbsa scripts will directly let you
> calculate it. Maybe a script combined with the outputs of MMPBSA.py will do
> this. I've done similar calculations with AMBER9 so it is possible. It will
> give you the average binding free energy with respect to time, which is a good
> way to decide if the system/calculation is converged.
>
> If the initial system has some weak bindings, it is also possible to analyze
> the dynamics of those regions. For example, Watson-Crick basepairs are
> generally too stable, and it is hard to see unstacking in an MD simulation. You
> will see, however, terminal basepairs fraying (which is due to the missing base
> stacking that makes those basepairs weaker than the ones in the stem). If the
> system you want to analyze has loop regions, for example, (such as AA, UU, CC,
> pairings) then you can run MD simulation to analyze the dynamics in these
> regions, which is possible to do with the current technology. At that point,
> the best thing is to run multiple independent MD runs and compare the
> populations of each particular conformations. If you see these populations
> match up, this will tell you something about the convergence.
>
> I think the hardest part is to simulate a big system and look for big motional
> changes that are shown to happen in experiments. For example, base flipping is
> known to happen, but can this be seen in an MD simulation? There are papers
> written on base flipping and there are different techniques that can be used to
> understand this motion, and base flipping is not that a big motion compared to
> conformational changes seen in proteins. In order to see these type of
> conformational changes, good amount of MD time has to be run.
>
> Good luck,
>
> Ilyas,
>
>> However, Carlo's last message is a bit ambiguous. He states "in short runs
>> the first and second half match, then later they don't, then much later they
>> do again".
>>
>> What does "later" mean if one is dealing with "two halves"? For, say, a 10ns
>> simulation, I would understand this, if one were to split it in 2x5ns and
>> looked at a
>> particular property, say deltaG, then one extended the simulation for an
>> additional 10ns, did the same, etc.
>>
>> Or alternatively, for a 40ns simulation, one split it in 4x10ns parts and
>> looked at the halves of each part.
>>
>> Concerning Ross' 2nd suggestion "Write a simple script that starts with the
>> data for frame 1 and calculates the binding energy, then repeats this for
>> frames 1,2 combined, then 1,2,3, then 1 to 4
>> etc."
>>
>> This seems to me very heavy on computer resources. Am I right?
>>
>> A clarification would be most welcome
>>
>> George
>>
>> On Jul 6, 2012, at 7:48 PM, Carlos Simmerling wrote:
>>
>>> to add to Ross's good advice, you always need to worry that sometimes your
>>> first and second half match just because you are looking at times so short
>>> that it hasn't even done much. In other words, in short runs the first and
>>> second half match, then later they don't, then much later they do again. To
>>> avoid the "premature convergence", you should probably also look at 2
>>> independent simulations. What you mean by "independent" again depends
>>> entirely on what you're trying to study, which you haven't told us about
>>> yet so we can't really give specific advice.
>>>
>>>
>>> On Fri, Jul 6, 2012 at 12:23 PM, Ross Walker <ross.rosswalker.co.uk> wrote:
>>>
>>>> Hi Catherine,
>>>>
>>>> To add some more practical advice to this discussion. In a slight
>>>> rephrasing
>>>> of what Thomas says I would state that your simulations need to be long
>>>> enough that the property you are attempting to measure is not dependent on
>>>> the length of the simulation. This is another way of saying that your
>>>> simulation is converged. I would start by going back over what you learnt
>>>> as
>>>> an undergrad for statistical error analysis. Most people learn this in
>>>> terms
>>>> of experiment but the same principles apply to MD computer simulation and
>>>> it
>>>> can be helpful to think of the simulations in terms of experiment.
>>>>
>>>> A simple test and one I recommend is to calculate the property you are
>>>> interested in based on the data you have, then double the run length and
>>>> see
>>>> if the property changes and if so by how much. This will at least give you
>>>> some idea of the sampling error due to convergence and will give you
>>>> ammunition to defend the length of your simulations. Another way to do this
>>>> is to effectively throw away half your data. Take just the first half of
>>>> your data and calculate the properties you are interested in, say a binding
>>>> energy. Then repeat that with just the second half of the data, then repeat
>>>> it with all the data and compare the results. You can also take this one
>>>> step further and script it such that you show convergence of the property
>>>> being measured as a function of your dataset size. This is pretty easy to
>>>> do
>>>> and something I think everyone publishing such things should do. However, I
>>>> suspect that in most cases things are horribly unconverged and so showing
>>>> such a plot would 'detract' from the result a little too much. It is nice
>>>> to
>>>> do though just to prove to yourself if the data is converged or not. Take a
>>>> MMPBSA calculation for example. Say you have 10,000 frames. Write a simple
>>>> script that starts with the data for frame 1 and calculates the binding
>>>> energy, then repeats this for frames 1,2 combined, then 1,2,3, then 1 to 4
>>>> etc. If you convert the frames into time, as in frame 1 = 10ps, frame 2 =
>>>> 20ps etc you can then easily plot the binding energy as a function of
>>>> cumulative simulation time. This function 'SHOULD' converge since each
>>>> point
>>>> includes all the previous points. You'll be shocked though how long it
>>>> actually takes to converge. However, armed with such data showing
>>>> convergence I think you can easily convince a skeptical reviewer that your
>>>> simulation lengths (and number of simulations) were sufficient.
>>>>
>>>> Good luck,
>>>>
>>>> All the best
>>>> Ross
>>>>
>>>>> -----Original Message-----
>>>>> From: steinbrt.rci.rutgers.edu [mailto:steinbrt.rci.rutgers.edu]
>>>>> Sent: Friday, July 06, 2012 2:51 AM
>>>>> To: AMBER Mailing List
>>>>> Subject: Re: [AMBER] What is the typical stimulation time?
>>>>>
>>>>> Hi,
>>>>>
>>>>> well, there have been many responses so far, some of them serious...
>>>>>
>>>>> Still, to give my two cents: You should rephrase the question into a
>>>>> statement. When you write up your manuscript, think about adding that
>>>>> you
>>>>> are confident that the results presented from a simulation of length X
>>>>> are
>>>>> sufficiently converged, because...
>>>>>
>>>>> ...and then it depends on what you want to say, e.g. because multiple
>>>>> transitions along a reaction coordinate that you study have been
>>>>> observed,
>>>>> because the correlation time of whatever property you look at is much
>>>>> smaller than X, because you get good agreement to experiment, etc. The
>>>>> last one may not be such a good justification, but is seen in papers
>>>>> often
>>>>> enough.
>>>>>
>>>>> Kind Regards,,
>>>>>
>>>>> Thomas
>>>>>
>>>>>
>>>>> On Thu, July 5, 2012 10:25 pm, Dr. Vitaly V. G. Chaban wrote:
>>>>>> and whether the simulation is equilibrium dynamics, but we go
>>>>> flooding...
>>>>>>
>>>>>> Good journals do not like to accept applicable simulation studies
>>>>>> based on less than 10 ns trajectories, this is a purely practical
>>>>>> advice/observation.
>>>>>>
>>>>>> Vitaly
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 5, 2012 at 10:06 PM, Ganesh Kamath
>>>>> <gkamath9173.gmail.com>
>>>>>> wrote:
>>>>>>> Depends on what you are trying to simulate ......
>>>>>>>
>>>>>>> On Jul 5, 2012 8:56 PM, "Dr. Vitaly V. G. Chaban"
>>>>> <vvchaban.gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dear Sir/Madam,
>>>>>>>>> What is the typical simulation time to get a reasonable and data
>>>>> for
>>>>>>>>> publications use?
>>>>>>>>> Best regards,
>>>>>>>>> Catherine
>>>>>>>>
>>>>>>>> Catherine -
>>>>>>>>
>>>>>>>> Not any femtosecond more after ergodicity is achieved.
>>>>>>>>
>>>>>>>>
>>>>>>>> Dr. Vitaly V. Chaban, 430 Hutchison Hall
>>>>>>>> Dept. Chemistry, University of Rochester
>>>>>>>> 120 Trustee Road, Rochester, NY 14627-0216
>>>>>>>> THE UNITED STATES OF AMERICA
>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>
>>>>>
>>>>>
>>>>> Dr. Thomas Steinbrecher
>>>>> formerly at the
>>>>> BioMaps Institute
>>>>> Rutgers University
>>>>> 610 Taylor Rd.
>>>>> Piscataway, NJ 08854
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jul 07 2012 - 05:00:02 PDT