Re: [AMBER] [cpptraj] Kullback Leibler Divergence cutoff choice from Eiros Zamora, Juan on 2015-08-05 (Amber Archive Aug 2015)

From: Eiros Zamora, Juan <j.eiros-zamora14.imperial.ac.uk>
Date: Wed, 5 Aug 2015 16:23:44 +0000

Dear Adrian,

Thanks for your fast answer. You are absolutely correct, I am just doing several (750 ns) independent classical MD runs.
Even when not comparing it to a true final distribution, it’s a suitable way to estimate the convergence in sampling between two runs right? At least on the timescale and phase space that I have access to.

The PCA I did it by fitting the coordinates to the average structure and then diagonalizing the backbone coordinate covariance matrix.

All the best,

Juan
> El 5/8/2015, a las 18:09, Adrian Roitberg <roitberg.ufl.edu> escribió:
>
> Hi Juan
>
> If you look at the paper you mention, they use KLD in a very different
> way that what you are trying to do. In their case, they KNOW the
> correct, final distribution, obtained through immense amounts of
> sampling. Then they ask, how long does it takes, using a different
> sampling technique, to get a KLD < 0.02 VERSUS the correct, converged
> distribution. To remind you, KLD = 0 means the distributions are identical.
>
> Now, in your case, you are comparing different MD runs, NONE of which is
> possibly fully converged, correct? This means you expect your KLD to be
> higher than 0.02. Now, KLDs are comparisons between TWO distributions.
> For some properties, such a radius of gyration for instance, I expect
> that your KLD would be close to zero when comparing two dynamics, each
> of 200 ns. PCA's are much trickier and take longer to converge.
>
> When you compute PCAs, there are many details about how you overlapped
> your structures, etc.
>
> Adrian
>
>
> On 8/5/15 12:01 PM, Eiros Zamora, Juan wrote:
>> Hi everyone,
>>
>> I’ve replicated the KLD analysis of this paper http://pubs.acs.org/doi/abs/10.1021/jp4125099 on my system and I have a couple of questions.
>>
>> A cutoff of convergence of KLD < 0.02 is chosen because the slope of the KLD plot vs time no longer changes once its below this number. For my system, this appears to be happening as well, but for a KLD < 2.5.
>>
>> 1) Are the KLD values expected to be higher the more complex a system is? (i.e. in the paper the analysis is done on a tetra nucleotide and I’m doing it on a 419 residue protein). I understand that this is a measure of the difference between two probability distribution functions, therefore it wouldn’t really matter how complex the system is when you do the PC projection on the trajectory and histogram it, but I was wondering if I’m missing something and that could be the explanation. Also, am I just wrong assuming that this is converged by choosing a higher cutoff? I’m just picking this 2.5 value because it appears that for the last 200 ns the plot is stable, but if were to pick 0.02 then it would be not that easy to say so.
>>
>> 2) Is there a reason to not do all the pairwise KLD comparisons between the independent runs? As in, if you have 10 runs you should be doing 90 KLDs, because the KLD is not symmetric. But I don’t know if that would make much sense in MD, or if it would give extra info at all? I’d like to have the opinion of the authors on this, because it looks to me a tedious analysis with cpptraj that maybe isn’t really adding any insight.
>>
>> Thanks for your time and/or any comments on this matter,
>>
>> Juan
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
> --
> Dr. Adrian E. Roitberg
> Professor.
> Department of Chemistry
> University of Florida
> roitberg.ufl.edu
> 352-392-6972
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Aug 05 2015 - 09:30:04 PDT