On 11/09/2022 22:19, He, Amy via AMBER wrote:
> Dear Amber community,
> I have a question about time series analysis. I’m doing an autocorrelation analysis of a time series I obtained from MD trajectories. Following the definition of autocorrelation function (ACF), I was able to calculate ACF with respect to different lags. However, I was a bit confused about how to calculate the characteristic time of ACF (also referred to as “autocorrelation time”).
> My question is:
> What is the most widely accepted definition of autocorrelation time, for discrete time series data such as MD trajectories?
> What I have tried:
> (1)
> I found the definition of the autocorrelation time on a paper:
> John D. Chodera, William C. Swope, Jed W. Pitera, Chaok Seok, and Ken A. Dill Journal of Chemical Theory and Computation 2007 3 (1), 26-41 DOI: 10.1021/ct0502864
> Per se equation 19, the autocorrelation time is calculated as the sum of (1-t/N)*Ct, where Ct is the ACF with respect to the lag t.
> I tried that on my data and some mock data, and I found the result always converged to 0.5, regardless of the input data … I think others found the same issue with possibly the same equation, as discussed in this post<https://mattermodeling.stackexchange.com/questions/7061/what-is-autocorrelation-time> on StackExchange.
> (2)
> I also viewed the answer to that StackExchange post, which suggested the autocorrelation time as 1+2*sum(Ct). That did not work for me because my Ct at longer lags (larger t) is very noisy... Summing up all Ct can either return a large positive or negative value, and I don’t think the result is a good estimator for autocorrelation time.
> (3)
> If we assume the decay of ACF is exponential, we can approximate the decay rate and the characteristic time by curve fitting. The fitting can be very poor in some cases, and again it’s due to the noise at longer lags as well as the rapid decay at shorter lags, which suggests my data appear to be uncorrelated already at shorter lags.
> My thought:
> I think it makes more sense to estimate the autocorrelation time with only the first a few Ct, assuming that Ct at longer lags is swinging above and below 0 but eventually cancel. Although that sounds like a very sleazy way to estimate the autocorrelation time :’)
> Has anyone calculated the autocorrelation time for any kind of MD data, and how did you do that? Any comments & suggestions would be greatly appreciated.
> Many Thanks,
> Amy
Hi there, first of all, your final thought is quite correct. Typically, the ACF is fairly noisy at large times. Thus, what one normally does is disregard the points after the first time when it
becomes smaller than a certain arbitrarily chosen threshold. If you computed the ACF on enough data, this leaves you with a sufficiently large interval to estimate the autocorrelation time. As to how
to do this, it is pretty much as you wrote. The ACF decays exponentially, like e^{-t/\tau}. However, rather than fitting it, which can be yield poor results, you can estimate \tau by integrating it.
In fact, the integral from 0 to infinity of e^{-t/\tau} is indeed \tau. This is why this quantity is often called the integrated autocorrelation time. The only thing you must remember is to operate on
the normalized ACF, which simply means dividing it by its value at 0.
So, recapitulating:
1) Normalize the ACF (divide all its points by its value at 0);
2) Throw away everything after the first time when the normalized ACF is smaller than some threshold;
3) Integrate (sum) what is left.
The result is the characteristic time.
Cheers,
Charo
--
Dr. Charo I. del Genio
Senior Lecturer in Statistical Physics
Applied Mathematics Research Centre (AMRC)
Design Hub
Coventry University Technology Park
Coventry CV1 5FB
UK
https://charodelgenio.weebly.com
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Sep 11 2022 - 23:00:03 PDT