- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Tom Kurtzman <simpleliquid.gmail.com>

Date: Fri, 11 Nov 2016 13:26:13 -0500

Sergey, Steve and I had a brief discussion about this and there doesn't

seem to be anything concerning to us in this behavior. The behavior, is

consistent with what we'd expect with sparse sampling and how the code

handles sparse sampling.

The value of \rho \ln \rho which is used to calculate the translational

entropy is zero when \rho = 0. In the code, when the value of \rho is

below some threshold, we just don't calculate in since the \ln \rho term

approaches negative infinity and the computer just can't handle it and the

overall contribution even if we did calculate it is negligible.

The Nearest Neighbor algorithm which uses an approximation of the local

density around a particle (a water oxygen here) that is 1 over the volume

of the sphere that has a radius of the nearest neighbor distance. For

computational efficiency to find a nearest neighbor of a water oxygen, we

only search the voxel the water oxygen is in, and all neighboring voxels

(27 total). With the default voxel size of .5 angstroms per side, the

probability of finding a particle in a voxel is about 1 in 10.

What this means is that at very sparse sampling (1 frame for example) the

NN estimated (or should I say mis-estimated) density in every voxel is zero

and the estimated entropy would be zero. If you only sample two frames

with independent configurations, about 90% of the frames would still have

an estimated local density of zero and hence we still expect entropy

estimates much higher than what is actual. Even with 10 frames of

sampling, you are still quite likely not to find any neighbors. This is

all an artifact of extremely sparse sampling and code that is designed for

efficiency at higher sampling. Once there is sufficient sampling for the

NN algorithm to work, the convergence of the method is really quite

outstanding. From your figure I'd certainly not use fewer frames than the

minimum of that curve (100 frames?) for translational density.

Tom

On Fri, Nov 11, 2016 at 11:43 AM, Steven Ramsey <vpsramsey.gmail.com> wrote:

*> Hi Sergey,
*

*>
*

*> I think the initial drop in entropy you're seeing in your analysis is
*

*> indeed an artifact due to low sampling. At very low frame counts (under
*

*> 1000 or so) the nearest neighbor algorithm used to solve entropies will
*

*> provide strange results due to there being a very low number of waters (and
*

*> therefore neighbor distances) to consider.
*

*>
*

*> We recently evaluated GIST convergence rates in the cpptraj software
*

*> release study (doi: 10.1002/jcc.24417) and found that entropies converge
*

*> within 30000 frames (sampled every ps). This may be system specific, but is
*

*> a reasonably good guess for most studies.
*

*>
*

*> Hope this helps, best of luck!
*

*>
*

*> --Steve
*

*>
*

*> On Fri, Nov 11, 2016 at 9:28 AM, Sergey Samsonov <
*

*> sergeys.biotec.tu-dresden.de> wrote:
*

*>
*

*> > Dear AMBERs,
*

*> >
*

*> > I'm calibrating some GIST calculations. In particular, I'm checking how a
*

*> > number of frames (equidistantly distributed through the equilibrated
*

*> > simulation) taken for GIST calculations affects the values of GIST energy
*

*> > components. The reason to do this is to find an optimal length of my
*

*> > simulations (and a number of frames to analyze with GIST) for a system I
*

*> > study so that the values I obtain are converged. I found a common feature
*

*> > independently of regions, sizes a box used for GIST and lengths of the
*

*> > simulations within the ranges I'm working in. So E(sw), E(ww) and
*

*> > TS(orientational) converge very similarly: the values go down
*

*> monotonically
*

*> > with the increase of number of frames taken into account for the
*

*> > calculations and converge for several thousand of frames (exact number
*

*> > depends on the system and simulation type). This result is something one
*

*> > can expect. However, TS(translational) behaves essentially differently:
*

*> its
*

*> > value drops when increasing the number of analyzed frames to ~200 and
*

*> then
*

*> > it goes up again (one of the examples is attached for 10000 frames from
*

*> 100
*

*> > ns long simulation). What could be the reason for such a non-monotonic
*

*> > behaviour? Is the decrease observed simply due to an artifact of the
*

*> > calculations when the number of frames too low?
*

*> >
*

*> > Thank you very much and cheers,
*

*> >
*

*> > Sergey
*

*> >
*

*> > --
*

*> > Sergey A. Samsonov
*

*> > Postdoctoral researcher
*

*> > Structural Bioinformatics
*

*> > Biotechnology Center
*

*> > Tatzberg 47-51
*

*> > 01307 Dresden, Germany
*

*> >
*

*> > Tel: (+49) 351 463 400 83
*

*> > Fax: (+49) 351 463 402 87
*

*> > E-mail: sergey.samsonov.biotec.tu-dresden.de
*

*> > Webpage: www.biotec.tu-dresden.de
*

*> >
*

*> >
*

*> > _______________________________________________
*

*> > AMBER mailing list
*

*> > AMBER.ambermd.org
*

*> > http://lists.ambermd.org/mailman/listinfo/amber
*

*> >
*

*> >
*

*> _______________________________________________
*

*> AMBER mailing list
*

*> AMBER.ambermd.org
*

*> http://lists.ambermd.org/mailman/listinfo/amber
*

*>
*

Date: Fri, 11 Nov 2016 13:26:13 -0500

Sergey, Steve and I had a brief discussion about this and there doesn't

seem to be anything concerning to us in this behavior. The behavior, is

consistent with what we'd expect with sparse sampling and how the code

handles sparse sampling.

The value of \rho \ln \rho which is used to calculate the translational

entropy is zero when \rho = 0. In the code, when the value of \rho is

below some threshold, we just don't calculate in since the \ln \rho term

approaches negative infinity and the computer just can't handle it and the

overall contribution even if we did calculate it is negligible.

The Nearest Neighbor algorithm which uses an approximation of the local

density around a particle (a water oxygen here) that is 1 over the volume

of the sphere that has a radius of the nearest neighbor distance. For

computational efficiency to find a nearest neighbor of a water oxygen, we

only search the voxel the water oxygen is in, and all neighboring voxels

(27 total). With the default voxel size of .5 angstroms per side, the

probability of finding a particle in a voxel is about 1 in 10.

What this means is that at very sparse sampling (1 frame for example) the

NN estimated (or should I say mis-estimated) density in every voxel is zero

and the estimated entropy would be zero. If you only sample two frames

with independent configurations, about 90% of the frames would still have

an estimated local density of zero and hence we still expect entropy

estimates much higher than what is actual. Even with 10 frames of

sampling, you are still quite likely not to find any neighbors. This is

all an artifact of extremely sparse sampling and code that is designed for

efficiency at higher sampling. Once there is sufficient sampling for the

NN algorithm to work, the convergence of the method is really quite

outstanding. From your figure I'd certainly not use fewer frames than the

minimum of that curve (100 frames?) for translational density.

Tom

On Fri, Nov 11, 2016 at 11:43 AM, Steven Ramsey <vpsramsey.gmail.com> wrote:

-- ************************************************ Tom Kurtzman, Ph.D. Assistant Professor Department of Chemistry Lehman College, CUNY 250 Bedford Park Blvd. West Bronx, New York 10468 718-960-8832 http://www.lehman.edu/faculty/tkurtzman/ <http://www.lehman.edu/faculty/tkurtzman/index.html> ************************************************ _______________________________________________ AMBER mailing list AMBER.ambermd.org http://lists.ambermd.org/mailman/listinfo/amberReceived on Fri Nov 11 2016 - 10:30:02 PST

Custom Search