Re: [AMBER] KDE analysis, (P) greater than one

From: Jason Swails <>
Date: Wed, 5 Sep 2018 23:49:34 -0400

Hi Kevin,

As Adrian pointed out, these KDEs are plotting a probability density
function (PDF). Note that probability densities are *not* probabilities.
The only constraint on a PDF is that it integrates to 1 -- individual
points along the PDF can take just about any (non-negative) value you want.

As an extreme example, the dirac delta function -- a PDF that is non-zero
at only a single point -- has *infinite* probability density at the only
point where it is non-zero.

The bandwidth for a KDE controls the width of the individual Gaussian
kernels used to construct the PDF. A larger bandwidth tends to yield more
diffuse PDFs.

On Mon, Sep 3, 2018 at 9:25 PM Mac Kevin Braza <> wrote:

> Thank you for response.
> Here's how the figure look like. I used a bandwidth of 0.25.
> But if I do not specify the bandwidth, the calculated bandwidth from normal
> distribution approximation ranges from 0.02 - 0.05.
> I choose 0.25 because I have other systems also to analyzed with the same
> dataset and I want to normalize the KDE calculations for all the system I
> will be analyzing.

I'm not sure what you mean by "normalize" here. The definition I'm
familiar with in this context is imposing the constraint that the integral
over all space is 1. For that definition, this is *not* the effect that
the bandwidth has. The bandwidth for KDEs is analogous to the bin width in
a traditional histogram. It's a measure of the width of the individual
kernel functions in the KDE (I believe cpptraj only supports the use of
Gaussian kernels, so it's closely related to the variance of the kernel
function centered on each point in the sample). However, it doesn't affect
the value of the integrated KDE over all space.


Jason M. Swails
AMBER mailing list
Received on Wed Sep 05 2018 - 21:00:01 PDT
Custom Search