- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Jason Swails <jason.swails.gmail.com>

Date: Wed, 5 Sep 2018 23:49:34 -0400

Hi Kevin,

As Adrian pointed out, these KDEs are plotting a probability density

function (PDF). Note that probability densities are *not* probabilities.

The only constraint on a PDF is that it integrates to 1 -- individual

points along the PDF can take just about any (non-negative) value you want.

As an extreme example, the dirac delta function -- a PDF that is non-zero

at only a single point -- has *infinite* probability density at the only

point where it is non-zero.

The bandwidth for a KDE controls the width of the individual Gaussian

kernels used to construct the PDF. A larger bandwidth tends to yield more

diffuse PDFs.

On Mon, Sep 3, 2018 at 9:25 PM Mac Kevin Braza <mebraza.up.edu.ph> wrote:

*> Thank you for response.
*

*>
*

*> Here's how the figure look like. I used a bandwidth of 0.25.
*

*> But if I do not specify the bandwidth, the calculated bandwidth from normal
*

*> distribution approximation ranges from 0.02 - 0.05.
*

*>
*

*> I choose 0.25 because I have other systems also to analyzed with the same
*

*> dataset and I want to normalize the KDE calculations for all the system I
*

*> will be analyzing.
*

*>
*

I'm not sure what you mean by "normalize" here. The definition I'm

familiar with in this context is imposing the constraint that the integral

over all space is 1. For that definition, this is *not* the effect that

the bandwidth has. The bandwidth for KDEs is analogous to the bin width in

a traditional histogram. It's a measure of the width of the individual

kernel functions in the KDE (I believe cpptraj only supports the use of

Gaussian kernels, so it's closely related to the variance of the kernel

function centered on each point in the sample). However, it doesn't affect

the value of the integrated KDE over all space.

HTH,

Jason

Date: Wed, 5 Sep 2018 23:49:34 -0400

Hi Kevin,

As Adrian pointed out, these KDEs are plotting a probability density

function (PDF). Note that probability densities are *not* probabilities.

The only constraint on a PDF is that it integrates to 1 -- individual

points along the PDF can take just about any (non-negative) value you want.

As an extreme example, the dirac delta function -- a PDF that is non-zero

at only a single point -- has *infinite* probability density at the only

point where it is non-zero.

The bandwidth for a KDE controls the width of the individual Gaussian

kernels used to construct the PDF. A larger bandwidth tends to yield more

diffuse PDFs.

On Mon, Sep 3, 2018 at 9:25 PM Mac Kevin Braza <mebraza.up.edu.ph> wrote:

I'm not sure what you mean by "normalize" here. The definition I'm

familiar with in this context is imposing the constraint that the integral

over all space is 1. For that definition, this is *not* the effect that

the bandwidth has. The bandwidth for KDEs is analogous to the bin width in

a traditional histogram. It's a measure of the width of the individual

kernel functions in the KDE (I believe cpptraj only supports the use of

Gaussian kernels, so it's closely related to the variance of the kernel

function centered on each point in the sample). However, it doesn't affect

the value of the integrated KDE over all space.

HTH,

Jason

-- Jason M. Swails _______________________________________________ AMBER mailing list AMBER.ambermd.org http://lists.ambermd.org/mailman/listinfo/amberReceived on Wed Sep 05 2018 - 21:00:01 PDT

Custom Search