Re: [AMBER] Clustering analysis from Debarati DasGupta on 2020-04-15 (Amber Archive Apr 2020)

From: Debarati DasGupta <debarati_dasgupta.hotmail.com>
Date: Wed, 15 Apr 2020 14:38:37 +0000

Hi Daniel,

Could you let me know what are the inputs actually needed to calculate the DB Index?
I did google a lot and seems like I did not find what I was looking for.
Thanks

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

From: Daniel Roe<mailto:daniel.r.roe.gmail.com>
Sent: 25 March 2020 15:27
To: AMBER Mailing List<mailto:amber.ambermd.org>
Subject: Re: [AMBER] Clustering analysis

Hi,

Unfortunately there's no magic formula to getting results from
clustering. Like any other method it requires careful scrutiny to
really have confidence in your results. 2 things I can recommend:

1) Check your DBI and pseudo F values for the various cluster results
you have. In general you want a small DBI, high pseudo F. It also
helps to look at the cluster silhouettes.

2) I recommend this all the time and I'm sure people are tired of it
(but I also don't care): **read through the Shao & Cheatham et al
clustering paper**, and specifically everything they do to validate
your results. I don't think you'll find a more comprehensive study on
clustering of MD trajectory data. It's where I go when I need new
ideas (or have to brush up on some old ones) on how to analyze
clustering data. If anyone has more recent recommendations please post
them! https://pubs.acs.org/doi/10.1021/ct700119m

-Dan

On Wed, Mar 18, 2020 at 6:34 PM Debarati DasGupta
<debarati_dasgupta.hotmail.com> wrote:
>
> Hello Daniel,
>
> I have been trying to follow your advice *clustering is an art form and requires various different trial and error sessions.*
>
> My aims were simple
> I am working on a NMR structure (10 conformers in the pdb file) a Med25 ACID protein domain and trying to answer 2 questions
>
> 1. How different are the 10 models from each other?
> 2. We did run some plain TIP3P water explicit solvent simulations (10 different runs on 10 different NMR models)
> Trying to analyse the conformations and how do the 10 conformers behave during the explicit solvent simulations.
>
> So I focused on k-means and hierarchical clustering methods just to learn the basics of setting up clustering in cpptraj.
> So I played with the cluster number and also the epsilon value (distance between cluster points) and got a wide array of results!
> Now I am getting drowned in statistics and literally lost.
> Can anyone suggest “how to analyse” my outputs and how to make sense of my results?
> What I am looking to compare between the different outputs and how to analyse the avg structures I got the representative structures I got ?
> Any help will be super grateful.
>
> Regards
> Debarati
>
>
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Apr 15 2020 - 08:30:02 PDT