Re: [AMBER] [UCE] Re: DBSCAN CLUSTERING STARTS AT 0 CLUSTERS

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 18 Jun 2019 12:19:20 -0400

Hi,

On Mon, Jun 17, 2019 at 7:43 AM <dnlfr1994.gmail.com> wrote:
> Indeed I am getting noise frames. I think they come from a high rms change at the beginning of the simulation. Does it make sense? Are there any objections to a clustering graph like the one I have?
>
> I am also getting negative silhouette values for pop 0 and 1, and low positive values for the others, what makes me think dbscan may not be the best algorithm for my system.

So as I am very fond of saying, cluster analysis is very much an art
form. One clustering run is almost never enough. Choosing the "right"
parameters is usually a matter of trial and error, watching how
different parameters affect your metrics (DBI, pseudo F, silhouette).
This is why saving and reusing the pairwise distance matrix with
'savepairdist'/'loadpairdist' can be very useful. Definitely try
varying your DBscan parameters, and feel free to try a different
algorithm like K-means. You may also find it useful to cluster a
smaller subset of your overall data (e.g. every 10th frame) to get an
idea of what parameters may be good for your system.

Hope this helps,

-Dan

>
> Any ideas on that?
>
> Thank you very much
>
> Daniel
>
> -----Mensaje original-----
> De: Daniel Roe <daniel.r.roe.gmail.com>
> Enviado el: viernes, 14 de junio de 2019 15:46
> Para: AMBER Mailing List <amber.ambermd.org>
> Asunto: Re: [AMBER] [UCE] Re: DBSCAN CLUSTERING STARTS AT 0 CLUSTERS
>
> Hi,
>
> I see what you mean now. So without seeing your cluster data myself (something like your cluster population vs time data) I can't be certain, but I suspect you may be encountering "noise" frames, i.e.
> frames that do not belong to any specific cluster. Could that be the case?
>
> -Dan
>
> On Thu, Jun 13, 2019 at 6:01 PM Daniel Fernández Remacha <dnlfr1994.gmail.com> wrote:
> >
> > Thank you very much for your answers. However I think I did not
> > express correctly. My question goes in another sense.
> >
> > For what I know, cluster plots usually start at the maximum of the y
> > axis (1), meaning all conformations belong to the same cluster until
> > the moment where another cluster is found; where the first starts to
> > decrease as the second increases, and so on.
> >
> > In my case, pop1 starts at the bottom of the y axis and rapidly
> > increases, without reaching the unit. Then everything is as expected.
> > But what I don't understand is why dbscan does not assume the 100% of
> > the data as part of the first cluster. Where does that difference come from?
> > Is this behaviour normal?
> >
> > Hope I explained better this time.
> >
> > Thank you again,
> >
> > El jue., 13 jun. 2019 21:50, Daniel Roe <daniel.r.roe.gmail.com> escribió:
> >
> > > On Thu, Jun 13, 2019 at 3:45 PM Thomas Cheatham <tec3.utah.edu> wrote:
> > > >
> > > > Perhaps my fault, C programmer... Easy fix +1 or convert to
> > > > FORTRAN
> > >
> > > Ha, never! I wish everything started from 0...
> > >
> > > >
> > > > > On Jun 13, 2019, at 9:12 PM, Daniel Roe <daniel.r.roe.gmail.com>
> > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > Historically, PTRAJ always numbered clusters starting from 0 and
> > > > > sorted by population (cluster 0 = most populated). CPPTRAJ has
> > > > > continued this convention. All cluster output should have a cluster 0.
> > > > >
> > > > > -Dan
> > > > >
> > > > >> On Thu, Jun 13, 2019 at 1:04 PM <dnlfr1994.gmail.com> wrote:
> > > > >>
> > > > >> Dear users,
> > > > >>
> > > > >>
> > > > >>
> > > > >> I am currently working on clustering analysis of several MD
> > > > >> with a
> > > system
> > > > >> with 600 aminoacids for 200 ns simulations.
> > > > >>
> > > > >> I have plotted the kdist plots for kdist 1 to 10 and from this
> > > > >> results obtained I use the following input:
> > > > >>
> > > > >>
> > > > >>
> > > > >> cluster C0 dbscan minpoints 4 epsilon 1 sievetoframe rms :1-279
> > > > >> sieve
> > > 10 \
> > > > >>
> > > > >> out dbscan_clust/cnumvtime.dat sil Sil \
> > > > >>
> > > > >> summary dbscan_clust/summary.dat info
> > > > >> dbscan_clust/info.dat \
> > > > >>
> > > > >> cpopvtime dbscan_clust/cpopvtime.agr normframe repout
> > > > >> dbscan_clust/rep repfmt pdb \
> > > > >>
> > > > >> singlerepout dbscan_clust/singlerep.nc singlerepfmt
> > > > >> netcdf
> > > avgout
> > > > >> Avg avgfmt restart
> > > > >>
> > > > >> run
> > > > >>
> > > > >> clear all
> > > > >>
> > > > >>
> > > > >>
> > > > >> The clustering runs OK with no noticeable problems. However, it
> > > > >> has me intrigued to see that unlike all other clustering plots
> > > > >> I have seen (clusters starting at 1), the clusters on this one
> > > > >> start from cero. I
> > > fail
> > > > >> to find a reason for this behaviour. Find a picture of the pop
> > > > >> vs
> > > frame I
> > > > >> get at the bottom of this message.
> > > > >>
> > > > >>
> > > > >>
> > > > >> I would really appreciate it if anyone could help me solve this
> > > problem
> > > > >>
> > > > >>
> > > > >>
> > > > >> Thank you in advance,
> > > > >>
> > > > >>
> > > > >>
> > > > >> Daniel Fernández
> > > > >>
> > > > >>
> > > > >>
> > > > >> _______________________________________________
> > > > >> AMBER mailing list
> > > > >> AMBER.ambermd.org
> > > > >> http://lists.ambermd.org/mailman/listinfo/amber
> > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing list
> > > > > AMBER.ambermd.org
> > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jun 18 2019 - 09:30:05 PDT
Custom Search