Re: [AMBER] Cluster Analysis with CPPTRAJ from Lod King on 2019-07-10 (Amber Archive Jul 2019)

From: Lod King <lodking407.gmail.com>
Date: Wed, 10 Jul 2019 22:02:52 -0700

Hi,

I then tried the following based on the manual:

> cluster C0 :1-42.CA clusters 3 epsilon 5.0 out test.dat summary
avg.summary.dat

and obtained a total cluster of 50. The first 10 are:

#Cluster Frames Frac AvgDist Stdev Centroid AvgCDist
       0 752 0.376 3.980 1.153 1255 10.563
       1 341 0.171 2.364 0.643 1890 11.069
       2 137 0.069 3.497 0.868 467 9.567
       3 117 0.059 2.882 0.791 572 10.541
       4 90 0.045 3.704 1.086 122 9.680
       5 74 0.037 3.385 0.959 796 10.716
       6 63 0.032 2.320 0.668 878 11.336
       7 48 0.024 3.206 1.137 312 10.564
       8 40 0.020 3.487 0.845 752 10.079
       9 38 0.019 3.990 1.074 204 9.960
      10 36 0.018 4.254 1.026 650 10.361

I wondered,

1. how is 50 defined? By default?

2. why is cluster number started from 0?

3. Is there any way to plot them beside specifying "filename.agr"

4. Actually, how to read the data? what is centroid number defined here?

On Wed, Jul 10, 2019 at 8:09 PM Elvis Martis <elvis_bcp.elvismartis.in>
wrote:

> I guess there must be some typo or an extra \, just recheck all commands.
> Moreover, the AMBER Manual mentions a method on how to select an epsilon
> and minpts, you might want to check that once.
> On Thursday, July 11, 2019, Lod King <lodking407.gmail.com> wrote:
>
> > Hi Amber
> >
> > I saw a workshop tutorial online for Clustering and tried to follow the
> > command :
> >
> > *>parm abc.prmtop*
> >
> > *>trajin abc.dcd*
> > *>cluster c0 \ dbscan minpoints 25 epsilon 0.9 sievetoframe \ rms
> :1-42.CA
> > \ sieve 2000 random \ out cnumvtime.dat \ sil Sil \ summary summary.dat \
> > info info.dat\ cpopvtime cpopvtime.agr normframe \ repout rep repfmt pdb
> \
> > singlerepout siglerrep.nc <http://siglerrep.nc> singlerepfmt netcdf \
> > avgout Avg avgfmt restart*
> >
> > but got the following message:
> >
> > CLUSTER: Using coords dataset _DEFAULTCRD_, clustering using RMSD
> (mask
> > [:1-42.CA]) best-fit
> > DBSCAN:
> > Minimum pts to form cluster= 25
> > Cluster distance criterion= 0.900
> > Sieved frames will only be added back if they are within
> > 0.900 of a frame in an existing cluster.
> > (This option is more accurate and will identify sieved
> > frames as noise but is slower.)
> > Initial clustering will be randomly sieved (with value 2000).
> > Only non-sieved frames will be used to calc within-cluster average.
> > Cluster # vs time will be written to cnumvtime.dat
> > Cluster pop vs time will be written to cpopvtime.agr (normalized by
> frame)
> > Pairwise distance data set is 'c0[PWD]'
> > Cluster information will be written to info.dat\
> > Summary of cluster results will be written to summary.dat
> > Frame silhouettes will be written to Sil.frame.dat, cluster silhouettes
> > will be written to Sil.cluster.dat
> > Silhouette calculation will use non-sieved frames ONLY.
> > Representative frames will be chosen by closest distance to cluster
> > centroid.
> > Cluster representatives will be written to 1 traj (siglerrep.nc), format
> > Amber NetCDF
> > Cluster representatives will be written to separate trajectories,
> > prefix (rep), format PDB
> > Average structures for clusters will be written to Avg, format Amber
> > Restart
> > Error: [cluster] Not all arguments handled: [ \ \ \ \ \ \ \ \ \ \ ]
> >
> > Did I miss anything?
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
>
> --
> Best Regards
> Elvis Martis
> Mumbai.
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jul 10 2019 - 22:30:01 PDT