Hi Dan,
Thank you for looking into this. We have been actually using dbscan and got some unsatisfactory results, in this case for clustering of ligand snapshots during binding to a protein. Lots of the snapshots were assigned to a cluster -1, which seems to include widely distributed ligand conformations in both the bound and unbound states. How comes this “-1” cluster? Do you have any thoughts to perhaps assign these snapshots more accurately to the correct clusters?
Given what we have for dbscan, I wanted to try dpeaks, hoping it can resolve the issue. I will lower epsilon as you suggested and see if it can at least complete the calculation. From initial outputs, seems there is also a “-1” cluster though. Since we have a very large number of simulation frames for clustering, the sieve option is essential to avoid the memory problem. When would you possibly make that available for dpeaks?
Thanks again,
Yinglong
> On Jan 17, 2020, at 8:41 AM, Daniel Roe <daniel.r.roe.gmail.com> wrote:
>
> PPS - Also note that adding back sieved frames isn't yet implemented
> for 'dpeaks', so you may just want to use another clustering method.
> If you want to stick with density based there's 'dbscan'...
>
> On Fri, Jan 17, 2020 at 9:37 AM Daniel Roe <daniel.r.roe.gmail.com> wrote:
>>
>> PS - If you're really interested, prior to the clustering command you
>> can use 'debug <#>' (where <#> is greater than 0) to print more
>> potentially helpful information. In the output you will see 'DBG: Max
>> dist=' which will show the maximum distance observed between points;
>> epsilon should be less than this. I should probably have that printed
>> by default.
>>
>> Thanks for the report by the way.
>>
>> On Fri, Jan 17, 2020 at 9:34 AM Daniel Roe <daniel.r.roe.gmail.com> wrote:
>>>
>>> OK - I've been looking at this for a bit. I think that the problem
>>> must be that all your points too close i.e. all points are within
>>> epsilon from each other. Your dvdfile backs that up - the first column
>>> is '#Density', which just means # of points that are within epsilon
>>> from that point. In each case the #Density is 1249, indicating that
>>> everyone is too tight. I think if you lower epsilon you'll start to
>>> get better results.
>>>
>>> This is probably a case that cpptraj should trap. In my (limited)
>>> defense, it does state that the 'dpeaks' implementation is under
>>> development...
>>>
>>> So in summary, try lowering epsilon and see if that helps. I'll work
>>> on an update to trap the case where epsilon is too large.
>>>
>>> Hope this helps,
>>>
>>> -Dan
>>>
>>> On Tue, Jan 14, 2020 at 12:34 PM <yinglong.miao.gmail.com> wrote:
>>>>
>>>> I have also tried the gauss option. It gave the following output:
>>>> ACTION OUTPUT:
>>>>
>>>> ANALYSIS: Performing 1 analyses:
>>>> 0: [cluster C0 dpeaks epsilon 4 dvdfile dvdfile choosepoints auto runavg
>>>> runavg.dat deltafile delta.dat sieve 200 gauss]
>>>> Starting clustering.
>>>> Mask [*] corresponds to 15 atoms.
>>>> Estimated pair-wise matrix memory usage: > 3.123 MB
>>>> Pair-wise matrix set up with sieve, 250000 frames, 1250 sieved frames.
>>>> Calculating pair-wise distances.
>>>> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
>>>>
>>>> No error message was given but also no further output ...
>>>>
>>>> Thanks,
>>>> Yinglong
>>>>
>>>>
>>>> On Tue, Jan 14, 2020 at 9:47 AM Daniel Roe <daniel.r.roe.gmail.com> wrote:
>>>>
>>>>> Can you provide me (either in reply to this or off list) your entire
>>>>> cpptraj output and the contents of dvdfile?
>>>>>
>>>>> This could happen with very sparse density I think, although its
>>>>> difficult to say without exactly replicating. You could potentially
>>>>> try the 'gauss' keyword for Gaussian density instead of discrete
>>>>> density.
>>>>>
>>>>> -Dan
>>>>>
>>>>> On Mon, Jan 13, 2020 at 8:10 PM Yinglong Miao <yinglong.miao.gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Dan,
>>>>>>
>>>>>> It’s the latest version as in AMBER git repository.
>>>>>>
>>>>>> Thanks,
>>>>>> Yinglong
>>>>>>
>>>>>>
>>>>>>> On Jan 13, 2020, at 6:20 PM, Daniel Roe <daniel.r.roe.gmail.com>
>>>>> wrote:
>>>>>>>
>>>>>>> What version of cpptraj are you using?
>>>>>>>
>>>>>>> -Dan
>>>>>>>
>>>>>>> On Mon, Jan 13, 2020 at 6:51 PM <yinglong.miao.gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I tried to use the dpeaks algorithm for clustering with the following
>>>>>>>> command:
>>>>>>>> cluster C0 dpeaks epsilon 4 dvdfile dvdfile choosepoints auto runavg
>>>>>>>> runavg.dat deltafile delta.dat sieve 200
>>>>>>>>
>>>>>>>> But keep getting the following output with error:
>>>>>>>> ...
>>>>>>>> Finding closest neighbor point with higher density for each point.
>>>>>>>> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
>>>>>>>> Internal Error: In Cluster_DPeaks::AssignClusterNum nearest neighbor
>>>>> is -1.
>>>>>>>> Segmentation fault (core dumped)
>>>>>>>>
>>>>>>>> I will appreciate any suggestions that would fix this ...
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Yinglong
>>>>>>>>
>>>>>>>> Yinglong Miao, Ph.D.
>>>>>>>> Assistant Professor
>>>>>>>> Center for Computational Biology and
>>>>>>>> Department of Molecular Biosciences
>>>>>>>> University of Kansas
>>>>>>>> http://miao.compbio.ku.edu
>>>>>>>> _______________________________________________
>>>>>>>> AMBER mailing list
>>>>>>>> AMBER.ambermd.org
>>>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> AMBER mailing list
>>>>>>> AMBER.ambermd.org
>>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jan 17 2020 - 07:30:02 PST