Re: [AMBER] algorithm for clustering solvent molecules? from Hai Nguyen on 2015-08-03 (Amber Archive Aug 2015)

From: Hai Nguyen <nhai.qn.gmail.com>
Date: Mon, 3 Aug 2015 16:48:24 -0400

.Jose
for the sake of completeness. You can use pytraj/cpptraj for this task (if
I understand your goal and Jason's idea correctly)

Since pytraj is not well-documented (yet), I wrote example here for demo.
https://github.com/pytraj/pytraj/blob/master/examples/example_water_clustering.py

import pytraj as pt

# use `iterload` to save memory (same as `generator` (with fancy indexing)
in python)
# you can use `load`, which is similiar to `mdtraj`
traj = pt.iterload("../tests/data/tz2.ortho.nc",
                        "../tests/data/tz2.ortho.parm7")
# get some info
print(traj)

# get new trajectory for specific waters (Oxygen atom only)
wat_traj = traj[':100-500.O']

# iterate every frame and do clustering
for frame in wat_traj:
    xyz = frame.xyz
    # clustering for x-coordniates
    result = pt.clustering_dataset(xyz[:, 0], 'clusters 10')

    # cluster index for each atom
    print(result)

Hai

On Mon, Aug 3, 2015 at 3:21 PM, Jose Borreguero <borreguero.gmail.com>
wrote:

> I have created a graph for every frame. Nodes in the graph are the solvent
> molecules, and two nodes are connected with and edge if the distance
> between the associated solvent molecules is below a cutoff I chose. I have
> systems with different solvation levels, some of then featuring "pockets"
> of solvent molecules. These pockets are the clusters I'm interested in.
> Algorithm networkx.connected_components
> <
> https://networkx.github.io/documentation/latest/reference/generated/networkx.algorithms.components.connected.connected_components.html
> >can
> find the connected clusters from a graph. To create the graph, I am using
> MDAnalysis to obtain the contact map between solvent molecules. Regarding
> time, it takes 2.2seconds to create a contact map for 4132 solvent
> molecules, which I think is reasonable (unless you have many thousands of
> frames)
>
> On Mon, Aug 3, 2015 at 1:17 PM, Jason Swails <jason.swails.gmail.com>
> wrote:
>
> > On Mon, 2015-08-03 at 12:57 -0400, Jose Borreguero wrote:
> > > Thanks a lot, Jason. I'll go along with python for the clustering
> step. I
> > > found module networkx which is very straighforward for clustering, and
> > > quite fast.
> >
> > How are you using networkx for this? I've used it to define a bond
> > graph in molecular topology before, but I don't see how networkx maps to
> > clustering here. Do you have a fully connected graph whose edge weights
> > are the distance between the nodes or something? That sounds like it
> > would be an expensive graph to create. Keep in mind that the PBC will
> > have an effect on the clusters, so how you pick the unit cell
> > representation is likely important.
> >
> > I've used sklearn to cluster in the past, and I've found it to be pretty
> > easy to use, for what that's worth.
> >
> > All the best,
> > Jason
> >
> > --
> > Jason M. Swails
> > BioMaPS,
> > Rutgers University
> > Postdoctoral Researcher
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 03 2015 - 14:00:03 PDT