Re: [AMBER] algorithm for clustering solvent molecules?

From: Jose Borreguero <borreguero.gmail.com>
Date: Mon, 3 Aug 2015 12:57:08 -0400

Thanks a lot, Jason. I'll go along with python for the clustering step. I
found module networkx which is very straighforward for clustering, and
quite fast.

On Mon, Aug 3, 2015 at 10:00 AM, Jason Swails <jason.swails.gmail.com>
wrote:

> On Sat, 2015-08-01 at 21:23 -0400, Jose Borreguero wrote:
> > Dear AMBER users,
> >
> > My system is very inhomogeneous regarding the spatial distribution of
> > solvent molecues. I want to cluster these molecules with the algorithm
> > "hieragglo" of the "cluster" command in cpptraj. However, it seems this
> > command can only cluster frames because the distance metric is evaluated
> > between frames. What I want to do is to cluster the solvent molecules for
> > each frame, independent of other frames. Is this possible to do with
> > cpptraj? If not, do you know of other program that can do this?
>
> cpptraj can do this with some massaging. You can tell cpptraj to
> cluster an arbitrary data set, so what you would need to do is generate
> datasets for the water (I presume you just want the x, y, and
> z-coordinates of each oxygen atom). Basically you should create
> separate data sets for the X-, Y-, and Z-coordinates of every water
> oxygen atom (which is basically the same as the COM of the water
> molecule), and feed those 3 data sets to the "cluster" command in
> cpptraj for each frame.
>
> How you get the X-, Y-, and Z-coordinates of each water oxygen is up to
> you -- you can use the "vector" command in cpptraj, you can write a
> Python script to do it (perhaps with the help of a trajectory library
> like pytraj or mdtraj), etc.
>
> You can also implement the entire workflow in a couple lines of Python
> if you have the right libraries installed. For example, scikit-learn is
> a machine learning library written in Python that implements a wide
> array of clustering algorithms
> (http://scikit-learn.org/stable/modules/clustering.html#clustering) --
> so if you use pytraj or mdtraj to extract the X-, Y-, and Z-coordinates
> pretty easily. MDTraj is currently an easier package to install and
> start using immediately. So if you install anaconda or miniconda
> (http://conda.pydata.org/miniconda.html), you can use the following
> commands to install the necessary packages:
>
> conda install -c omnia mdtraj scikit-learn
>
> Then a simple Python script like the following should extract the
> information you want:
>
> import mdtraj as md
>
> traj = md.load('your_trajectory.nc', top='your_topology.prmtop')
> wat_xyz = traj.xyz[:,traj.topology.select('resname HOH and name O'),:]
>
> Then you can either write a dataset with that data and feed it to
> cpptraj:
>
> import numpy as np
> for i, frame in enumerate(wat_xyz):
> np.savetxt('frame_%d.dat' % i, frame)
>
> Note, though, that if you have a lot of water molecules, the clustering
> can take a long time for each frame.
>
> HTH,
> Jason
>
> --
> Jason M. Swails
> BioMaPS,
> Rutgers University
> Postdoctoral Researcher
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 03 2015 - 10:00:03 PDT
Custom Search