Re: [AMBER] comparison of PDBs from clustering - representative, average and from rst file from Thomas Cheatham III on 2010-09-16 (Amber Archive Sep 2010)

From: Thomas Cheatham III <tec3.utah.edu>
Date: Thu, 16 Sep 2010 09:39:41 -0600 (Mountain Daylight Time)

> So maybe the main question is about difference between representative
> structures and average. As I understand the representatie structure is
> one of the real structures in the trajectory.

The representative structure should be equivalent to one of the structures
read in from the specified "trajin" list of files. Although I know this
to be true, I went ahead and tested this to be sure and this valid...

So, why might the representative display "unusual" backbone angles? It
comes down to issues similar the arguments about the utility of the
average structure. If you have something in motion, perhaps oscillating
back and forth between two states, the average structure becomes unreal or
smeared. For example consider the two conformers below in exchange,
followed by the expected average structure...

     \ / |
     | <-> | = |
     \ / |

   conf1 conf2 avg

Now if we want the representative, is it conf1 or conf2? If we have lot's
of conformations, the representative chosen will be the structure that
is the closest to the "average" of the cluster (and therefore may
display similarities to the motionally averaged structure).

Can minimization "fix" either the average or the representative? Sure,
but in the example above it will move towards either conf1 or conf2.
Which is more representative, conf1 or conf2? [Both are.]

What this suggests is that if you want truly representative structures
that are not averages of interconverting substates that you need to have a
sufficient number of clusters to match the number of definitive substates.

For a simple system like the above, this is easy; for a mobile molecular
system it becomes complicated. To see how complicated, read the
paper that discusses clustering in ptraj and the influence on various
metrics, algorithms, ... [Shao et al, JCTC (2007) 3, 2312-2334].

When there is a clear distinction among sub-states, then clustering is
easy. Real life (aka protein simulation) however is complicated.

[p.s.

For your docking investigations, I would suggest comparing protocols:
  (a) representative structures
  (b) minimized representative structures
  (c) randomly chosen structures
  (d) structures chosen at set intervals
      ...and also check whether the results depend on the
      # of structures chosen...
]

--tec3

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Sep 16 2010 - 09:00:04 PDT