- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Thomas Cheatham <tec3.utah.edu>

Date: Wed, 29 Apr 2020 21:24:07 -0600 (MDT)

*> I have a question on PCA. I did AMBER PCA tutorial and i have a
*

*> question. How to decide how many vectors to use. The PCA tutorial uses 3.
*

*> ( runanalysis diagmatrix cpu-gpu-covar out cpu-gpu-evecs.dat vecs 3 name
*

*> myEvecs nmwiz nmwizvecs 3 nmwizfile dna.nmd nmwizmask :1-36&!.H= )
*

Well, no one has responded yet so I will give it a shot noting that

hoefully my people will correct me if I am wrong.

Like clustering, there is no right answer and it depends upon what you

want to learn. When I teach about PCA, I try to start from the concepts of

normal modes of motion. You do a QM minimization and you can visualize the

normal modes. The first eigenvalues/eigenvectors report on the slowest

collective modes of motion (the overall molecule bending / twisting with

many of the atoms moving). The later ones, the high frequency modes, are

just a couple of atoms moving, such as bond vibrations. If you haven't

ever viewed modes of motion in GaussView or equivalents, do...

When looking at the modes of motion from PCA, they are ordered from low

frequency to high (bonds). Bond vibrations do not tell you much about the

dynamics or function or really much more than the fact that the bond is

vibrating. Often the most informative are the first few modes on motion

since they give you a picture of the largest / collective motion. The

first few modes represent 90% of the motion. So, back to your question

about how many modes you need to see depends on what you are trying to

learn.

In the papers you reference from my lab, we were trying to demonstrate

reproducibilty and convergence from independent simulations from different

initial conditions. If different simulations showed equivalence for the

first three modes of motion, probably good. If 15-20 modes agree (between

independent simulations) even better since this gets to ~95-99% of motion

-- hard to claim independent simulations are not equivalent if they

reproduce 15-20 of the modes. Does this guarantee convergence, no, since

the independent simulations could have both missed important states (i.e.

locally converged, not necessarily globally converged). Yet, do we need to

compare all modes, probably not (noise) but we could with CPPTRAJ since it

was designed by be flexible (allow you to investigate what you want).

CPPTRAJ by being flexible means experiement; there is no one "correct" way

and the rate limiting step in simulation these days is not running MD, but

figuring out what the MD means.

To reframe your question, it is not about the number of modes, it is what

question are you trying to answer with respect to the modes of motion?

--tec3

_______________________________________________

AMBER mailing list

AMBER.ambermd.org

http://lists.ambermd.org/mailman/listinfo/amber

Received on Wed Apr 29 2020 - 20:30:03 PDT

Date: Wed, 29 Apr 2020 21:24:07 -0600 (MDT)

Well, no one has responded yet so I will give it a shot noting that

hoefully my people will correct me if I am wrong.

Like clustering, there is no right answer and it depends upon what you

want to learn. When I teach about PCA, I try to start from the concepts of

normal modes of motion. You do a QM minimization and you can visualize the

normal modes. The first eigenvalues/eigenvectors report on the slowest

collective modes of motion (the overall molecule bending / twisting with

many of the atoms moving). The later ones, the high frequency modes, are

just a couple of atoms moving, such as bond vibrations. If you haven't

ever viewed modes of motion in GaussView or equivalents, do...

When looking at the modes of motion from PCA, they are ordered from low

frequency to high (bonds). Bond vibrations do not tell you much about the

dynamics or function or really much more than the fact that the bond is

vibrating. Often the most informative are the first few modes on motion

since they give you a picture of the largest / collective motion. The

first few modes represent 90% of the motion. So, back to your question

about how many modes you need to see depends on what you are trying to

learn.

In the papers you reference from my lab, we were trying to demonstrate

reproducibilty and convergence from independent simulations from different

initial conditions. If different simulations showed equivalence for the

first three modes of motion, probably good. If 15-20 modes agree (between

independent simulations) even better since this gets to ~95-99% of motion

-- hard to claim independent simulations are not equivalent if they

reproduce 15-20 of the modes. Does this guarantee convergence, no, since

the independent simulations could have both missed important states (i.e.

locally converged, not necessarily globally converged). Yet, do we need to

compare all modes, probably not (noise) but we could with CPPTRAJ since it

was designed by be flexible (allow you to investigate what you want).

CPPTRAJ by being flexible means experiement; there is no one "correct" way

and the rate limiting step in simulation these days is not running MD, but

figuring out what the MD means.

To reframe your question, it is not about the number of modes, it is what

question are you trying to answer with respect to the modes of motion?

--tec3

_______________________________________________

AMBER mailing list

AMBER.ambermd.org

http://lists.ambermd.org/mailman/listinfo/amber

Received on Wed Apr 29 2020 - 20:30:03 PDT

Custom Search