- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Jason Swails <jason.swails.gmail.com>

Date: Wed, 13 Jul 2016 21:38:44 -0400

On Wed, Jul 13, 2016 at 7:09 PM, Thomas Evangelidis <tevang3.gmail.com>

wrote:

*> > The slow part of the calculation is calculating the forces. Compared to
*

*> > that, the cost of integrating positions and velocities is nearly
*

*> > negligible. So the question here is "is there any way to efficiently
*

*> > calculate only the forces on the atoms you wish to propagate"?
*

*> >
*

*> > Unfortunately the answer to that is "no", especially with GB. The
*

*> > long-range effect of the nonbonded interactions means you need to compute
*

*> > the interactions between all atoms you wish to study (near the active
*

*> site)
*

*> > with all other atoms in the system. And you also can't restrict the
*

*> > nonbonded calculation between only the (very small number of) pairs in
*

*> > which one atom is one of the "atoms of interest", because the GB
*

*> potential
*

*> > is not pairwise decomposable.
*

*> >
*

*> >
*

*> Which means in simple words that not calculating forces between _fixed
*

*> atoms only_ in a GP simulation is wrong?
*

*>
*

F

rom a computational efficiency perspective, this is essentially correct.

[1] It's difficult to eliminate unnecessary calculations in a way that is

computationally efficient, so it was never done in Amber. For a lengthier,

more precise description (including a description of how performance

*could* be improved in a new code or modified Amber), see the postscript.

HTH,

Jason

[1] This is because the most expensive part of the GB calculation is the

double loop over nonbonded pairs. This needs to be done twice. The first

time to compute the effective GB radii and the second time to compute the

vdW, GB, and electrostatic energies (which uses those effective radii).

The second double-loop can be eliminated by restricting the calculated

interactions to just include terms in which at least one atom is mobile (if

this is a small number of atoms, it makes that loop a very fast O(N)

instead of the very slow O(N^2) it is now).

However, since the GB energies between any atom pairs depends on the

effective radii, it's impossible to eliminate the first double-loop. This

imposes a very real limit to how efficient you can make GB calculations

with only a small subset of the atoms being free to move. The effective

radii loop remains naively O(N^2), although it can be reduced to O(N) with

a large prefactor if you choose a cutoff for the effective radii (rgbmax).

The effective radii calculation falls as r^-4, so you *can* choose a cutoff

and retain most of your accuracy, but it needs to be a lot larger than the

8 A typically used for vdW interactions in periodic simulations. The

default for Amber is 15. But then you'd need a pairlist to actually make

this efficient, and Amber does not implement one of those for GB (so you

wind up doing the full N*(N-1)/2 distance calculations between all distinct

atom pairs regardless).

Since the effective radii is, IIRC, ~30% of the total cost of the GB

calculation, you could effectively make the code run ~3x faster. You might

be able to bump it up to ~6-10x faster if you took more approximations

(like a RESPA integrator where you compute effective radii every other step

instead of every step). But to do that you would need to basically

completely rewrite the nonbonded loops to optimize for this use-case in

addition to adding a pairlist for use in calculating the effective radii

(and don't forget all the validation you would need to do to make sure you

implemented it correctly). You already get a bigger boost than that in

going to GPUs (and it would be even harder to get that 3x speedup on GPUs,

I suspect).

And even if you did *all that*, you'd still have to be able to justify to

reviewers why fixing most of the protein makes no difference to the

results... which may very well require you to run the full, unconstrained

simulations, anyway :). And any justification for one system does not

automatically extend to others. So it's a lot of effort for little (and

quite possibly no) real benefit.

Date: Wed, 13 Jul 2016 21:38:44 -0400

On Wed, Jul 13, 2016 at 7:09 PM, Thomas Evangelidis <tevang3.gmail.com>

wrote:

F

rom a computational efficiency perspective, this is essentially correct.

[1] It's difficult to eliminate unnecessary calculations in a way that is

computationally efficient, so it was never done in Amber. For a lengthier,

more precise description (including a description of how performance

*could* be improved in a new code or modified Amber), see the postscript.

HTH,

Jason

[1] This is because the most expensive part of the GB calculation is the

double loop over nonbonded pairs. This needs to be done twice. The first

time to compute the effective GB radii and the second time to compute the

vdW, GB, and electrostatic energies (which uses those effective radii).

The second double-loop can be eliminated by restricting the calculated

interactions to just include terms in which at least one atom is mobile (if

this is a small number of atoms, it makes that loop a very fast O(N)

instead of the very slow O(N^2) it is now).

However, since the GB energies between any atom pairs depends on the

effective radii, it's impossible to eliminate the first double-loop. This

imposes a very real limit to how efficient you can make GB calculations

with only a small subset of the atoms being free to move. The effective

radii loop remains naively O(N^2), although it can be reduced to O(N) with

a large prefactor if you choose a cutoff for the effective radii (rgbmax).

The effective radii calculation falls as r^-4, so you *can* choose a cutoff

and retain most of your accuracy, but it needs to be a lot larger than the

8 A typically used for vdW interactions in periodic simulations. The

default for Amber is 15. But then you'd need a pairlist to actually make

this efficient, and Amber does not implement one of those for GB (so you

wind up doing the full N*(N-1)/2 distance calculations between all distinct

atom pairs regardless).

Since the effective radii is, IIRC, ~30% of the total cost of the GB

calculation, you could effectively make the code run ~3x faster. You might

be able to bump it up to ~6-10x faster if you took more approximations

(like a RESPA integrator where you compute effective radii every other step

instead of every step). But to do that you would need to basically

completely rewrite the nonbonded loops to optimize for this use-case in

addition to adding a pairlist for use in calculating the effective radii

(and don't forget all the validation you would need to do to make sure you

implemented it correctly). You already get a bigger boost than that in

going to GPUs (and it would be even harder to get that 3x speedup on GPUs,

I suspect).

And even if you did *all that*, you'd still have to be able to justify to

reviewers why fixing most of the protein makes no difference to the

results... which may very well require you to run the full, unconstrained

simulations, anyway :). And any justification for one system does not

automatically extend to others. So it's a lot of effort for little (and

quite possibly no) real benefit.

-- Jason M. Swails _______________________________________________ AMBER mailing list AMBER.ambermd.org http://lists.ambermd.org/mailman/listinfo/amberReceived on Wed Jul 13 2016 - 19:00:02 PDT

Custom Search