On Tue, Dec 20, 2016 at 12:09 PM, Neale, Christopher Andrew <cneale.lanl.gov
> wrote:
> Dear developers:
>
> I have finished including PV work in the T-REMD exchange criterion in
> amber16. I still have an open question about PBC changing too much and the
> fact that "The GPU code does not automatically reorganize grid cells",
This message seems to confuse a lot of people, so I will try to describe
why this message appears, which I hope will explain what it means to you
and how you should respond to it. You'll need to know big-O notation for
computational complexity (
https://en.wikipedia.org/wiki/Big_O_notation)
It all boils down to computational efficiency through the judicious use of
good algorithms. In its naive implementation, nonbonded interactions scale
as O(N^2), since you need to do a double-loop over pairs of atoms:
for (i = 0; i < natom; i++)
for (j = i; j < natom; j++)
calculate_nonbonded_potential(i, j)
end
end
However, if you use a cutoff, then this scaling reduces to O(N), because
each atom (for typical particle densities) interacts with a more or less
predictable number of particles that does not increase as the system size
increases. (Adding PME to capture long-range electrostatics increases that
complexity cost to O(N logN).) However, the only way you can get to the
O(N) scaling is if you use a pairlist -- otherwise you have to check all
N^2/2 distances to see if you need to calculate the interaction, so that
algorithm still scales as O(N^2).
OK, so we need a pairlist. The obvious way to build a pairlist is to check
every atom with every other atom to see if the distance is within the
cutoff (it's typical to "add" a little extra padding so you don't have to
rebuild the pairlist every step -- it lets particles move a little without
running the risk of having a particle within the cutoff be omitted from the
pairlist; this is what the "skinnb" parameter controls). But this, again,
scales as O(N^2), so we're still not really improving the net scalability
of our MD simulation by using a cutoff or a pairlist. So now we have to
figure out a way to build the pairlist in O(N) complexity.
The way we do this is to break the whole unit cell into cube-shaped chunks,
or "cells". The clever part about this approach is that we can very
quickly calculate which cell each atom belongs to (independent of every
other atom) by a single loop over all atoms, which is O(N). Then,
depending on the size of our cell, we can build an atom's pairlist by only
looking at distances to atoms in nearby cells. Since the size of these
cells is independent of N, this reduces the complexity of our pairlist to
O(N) (since adding more atoms does not increase the number of distances we
need to compute *per atom* as long as we have a typical density).
This is what pmemd.cuda does -- it breaks the unit cell down into
individual sub-"cells" to facilitate building a pairlist quickly and
efficiently in a way that scales well with the number of particles. A
common choice for sub-cell size is the size of the cutoff (+skinnb), so
that you only have to look at the 27 surrounding cubic cells for possible
pairlist partners for each atom.
Well, in constant P simulations, this cell size can change. If the volume
changes a *lot*, then each sub-cell might become less than the size of the
cutoff + skinnb. If this happens, then building a pairlist from just the
27 nearest neighbors will miss a potentially large number of interactions
that *should* be included. You'd need to either redefine new cells based
on your new unit cell size, or you'd need to search the (5^3=)125 sub-cells
(and so on, depending on how small the sub-cells were compared to the
cutoff).
pmemd.cuda never did any of this, and one day someone ran into some pretty
crazy behavior as a result of this bug. The root cause was uncovered, and
a patch was thrown in to prevent this from happening. The easiest way to
prevent this bug from hitting people was to check to see if the cell size
got too small for the cutoff and quit before pmemd.cuda could do the wrong
thing (that's what this error message is saying). By far the better thing
to do would be to rebuild the cell decomposition used in the pairlist
builder, but that requires quite a bit more coding.
So rather than re-generating the subcells automatically, pmemd.cuda forces
users to do it by hand -- that is, they need to rerun their simulations
starting from their last restart file (since the subcells are defined as
soon as the simulation starts).
Note that this is specific to pmemd.cuda -- the CPU version of pmemd and
pmemd.MPI, as well as sander and sander.MPI, do not have this problem
(which is why often the suggestion is to heat using pmemd.MPI and do
production with pmemd.cuda).
Long story short -- this only happens in NpT, does not indicate you have
done something wrong (although it is worth checking to be sure something
weird is not happening), and is easily fixed by just re-running from the
restart file that was generated right before pmemd.cuda quit.
HTH,
Jason
--
Jason M. Swails
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Dec 22 2016 - 13:30:02 PST