Any compiled GPU code that interfaces significantly with interpreted Python
code is going to be slow. Much more so than 20-50x slower CPU code.
That said, OpenMM is roughly 2/3 the speed of equivalent AMBER code. Its
Generalized Born code is seemingly faster, but since whatever it's running
is a simpler form of the GB code in AMBER 11 (unless things have changed
dramatically since I *wrote* the first cut of that GPU code) it is somewhat
faster. In that spirit, I could replace the nonbond VDW and Electrostatic
code in AMBER with NOPs and it would become the fastest molecular dynamics
code in existence. I'm really tired of these contests, but like I said at
GTC this year, I'm happy to go there because I used to be one of those guys
the PC hardware web sites used to condemn on a regular basis for handcoded
hacks to display drivers to win crucial benchmarks. We've already added
HMR support. Want us to start skipping the Ewald sum and only rebuilding
the neighbor list when we feel like it too? While we're at it, I know how
to take the timestep to 5.2 fs and as well all know, 5.2 fs is 0.2 rocking
fs bigger than 5.0!
Scott
On Mon, May 26, 2014 at 3:35 PM, Jason Swails <jason.swails.gmail.com>wrote:
> We're getting a little OT here, so I branched the thread
>
>
> On Mon, May 26, 2014 at 5:09 PM, Robert McGibbon <rmcgibbo.gmail.com>
> wrote:
>
> >
> > > OpenMM-Python suffers from frequent I/O
> >
> > > OpenMM incurs a lot of overhead from the builtin dimensional analysis
> and
> >
> > the time spent in the Python interpreter.
> >
> >
> >
> > I don't think these statements are accurate. You can easily check by not
> >
> > attaching any reporters to a simulation, or running context.step()
> directly
> >
> > without a simulation object. But I'm a little biased. Perhaps another
> forum
> >
> > would be more appropriate to discuss it in though.
> >
>
> The first statement (OpenMM-Python suffers from frequent I/O) is definitely
> accurate. The overhead of manipulating data in Python and the overhead of
> OpenMM-Python's dimensional analysis is expensive. The slowdown relative
> to "peak performance" (i.e., no printouts) of OpenMM-Python when writing a
> trajectory snapshot every step is much more pronounced than the slowdown of
> any compiled program (sander, pmemd, OpenMM-accelerated sander, etc.).
> I've verified this directly with my own benchmarks and tests. [1]
>
> The only trick you mentioned that gets rid of additional OpenMM-Python
> overhead entirely in the simulation is calling step on the context (really
> the context.integrator) directly, but for an OpenMM beginner (especially
> coming only from experience with Amber), this is non-obvious. Even a
> Simulation object with no reporters has _some_ additional Python overhead
> introduced by taking only 10 steps at a time. [2]
>
> All the best,
> Jason
>
> [1] This is hardly a strike against OpenMM -- in fact quite the opposite.
> OpenMM will always be (possibly negligibly) faster than the Python
> application layer reports due to this overhead. A more fair performance
> comparison, strictly speaking, would be between pmemd.cuda and an
> OpenMM-accelerated version of sander or pmemd (although you can get to a
> fair comparison in the Python layer if you were careful).
>
> [2] I have never measured this cost. It could easily be completely
> negligible for any system of any size. I just don't know
> (order-of-magnitude estimates here are complicated by jumping between
> Python and C/C++)...
>
> --
> Jason M. Swails
> BioMaPS,
> Rutgers University
> Postdoctoral Researcher
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 27 2014 - 12:30:02 PDT