Hi all,
Scott, I agree about the overall difference between OpenMM and AMBER
performance, and furthermore that much too much energy has been spent on
creative benchmarking of various codes against one another. There's been a
lot of motivated
reasoning<
http://en.wikipedia.org/wiki/Motivated_reasoning>going on,
and it's a shame. I hope we can move forward more constructively.
The particular claim of Jason's that I disputed was that the overhead from
the python interpreter and/or file IO has a significant effect on OpenMM
performance. Note that this isn't a claim about relative performance
between the two codebases, just what aspects influence OpenMM performance
(and to what degrees). I'm running some experiments to try to test my
hypotheses and see where they break down. Once I have the data, I'll make a
blog post and send a link here.
-Robert
On Tue, May 27, 2014 at 12:21 PM, Scott Le Grand <varelse2005.gmail.com>wrote:
> Any compiled GPU code that interfaces significantly with interpreted Python
> code is going to be slow. Much more so than 20-50x slower CPU code.
>
> That said, OpenMM is roughly 2/3 the speed of equivalent AMBER code. Its
> Generalized Born code is seemingly faster, but since whatever it's running
> is a simpler form of the GB code in AMBER 11 (unless things have changed
> dramatically since I *wrote* the first cut of that GPU code) it is somewhat
> faster. In that spirit, I could replace the nonbond VDW and Electrostatic
> code in AMBER with NOPs and it would become the fastest molecular dynamics
> code in existence. I'm really tired of these contests, but like I said at
> GTC this year, I'm happy to go there because I used to be one of those guys
> the PC hardware web sites used to condemn on a regular basis for handcoded
> hacks to display drivers to win crucial benchmarks. We've already added
> HMR support. Want us to start skipping the Ewald sum and only rebuilding
> the neighbor list when we feel like it too? While we're at it, I know how
> to take the timestep to 5.2 fs and as well all know, 5.2 fs is 0.2 rocking
> fs bigger than 5.0!
>
> Scott
>
>
>
>
> On Mon, May 26, 2014 at 3:35 PM, Jason Swails <jason.swails.gmail.com
> >wrote:
>
> > We're getting a little OT here, so I branched the thread
> >
> >
> > On Mon, May 26, 2014 at 5:09 PM, Robert McGibbon <rmcgibbo.gmail.com>
> > wrote:
> >
> > >
> > > > OpenMM-Python suffers from frequent I/O
> > >
> > > > OpenMM incurs a lot of overhead from the builtin dimensional analysis
> > and
> > >
> > > the time spent in the Python interpreter.
> > >
> > >
> > >
> > > I don't think these statements are accurate. You can easily check by
> not
> > >
> > > attaching any reporters to a simulation, or running context.step()
> > directly
> > >
> > > without a simulation object. But I'm a little biased. Perhaps another
> > forum
> > >
> > > would be more appropriate to discuss it in though.
> > >
> >
> > The first statement (OpenMM-Python suffers from frequent I/O) is
> definitely
> > accurate. The overhead of manipulating data in Python and the overhead
> of
> > OpenMM-Python's dimensional analysis is expensive. The slowdown relative
> > to "peak performance" (i.e., no printouts) of OpenMM-Python when writing
> a
> > trajectory snapshot every step is much more pronounced than the slowdown
> of
> > any compiled program (sander, pmemd, OpenMM-accelerated sander, etc.).
> > I've verified this directly with my own benchmarks and tests. [1]
> >
> > The only trick you mentioned that gets rid of additional OpenMM-Python
> > overhead entirely in the simulation is calling step on the context
> (really
> > the context.integrator) directly, but for an OpenMM beginner (especially
> > coming only from experience with Amber), this is non-obvious. Even a
> > Simulation object with no reporters has _some_ additional Python overhead
> > introduced by taking only 10 steps at a time. [2]
> >
> > All the best,
> > Jason
> >
> > [1] This is hardly a strike against OpenMM -- in fact quite the opposite.
> > OpenMM will always be (possibly negligibly) faster than the Python
> > application layer reports due to this overhead. A more fair performance
> > comparison, strictly speaking, would be between pmemd.cuda and an
> > OpenMM-accelerated version of sander or pmemd (although you can get to a
> > fair comparison in the Python layer if you were careful).
> >
> > [2] I have never measured this cost. It could easily be completely
> > negligible for any system of any size. I just don't know
> > (order-of-magnitude estimates here are complicated by jumping between
> > Python and C/C++)...
> >
> > --
> > Jason M. Swails
> > BioMaPS,
> > Rutgers University
> > Postdoctoral Researcher
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 27 2014 - 14:30:02 PDT