You know what this means?
Independent Performance Analysis shows!
AMBER is up to 57x faster than "the leading brand(tm)!"
Accept no substitute...
On May 27, 2014 7:16 PM, "Jason Swails" <jason.swails.gmail.com> wrote:
> On Tue, May 27, 2014 at 5:11 PM, Robert McGibbon <rmcgibbo.gmail.com>
> wrote:
> >
> > The particular claim of Jason's that I disputed was that the overhead
> from
> > the python interpreter and/or file IO has a significant effect on OpenMM
> > performance.
>
>
> It can have a very significant impact. Writing a trajectory file even
> using a binary format (for my example I used the NetCDF reporter class in
> ParmEd, which does effectively the same thing, using the
> scipy.io.netcdf_file backend, as MDTraj's NetCDFReporter class) requires a
> _lot_ of overhead going through the Python application layer.
>
> You download all 3N coordinates into Python space (not sure of the
> mechanics or how much, if any, memory is copied each time), do a unit
> conversion on all 3N coordinates, then write those coordinates to the
> trajectory file. In pmemd.cuda, this overhead is effectively eliminated --
> the sole cost is the NetCDF API call (cheap) and the I/O (expensive). If
> you write snapshots infrequently, the trajectory file writing is not
> noticeable since it is so much slower than the MD. If you write frequently
> enough, the cost becomes measurable (and at some point, even dominant).
> Trajectory I/O cost is ~2 orders of magnitude faster with pmemd.cuda (or
> any program that uses the OpenMM C/C++/F90 API directly -- basically
> anything that _avoids_ Python and OMM's dimensional analysis) than it is in
> the OpenMM Python application layer.
>
> I ran a quick benchmark on my desktop with a GTX 680 (writing to a standard
> SATA drive, 6 Gb/s IIRC) running all calculations using CUDA (5.5) on a
> 21,548-atom system with PME (SPFP in pmemd and mixed precision model in
> OpenMM). Same constraint tolerance, cutoff, etc. Results are shown below
> in a (fixed-width-optimized) table. The text file is attached in case the
> formatting comes out poorly and you want to see it:
>
> OpenMM pmemd.cuda
> ------ ----------
>
> steps | Total time (s) steps | Total time (s)
> -------------------------- --------------------------
> 10,000 | 374.9797 10,000 | 86.58
> 5,000 | 375.5790 5,000 | 85.89
> 2,000 | 378.6431 2,000 | 85.96
> 1,000 | 384.0025 1,000 | 85.93
> 500 | 395.7065 500 | 86.34
> 250 | 417.7666 250 | 86.31
> 100 | 484.1631 100 | 87.03
> 50 | 597.1071 50 | 88.20
> 10 | 1478.6362 10 | 98.74
> 5 | 2677.9380* 5 | 109.40*
> 1 | 11431.2420* 1 | 199.20*
>
> This effect I had seen before. The cost of having the Simulation class
> break up the MD into 10 step chunks so Python can catch a SIGINT I have
> never measured. It would be easy enough to subclass Simulation and
> implement "step" without breaking it into 10-step chunks, I've just never
> done it.
>
> All the best,
> Jason
>
> P.S. The timings presented here are just to look at the significance of
> trajectory file I/O on the performance of OMM-Python and pmemd.cuda -- the
> cross-comparisons are probably less reliable.
>
> --
> Jason M. Swails
> BioMaPS,
> Rutgers University
> Postdoctoral Researcher
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 27 2014 - 21:00:02 PDT