# Re: [AMBER] GTX680

From: Aron Broom <broomsday.gmail.com>
Date: Wed, 28 Mar 2012 13:45:45 -0400

yeah exactly, I'll keep my fingers crossed. Thanks again for sending us
this info!

On Wed, Mar 28, 2012 at 1:40 PM, Scott Le Grand <varelse2005.gmail.com>wrote:

> I wouldn't give up at all. If GK104 had 2x 580 double-precision, I'd
> already be hitting 1.7x GTX 580 perf on it. And I can't see NVIDIA
> abandoning the high ground, ever. I can see them providing better
> differentiation between consumer and Tesla though. After all, what games
> right now make extensive use of GPU double-precision?
> Scott
> On Wed, Mar 28, 2012 at 10:30 AM, Aron Broom <broomsday.gmail.com> wrote:
> > Ha, thanks for the correction concerning the consumer cards.
> >
> > To follow-up on your point then, do you think you'll be getting an
> > opportunity to test a GK110 when it comes out and give us the scoop? I
> > so much hope for the new Keplers, it would be really sad if we have to
> > stick with the older technology. As much as I liked being able to buy
> > cheap consumer cards, I'm fine with the idea of having consumer
> specialized
> > cards that deliver for gaming and another set that deliver for GPGPU.
> >
> > Anyway, I'll keep holding onto the hope that the GK110 gives something
> > amazing like 2x the speed of an M2090. I also heard a rumor, of dubious
> > reliability, that the GK110 would have 384-bit memory BUS rather than the
> > 256 of the 680.
> >
> > ~Aron
> >
> > On Wed, Mar 28, 2012 at 1:07 PM, Scott Le Grand <varelse2005.gmail.com
> > >wrote:
> >
> > > It's *not* artificially crippled. It *never* was. That's the sort of
> > > nonsense you get from reading Charlie Demerjian (whom I can only assume
> > > must have had his family fortune destroyed by speculating in NVIDIA
> > > stock)*. While Teslas and Ge Forces shared the same base chip, the
> > > Tesla-grade chips required all double-precision units to be functional
> > > while a Ge Force only needed 1 out of 4 to work to be shippable. So
> > you're
> > > getting slightly less functional chips for far less money. And if you
> > > think that's crazy, check out what Intel charges for a 100 MHz clock
> > boost
> > > on a consumer CPU.
> > >
> > > You are correct however that AMBER performance in SPDP mode is more or
> > less
> > > equivalent between a GTX 570 and an M2070. But that changes
> dramatically
> > > in full double-precision mode however where the M2070 is roughly 2x
> > faster
> > > (go ahead, try it). There's also a hardware bug in Ge Force chips that
> > > breaks parallel runs that has never manifested in Tesla so it's not
> quite
> > > as black and white.
> > >
> > > So I'd say stick with the M2090s for now (or just wait and see what
> GK110
> > > delivers before making such a call).
> > >
> > > Scott
> > >
> > > *He seemed much saner in the 1990s when we were both Atari Jaguar
> > > developers and talked regularly over IRC. He even stopped by my
> > apartment
> > > once apparently while I was at a conference. Oh well, whatever floats
> > his
> > > boat.
> > > On Wed, Mar 28, 2012 at 9:54 AM, Aron Broom <broomsday.gmail.com>
> wrote:
> > >
> > > > In terms of this double precision hit, how do you think that will
> work
> > > out
> > > > on the workstation cards?
> > > >
> > > > I know for the Fermi cards for instance, the double precision was
> > > > artificially crippled in the consumer cards (GTX 580) to be 1/4 of
> what
> > > it
> > > > was in the workstation (M2090). In testing AMBER and other programs
> > > (NAMD)
> > > > on a GTX 570 and M2070 (almost identical number of cores) I've found
> no
> > > > real improvement in speed with the M2070, which suggests that the 4x
> > > double
> > > > precision wasn't entirely needed for AMBER or NAMD.
> > > >
> > > > So the question then is: is the low double precision capability of
> the
> > > 680
> > > > partially because it has again been artificially crippled, and the
> > > > corresponding workstation cards will actually have enough double
> > > precision
> > > > performance to show something fantastic?
> > > >
> > > > Personally I'm fine with buying up a bunch of GTX570s when people are
> > > > trying to clear out that model, but I know some people who will be
> > > looking
> > > > to purchase new workstation GPUs soon, and I've love to have a good
> > sense
> > > > of whether or not they should just continue on with the M2090s.
> > > >
> > > > Thanks,
> > > >
> > > > ~Aron
> > > >
> > > > On Wed, Mar 28, 2012 at 12:33 PM, Scott Le Grand <
> > varelse2005.gmail.com
> > > > >wrote:
> > > >
> > > > > After a weekend with GTX 680, what I can say is that this is a
> great
> > > > gaming
> > > > > GPU with amazing single-precision and texture performance, but it
> has
> > > the
> > > > > same overall memory bandwidth as a GTX 580 with significantly
> > > > > *less*double-precision performance.
> > > > >
> > > > > The upshot is that I expect it to only deliver 85-90% of a GTX 580.
> > > And
> > > > > that's partially because there's no increase in memory bandwidth
> and
> > > > mostly
> > > > > because of the regression in double-precision performance. And
> > that's
> > > a
> > > > > shame because single-precision *screams* on this chip.
> > > > >
> > > > > There are compensated single-precision accumulation algorithms that
> > > could
> > > > > be used here to ameliorate the performance hit. But this is a
> > > dangerous
> > > > > precedent to follow IMO that leads to code fragmentation because
> > > > > double-precision on all Fermi-class GPUs was faster, more precise,
> > and
> > > > > simpler than such algorithms (which themselves where faster on GTX
> > 2xx
> > > > (see
> > > > > Tetsu Narumi's work, sigh). This is a nightmare to validate: GTX
> 680
> > > > > simply shouldn't have regressed on double-precision performance.
> > > > > Hopefully, the next chip won't. That said, this thing overclocks
> > like
> > > > > crazy and the modder crowd has already doubled the base clock. So
> > > > perhaps
> > > > > not all hope is lost...
> > > > >
> > > > > Scott
> > > > >
> > > > > On Fri, Mar 23, 2012 at 11:57 AM, Scott Le Grand <
> > > varelse2005.gmail.com
> > > > > >wrote:
> > > > >
> > > > > > I am very optimistic about GTC680 performance...
> > > > > >
> > > > > > That said, anyone who hacks the configure script to make the
> > current
> > > > code
> > > > > > run will be severely (an unnecessarily) disappointed. Every
> AMBER
> > > > kernel
> > > > > > has been meticulously shoehorned into GTX2xx and GTX5xx GPUs.
> > GTX680
> > > > is
> > > > > a
> > > > > > radical redesign of Fermi (please don't listen to the dunderheads
> > on
> > > > > review
> > > > > > sites blathering about matters that's beyond them about such
> > things,
> > > > > > seriously). That radical redesign has created a much more
> > efficient
> > > > GPU
> > > > > > (I'm expecting the perf/watt on AMBER to hit transwarp as opposed
> > to
> > > > > merely
> > > > > > warp drive in the near future) but it's been at the expense of
> 33%
> > > > higher
> > > > > > operational latency.
> > > > > >
> > > > > > 33% higher operational latency is fine - except that the shared
> > > memory
> > > > on
> > > > > > GTX680 is exactly the same as GTX580 and that's leading to a ~30%
> > > > > > performance deficit if one just runs the existing code. However,
> > > there
> > > > > are
> > > > > > 2x as many machine registers on GTX680 than on GTX580. Or TLDR: I
> > > need
> > > > to
> > > > > > rewrite every single kernel for GTX680 from the ground-up to hit
> > > > > attainable
> > > > > > performance.
> > > > > >
> > > > > > So give me a few weeks, mmkay?
> > > > > >
> > > > > > Scott
> > > > > >
> > > > > > On Fri, Mar 23, 2012 at 11:49 AM, Ross Walker <
> > ross.rosswalker.co.uk
> > > > > >wrote:
> > > > > >
> > > > > >> Hi Filip,
> > > > > >>
> > > > > >> > Hi all,
> > > > > >> > I was wondering
> > > > > >> > what we can expect from GTX680 and in general from the new
> > Kepler
> > > > > line.
> > > > > >> > I know
> > > > > >> > that GTX680 is very limited DP, but should be good in SP mode.
> > > Would
> > > > > we
> > > > > >> > expect
> > > > > >> > some speed boost compared to GTX580 and also will it work
> along
> > > > Amber
> > > > > >> > 11/12?
> > > > > >>
> > > > > >> Amber 11 will NOT support the GTX680 cards (unless you hack the
> > > > > configure
> > > > > >> script to compile it in what is effectively an emulation mode).
> It
> > > > will
> > > > > be
> > > > > >> too much work to make patch against that. AMBER 12 will support
> > them
> > > > > but it
> > > > > >> is going to take around 6 weeks to 2 months to get the
> > optimization
> > > > done
> > > > > >> and a patch released so it won't support the cards at release
> but
> > it
> > > > > will
> > > > > >> as soon as we have the patch ready. I can't really give you any
> > > > > performance
> > > > > >> expectations right now, only got my first prototype board
> > yesterday.
> > > > ;-)
> > > > > >>
> > > > > >> Right now if you compile AMBER 12 with PTX support so that it
> will
> > > at
> > > > > >> least run on the GTX680 the performance sucks. It is about 70%
> of
> > a
> > > > > GTX580.
> > > > > >> NVIDIA changed the hardware too much (massively increasing the
> > > threads
> > > > > but
> > > > > >> also the thread latency) so it will need some work to optimize
> it
> > > > which
> > > > > is
> > > > > >> why I have chosen not to support the cards in AMBER 12 until we
> > have
> > > > > that
> > > > > >> optimization done. Once it is done I expect considerable
> > improvement
> > > > > over
> > > > > >> GTX580 speeds but can't give you anything concrete right now.
> > > > > >>
> > > > > >> > P.S. Indeed most
> > > > > >> > of us will probably wait for GK110, but CUDA capability of
> > GTX680
> > > is
> > > > > >> > very limited now.
> > > > > >>
> > > > > >> This is probably a good idea, at least you should wait until we
> > have
> > > > had
> > > > > >> a chance to get our hands dirty with the GK104 chip. So I'd urge
> > you
> > > > to
> > > > > >> wait at least until we have the patch ready for AMBER 12.
> > > > > >>
> > > > > >> All the best
> > > > > >> Ross
> > > > > >>
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
