Re: [AMBER] GTX Titan was finally released

From: Aron Broom <broomsday.gmail.com>
Date: Thu, 14 Mar 2013 13:42:22 -0400

Yeah, the bandwidth limitations across the (I'm assuming PCI 3.0)
motherboard connections is a bottle-neck for using any more than 1GPU.
Still those numbers are pretty extreme. What are we talking about in terms
of CPU equivalence? Even using NAMD or something that scales quite
linearly on the CPU I think some of those would take >64 cores to match,
and you aren't getting 64 cores for anything even remotely close to the
$1000 for the Titan.

~Aron

On Thu, Mar 14, 2013 at 1:31 PM, Gustavo Seabra <gustavo.seabra.gmail.com>wrote:

> Looks like there's really little gain by using 2 GPUs in parallel. Is that
> expected?
>
> Gustavo Seabra
> Professor Adjunto
> Departamento de Química Fundamental
> Universidade Federal de Pernambuco
> Fone: +55-81-2126-7450
>
>
> On Thu, Mar 14, 2013 at 1:33 PM, filip fratev <filipfratev.yahoo.com>
> wrote:
>
> > Hi Ross and all,
> >
> > These are my test results for GTX Titan. Just great gpu! My results
> differ
> > 1-3% from those obtained by Ian. I am not sure why. First of all Titan
> is a
> > cold card. When I set the fan speed to only 70-75%, the temperature
> never
> > goes above 60-65C. It is a pity that the card clock is only 876Mhz. Under
> > Windows I was not able to heat the card above 74 C and the speed was
> > 1150-1170Mhz, i.e. under Windows the single precision speed and the boost
> > speed are equal to the so called maximal clock speed. Many folks have
> > already hacked their bioses. Anyway..
> >
> > I use i7 3770K.4.6Ghz, GB GTX-Titan and my RAM is clocked above 2400Mhz.
> > OS was Suse 12.1+cuda 5.0.
> >
> > ----------------------------------
> > DHFR NVE = 23,558 atoms
> >
> > 1xGTX Titan = 110.65 ns/day
> > 2xGTX Titan = 125.28 ns/day
> >
> > --------------------------------------
> > DHFR NPT = 23,558 atoms
> >
> > 1xGTX Titan = 85.27 ns/day
> > 2xGTX Titan = 101.88 ns/day
> >
> > ---------------------------------------
> > FactorIX NVE = 90,906 atoms
> >
> > 1xGTX Titan = 31.55 ns/day
> > 2xGTX Titan = 38.05 ns/day
> >
> > ----------------------------------
> > FactorIX NPT = 90,906 atoms
> >
> > 1xGTX Titan = 25.85 ns/day
> > 2xGTX Titan = 32.54 ns/day
> >
> > ----------------------------------------
> > Cellulose NVE = 408,609 atoms
> >
> > 1xGTX Titan = 7.50 ns/day
> > 2xGTX Titan = 8.72 ns/day
> >
> > ----------------------------------------
> > Cellulose NPT = 408,609 atoms
> >
> > 1xGTX Titan = 6.31 ns/day
> > 2xGTX Titan = 7.71 ns/day
> >
> > -------------------------------------------
> >
> > All the best,
> > Filip
> >
> >
> >
> > ________________________________
> > From: filip fratev <filipfratev.yahoo.com>
> > To: AMBER Mailing List <amber.ambermd.org>
> > Sent: Wednesday, February 27, 2013 9:21 PM
> > Subject: Re: [AMBER] GTX Titan was finally released
> >
> > Hi Marek,
> >
> >
> > Thanks for sharing your benchmarks! Also thanks to Ross and Scott!
> >
> > 8-10% is not insignificant difference considering that the difference
> > between one two GPU's are 14% in JAC NPT. At least for me:)
> >
> > The EVGA revealed the Titan clock speed for their superclocked version-
> > 876Mhz, i.e. noting intriguing.
> >
> >
> > Ross you mentioned:
> > >>Firstly you are referring to the double precision clock rate and not
> the
> > single precision clock.
> >
> > What will be the single precision clock?
> >
> >
> > All the best,
> > Filip
> >
> > P.S. The GPU Boost 2.0 is different than the CPU boost. Your GPU can be
> > under 100% load but it will still work on the boost clock (876Mhz, single
> > precision?) and under Windows on nearly 1Ghz and this will be changeed
> only
> > if your temperature is above 80C.Thus if one use a water cooling under
> > Windows will be able to use Titan on 1Ghz which is around 10-15% + in the
> > performance.
> >
> >
> >
> > ________________________________
> > From: Marek Maly <marek.maly.ujep.cz>
> > To: AMBER Mailing List <amber.ambermd.org>
> > Sent: Wednesday, February 27, 2013 8:23 PM
> > Subject: Re: [AMBER] GTX Titan was finally released
> >
> > Hi Guys,
> > here are finally the results of in factory over clocked GTX680
> > ( EVGA GeForce GTX680 Classified ) in combination with "ASUS P9X79 PRO"
> > motherboard.
> >
> > As one can see the increase from the reference 1006MHz to 1111MHz make
> just
> > a small difference in results (reflecting percentually more or less the
> > difference
> > in frquency ). I did not test it in Boost clock (1176MHz) and I am not
> > going to do it, as for the long MD runs this regime seems to me a bit
> > dangerous :))
> >
> > Regarding the reliability of this OC version, I am fully satisfied,
> already
> > tested 2 of these in few weeks simulations.
> >
> > Best wishes,
> >
> > Marek
> >
> >
> > JAC_PRODUCTION_NVE - 23,558 atoms PME
> > -------------------------------------
> >
> > 1 x GTX680: | ns/day = 80.38 seconds/ns =
> 1074.90
> >
> > JAC_PRODUCTION_NPT - 23,558 atoms PME
> > -------------------------------------
> >
> > 1 x GTX680: | ns/day = 64.59 seconds/ns =
> 1337.60
> >
> > FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
> > -------------------------------------------
> >
> > 1 x GTX680: | ns/day = 20.99 seconds/ns =
> 4115.84
> >
> > FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
> > -------------------------------------------
> >
> > 1 x GTX680: | ns/day = 16.89 seconds/ns =
> 5115.66
> >
> > CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
> > --------------------------------------------
> >
> > 1 x GTX680: | ns/day = 4.67 seconds/ns =
> 18485.98
> >
> > CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
> > --------------------------------------------
> >
> > 1 x GTX680: | ns/day = 3.87 seconds/ns =
> 22323.57
> >
> > TRPCAGE_PRODUCTION - 304 atoms GB
> > ---------------------------------
> >
> > 1 x GTX680: | ns/day = 774.84 seconds/ns =
> 111.51
> > 2 x GTX680: N/A 3 x GTX680: N/A 4 x GTX680: N/A
> > MYOGLOBIN_PRODUCTION - 2,492 atoms GB
> > -------------------------------------
> >
> > 1 x GTX680: | ns/day = 166.44 seconds/ns =
> 519.10
> >
> > NUCLEOSOME_PRODUCTION - 25,095 atoms GB
> > ---------------------------------------
> >
> > 1 x GTX680: | ns/day = 2.90 seconds/ns =
> 29755.05
> >
> >
> >
> >
> >
> > Dne Tue, 26 Feb 2013 17:14:01 +0100 Scott Le Grand <
> varelse2005.gmail.com
> > >
> > napsal/-a:
> >
> > > As an side, go run JAC NVE in SPFP mode...
> > >
> > > If you get ~75+ ns/day, you're running at 1.05+ GHz...
> > >
> > > Otherwise, something's up. And I second what Ross is saying - just sit
> > > back and ride Pixel's Law. In the mid-term, I think I'll get JAC to
> 200+
> > > ns/day with a couple GTX Titans once I get the time to optimize GPU to
> > > GPU
> > > communication...
> > >
> > >
> > >
> > > On Mon, Feb 25, 2013 at 5:46 PM, Ross Walker <ross.rosswalker.co.uk>
> > > wrote:
> > >
> > >> Hi Filip,
> > >>
> > >> I think you are worrying too much hear. Firstly you are referring to
> the
> > >> double precision clock rate and not the single precision clock. AMBER
> > >> stopped relying on the double precision side of things and switched to
> > >> fixed point accumulation with the release of the GTX680 and K10.
> Second
> > >> the stock single precision clock will be faster than the K20X so you
> can
> > >> expect performance to be better than the K20X. It also has more cores
> > >> active 'I think', don't have the specs here or internet access to
> check
> > >> right now.
> > >>
> > >> Thirdly, the boost clock. AMBER pretty much runs the entire GPU flat
> out
> > >> ALL the time. The boost clock is only useful, as with CPUs, when you
> are
> > >> only using a fraction of the cores. In the case of GPUs unless you are
> > >> running very small atom counts this is unlikely to happen so even if
> the
> > >> boost clock was supported it wouldn't do you any good.
> > >>
> > >> In short, I wouldn't worry about it. Let's just wait and see how it
> > >> truly
> > >> performs when the "vaporware" actually turns up.
> > >>
> > >> All the best
> > >> Ross
> > >>
> > >>
> > >> On 2/25/13 2:39 PM, "filip fratev" <filipfratev.yahoo.com> wrote:
> > >>
> > >> >Hi all,
> > >> >I received some tests performed. Here is the comparison between
> LuxMax
> > >> >results obtained by GTX660 under Linux and Windows, respectively:
> > >> >
> > >> >http://img22.imageshack.us/img22/1279/luxmarkubuntu1204.png
> > >> >http://img692.imageshack.us/img692/9647/luxmarkwin7.png
> > >> >
> > >> >According to these results the GTX660 works at 1071Mhz, thus the
> Boost
> > >> >speed and the results between Linux and Windows are similar.
> > >> >
> > >> >However, Nvidia answered me that the GTX Titan core speed under Linux
> > >> >will be 837MHz and about the boost technology this: "unfortunately
> no,
> > >> >boost 1.0/2.0 are only supported on windows."
> > >> >Personally I trust on the above tests:)
> > >> >If they really caped their GTX GPU's under Linux to the base clock
> > >> >presumably only the BIOS hack option will be possible, which is....:)
> > >> >
> > >> >Regards,
> > >> >Filip
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >________________________________
> > >> > From: Aron Broom <broomsday.gmail.com>
> > >> >To: filip fratev <filipfratev.yahoo.com>; AMBER Mailing List
> > >> ><amber.ambermd.org>
> > >> >Sent: Saturday, February 23, 2013 10:12 PM
> > >> >Subject: Re: [AMBER] GTX Titan was finally released
> > >> >
> > >> >Just as another note, I checked out the AMBER output from running on
> a
> > >> >GTX570,
> > >> >
> > >> >|------------------- GPU DEVICE INFO --------------------
> > >> >|
> > >> >| CUDA Capable Devices Detected: 1
> > >> >| CUDA Device ID in use: 0
> > >> >| CUDA Device Name: GeForce GTX 570
> > >> >| CUDA Device Global Mem Size: 1279 MB
> > >> >| CUDA Device Num Multiprocessors: 15
> > >> >| CUDA Device Core Freq: 1.46 GHz
> > >> >|
> > >> >|--------------------------------------------------------
> > >> >
> > >> >So in that case the Core Freq reported is indeed the correct one,
> even
> > >> >though the GTX570 has two lower clock speeds it runs at depending on
> > >> load
> > >> >(810 MHz, and 101 MHz)
> > >> >
> > >> >I know with the 500 series, the available nVidia tools for linux will
> > >> >least
> > >> >allow you to set the device to maintain the highest clock speeds
> > >> >regardless
> > >> >of load. I have NOT done that in the above case, but if such a thing
> > >> is
> > >> >possible for the 600 series, it might be worth looking at. Sadly the
> > >> tool
> > >> >is only easily usable if you have a display connected although if you
> > >> >google "Axel Kohlmeyer" and go to his homepage there are some
> > >> suggestions
> > >> >on installing these tools on a typical server where you can fake a
> > >> >display.
> > >> >
> > >> >~Aron
> > >> >
> > >> >On Sat, Feb 23, 2013 at 2:33 PM, filip fratev <filipfratev.yahoo.com
> >
> > >> >wrote:
> > >> >
> > >> >> Hi Ross, Aron and all,
> > >> >> Thanks for your detail answers!!
> > >> >>
> > >> >> So, it seems that nobody know whether Nvidia
> > >> >> support the boost speed even on GTX680. Moreover, because the core
> > >> >>speed is
> > >> >> wrongly (I hope) printed as in the case of Amber 12 as well in all
> > >> >> benchmark
> > >> >> applications, we can see the difference only if compare the GTX680
> to
> > >> >>K10
> > >> >> (1
> > >> >> GPU) where we can see 37% performance increase (JAC), which can
> comes
> > >> >>only
> > >> >> from the
> > >> >> core/memory clock.
> > >> >>
> > >> >> Ross, please ask Nvidia about these issues.
> > >> >> I've already asked them but don't believe that will receive any
> > >> adequate
> > >> >> answer.
> > >> >> I also asked several users but nobody knows and they told me that
> > >> Nvidia
> > >> >> never
> > >> >> said something about their Boost technology under Linux.
> > >> >> Thus, at this point I think that we can trust
> > >> >> only to your information.
> > >> >>
> > >> >> Regards,
> > >> >> Filip
> > >> >>
> > >> >>
> > >> >> ________________________________
> > >> >> From: Ross Walker <ross.rosswalker.co.uk>
> > >> >> To: filip fratev <filipfratev.yahoo.com>; AMBER Mailing List <
> > >> >> amber.ambermd.org>
> > >> >> Sent: Saturday, February 23, 2013 6:45 AM
> > >> >> Subject: Re: [AMBER] GTX Titan was finally released
> > >> >>
> > >> >> Hi Filip,
> > >> >>
> > >> >>
> > >> >> >As you know I plan to purchase few GTX Titans:)
> > >> >> >but I am not sure actually at what speed they will run: 836, 876
> or
> > >> 993
> > >> >> >Mhz?
> > >> >> >It seems that by default (80C target) the Titan
> > >> >> >runs under Windows only on the maximal core speed (around 1Ghz)
> not
> > >> the
> > >> >> >boost
> > >> >> >one. It goes back to 836 only if the temperature rises above 80C
> but
> > >> >>with
> > >> >> >100%
> > >> >> >fan speed this looks almost impossible. At least this is what I
> saw
> > >> >>from
> > >> >> >the
> > >> >> >reviews.
> > >> >>
> > >> >> No idea since I am still waiting for NVIDIA to actually send me a
> > >> >> development card to try this with. I guess the Titan's will be
> > >> vaporware
> > >> >> for a while. I am intrigued to know about how the clock speed will
> > >> work
> > >> >> and I am waiting for NVIDIA engineering to get back to me with a
> > >> >> definitive answer. Note the Titan can also be run in two modes from
> > >> >>what I
> > >> >> gather. One with the DP cores turned down and the SP cores clocked
> up
> > >> >> (Gaming mode) and one where it turns on all the DP cores and clocks
> > >> down
> > >> >> the single precision (CUDA mode). Note AMBER was retooled for the
> > >> GK104
> > >> >> chip to not use double precision anymore. It uses a combination of
> > >> >>single
> > >> >> and fixed precision which we worked very hard to tune to
> match/better
> > >> >>the
> > >> >> SPDP accuracy. Thus it is entirely possible that one will actually
> > >> want
> > >> >>to
> > >> >> run the Titan cards in gaming mode when running AMBER. Of course
> > >> this is
> > >> >> entirely speculation until I lay my hands on one. The thermal
> window
> > >> >>also
> > >> >> has potential issues for 4 GPU boxes but there may end up being a
> > >> hack
> > >> >>to
> > >> >> disable the down clocking and allow temps over 80C. Note most
> cards I
> > >> >>have
> > >> >> (GTX680s) run around 90C right now. SDSC runs it's machine room at
> > >> 85F
> > >> >>in
> > >> >> order to save power - since disks and CPUs don't care if the room
> is
> > >> 85F
> > >> >> vs 60F. This might be a different story if the GPUs throttle based
> on
> > >> >> temperature but I guess we'll just have to wait and see.
> > >> >>
> > >> >> >
> > >> >> >I was also horrified to see that many GTX680
> > >> >> >(and other cards) users complain that under Linux their cards run
> at
> > >> >>only
> > >> >> >about
> > >> >> >700Mhz core speed instead of 1Ghz. What is your experience with
> GTX
> > >> >>680?
> > >> >> >I was also wondering whether the GTX680 use the
> > >> >> >boost clock during the Amber calculations or the just the base
> one?
> > >> >> >
> > >> >>
> > >> >> I think this is just speculation. When you run AMBER with a GTX680
> it
> > >> >> prints the following:
> > >> >>
> > >> >>
> > >> >> |------------------- GPU DEVICE INFO --------------------
> > >> >> |
> > >> >> | CUDA Capable Devices Detected: 1
> > >> >> | CUDA Device ID in use: 0
> > >> >> | CUDA Device Name: GeForce GTX 680
> > >> >> | CUDA Device Global Mem Size: 2047 MB
> > >> >> | CUDA Device Num Multiprocessors: 8
> > >> >> | CUDA Device Core Freq: 0.71 GHz
> > >> >> |
> > >> >> |--------------------------------------------------------
> > >> >>
> > >> >>
> > >> >>
> > >> >> But this is a query that occurs at the very beginning of a run
> before
> > >> >>any
> > >> >> CUDA kernels have been run. I believe that when unloaded the 680 in
> > >> >>Linux
> > >> >> clocks down to 705MHz to save power. When you stress it hard it
> > >> >> automatically clocks up the frequency. I am not sure if there is
> way
> > >> to
> > >> >> check this though while the card is under load. Certainly the
> > >> >>performance
> > >> >> we see would be what it is if the clock speed was only 705MHz. I am
> > >> >>asking
> > >> >> NVIDIA engineering to clarify though.
> > >> >>
> > >> >> >Finally, what is the performance difference of
> > >> >> >pmemdCuda under Linux and Cygwin?
> > >> >>
> > >> >> Never tried and I very much doubt you'll be able to get pmemd.cuda
> > >> >> compiled under cygwin. Cygwin emulates things through the cygwin
> dll
> > >> and
> > >> >> so you'd need a cygwin compatible version of the nvidia compiler
> I'd
> > >> >> expect.
> > >> >>
> > >> >> Note have a native Windows version of pmemd.cuda but never released
> > >> the
> > >> >> binary since the performance is about half that of what it is on
> > >> Linux
> > >> >>due
> > >> >> to a bug in cuda 4.2 under windows that limited performance. cuda 3
> > >> >>showed
> > >> >> good performance under windows but you can't use that with AMBER
> 12.
> > >> We
> > >> >> haven't had time to get back to looking at this with cuda 5
> > >> >>unfortunately.
> > >> >>
> > >> >> All the best
> > >> >> Ross
> > >> >>
> > >> >> /\
> > >> >> \/
> > >> >> |\oss Walker
> > >> >>
> > >> >> ---------------------------------------------------------
> > >> >> | Assistant Research Professor |
> > >> >> | San Diego Supercomputer Center |
> > >> >> | Adjunct Assistant Professor |
> > >> >> | Dept. of Chemistry and Biochemistry |
> > >> >> | University of California San Diego |
> > >> >> | NVIDIA Fellow |
> > >> >> | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
> > >> >> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> > >> >> ---------------------------------------------------------
> > >> >>
> > >> >> Note: Electronic Mail is not secure, has no guarantee of delivery,
> > >> may
> > >> >>not
> > >> >> be read every day, and should not be used for urgent or sensitive
> > >> >>issues.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> _______________________________________________
> > >> >> AMBER mailing list
> > >> >> AMBER.ambermd.org
> > >> >> http://lists.ambermd.org/mailman/listinfo/amber
> > >> >> _______________________________________________
> > >> >> AMBER mailing list
> > >> >> AMBER.ambermd.org
> > >> >> http://lists.ambermd.org/mailman/listinfo/amber
> > >> >>
> > >> >
> > >> >
> > >> >
> > >> >--
> > >> >Aron Broom M.Sc
> > >> >PhD Student
> > >> >Department of Chemistry
> > >> >University of Waterloo
> > >> >_______________________________________________
> > >> >AMBER mailing list
> > >> >AMBER.ambermd.org
> > >> >http://lists.ambermd.org/mailman/listinfo/amber
> > >> >_______________________________________________
> > >> >AMBER mailing list
> > >> >AMBER.ambermd.org
> > >> >http://lists.ambermd.org/mailman/listinfo/amber
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> AMBER mailing list
> > >> AMBER.ambermd.org
> > >> http://lists.ambermd.org/mailman/listinfo/amber
> > >>
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > > __________ Informace od ESET NOD32 Antivirus, verze databaze 8053
> > > (20130226) __________
> > >
> > > Tuto zpravu proveril ESET NOD32 Antivirus.
> > >
> > > http://www.eset.cz
> > >
> > >
> > >
> >
> >
> > --
> > Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> > http://www.opera.com/mail/
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Mar 14 2013 - 11:00:04 PDT
Custom Search