Re: [AMBER] GTX780s

From: ET <sketchfoot.gmail.com>
Date: Sat, 13 Jul 2013 06:35:10 +0100

no worries. I will look forward to the fix.


On 13 July 2013 06:28, Ross Walker <rosscwalker.gmail.com> wrote:

> People have a VERY good clue what is going on and a fix will be
> forthcoming. You just have to appreciate that a lot of people are under
> NDAs and therefore need permission to discuss it in a public forum.
>
> For now please just accept that you do not have all the information and
> just be patient like I told people at the beginning. A LOT of very
> experienced people are working on this behind the scenes. Continued testing
> at this point will not help.
>
> Trust me, NVIDIA are taking this very seriously. If I could tell you all
> the details I would. For now please just be patient and accept that a fix
> will be forthcoming soon.
>
> Thank you.
>
> All the best
> Ross
>
>
>
> On Jul 12, 2013, at 22:14, ET <sketchfoot.gmail.com> wrote:
>
> > .tec3. As nobody seems to have a clue as to what is going on and there
> are
> > many theories floating about, i think its fair enough to question some of
> > them. If you are talking to a bunch of scientists you got to expect this
> > kind of behaviour. The high volume of emails includes benchmarking by
> > people who are interested in solving the problem and IMO has been
> > instrumental in identifying and clarifying the scale of the problem.
> Please
> > correct me if this is not the case...
> >
> > Additionally, as ppl have shelled out £££ & time on these gpus, I feel it
> > is appropriate to ask whether NVIDIA is actually going to release a fix
> at
> > all. Why would they release a "fix" for perfectly working consumer grade
> > cards targeted at gamers, so that a bunch of people effectively get a
> > massive discount that they otherwise would not have done?
> >
> > cheers
> >
> >
> >
> >
> > On 13 July 2013 04:45, ET <sketchfoot.gmail.com> wrote:
> >
> >> Hi Ross,
> >>
> >> Thanks very much for the test results. :) In my case I do not believe
> the
> >> temperature is an issue. "homebrew" gaming cases usually have equal or
> >> better air cooling than server cases if configured correctly IMO,
> because
> >> gamers are ridiculously OTT about these things and it all comes down to
> the
> >> design & placement of fans in the case. I use monitorix to graph the
> >> temperatures of the system at all times and it shows the following
> maximum
> >> temperature thresholds for the following components when the system is
> >> under full load. i.e. 4xZotac 780, 1x17 CPU going for it:
> >>
> >> GPU's = max 80 degrees C (when there are 4 loaded)
> >> CPU = max 50 degrees
> >> mb = 30 degrees.
> >>
> >>
> >> What are the temperatures that you get for the components in the EXXACT
> >> cases?
> >>
> >> I'm starting to firmly believe that the Zotac card you have got there is
> >> working and will pass all the cellulose NPT tests with no problems,
> similar
> >> to the 1 working zotac that I have. However, if you tested a whole load
> of
> >> Zotac's a number would pass and a number would fail due to differences
> in
> >> the manufacturing process. The only way to find out is if someone with
> more
> >> of the Zotacs tests them.
> >>
> >>
> >>
> >>
> >> On 12 July 2013 16:53, Ross Walker <ross.rosswalker.co.uk> wrote:
> >>
> >>> Hi All,
> >>>
> >>> Ok, so overnight I repeated the JAC NPT tests on the two 4 x GTX780
> >>> machines I have access too. One of these has:
> >>>
> >>> http://tinyurl.com/prxlwy6 Zotac GTX780 ZT-70201-10P
> >>>
> >>>
> >>> and the other has:
> >>>
> >>> http://tinyurl.com/k3n2rqb EVGA GeForce GTX780 3GB GDDR5 384bit,
> >>> Dual-Link
> >>> DVI-I, DVI-D, HDMI,DP, SLI
> >>> Ready Graphics Card (03G-P4-2781-KR)
> >>>
> >>>
> >>> BOTH machines are super micro motherboards in certified cases with
> >>> validated and ducted cooling. One is rack mount and the other is a
> desktop
> >>> - essentially the models shown here:
> >>> http://exxactcorp.com/index.php/solution/solu_list/65
> >>>
> >>> I ran JAC NPT with the following input:
> >>>
> >>> Typical Production MD NVT
> >>> &cntrl
> >>> ntx=5, irest=1,
> >>> ntc=2, ntf=2,
> >>> nstlim=1000000,
> >>> ntpr=1000, ntwx=5000,
> >>> ntwr=100000,
> >>> dt=0.002, cut=8.,
> >>> ntt=1, tautp=10.0,
> >>> temp0=300.0,
> >>> ntb=2, ntp=1, taup=10.0,
> >>> ioutfm=1,
> >>> /
> >>>
> >>> AMBER 12 with all the latest public updates
> >>>
> >>>
> >>> nvcc: NVIDIA (R) Cuda compiler driver
> >>> Copyright (c) 2005-2012 NVIDIA Corporation
> >>> Built on Fri_Sep_21_17:28:58_PDT_2012
> >>> Cuda compilation tools, release 5.0, V0.2.1221
> >>>
> >>>
> >>> NVRM version: NVIDIA UNIX x86_64 Kernel Module 319.17 Thu Apr 25
> >>> 22:45:49 PDT 2013
> >>> GCC version: gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC)
> >>>
> >>> Running on CentOS / RHEL 6
> >>>
> >>>
> >>> I ran 10 sequential runs on each of the 4 GPUs in each machine
> >>> simultaneously giving me a total of 80 output files all of which were
> >>> identical and ended after 1 million steps with the following output:
> >>>
> >>> NSTEP = 1000000 TIME(PS) = 2006.000 TEMP(K) = 300.01 PRESS =
> >>> -47.5
> >>> Etot = -58245.9096 EKtot = 14421.9385 EPtot =
> >>> -72667.8481
> >>> BOND = 486.0307 ANGLE = 1238.6417 DIHED =
> >>> 972.1042
> >>> 1-4 NB = 558.3709 1-4 EEL = 6793.5638 VDWAALS =
> >>> 8465.9200
> >>> EELEC = -91182.4793 EHBOND = 0.0000 RESTRAINT =
> >>> 0.0000
> >>> EKCMT = 6392.4186 VIRIAL = 6633.3942 VOLUME =
> >>> 234849.5754
> >>> Density =
> >>> 1.0218
> >>>
> >>>
> >>>
> >>> So right now the GTX780s look ok to me but it is very possible that
> they
> >>> don't work well in traditional consumer cases.
> >>>
> >>> I am going to repeat the tests with Cellulose NPT.
> >>>
> >>> All the best
> >>> Ross
> >>>
> >>> /\
> >>> \/
> >>> |\oss Walker
> >>>
> >>> ---------------------------------------------------------
> >>> | Associate Research Professor |
> >>> | San Diego Supercomputer Center |
> >>> | Adjunct Associate Professor |
> >>> | Dept. of Chemistry and Biochemistry |
> >>> | University of California San Diego |
> >>> | NVIDIA Fellow |
> >>> | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
> >>> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> >>> ---------------------------------------------------------
> >>>
> >>> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> not
> >>> be read every day, and should not be used for urgent or sensitive
> issues.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>
> >>
> >>
> >>
> >> On 12 July 2013 23:18, Marek Maly <marek.maly.ujep.cz> wrote:
> >>
> >>> Yes of course,
> >>>
> >>> it was just my reaction on some recent Titan overheating hypothesis
> >>> in connection with Ross hypothesis about his "super-cooled" working GTX
> >>> 780s versus
> >>> some GTX 780 from Amber users which do not work properly in spite the
> >>> fact that they are of the same type ZOTAC ...
> >>>
> >>> Anyway my opinion is also that Titan/cuFFT issue is rather a bit more
> >>> complicated
> >>> that simply memory/(some other GPU parts) overheating problem.
> >>>
> >>> BTW, the latest Scott's info about some preliminary optimistic cuFFT
> >>> results with Titans
> >>> with downclocked memory and also with heatsink seem promising although
> no
> >>> Amber tests
> >>> were probably done with such modified GPUs yet.
> >>>
> >>> So OK let's wait,
> >>>
> >>> Best,
> >>>
> >>> Marek
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Dne Fri, 12 Jul 2013 22:50:55 +0200 <tec3.utah.edu> napsal/-a:
> >>>
> >>>>
> >>>>> Hi Ross,
> >>>>> would be interesting if you can do the same test
> >>>>> with Titan GPU using the same super-cooled machines
> >>>>> as you are using for testing of GTX 780 but perhaps, you
> >>>>> or Scott already tested Titans in such machines or it was some
> >>>>> normal consumer cases ?
> >>>>
> >>>> I think Ross and Scott have been extremely clear on this in the
> >>>> incredible
> >>>> volume of e-mail that has come through this list on this topic. There
> >>>> clearly is a cuFFT problem and also Titan's hardware is also suspect.
> I
> >>>> would guess the skepticism of Scott and Ross will be apparent
> regardless
> >>>> of whether or not you immerse the cards in liquid N2...
> >>>>
> >>>> 99.5% likely a Titan hardware issue:
> >>>>
> >>>> http://archive.ambermd.org/201306/0007.html
> >>>>
> >>>> Continuing to push the AMBER developers will not make things move
> >>> faster,
> >>>> and perhaps could make them move even slower. Patience please as we
> >>> wait
> >>>> on nVidia to see if a cuFFT fix can emerge and better probe these
> >>> issues.
> >>>>
> >>>> When Ross and Scott know, I am sure they will inform the list...
> >>>>
> >>>> --tec3
> >>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>
> >>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8559
> >>>> (20130712) __________
> >>>>
> >>>> Tuto zpravu proveril ESET NOD32 Antivirus.
> >>>>
> >>>> http://www.eset.cz
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> >>> http://www.opera.com/mail/
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>
> >>
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jul 12 2013 - 23:00:03 PDT
Custom Search