Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Ross Walker <rosscwalker.gmail.com>
Date: Wed, 26 Jun 2013 16:02:48 -0700

I know what not to drink then. ;-)

I wonder how many of the NVIDIA people are reading this list. (in non-digest form) ;-)



On Jun 26, 2013, at 15:12, Scott Le Grand <varelse2005.gmail.com> wrote:

> Handed NVIDIA a defective Titan on Saturday. No word since. Having dinner
> with the perps tonight so we'll see what comes of it. Since I'm bringing
> the wine, I'll be sure to spike it sodium pentothal...
>
>
>
>
>
> On Wed, Jun 26, 2013 at 2:22 PM, Jonathan Gough
> <jonathan.d.gough.gmail.com>wrote:
>
>> Any Updates on the bug with the Titan cards?
>>
>>
>> On Thu, Jun 20, 2013 at 2:05 PM, Marek Maly <marek.maly.ujep.cz> wrote:
>>
>>> Thanks guys !
>>> Now it is all clear even to me :))
>>>
>>> So the key problem is here that one could not reproduce the exactly the
>>> same
>>> "resource" conditions (e.g. state of individual cuda cores and memory
>>> segments)
>>> during different runs of parallel code.
>>>
>>> Best wishes,
>>>
>>> Marek
>>>
>>>
>>>
>>>
>>>
>>> Dne Thu, 20 Jun 2013 19:36:25 +0200 Ross Walker <ross.rosswalker.co.uk>
>>> napsal/-a:
>>>
>>>>
>>>> On 6/20/13 9:57 AM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>>>>
>>>>> Dne Thu, 20 Jun 2013 18:57:18 +0200 Scott Le Grand
>>>>> <varelse2005.gmail.com>
>>>>> napsal/-a:
>>>>>
>>>>>> You're overthinking it. Neither NAMD nor GROMACS produce
>> deterministic
>>>>>> outputs because they accumulate in 32-bit single precision in an
>>>>>> arbitrary order rather than do so in a deterministic order or use an
>>>>>> associative
>>>>>
>>>>> OK, but what is the reason of that "ARBITRARY order" ?
>>>>>
>>>>> Why the order of the numbers accumulation is in each run of the same
>>>>> code
>>>>>
>>>>> on the
>>>>> same machine different ? I would naturally assume that the order of
>> all
>>>>> operations will be the same in each run unless is from some reason
>>>>> defined
>>>>> using some pseudorandom number generator which is not reset or is even
>>>>> impossible to "reset" it (if necessary) for each code run.
>>>>
>>>> Because these calculations are NOT being run in serial. GPUs are
>>>> massively
>>>> threaded architectures running hundreds of thousands of threads, even
>>>> when
>>>> using a single GPU. These threads are dispatched across multiple
>>>> streaming
>>>> compute units and essentially things are executed whenever the required
>>>> memory arrives. It is a VERY different situation from running single
>>>> threaded on CPUs. I would suggest reading a couple of books on CUDA and
>>>> GPUs and that should make the differences very apparent.
>>>>
>>>> Essentially CPUs are going the same way now, pretty much nothing is
>>>> serial
>>>> anymore so unless you take steps to deliberately control the way things
>>>> are rounded when an array is summed in an arbitrary order (either by
>> use
>>>> of things like atomic operations, or various sync and locks, which make
>>>> your code slow) you will always get different answers from different
>>>> runs.
>>>>
>>>> All the best
>>>> Ross
>>>>
>>>> /\
>>>> \/
>>>> |\oss Walker
>>>>
>>>> ---------------------------------------------------------
>>>> | Associate Research Professor |
>>>> | San Diego Supercomputer Center |
>>>> | Adjunct Associate Professor |
>>>> | Dept. of Chemistry and Biochemistry |
>>>> | University of California San Diego |
>>>> | NVIDIA Fellow |
>>>> | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
>>>> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
>>>> ---------------------------------------------------------
>>>>
>>>> Note: Electronic Mail is not secure, has no guarantee of delivery, may
>>>> not
>>>> be read every day, and should not be used for urgent or sensitive
>> issues.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8468
>>>> (20130619) __________
>>>>
>>>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>
>>>> http://www.eset.cz
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> http://www.opera.com/mail/
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 26 2013 - 16:30:02 PDT
Custom Search