Lot's of sodium pentathol just ellicited the response "bbbuyyyy
Teeesssllaaaa" ;-)
Seriously though, they have some leads for what the problem could be and
have several engineers investigating things. There are also reports from
some other codes that are having problems so it is definitely being taken
seriously. We just have to be patient. First they need to find out exactly
what the problem is and then one can start designing the solution. Right
now it is still at the stage of determining which of the hypothesized
problems is the real one.
All the best
Ross
On 7/1/13 3:44 AM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>Hi Scott,
>any news after this wine-meeting ?
>
> Best,
>
> Marek
>
>Dne Thu, 27 Jun 2013 00:12:54 +0200 Scott Le Grand
><varelse2005.gmail.com>
>napsal/-a:
>
>> Handed NVIDIA a defective Titan on Saturday. No word since. Having
>> dinner
>> with the perps tonight so we'll see what comes of it. Since I'm
>>bringing
>> the wine, I'll be sure to spike it sodium pentothal...
>>
>>
>>
>>
>>
>> On Wed, Jun 26, 2013 at 2:22 PM, Jonathan Gough
>> <jonathan.d.gough.gmail.com>wrote:
>>
>>> Any Updates on the bug with the Titan cards?
>>>
>>>
>>> On Thu, Jun 20, 2013 at 2:05 PM, Marek Maly <marek.maly.ujep.cz> wrote:
>>>
>>> > Thanks guys !
>>> > Now it is all clear even to me :))
>>> >
>>> > So the key problem is here that one could not reproduce the
>>>exactly
>>> the
>>> > same
>>> > "resource" conditions (e.g. state of individual cuda cores and
>>> memory
>>> > segments)
>>> > during different runs of parallel code.
>>> >
>>> > Best wishes,
>>> >
>>> > Marek
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Dne Thu, 20 Jun 2013 19:36:25 +0200 Ross Walker
>>> <ross.rosswalker.co.uk>
>>> > napsal/-a:
>>> >
>>> > >
>>> > > On 6/20/13 9:57 AM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>>> > >
>>> > >> Dne Thu, 20 Jun 2013 18:57:18 +0200 Scott Le Grand
>>> > >> <varelse2005.gmail.com>
>>> > >> napsal/-a:
>>> > >>
>>> > >>> You're overthinking it. Neither NAMD nor GROMACS produce
>>> deterministic
>>> > >>> outputs because they accumulate in 32-bit single precision in an
>>> > >>> arbitrary order rather than do so in a deterministic order or
>>>use
>>> an
>>> > >>> associative
>>> > >>
>>> > >> OK, but what is the reason of that "ARBITRARY order" ?
>>> > >>
>>> > >> Why the order of the numbers accumulation is in each run of the
>>> same
>>> > >> code
>>> > >>
>>> > >> on the
>>> > >> same machine different ? I would naturally assume that the order
>>>of
>>> all
>>> > >> operations will be the same in each run unless is from some reason
>>> > >> defined
>>> > >> using some pseudorandom number generator which is not reset or is
>>>
>>> even
>>> > >> impossible to "reset" it (if necessary) for each code run.
>>> > >
>>> > > Because these calculations are NOT being run in serial. GPUs are
>>> > > massively
>>> > > threaded architectures running hundreds of thousands of threads,
>>> even
>>> > > when
>>> > > using a single GPU. These threads are dispatched across multiple
>>> > > streaming
>>> > > compute units and essentially things are executed whenever the
>>> required
>>> > > memory arrives. It is a VERY different situation from running
>>>single
>>> > > threaded on CPUs. I would suggest reading a couple of books on
>>>CUDA
>>> and
>>> > > GPUs and that should make the differences very apparent.
>>> > >
>>> > > Essentially CPUs are going the same way now, pretty much nothing is
>>> > > serial
>>> > > anymore so unless you take steps to deliberately control the way
>>> things
>>> > > are rounded when an array is summed in an arbitrary order (either
>>>by
>>> use
>>> > > of things like atomic operations, or various sync and locks, which
>>>
>>> make
>>> > > your code slow) you will always get different answers from
>>>different
>>> > > runs.
>>> > >
>>> > > All the best
>>> > > Ross
>>> > >
>>> > > /\
>>> > > \/
>>> > > |\oss Walker
>>> > >
>>> > > ---------------------------------------------------------
>>> > > | Associate Research Professor |
>>> > > | San Diego Supercomputer Center |
>>> > > | Adjunct Associate Professor |
>>> > > | Dept. of Chemistry and Biochemistry |
>>> > > | University of California San Diego |
>>> > > | NVIDIA Fellow |
>>> > > | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
>>> > > | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
>>> > > ---------------------------------------------------------
>>> > >
>>> > > Note: Electronic Mail is not secure, has no guarantee of delivery,
>>>
>>> may
>>> > > not
>>> > > be read every day, and should not be used for urgent or sensitive
>>> issues.
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > _______________________________________________
>>> > > AMBER mailing list
>>> > > AMBER.ambermd.org
>>> > > http://lists.ambermd.org/mailman/listinfo/amber
>>> > >
>>> > > __________ Informace od ESET NOD32 Antivirus, verze databaze 8468
>>> > > (20130619) __________
>>> > >
>>> > > Tuto zpravu proveril ESET NOD32 Antivirus.
>>> > >
>>> > > http://www.eset.cz
>>> > >
>>> > >
>>> > >
>>> >
>>> >
>>> > --
>>> > Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> > http://www.opera.com/mail/
>>> >
>>> > _______________________________________________
>>> > AMBER mailing list
>>> > AMBER.ambermd.org
>>> > http://lists.ambermd.org/mailman/listinfo/amber
>>> >
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8494
>> (20130626) __________
>>
>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>
>> http://www.eset.cz
>>
>>
>>
>
>
>--
>Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>http://www.opera.com/mail/
>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Jul 01 2013 - 08:30:02 PDT