Hi Jonathon,
Could you let me know the make and model of the motherboard you are using
to run your 4x Geforce 780's please? :)
br,
g
On 20 June 2013 17:03, Marek Maly <marek.maly.ujep.cz> wrote:
> OK, so if I understood well the "nondeterministic" behavior should be
> anyway attributed to Titan GPUs not to any code ( even Namd :)) ) and
> each code it self is always deterministic as I assumed.
>
> On the other hand in Titan (or all) GPUs exist naturally some spontaneous
> truly random processes which might be "modulated" by the given code or by
> the given job. E.g. in Amber some jobs increase
> the probability that "chaotic GPU processes" will be strengthened and
> will influence the calculation (some kind of "resonance" e.g. in JAC case)
> and another job or completely different code (Namd) do not amplify that
> "natural GPU roulette" so the eventual influence of these random processes
> on calculation (e.g. frequency of bit flipping) is very low.
>
> It is perhaps something like:
>
> "genetic predisposition" + proper "starter" = "occurrence of the given
> disease" ?
>
>
> OK, anyway thanks for your effort and let's hope that Amber is for NVIDIA
> important
> not only due to the "Tesla/Amber" market. But anyway the fact that the
> Titan issue
> affects somehow cuFFT (and thus also Amber calc.) should be for NVIDIA in
> my opinion sufficient
> reason to solve this problem as cuFFT is perhaps used in many other
> applications although
> it would be also possible that the "amplitude" of these errs is in many
> softwares still
> within the acceptable tolerance which is unfortunately not the Amber case.
>
> Best,
>
> Marek
>
>
>
>
>
>
> Dne Thu, 20 Jun 2013 17:25:05 +0200 Scott Le Grand <varelse2005.gmail.com>
> napsal/-a:
>
> > "How it is possible to run on Titan e.g. JAC tests several times and in
> > each case obtain different result (errors in different stage of
> > calculation or different final results) ? Where is hidden that
> > "roulette" here ???"
> >
> > The nondeterministic behavior I'm seeing from Titan is enough to throw
> > simulations (D.E. Shaw estimated the worst error one can tolerate is
> > 1e-5,
> > I'm seeing 1e-3 in the IPS repro I found last week because random
> > individual bonded interactions are going AWOL every 50K iterations). But
> > it's also very had to detect unless one is running an algorithm with
> > deterministic output because the chaos of a nondeterministic algorithm is
> > more than enough to obscure it.
> >
> > As to what is causing it, I have no idea at this point. Titan seems to
> > have similar texture issues as GTX4xx and GTX5xx, but there seems to be
> > something more going on here. And that's really hard to diagnose let
> > alone
> > fix. I've tried a multitude of dirty tricks to try to convince the GPU
> > to
> > behave, nothing works. Give NVIDIA time here. They'll do what it takes
> > to
> > make things right. AMBER is too important to them to do otherwise.
> >
> > Scott
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> > __________ Informace od ESET NOD32 Antivirus, verze databaze 8468
> > (20130619) __________
> >
> > Tuto zpravu proveril ESET NOD32 Antivirus.
> >
> > http://www.eset.cz
> >
> >
> >
>
>
> --
> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> http://www.opera.com/mail/
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jun 20 2013 - 10:00:05 PDT