Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: filip fratev <filipfratev.yahoo.com>
Date: Wed, 5 Jun 2013 05:23:41 -0700 (PDT)

Hi Marek,
I updated to fix18 and found out a lot of improvements and stability.

>Did you succeed with your TITAN_1 to finish
>twice with reproducible results also both (NVE/NPT)
>JAC tests ?

Yes I did. I am able to finish all test without any problems by both TITAN_1 and TITAN_0. I tested 6 times JAC (NVE/NPT) and two times Cellulose.
All NPT test for both TITAN_1 and TITAN_0 are reproducible! However, I still have a problems with TITAN_0 in NVE tests. 50% of the test produced a little difference in Etot but in a range of 0.xxxx. My monitor is connected to TITAN_0. I am not able to swap the cards because I tested remotely, but if you are able to test this (probably stupid) hypothesis will be great!     

>Anyway which is your motherboard ?
I use  GIGABYTE Z77X-UP7 motherboard.  

Regards,
Filip



________________________________
 From: Marek Maly <marek.maly.ujep.cz>
To: filip fratev <filipfratev.yahoo.com>; AMBER Mailing List <amber.ambermd.org>
Sent: Wednesday, June 5, 2013 1:33 PM
Subject: Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?
 

Hi Filip,

this is interesting information.

Did you succeed with your TITAN_1 to finish
twice with reproducible results also both (NVE/NPT)
JAC tests ?

So what about to try to swap GPUs with respect to PCI slots ? I will try 
it.

Anyway which is your motherboard ?
I have : ASUS P9X79 PRO

BTW my experiment with  my system as I announced yesterday
finished OK again just for the TITAN_1 and KO for TITAN_0 (as usually, run 
crashed)
in simultaneous GPU run (both GPUs worked at the same time) but
surprisingly also in consequent single (just TITAN_0) run, although before
more than 750K steps was done by this GPU without any problems on this 
system ...

Uf ...

  M.






Dne Wed, 05 Jun 2013 11:12:54 +0200 filip fratev <filipfratev.yahoo.com> 
napsal/-a:

> Hi all,
> For me it is very strange that only/mainly? Titans_0 are problematic 
> (not identical results). I didn’t apply any patches (still use up to 15) 
> and driver 313.26.
> My Titan_1 is ok, i.e. gives reproducible results, this on Marek's too, 
> but Titan_0, not?
>
>
> Regards,
> Filip
>
>
> ________________________________
>  From: Marek Maly <marek.maly.ujep.cz>
> To: AMBER Mailing List <amber.ambermd.org>
> Sent: Wednesday, June 5, 2013 1:20 AM
> Subject: Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - 
> memtestG80 - UNDERclocking in Linux ?
>
> Hi Scott,
>
> thanks for update.
>
> I just got the idea to try with the actual config:
> (driver 319.23, Amber12 bugfix 18 applied, cuda 5.0)
> to simulate again the system where my TITANs originally
> failed and what was the reason why I started this
> "threaaaaaaaaaaaaaaaaaaaaaad" :))
>
> And what a surprise, the simulation seems to go well
> (now I am above 750K steps) even on my "less reliable"
> titan TITAN_0. So it seems that bugfix 18 helped here.
>
> I will try this system (protein + TIP3P water, 114852 atoms, NPT, ntt=3 )
> to use for 100K reproducibility tests before I go sleep.
>
> If I confirm reproducibility here, then would be maybe good idea to try
> systematically
> test the hypothesis that at least regarding PME calculations the
> probability of crash or irreproducible results significantly increases as
> the size (number of atoms) of the simulated system
> decreases (see my and ETs results JAC versus FACTOR_IX). If this will be
> confirmed it could help
> with eventual "debugging" and of course it would be also good news for 
> thewhole "Amber/Titan club" as indeed Titan/K20s GPUs are suppose to help
> especially with simulation of bigger systems (let say
> 100k atoms and more) while for those smaller GTX 580/680 are still
> acceptable solutions.
>
>    So let see ...
>
>          M.
>
>
>
>
>
>
>
>
>
> Dne Tue, 04 Jun 2013 22:36:00 +0200 Scott Le Grand 
> <varelse2005.gmail.com>napsal/-a:
>
>> It's harder to get a failure out of GB in Titan, but it does happen for
>> me
>> as well...
>>
>> I am now running the GB tests on K20.  No failures observed yet. 
>> Doesn't
>> exactly prove this is hardware, but it's really making it hard to make a
>> case that it isn't...
>>
>>
>>
>> On Tue, Jun 4, 2013 at 6:23 AM, ET <sketchfoot.gmail.com> wrote:
>>
>>> 100k nucleosome test = identical results:
>>>
>>>      A V E R A G E S  O V E R  100000 S T E P S                    A
>>> V E
>>> R A G E S  O V E R  100000 S T E P S
>>>
>>>
>>>  NSTEP =  100000  TIME(PS) =    300.000  TEMP(K) =  310.0   
>>> NSTEP =
>>> 100000  TIME(PS) =    300.000  TEMP(K) =  310.0
>>>  Etot  =    -66600.0926  EKtot  =    19654.9595  EPtot        Etot
>>> =    -66600.0926  EKtot  =    19654.9595  EPtot
>>>  BOND  =      5795.1298  ANGLE  =    13672.2739  DIHED        BOND
>>> =      5795.1298  ANGLE  =    13672.2739  DIHED
>>>  1-4 NB =      5612.4805  1-4 EEL =      1436.2790  VDWAALS      1-4 
>>> NB
>>> =      5612.4805  1-4 EEL =      1436.2790  VDWAALS
>>>  EELEC  =    -11449.2413  EGB    =  -105134.8815  RESTRAINT    EELEC
>>> =    -11449.2413  EGB    =  -105134.8815  RESTRAINT
>>>  EAMBER (non-restraint)  =    -86607.8501                       
>>> EAMBER
>>> (non-restraint)  =    -86607.8501
>>>  ------------------------------------------------------------
>>> ------------------------------------------------------------
>>>
>>>
>>>
>>> On 4 June 2013 12:39, Marek Maly <marek.maly.ujep.cz> wrote:
>>>
>>> > Hi,
>>> >  here are my results from the "NTPR" experiment:
>>> >
>>> >
>>> > Total energy at step 100 000 reported for ROUND_1 and ROUND_2
>>> > (driver 319.23, Amber12 bugfix 18 applied, cuda 5.0) (In all cases)
>>> >
>>> > GTX580 (NTPR=1000)
>>> > -66801.3274
>>> > -66801.3274
>>> >
>>> > TITAN_0 (NTPR=1)
>>> > -66854.0492
>>> > -66802.4419
>>> >
>>> > TITAN_1 (NTPR=1)
>>> >  -66858.7444
>>> >  -66858.7444
>>> >
>>> >
>>> >        M.
>>> >
>>> >
>>> >
>>> >
>>> > Dne Tue, 04 Jun 2013 06:14:28 +0200 Marek Maly <marek.maly.ujep.cz>
>>> > napsal/-a:
>>> >
>>> > > Hi Scott,
>>> > >
>>> > > I am sending again my very first tests/table (see attached) where
>>> > > I did also GTX 580/GTX 680 tests as a control and as you can see
>>> > > here I have obtained perfect reproducibility on those GTX but also
>>> > > on my second TITAN card (TITAN_1) for NUCLEOSOME ! But that was 
>>> with
>>> > > driver 319.17
>>> > > (and also before bugfix 18).
>>> > >
>>> > > Now I will try on my titans again with ntpr=1 as you wish
>>> > > (driver 319.23, Amber12 bugfix 18 applied, cuda 5.0).
>>> > >
>>> > > Simultaneously I will repeat this test on GTX 580 with ntpr=1000
>>> > > (driver 319.23, Amber12 bugfix 18 applied, cuda 5.0).
>>> > >
>>> > > BTW I also experimented a bit, first try to use some settings from
>>> > > NUCLEOSOME (e.g. igb=5, ntt=1/3, saltcon=0.1, tautp=1.0 + 
>>> restrains)
>>> and
>>> > > use it
>>> > > for TRP cage and Myoglob. assuming these params which are different
>>> > > between NUCLE and TRP + MYO will affect the TRP + MYO
>>> reproducibility.
>>> > >
>>> > > This was not confirmed i.e. TRP + MYO still perfectly reproducible.
>>> > >
>>> > > So then (to be sure) I did opposite exper. and used TRP mdin file
>>> for
>>> > > NUCLEOSOME to see
>>> > > if it influence NUCL reproducibility, but in agreement with
>>> "TRP-MYO"
>>> > > tests NUCL
>>> > > was again irreproducible ...
>>> > >
>>> > > So let's see the ntpr tests.
>>> > >
>>> > >    M.
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > Dne Tue, 04 Jun 2013 04:51:08 +0200 Scott Le Grand
>>> > > <varelse2005.gmail.com>
>>> > > napsal/-a:
>>> > >
>>> > >> Update: The nucleosome GB irreproducibility is weird.  it goes
>>> away on
>>> > >> my
>>> > >> Titan if I set ntpr to 1 (was trying to find the offending energy
>>> > >> component
>>> > >> that diverges first).  Can you guys try this on your machines?  I
>>> think
>>> > >> this might be SW...
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >> On Mon, Jun 3, 2013 at 1:18 PM, ET <sketchfoot.gmail.com> wrote:
>>> > >>
>>> > >>> Hi Scott & Ross,
>>> > >>>
>>> > >>> I take it you will post to this thread once a fix has been found?
>>> :)
>>> > >>>
>>> > >>> br,
>>> > >>> g
>>> > >>>
>>> > >>>
>>> > >>> On 3 June 2013 20:31, Marek Maly <marek.maly.ujep.cz> wrote:
>>> > >>>
>>> > >>> > OK,
>>> > >>> > I just took deep breath and started to pray :))
>>> > >>> >
>>> > >>> > BTW, the difference between GB results TRPcage/myoglobin
>>> (perfectly
>>> > >>> > reproducible)
>>> > >>> > versus Nucleosome (irreproducible res.) might be connected with
>>> some
>>> > >>> > differences
>>> > >>> > in mdin parameters:
>>> > >>> >
>>> > >>> > TRPcage/myoglobin (igb=1, ntt=3) versus Nucleosome (igb=5,
>>> ntt=1).
>>> > >>> > Nucleosome simul. is also
>>> > >>> > with restraint:
>>> > >>> >
>>> > >>> > RESTRAIN DNA
>>> > >>> > 0.1
>>> > >>> > RES 1 294
>>> > >>> > END
>>> > >>> > END
>>> > >>> >
>>> > >>> > I will try to experiment here to learn which parameter is
>>> responsible
>>> > >>> for
>>> > >>> > the
>>> > >>> > Nucleosome irreproducible results.
>>> > >>> >
>>> > >>> >    M.
>>> > >>> >
>>> > >>> >
>>> > >>> >
>>> > >>> >
>>> > >>> >
>>> > >>> > Dne Mon, 03 Jun 2013 21:17:23 +0200 Ross Walker
>>> > >>> <ross.rosswalker.co.uk>
>>> > >>> > napsal/-a:
>>> > >>> >
>>> > >>> > > Hi Marek,
>>> > >>> > >
>>> > >>> > > To be honest I would just take a deep breath and give us some
>>> time
>>> > >>> to
>>> > >>> > > figure out what is going on with the Titan and work around 
>>> it.
>>> > >>> Hopefully
>>> > >>> > > this won't take too long and we can have a patch out shortly.
>>> > >>> > >
>>> > >>> > > All the best
>>> > >>> > > Ross
>>> > >>> > >
>>> > >>> > >
>>> > >>> > >
>>> > >>> > > On 6/3/13 11:47 AM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>>> > >>> > >
>>> > >>> > >> Thanks Scott !
>>> > >>> > >>
>>> > >>> > >> sounds me like "Of course you can win gold treasure if you
>>> survive
>>> > >>> > >> Russian
>>> > >>> > >> roulette before ..."
>>> > >>> > >>
>>> > >>> > >> It seems that the difference in reliability for sci. calc.
>>> between
>>> > >>> > >> Teslas
>>> > >>> > >>
>>> > >>> > >> and "equivalent" stock GTXs
>>> > >>> > >> is now (with chip GTK110) clearly bigger. I am curious how 
>>> it
>>> will
>>> > >>> be
>>> > >>> > >> with
>>> > >>> > >> GTX 780 comparing to Titans.
>>> > >>> > >>
>>> > >>> > >> So let's hope that in the worst case downclocking of Titans
>>> might
>>> > >>> solve
>>> > >>> > >> the problem.
>>> > >>> > >>
>>> > >>> > >> BTW what is the working temperature of your K20c ? My Titans
>>> works
>>> > >>> under
>>> > >>> > >> 80°C (cca
>>> > >>> > >> 60% Fan utilization). For the older cards (GTX 680/580 ...)
>>> this
>>> > >>> temp.
>>> > >>> > >> should be OK but
>>> > >>> > >> maybe for the GTK110 this temp is already too high to ensure
>>> zero
>>> > >>> "bit
>>> > >>> > >> fluctuations".
>>> > >>> > >>
>>> > >>> > >> cuFFT is maybe responsible for crashes and maybe also some
>>> > >>> > >> irreproducibility but the irreproducibility of the results
>>> will
>>> > >>> have
>>> > >>> > >> also
>>> > >>> > >>
>>> > >>> > >> some another source as suggests
>>> > >>> > >> NUCLEOSOME GB test where perhaps no  FFT is involved ? (just
>>> the
>>> > >>> real
>>> > >>> > >> space calc.).
>>> > >>> > >>
>>> > >>> > >>  So thanks for the moment and please let us know when you 
>>> do
>>> some
>>> > >>> > >> progress.
>>> > >>> > >>
>>> > >>> > >>
>>> > >>> > >>        M.
>>> > >>> > >>
>>> > >>> > >>
>>> > >>> > >>
>>> > >>> > >> Dne Mon, 03 Jun 2013 20:12:04 +0200 Scott Le Grand
>>> > >>> > >> <varelse2005.gmail.com>
>>> > >>> > >> napsal/-a:
>>> > >>> > >>
>>> > >>> > >>> Addressing Divi's two points:
>>> > >>> > >>>
>>> > >>> > >>> 1. We're trying to find a way to do this...
>>> > >>> > >>>
>>> > >>> > >>> 2. I am extremely paranoid and while I would still use the
>>> Titans
>>> > >>> for
>>> > >>> > >>> development and testing, I would also currently do my
>>> publishable
>>> > >>> runs
>>> > >>> > >>> on
>>> > >>> > >>> GK104 GPUs or K20s.  Given that, if you're comfortable with
>>> > >>> > >>> nondeterministic execution ala GROMACS, ACEMD, and NAMD,
>>> what's
>>> > >>> going
>>> > >>> > >>> on
>>> > >>> > >>> here is seemingly no worse.  I'm *not* comfortable with 
>>> that
>>> > >>> myself
>>> > >>> and
>>> > >>> > >>> I
>>> > >>> > >>> intend to find a fix or workaround like we did a couple
>>> years
>>> ago
>>> > >>> with
>>> > >>> > >>> GTX4xx and GTX5xx.  So your best strategy might just be to
>>> wait a
>>> > >>> week
>>> > >>> > >>> or
>>> > >>> > >>> two and see what comes of the bug hunt.
>>> > >>> > >>>
>>> > >>> > >>> Marek et al. if these GPU tests are failing on the Titans,
>>> then
>>> > >>> by
>>> > >>> all
>>> > >>> > >>> means return them without hesitation, but I don't think
>>> consumer
>>> > >>> level
>>> > >>> > >>> GPUs
>>> > >>> > >>> are tested with the same level of rigor as Teslas.  The
>>> upside
>>> is
>>> > >>> you
>>> > >>> > >>> get
>>> > >>> > >>> 30% better performance for 1/3 the price.  The downside is
>>> that
>>> > >>> IMO
>>> > >>> you
>>> > >>> > >>> should be carefully validate them before using them.  What
>>> I'm
>>> > >>> seeing
>>> > >>> > >>> here
>>> > >>> > >>> looks like single bit differences at the low-order bits 
>>> that
>>> > >>> cause a
>>> > >>> > >>> tiny
>>> > >>> > >>> fluctuation that ultimately mushrooms and diverges the 
>>> whole
>>> > >>> shebang
>>> > >>> > >>> along
>>> > >>> > >>> with occasional crashes.  The crashes seem to occur in 
>>> cuFFT
>>> > >>> somewhere.
>>> > >>> > >>>
>>> > >>> > >>> I
>>> > >>> > >>> have yet to see divergence there yet.
>>> > >>> > >>>
>>> > >>> > >>> Scott
>>> > >>> > >>>
>>> > >>> > >>>
>>> > >>> > >>> On Mon, Jun 3, 2013 at 9:42 AM, Marek Maly
>>> <marek.maly.ujep.cz
>>> >
>>> > >>> wrote:
>>> > >>> > >>>
>>> > >>> > >>>> Hi,
>>> > >>> > >>>> so here are my NUCLEOSOME test results. All tests finished
>>> > >>> (although
>>> > >>> > >>>> the
>>> > >>> > >>>> TITAN_0/ROUND_2) with "****" energy (*** records starts
>>> from
>>> the
>>> > >>> 75K
>>> > >>> > >>>> step
>>> > >>> > >>>> so
>>> > >>> > >>>> it is surprise for me that test was finished at the end).
>>> All
>>> > >>> the
>>> > >>> > >>>> results
>>> > >>> > >>>> are irreproducible (driver 319.23, Amber12 bugfix 18
>>> applied,
>>> > >>> cuda
>>> > >>> > >>>> 5.5)
>>> > >>> > >>>> I
>>> > >>> > >>>> will
>>> > >>> > >>>>  repeat it with CUDA 5.0.
>>> > >>> > >>>>
>>> > >>> > >>>>  M.
>>> > >>> > >>>>
>>> > >>> > >>>> >>>>>> TITAN_0
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>  ROUND_1
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>  NSTEP =  100000  TIME(PS) =    300.000  TEMP(K) =
>>> 310.60
>>> > >>>  PRESS
>>> > >>> > >>>> =    0.0
>>> > >>> > >>>>  Etot  =    -66843.8345  EKtot  =    19690.5156  EPtot
>>> > >>> =
>>> > >>> > >>>> -86534.3502
>>> > >>> > >>>>  BOND  =      5887.3611  ANGLE  =    13673.5215  DIHED
>>> > >>> =
>>> > >>> > >>>> 16941.7678
>>> > >>> > >>>>  1-4 NB =      5576.6911  1-4 EEL =      1371.5924VDWAALS
>>> > >>> =
>>> > >>> > >>>> -13647.8461
>>> > >>> > >>>>  EELEC  =    -14410.1252  EGB    =  -102286.9459
>>> RESTRAINT
>>> > >>> =
>>> > >>> > >>>> 359.6331
>>> > >>> > >>>>  EAMBER (non-restraint)  =    -86893.9832
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>  ROUND_2
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>  NSTEP =  100000  TIME(PS) =    300.000  TEMP(K)
>>> =*********
>>> > >>>  PRESS
>>> > >>> > >>>> =    0.0
>>> > >>> > >>>>  Etot  = **************  EKtot  = **************  EPtot
>>> > >>> =
>>> > >>> > >>>> 4279668.7807
>>> > >>> > >>>>  BOND  =        -0.0000  ANGLE  =  4681740.3488  DIHED
>>> > >>> =
>>> > >>> > >>>> 67661.6797
>>> > >>> > >>>>  1-4 NB =        -0.0000  1-4 EEL =        -2.0373VDWAALS
>>> > >>> =
>>> > >>> > >>>> 244.1012
>>> > >>> > >>>>  EELEC  =    72548.4049  EGB    =  -542523.7166
>>> RESTRAINT
>>> > >>> =
>>> > >>> > >>>> -0.0000
>>> > >>> > >>>>  EAMBER (non-restraint)  =  4279668.7807
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>> STARS from the 75k step ...
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>> >>>>>> TITAN_1
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>> ROUND_1
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>  NSTEP =  100000  TIME(PS) =    300.000  TEMP(K) =
>>> 310.36
>>> > >>>  PRESS
>>> > >>> > >>>> =    0.0
>>> > >>> > >>>>  Etot  =    -66846.8801  EKtot  =    19675.0488  EPtot
>>> > >>> =
>>> > >>> > >>>> -86521.9289
>>> > >>> > >>>>  BOND  =      5760.2422  ANGLE  =    13619.8710  DIHED
>>> > >>> =
>>> > >>> > >>>> 16996.9045
>>> > >>> > >>>>  1-4 NB =      5645.6416  1-4 EEL =      1774.6967VDWAALS
>>> > >>> =
>>> > >>> > >>>> -13622.9343
>>> > >>> > >>>>  EELEC  =    -14168.1788  EGB    =  -102880.8089
>>> RESTRAINT
>>> > >>> =
>>> > >>> > >>>> 352.6371
>>> > >>> > >>>>  EAMBER (non-restraint)  =    -86874.5660
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>  ROUND_2
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>  NSTEP =  100000  TIME(PS) =    300.000  TEMP(K) =
>>> 311.00
>>> > >>>  PRESS
>>> > >>> > >>>> =    0.0
>>> > >>> > >>>>  Etot  =    -66874.9016  EKtot  =    19715.3633  EPtot
>>> > >>> =
>>> > >>> > >>>> -86590.2649
>>> > >>> > >>>>  BOND  =      5819.0667  ANGLE  =    13683.6633  DIHED
>>> > >>> =
>>> > >>> > >>>> 16918.8596
>>> > >>> > >>>>  1-4 NB =      5627.0932  1-4 EEL =      1576.9564VDWAALS
>>> > >>> =
>>> > >>> > >>>> -13747.1032
>>> > >>> > >>>>  EELEC  =    -15232.3280  EGB    =  -101590.5078
>>> RESTRAINT
>>> > >>> =
>>> > >>> > >>>> 354.0348
>>> > >>> > >>>>  EAMBER (non-restraint)  =    -86944.2997
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> >
>>> > >>>
>>> >
>>> ------------------------------------------------------------------------
>>> > >>> > >>>> ------
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>> Dne Mon, 03 Jun 2013 12:34:15 +0200 Marek Maly
>>> > >>> <marek.maly.ujep.cz>
>>> > >>> > >>>> napsal/-a:
>>> > >>> > >>>>
>>> > >>> > >>>> > OK, I will try NUCLEOSOME case as well with my latest
>>> > >>> > >>>> > settings : (driver 319.23, Amber12 bugfix 18 applied,
>>> cuda
>>> > >>> 5.5)
>>> > >>> > >>>> >
>>> > >>> > >>>> >    M.
>>> > >>> > >>>> >
>>> > >>> > >>>> >
>>> > >>> > >>>> >
>>> > >>> > >>>> >
>>> > >>> > >>>> > Dne Mon, 03 Jun 2013 11:51:46 +0200 ET <
>>> sketchfoot.gmail.com>
>>> > >>> > >>>> napsal/-a:
>>> > >>> > >>>> >
>>> > >>> > >>>> >> Hi all,
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> I reran the benchmark with Amber recompiled and at the
>>> latest
>>> > >>> > >>>> drivers
>>> > >>> > >>>> >> with
>>> > >>> > >>>> >> GPU in solo configuration yields the following results:
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> When I run the tests on  GPU-00_TeaNCake:
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> 1) All the tests (across 2x repeats)  finish
>>> successfully:
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> 2) The sdiff logs indicate that reproducibility  across
>>> the
>>> > >>> two
>>> > >>> > >>>> repeats
>>> > >>> > >>>> >> is
>>> > >>> > >>>> >> as follows:
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> GB_myoglobin: Reproducible across 1,000,000 steps
>>> > >>> > >>>> >> GB_nucleosome: No reproducibility shown from step 3,400
>>> > >>> onwards.
>>> > >>> > >>>> Also
>>> > >>> > >>>> >> outfile is not written properly - blank gaps appear
>>> where
>>> > >>> something
>>> > >>> > >>>> >> should
>>> > >>> > >>>> >> have been written.
>>> > >>> > >>>> >> GB_TRPCage: Reproducible  across 1,000,000 steps
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> PME_JAC_production_NVE: No reproducibility shown from
>>> step
>>> > >>> 35,000
>>> > >>> > >>>> >> onwards.
>>> > >>> > >>>> >> Also outfile is not written properly - blank gaps 
>>> appear
>>> > >>> where
>>> > >>> > >>>> something
>>> > >>> > >>>> >> should have been written.
>>> > >>> > >>>> >> PME_JAC_production_NPT:  No reproducibility shown from
>>> step
>>> > >>> 69,000
>>> > >>> > >>>> >> onwards.
>>> > >>> > >>>> >> Also outfile is not written properly - blank gaps 
>>> appear
>>> > >>> where
>>> > >>> > >>>> something
>>> > >>> > >>>> >> should have been written.
>>> > >>> > >>>> >> PME_FactorIX_production_NVE: Reproducible across 100k
>>> steps
>>> > >>> > >>>> >> PME_FactorIX_production_NPT: Reproducible across 100k
>>> steps
>>> > >>> > >>>> >> PME_Cellulose_production_NVE: Reproducible across 100k
>>> steps
>>> > >>> > >>>> >> PME_Cellulose_production_NPT:  No reproducibility shown
>>> from
>>> > >>> step
>>> > >>> > >>>> 17,000
>>> > >>> > >>>> >> onwards. Also outfile is not written properly - blank
>>> gaps
>>> > >>> appear
>>> > >>> > >>>> where
>>> > >>> > >>>> >> something should have been written.
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> #################################################
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> So it looks like the problem does occur in GB runs too.
>>> > >>> Though I
>>> > >>> > >>>> notice
>>> > >>> > >>>> >> that running in single GPU mode seems to make the
>>> problem
>>> > >>> appear
>>> > >>> > >>>> much
>>> > >>> > >>>> >> later
>>> > >>> > >>>> >> than it occurs with dual GPUs, though obviously this is
>>> quite
>>> > >>> > >>>> >> qualitative
>>> > >>> > >>>> >> and based only of 1 repeat.
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> br,
>>> > >>> > >>>> >> g
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> On 3 June 2013 10:28, ET <sketchfoot.gmail.com> wrote:
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>> Hi Marek,
>>> > >>> > >>>> >>>
>>> > >>> > >>>> >>> I think what you say about Valley and Heaven are true
>>> to a
>>> > >>> certain
>>> > >>> > >>>> >>> extent,
>>> > >>> > >>>> >>> but I think the links I posted to the EVGA overclock
>>> utility
>>> > >>> &
>>> > >>> MSI
>>> > >>> > >>>> >>> Kombuster are very good ways of testing the card. I
>>> don't
>>> > >>> know
>>> > >>> the
>>> > >>> > >>>> >>> details
>>> > >>> > >>>> >>> of memtestG80 and cuda_memtest, but it seems to me 
>>> that
>>> they
>>> > >>> are
>>> > >>> > >>>> >>> testing
>>> > >>> > >>>> >>> one very specific component. i.e. The Memory. As the
>>> > >>> graphics
>>> > >>> card
>>> > >>> > >>>> >>> consists
>>> > >>> > >>>> >>> of more than this, it is better to have a test that
>>> checks
>>> > >>> the
>>> > >>> > >>>> card
>>> > >>> > >>>> in
>>> > >>> > >>>> >>> a
>>> > >>> > >>>> >>> more holistic manner IMO. :)
>>> > >>> > >>>> >>>
>>> > >>> > >>>> >>> I think this argument is supported by the fact that
>>> tech
>>> > >>> support
>>> > >>> > >>>> at
>>> > >>> > >>>> the
>>> > >>> > >>>> >>> store used a program called FurMark to stress test the
>>> GPU.
>>> > >>> As
>>> > >>> the
>>> > >>> > >>>>
>>> > >>> > >>>> GPU
>>> > >>> > >>>> >>> I
>>> > >>> > >>>> >>> returned kept failing the benchmark, they realized in
>>> less
>>> > >>> than
>>> > >>> > >>>> half a
>>> > >>> > >>>> >>> day
>>> > >>> > >>>> >>> it was faulty, whilst I wasted a couple of days 
>>> mucking
>>> > >>> about
>>> > >>> with
>>> > >>> > >>>>
>>> > >>> > >>>> GPU
>>> > >>> > >>>> >>> memory tests using Gpuburn on linux.
>>> > >>> > >>>> >>>
>>> > >>> > >>>> >>> http://www.ozone3d.net/benchmarks/fur/
>>> > >>> > >>>> >>>
>>> > >>> > >>>> >>> I think if you are going to test on windows, you are
>>> better
>>> > >>> of
>>> > >>> > >>>> getting
>>> > >>> > >>>> >>> MSI
>>> > >>> > >>>> >>> Kombuster which I posted earlier. It contains the test
>>> > >>> contained
>>> > >>> > >>>> in
>>> > >>> > >>>> >>> Furmark
>>> > >>> > >>>> >>> and many additional tests that test the compute
>>> capability
>>> > >>> of
>>> > >>> the
>>> > >>> > >>>> card.
>>> > >>> > >>>> >>>
>>> > >>> > >>>> >>> best regards,
>>> > >>> > >>>> >>> g
>>> > >>> > >>>> >>>
>>> > >>> > >>>> >> _______________________________________________
>>> > >>> > >>>> >> AMBER mailing list
>>> > >>> > >>>> >> AMBER.ambermd.org
>>> > >>> > >>>> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> __________ Informace od ESET NOD32 Antivirus, verze
>>> databaze
>>> > >>> 8405
>>> > >>> > >>>> >> (20130603) __________
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> Tuto zpravu proveril ESET NOD32 Antivirus.
>>> > >>> > >>>> >>
>>> > >>> > >>>> >> http://www.eset.cz
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >>
>>> > >>> > >>>> >
>>> > >>> > >>>> >
>>> > >>> > >>>>
>>> > >>> > >>>>
>>> > >>> > >>>> --
>>> > >>> > >>>> Tato zpráva byla vytvořena převratným poštovním klientem
>>> Opery:
>>> > >>> > >>>> http://www.opera.com/mail/
>>> > >>> > >>>>
>>> > >>> > >>>> _______________________________________________
>>> > >>> > >>>> AMBER mailing list
>>> > >>> > >>>> AMBER.ambermd.org
>>> > >>> > >>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>> > >>>>
>>> > >>> > >>> _______________________________________________
>>> > >>> > >>> AMBER mailing list
>>> > >>> > >>> AMBER.ambermd.org
>>> > >>> > >>> http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>> > >>>
>>> > >>> > >>> __________ Informace od ESET NOD32 Antivirus, verze 
>>> databaze
>>> 8407
>>> > >>> > >>> (20130603) __________
>>> > >>> > >>>
>>> > >>> > >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>> > >>> > >>>
>>> > >>> > >>> http://www.eset.cz
>>> > >>> > >>>
>>> > >>> > >>>
>>> > >>> > >>>
>>> > >>> > >>
>>> > >>> > >>
>>> > >>> > >> --
>>> > >>> > >> Tato zpráva byla vytvořena převratným poštovním klientem
>>> Opery:
>>> > >>> > >> http://www.opera.com/mail/
>>> > >>> > >>
>>> > >>> > >> _______________________________________________
>>> > >>> > >> AMBER mailing list
>>> > >>> > >> AMBER.ambermd.org
>>> > >>> > >> http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>> > >
>>> > >>> > >
>>> > >>> > >
>>> > >>> > > _______________________________________________
>>> > >>> > > AMBER mailing list
>>> > >>> > > AMBER.ambermd.org
>>> > >>> > > http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>> > >
>>> > >>> > > __________ Informace od ESET NOD32 Antivirus, verze databaze
>>> 8408
>>> > >>> > > (20130603) __________
>>> > >>> > >
>>> > >>> > > Tuto zpravu proveril ESET NOD32 Antivirus.
>>> > >>> > >
>>> > >>> > > http://www.eset.cz
>>> > >>> > >
>>> > >>> > >
>>> > >>> > >
>>> > >>> >
>>> > >>> >
>>> > >>> > --
>>> > >>> > Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> > >>> > http://www.opera.com/mail/
>>> > >>> >
>>> > >>> > _______________________________________________
>>> > >>> > AMBER mailing list
>>> > >>> > AMBER.ambermd.org
>>> > >>> > http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>> >
>>> > >>> _______________________________________________
>>> > >>> AMBER mailing list
>>> > >>> AMBER.ambermd.org
>>> > >>> http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>>
>>> > >> _______________________________________________
>>> > >> AMBER mailing list
>>> > >> AMBER.ambermd.org
>>> > >> http://lists.ambermd.org/mailman/listinfo/amber
>>> > >>
>>> > >> __________ Informace od ESET NOD32 Antivirus, verze databaze 8408
>>> > >> (20130603) __________
>>> > >>
>>> > >> Tuto zpravu proveril ESET NOD32 Antivirus.
>>> > >>
>>> > >> http://www.eset.cz
>>> > >>
>>> > >>
>>> > >>
>>> > >
>>> > >
>>> >
>>> >
>>> > --
>>> > Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> > http://www.opera.com/mail/
>>> >
>>> > _______________________________________________
>>> > AMBER mailing list
>>> > AMBER.ambermd.org
>>> > http://lists.ambermd.org/mailman/listinfo/amber
>>> >
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8411
>> (20130604) __________
>>
>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>
>> http://www.eset.cz
>>
>>
>>
>
>


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 05 2013 - 05:30:03 PDT
Custom Search