Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Scott Le Grand <varelse2005.gmail.com>
Date: Sat, 1 Jun 2013 15:28:45 -0700

Good news: K20c is deterministic... The software is likely fine...

Run #1:
 NSTEP = 100000 TIME(PS) = 220.020 TEMP(K) = 300.09 PRESS =
0.0
 Etot = -443244.4940 EKtot = 257867.5625 EPtot =
-701112.0565
 BOND = 20165.5496 ANGLE = 53463.6530 DIHED =
23309.5801
 1-4 NB = 21772.6557 1-4 EEL = 743421.0008 VDWAALS =
96020.3605
 EELEC = -1659264.8562 EHBOND = 0.0000 RESTRAINT =
0.0000


Run #2:
NSTEP = 100000 TIME(PS) = 220.020 TEMP(K) = 300.09 PRESS =
0.0
 Etot = -443244.4940 EKtot = 257867.5625 EPtot =
-701112.0565
 BOND = 20165.5496 ANGLE = 53463.6530 DIHED =
23309.5801
 1-4 NB = 21772.6557 1-4 EEL = 743421.0008 VDWAALS =
96020.3605
 EELEC = -1659264.8562 EHBOND = 0.0000 RESTRAINT =
0.0000


The beauty of deterministic execution is that the likelihood of the final
energies being identical if determinism is broken is really really low so
just giving the final energy is enough for me...

Bad News: This strongly implies something bad is up with all of our Titans.

So question #1: What's the wattage and rating of your power supply? This
machine is OC silver. The machine with the misbehaving Titan is OC
Bronze. I'm going to put the Titan in this machine next...

But barring something having to do with having 14 versus 13 SMs, K20 and
GTX Titan are practically identical hardware both running SM 3.5

I will be filing a bug with NVIDIA this week to look into this so please
don't panic. We had an issue a few years back with GTX4xx and GTX5xx that
was indeed hardware and the eventual workaround ate maybe 0,.5% of
attainable performance.

But I think this situation does make a strong case for writing code that
provides bit-accurate reproducibility and it boggles my mind why doing so
isn't a no-brainer.

Scott


On Sat, Jun 1, 2013 at 2:42 PM, ET <sketchfoot.gmail.com> wrote:

> cheers! :)
>
>
> On 1 June 2013 22:13, Marek Maly <marek.maly.ujep.cz> wrote:
>
> > Here is eventually my install script which helps you
> > to automate a bit the installation of the whole AT13/Amber12
> > package.
> >
> > Edit it (e.g. AMBERHOME) to match your settings.
> >
> > The attached script should be placed in the same directory
> > with AT/Amber tar files (Amber12.tar.bz2, AmberTools13.tar.bz2) or
> > you have to set proper path to them.
> >
> > Good luck !
> >
> > M.
> >
> >
> > Dne Sat, 01 Jun 2013 23:12:26 +0200 ET <sketchfoot.gmail.com> napsal/-a:
> >
> > Ahhhh ok! I will recompile in that case. :(
> >>
> >>
> >> On 1 June 2013 21:32, Marek Maly <marek.maly.ujep.cz> wrote:
> >>
> >> If you just "apply" patches using just configure command/script,
> >>> only source code is edited. You need then recompile
> >>> to obtain updated binary files "which are doing the real work".
> >>>
> >>> M.
> >>>
> >>>
> >>>
> >>> Dne Sat, 01 Jun 2013 22:18:34 +0200 ET <sketchfoot.gmail.com>
> napsal/-a:
> >>>
> >>> > Hi,
> >>> >
> >>> > Just saw your messages and am already half way through running the
> >>> > benchmarks again with driver version 319.23 and latest AMBER patches
> >>> > applied. I did not recompile. I know Mareck suggested that
> >>> recompilation
> >>> > was the preferred route, but could one of the developers definitively
> >>> > confirm whether this is necessary as I would prefer to avoid this if
> >>> > possible.
> >>> >
> >>> > The additional questions I had are:
> >>> >
> >>> > 1) Does anyone know the reason for the gaps in the mdout files? Is
> this
> >>> > because both GPUs are running at the same time and there is some kind
> >>> of
> >>> > sync error in writing to the mdout file?
> >>> >
> >>> > 2) Are there any issues with running the cards simultaneously in PCIe
> >>> > v2.0
> >>> > slots operating at x16? There is no min requirement for PCIe v 3.0?
> >>> >
> >>> >
> >>> > br,
> >>> > g
> >>> >
> >>> >
> >>> > On 1 June 2013 20:34, Marek Maly <marek.maly.ujep.cz> wrote:
> >>> >
> >>> >> Sorry,
> >>> >>
> >>> >> regarding that double precision mode it perhaps mean to recompile
> >>> >> GPU amber part with "DPDP" configure setting. Am I right ?
> >>> >>
> >>> >> M.
> >>> >>
> >>> >>
> >>> >> Dne Sat, 01 Jun 2013 21:26:42 +0200 Marek Maly <marek.maly.ujep.cz>
> >>> >> napsal/-a:
> >>> >>
> >>> >> > Hi Scott,
> >>> >> >
> >>> >> > please how can I activate double-precision mode ?
> >>> >> >
> >>> >> > It is something which could be enabled using nvidia-smi ?
> >>> >> >
> >>> >> >
> >>> >> > Regarding to this your comment:
> >>> >> >
> >>> >> > --------
> >>> >> > The only thing that's funny about these tests is how little they
> >>> >> diverge.
> >>> >> > So I am *hoping* this might be a bug in cuFFT rather than a GTX
> >>> Titan
> >>> >> HW
> >>> >> > ------
> >>> >> >
> >>> >> > So you mean that the problem here might be in CUDA 5.0
> >>> implementation
> >>> >> of
> >>> >> > cuFFT ? If yes, it means that there is some kind of
> >>> "incompatibility"
> >>> >> > just
> >>> >> > in case of Titanes as with GTX 580, GTX 680 I have obtained
> perfect
> >>> >> > reproducibility
> >>> >> > in all tests (see my older posts in this thread).
> >>> >> >
> >>> >> > So perhaps would be good idea to try with CUDA 5.5 where maybe
> cuFFT
> >>> >> will
> >>> >> > be
> >>> >> > more "compatible" also with the new Titanes (or also GTX 780s).
> >>> >> >
> >>> >> > I will do this this experiment, but before I would like to check
> >>> >> > if the bugfix 18 will solve at least some of reported issues or
> not
> >>> >> > (still using CUDA 5.0 which is also the latest version officially
> >>> >> > compatible with
> >>> >> > Amber code as reported here http://ambermd.org/gpus/ )
> >>> >> >
> >>> >> > M.
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > The only thing that's funny about these tests is how little they
> >>> >> diverge.
> >>> >> > So I am *hoping* this might be a bug in cuFFT rather than a GTX
> >>> Titan
> >>> >> HW
> >>> >> >
> >>> >> >
> >>> >> > Dne Sat, 01 Jun 2013 20:46:29 +0200 Scott Le Grand
> >>> >> > <varelse2005.gmail.com>
> >>> >> > napsal/-a:
> >>> >> >
> >>> >> >> The acid test is running on a K20. If K20 is OK, then I really
> >>> think
> >>> >> >> (99.5%) Titan is hosed...
> >>> >> >>
> >>> >> >> If K20 shows the same irreproducible behavior, my life gets a
> whole
> >>> >> lot
> >>> >> >> more interesting...
> >>> >> >>
> >>> >> >> But along those lines, could you try activating double-precision
> >>> mode
> >>> >> >> and
> >>> >> >> retesting? That ought to clock the thing down significantly, and
> >>> if
> >>> >> it
> >>> >> >> suddenly runs reproducibly, then 99.5% this is a Titan HW
> issue...
> >>> >> >>
> >>> >> >> Scott
> >>> >> >>
> >>> >> >>
> >>> >> >> On Sat, Jun 1, 2013 at 11:26 AM, ET <sketchfoot.gmail.com>
> wrote:
> >>> >> >>
> >>> >> >>> Hi,
> >>> >> >>>
> >>> >> >>> I've put the graphics card into a machine with the working GTX
> >>> titan
> >>> >> >>> that I
> >>> >> >>> mentioned earlier.
> >>> >> >>>
> >>> >> >>> The Nvidia driver version is: 133.30
> >>> >> >>>
> >>> >> >>> Amber version is:
> >>> >> >>> AmberTools version 13.03
> >>> >> >>> Amber version 12.16
> >>> >> >>>
> >>> >> >>> I ran 50k steps with the amber benchmark using ig=43689 on both
> >>> >> cards.
> >>> >> >>> For
> >>> >> >>> the purpose of discriminating between them, the card I believe
> >>> >> (fingers
> >>> >> >>> crossed) is working is called GPU-00_TeaNCake, whilst the other
> >>> one
> >>> >> is
> >>> >> >>> called GPU-01_008.
> >>> >> >>>
> >>> >> >>> *When I run the tests on GPU-01_008:*
> >>> >> >>>
> >>> >> >>> 1) All the tests (across 2x repeats) finish apart from the
> >>> >> following
> >>> >> >>> which
> >>> >> >>> have the errors listed:
> >>> >> >>>
> >>> >> >>> ------------------------------**--------------
> >>> >> >>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
> >>> >> >>> Error: unspecified launch failure launching kernel kNLSkinTest
> >>> >> >>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
> >>> >> >>>
> >>> >> >>> ------------------------------**--------------
> >>> >> >>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
> >>> >> >>> cudaMemcpy GpuBuffer::Download failed unspecified launch
> failure
> >>> >> >>>
> >>> >> >>> ------------------------------**--------------
> >>> >> >>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
> >>> >> >>> Error: unspecified launch failure launching kernel kNLSkinTest
> >>> >> >>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
> >>> >> >>>
> >>> >> >>> ------------------------------**--------------
> >>> >> >>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
> >>> >> >>> cudaMemcpy GpuBuffer::Download failed unspecified launch
> failure
> >>> >> >>> grep: mdinfo.1GTX680: No such file or directory
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> 2) The sdiff logs indicate that reproducibility across the two
> >>> >> repeats
> >>> >> >>> is
> >>> >> >>> as follows:
> >>> >> >>>
> >>> >> >>> *GB_myoglobin: *Reproducible across 50k steps
> >>> >> >>> *GB_nucleosome:* Reproducible till step 7400
> >>> >> >>> *GB_TRPCage:* Reproducible across 50k steps
> >>> >> >>>
> >>> >> >>> *PME_JAC_production_NVE: *No reproducibility shown from step
> 1,000
> >>> >> >>> onwards
> >>> >> >>> *PME_JAC_production_NPT*: Reproducible till step 1,000. Also
> >>> >> outfile
> >>> >> >>> is
> >>> >> >>> not written properly - blank gaps appear where something should
> >>> have
> >>> >> >>> been
> >>> >> >>> written
> >>> >> >>>
> >>> >> >>> *PME_FactorIX_production_NVE:* Reproducible across 50k steps
> >>> >> >>> *PME_FactorIX_production_NPT:* Reproducible across 50k steps
> >>> >> >>>
> >>> >> >>> *PME_Cellulose_production_NVE:*** Failure means that both runs
> >>> do not
> >>> >> >>> finish
> >>> >> >>> (see point1)
> >>> >> >>> *PME_Cellulose_production_NPT: *Failure means that both runs do
> >>> not
> >>> >> >>> finish
> >>> >> >>> (see point1)
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >>
> >>> ##############################**##############################**
> >>> ###########################
> >>> >> >>>
> >>> >> >>> *When I run the tests on * *GPU-00_TeaNCake:*
> >>> >> >>> *
> >>> >> >>> *
> >>> >> >>> 1) All the tests (across 2x repeats) finish apart from the
> >>> >> following
> >>> >> >>> which
> >>> >> >>> have the errors listed:
> >>> >> >>> ------------------------------**-------
> >>> >> >>> JAC_PRODUCTION_NPT - 23,558 atoms PME
> >>> >> >>> PMEMD Terminated Abnormally!
> >>> >> >>> ------------------------------**-------
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> 2) The sdiff logs indicate that reproducibility across the two
> >>> >> repeats
> >>> >> >>> is
> >>> >> >>> as follows:
> >>> >> >>>
> >>> >> >>> *GB_myoglobin:* Reproducible across 50k steps
> >>> >> >>> *GB_nucleosome:* Reproducible across 50k steps
> >>> >> >>> *GB_TRPCage:* Reproducible across 50k steps
> >>> >> >>>
> >>> >> >>> *PME_JAC_production_NVE:* No reproducibility shown from step
> >>> 10,000
> >>> >> >>> onwards
> >>> >> >>> *PME_JAC_production_NPT: * No reproducibility shown from step
> >>> 10,000
> >>> >> >>> onwards. Also outfile is not written properly - blank gaps
> appear
> >>> >> where
> >>> >> >>> something should have been written. Repeat 2 Crashes with error
> >>> >> noted
> >>> >> >>> in 1.
> >>> >> >>>
> >>> >> >>> *PME_FactorIX_production_NVE:* No reproducibility shown from
> step
> >>> >> 9,000
> >>> >> >>> onwards
> >>> >> >>> *PME_FactorIX_production_NPT: *Reproducible across 50k steps
> >>> >> >>>
> >>> >> >>> *PME_Cellulose_production_NVE: *No reproducibility shown from
> step
> >>> >> >>> 5,000
> >>> >> >>> onwards
> >>> >> >>> *PME_Cellulose_production_NPT: ** *No reproducibility shown from
> >>> >> step
> >>> >> >>> 29,000 onwards. Also outfile is not written properly - blank
> gaps
> >>> >> >>> appear
> >>> >> >>> where something should have been written.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Out files and sdiff files are included as attatchments
> >>> >> >>>
> >>> >> >>> ##############################**###################
> >>> >> >>>
> >>> >> >>> So I'm going to update my nvidia driver to the latest version
> and
> >>> >> patch
> >>> >> >>> amber to the latest version and rerun the tests to see if there
> is
> >>> >> any
> >>> >> >>> improvement. Could someone let me know if it is necessary to
> >>> >> recompile
> >>> >> >>> any
> >>> >> >>> or all of AMBER after applying the bugfixes?
> >>> >> >>>
> >>> >> >>> Additionally, I'm going to run memory tests and heaven
> benchmarks
> >>> on
> >>> >> >>> the
> >>> >> >>> cards to check whether they are faulty or not.
> >>> >> >>>
> >>> >> >>> I'm thinking that there is a mix of hardware error/configuration
> >>> >> (esp
> >>> >> >>> in
> >>> >> >>> the case of GPU-01_008) and amber software error in this
> >>> situation.
> >>> >> >>> What do
> >>> >> >>> you guys think?
> >>> >> >>>
> >>> >> >>> Also am I right in thinking (from what Scott was saying) that
> all
> >>> >> the
> >>> >> >>> benchmarks should be reproducible across 50k steps but begin to
> >>> >> diverge
> >>> >> >>> at
> >>> >> >>> around 100K steps? Is there any difference from in setting *ig
> *to
> >>> >> an
> >>> >> >>> explicit number to removing it from the mdin file?
> >>> >> >>>
> >>> >> >>> br,
> >>> >> >>> g
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On 31 May 2013 23:45, ET <sketchfoot.gmail.com> wrote:
> >>> >> >>>
> >>> >> >>> > I don't need sysadmins, but sysadmins need me as it gives
> >>> purpose
> >>> >> to
> >>> >> >>> their
> >>> >> >>> > bureaucratic existence. A encountered evil if working in an
> >>> >> >>> institution
> >>> >> >>> or
> >>> >> >>> > comapny IMO. Good science and indiviguality being sacrificed
> for
> >>> >> >>> > standardisation and mediocrity in the intrerests of maintaing
> a
> >>> >> >>> system
> >>> >> >>> that
> >>> >> >>> > focusses on maintaining the system and not the objective.
> >>> >> >>> >
> >>> >> >>> > You need root to move fwd on these things, unfortunately. and
> >>> ppl
> >>> >> >>> with
> >>> >> >>> > root are kinda like your parents when you try to borrow money
> >>> from
> >>> >> >>> them .
> >>> >> >>> > age 12 :D
> >>> >> >>> > On May 31, 2013 9:34 PM, "Marek Maly" <marek.maly.ujep.cz>
> >>> wrote:
> >>> >> >>> >
> >>> >> >>> >> Sorry why do you need sysadmins :)) ?
> >>> >> >>> >>
> >>> >> >>> >> BTW here is the most recent driver:
> >>> >> >>> >>
> >>> >> >>> >>
> >>> >> http://www.nvidia.com/object/**linux-display-amd64-319.23-**
> >>> driver.html<
> http://www.nvidia.com/object/linux-display-amd64-319.23-driver.html>
> >>> >> >>> >>
> >>> >> >>> >> I do not remember anything easier than is to install driver
> >>> >> >>> (especially
> >>> >> >>> >> in case of binary (*.run) installer) :))
> >>> >> >>> >>
> >>> >> >>> >> M.
> >>> >> >>> >>
> >>> >> >>> >>
> >>> >> >>> >>
> >>> >> >>> >> Dne Fri, 31 May 2013 22:02:34 +0200 ET <sketchfoot.gmail.com
> >
> >>> >> >>> napsal/-a:
> >>> >> >>> >>
> >>> >> >>> >> > Yup. I know. I replaced a 680 and the everknowing sysadmins
> >>> are
> >>> >> >>> >> reluctant
> >>> >> >>> >> > to install drivers not in the repositoery as they are lame.
> >>> :(
> >>> >> >>> >> > On May 31, 2013 7:14 PM, "Marek Maly" <marek.maly.ujep.cz>
> >>> >> wrote:
> >>> >> >>> >> >>
> >>> >> >>> >> >> As I already wrote you,
> >>> >> >>> >> >>
> >>> >> >>> >> >> the first driver which properly/officially supports
> Titans,
> >>> >> >>> should be
> >>> >> >>> >> >> 313.26 .
> >>> >> >>> >> >>
> >>> >> >>> >> >> Anyway I am curious mainly about your 100K repetitive
> tests
> >>> >> with
> >>> >> >>> >> >> your Titan SC card. Especially in case of these tests (
> >>> >> JAC_NVE,
> >>> >> >>> >> JAC_NPT
> >>> >> >>> >> >> and CELLULOSE_NVE ) where
> >>> >> >>> >> >> my Titans SC randomly failed or succeeded. In
> FACTOR_IX_NVE,
> >>> >> >>> >> >> FACTOR_IX_NPT
> >>> >> >>> >> >> tests both
> >>> >> >>> >> >> my cards are perfectly stable (independently from drv.
> >>> >> version)
> >>> >> >>> and
> >>> >> >>> >> also
> >>> >> >>> >> >> the runs
> >>> >> >>> >> >> are perfectly or almost perfectly reproducible.
> >>> >> >>> >> >>
> >>> >> >>> >> >> Also if your test will crash please report the eventual
> >>> errs.
> >>> >> >>> >> >>
> >>> >> >>> >> >> To this moment I have this actual library of errs on my
> >>> >> Titans SC
> >>> >> >>> GPUs.
> >>> >> >>> >> >>
> >>> >> >>> >> >> #1 ERR writtent in mdout:
> >>> >> >>> >> >> ------
> >>> >> >>> >> >> | ERROR: max pairlist cutoff must be less than unit cell
> >>> max
> >>> >> >>> sphere
> >>> >> >>> >> >> radius!
> >>> >> >>> >> >> ------
> >>> >> >>> >> >>
> >>> >> >>> >> >>
> >>> >> >>> >> >> #2 no ERR writtent in mdout, ERR written in standard
> output
> >>> >> >>> (nohup.out)
> >>> >> >>> >> >>
> >>> >> >>> >> >> ----
> >>> >> >>> >> >> Error: unspecified launch failure launching kernel
> >>> kNLSkinTest
> >>> >> >>> >> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
> >>> >> failure
> >>> >> >>> >> >> ----
> >>> >> >>> >> >>
> >>> >> >>> >> >>
> >>> >> >>> >> >> #3 no ERR writtent in mdout, ERR written in standard
> output
> >>> >> >>> (nohup.out)
> >>> >> >>> >> >> ----
> >>> >> >>> >> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
> >>> >> failure
> >>> >> >>> >> >> ----
> >>> >> >>> >> >>
> >>> >> >>> >> >> Another question, regarding your Titan SC, it is also EVGA
> >>> as
> >>> >> in
> >>> >> >>> my
> >>> >> >>> >> case
> >>> >> >>> >> >> or it is another producer ?
> >>> >> >>> >> >>
> >>> >> >>> >> >> Thanks,
> >>> >> >>> >> >>
> >>> >> >>> >> >> M.
> >>> >> >>> >> >>
> >>> >> >>> >> >>
> >>> >> >>> >> >>
> >>> >> >>> >> >> Dne Fri, 31 May 2013 19:17:03 +0200 ET <
> >>> sketchfoot.gmail.com>
> >>> >> >>> >> napsal/-a:
> >>> >> >>> >> >>
> >>> >> >>> >> >> > Well, this is interesting...
> >>> >> >>> >> >> >
> >>> >> >>> >> >> > I ran 50k steps on the Titan on the other machine with
> >>> >> driver
> >>> >> >>> 310.44
> >>> >> >>> >> >> and
> >>> >> >>> >> >> > it
> >>> >> >>> >> >> > passed all the GB steps. i.e totally identical results
> >>> over
> >>> >> two
> >>> >> >>> >> >> repeats.
> >>> >> >>> >> >> > However, it failed all the PME tests after step 1000.
> I'm
> >>> >> going
> >>> >> >>> to
> >>> >> >>> >> > update
> >>> >> >>> >> >> > the driver and test it again.
> >>> >> >>> >> >> >
> >>> >> >>> >> >> > Files included as attachments.
> >>> >> >>> >> >> >
> >>> >> >>> >> >> > br,
> >>> >> >>> >> >> > g
> >>> >> >>> >> >> >
> >>> >> >>> >> >> >
> >>> >> >>> >> >> > On 31 May 2013 16:40, Marek Maly <marek.maly.ujep.cz>
> >>> wrote:
> >>> >> >>> >> >> >
> >>> >> >>> >> >> >> One more thing,
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> can you please check under which frequency is running
> >>> that
> >>> >> >>> your
> >>> >> >>> >> >> titan ?
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> As the base frequency of normal Titans is 837MHz and
> the
> >>> >> Boost
> >>> >> >>> one
> >>> >> >>> >> is
> >>> >> >>> >> >> >> 876MHz I
> >>> >> >>> >> >> >> assume that yor GPU is running automatically also under
> >>> >> it's
> >>> >> >>> boot
> >>> >> >>> >> >> >> frequency (876MHz).
> >>> >> >>> >> >> >> You can find this information e.g. in Amber mdout file.
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> You also mentioned some crashes in your previous email.
> >>> >> Your
> >>> >> >>> ERRs
> >>> >> >>> >> >> were
> >>> >> >>> >> >> >> something like those here:
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> #1 ERR writtent in mdout:
> >>> >> >>> >> >> >> ------
> >>> >> >>> >> >> >> | ERROR: max pairlist cutoff must be less than unit
> >>> cell
> >>> >> max
> >>> >> >>> >> sphere
> >>> >> >>> >> >> >> radius!
> >>> >> >>> >> >> >> ------
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> #2 no ERR writtent in mdout, ERR written in standard
> >>> output
> >>> >> >>> >> >> (nohup.out)
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> ----
> >>> >> >>> >> >> >> Error: unspecified launch failure launching kernel
> >>> >> kNLSkinTest
> >>> >> >>> >> >> >> cudaFree GpuBuffer::Deallocate failed unspecified
> launch
> >>> >> >>> failure
> >>> >> >>> >> >> >> ----
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> #3 no ERR writtent in mdout, ERR written in standard
> >>> output
> >>> >> >>> >> >> (nohup.out)
> >>> >> >>> >> >> >> ----
> >>> >> >>> >> >> >> cudaMemcpy GpuBuffer::Download failed unspecified
> launch
> >>> >> >>> failure
> >>> >> >>> >> >> >> ----
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> or you obtained some new/additional errs ?
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> M.
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> Dne Fri, 31 May 2013 17:30:57 +0200 filip fratev
> >>> >> >>> >> >> <filipfratev.yahoo.com
> >>> >> >>> >> >>
> >>> >> >>> >> >> >> napsal/-a:
> >>> >> >>> >> >> >>
> >>> >> >>> >> >> >> > Hi,
> >>> >> >>> >> >> >> > This is what I obtained for 50K tests and "normal"
> >>> >> GTXTitan:
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > run1:
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> ------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > A V E R A G E S O V E R 50 S T E P S
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
> >>> >> >>> 299.87
> >>> >> >>> >> >> PRESS
> >>> >> >>> >> >> >> > = 0.0
> >>> >> >>> >> >> >> > Etot = -443237.1079 EKtot = 257679.9750
> >>> EPtot
> >>> >> >>> =
> >>> >> >>> >> >> >> > -700917.0829
> >>> >> >>> >> >> >> > BOND = 20193.1856 ANGLE = 53517.5432
> >>> >> >>> DIHED =
> >>> >> >>> >> >> >> > 23575.4648
> >>> >> >>> >> >> >> > 1-4 NB = 21759.5524 1-4 EEL = 742552.5939
> >>> >> >>> VDWAALS =
> >>> >> >>> >> >> >> > 96286.7714
> >>> >> >>> >> >> >> > EELEC = -1658802.1941 EHBOND = 0.0000
> >>> >> >>> RESTRAINT =
> >>> >> >>> >> >> >> > 0.0000
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> ------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > R M S F L U C T U A T I O N S
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
> >>> >> >>> 0.33
> >>> >> >>> >> >> PRESS
> >>> >> >>> >> >> >> > = 0.0
> >>> >> >>> >> >> >> > Etot = 11.2784 EKtot = 284.8999
> >>> >> >>> EPtot =
> >>> >> >>> >> >> >> > 289.0773
> >>> >> >>> >> >> >> > BOND = 136.3417 ANGLE = 214.0054
> >>> >> >>> DIHED =
> >>> >> >>> >> >> >> > 59.4893
> >>> >> >>> >> >> >> > 1-4 NB = 58.5891 1-4 EEL = 330.5400
> >>> >> >>> VDWAALS =
> >>> >> >>> >> >> >> > 559.2079
> >>> >> >>> >> >> >> > EELEC = 743.8771 EHBOND = 0.0000
> >>> >> >>> RESTRAINT =
> >>> >> >>> >> >> >> > 0.0000
> >>> >> >>> >> >> >> > |E(PBS) = 21.8119
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> ------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > run2:
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> ------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > A V E R A G E S O V E R 50 S T E P S
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
> >>> >> >>> 299.89
> >>> >> >>> >> >> PRESS
> >>> >> >>> >> >> >> > = 0.0
> >>> >> >>> >> >> >> > Etot = -443240.0999 EKtot = 257700.0950
> >>> >> >>> EPtot =
> >>> >> >>> >> >> >> > -700940.1949
> >>> >> >>> >> >> >> > BOND = 20241.9174 ANGLE = 53644.6694
> >>> >> >>> DIHED =
> >>> >> >>> >> >> >> > 23541.3737
> >>> >> >>> >> >> >> > 1-4 NB = 21803.1898 1-4 EEL = 742754.2254
> >>> >> >>> VDWAALS =
> >>> >> >>> >> >> >> > 96298.8308
> >>> >> >>> >> >> >> > EELEC = -1659224.4013 EHBOND = 0.0000
> >>> >> >>> RESTRAINT =
> >>> >> >>> >> >> >> > 0.0000
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> ------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > R M S F L U C T U A T I O N S
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
> >>> >> >>> 0.41
> >>> >> >>> >> >> PRESS
> >>> >> >>> >> >> >> > = 0.0
> >>> >> >>> >> >> >> > Etot = 10.7633 EKtot = 348.2819
> >>> >> >>> EPtot =
> >>> >> >>> >> >> >> > 353.9918
> >>> >> >>> >> >> >> > BOND = 106.5314 ANGLE = 196.7052
> >>> >> >>> DIHED =
> >>> >> >>> >> >> >> > 69.7476
> >>> >> >>> >> >> >> > 1-4 NB = 60.3435 1-4 EEL = 400.7466
> >>> >> >>> VDWAALS =
> >>> >> >>> >> >> >> > 462.7763
> >>> >> >>> >> >> >> > EELEC = 651.9857 EHBOND = 0.0000
> >>> >> >>> RESTRAINT =
> >>> >> >>> >> >> >> > 0.0000
> >>> >> >>> >> >> >> > |E(PBS) = 17.0642
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> ------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> ------------------------------**------------------------------**
> >>> --------------------
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > ______________________________**__
> >>> >> >>> >> >> >> > From: Marek Maly <marek.maly.ujep.cz>
> >>> >> >>> >> >> >> > To: AMBER Mailing List <amber.ambermd.org>
> >>> >> >>> >> >> >> > Sent: Friday, May 31, 2013 3:34 PM
> >>> >> >>> >> >> >> > Subject: Re: [AMBER] experiences with EVGA GTX TITAN
> >>> >> >>> Superclocked
> >>> >> >>> >> -
> >>> >> >>> >> >> >> > memtestG80 - UNDERclocking in Linux ?
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > Hi here are my 100K results for driver 313.30 (and
> >>> still
> >>> >> >>> Cuda
> >>> >> >>> >> 5.0).
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > The results are rather similar to those obtained
> >>> >> >>> >> >> >> > under my original driver 319.17 (see the first table
> >>> >> >>> >> >> >> > which I sent in this thread).
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > M.
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> > Dne Fri, 31 May 2013 12:29:59 +0200 Marek Maly <
> >>> >> >>> >> marek.maly.ujep.cz>
> >>> >> >>> >> >> >> > napsal/-a:
> >>> >> >>> >> >> >> >
> >>> >> >>> >> >> >> >> Hi,
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> please try to run at lest 100K tests twice to verify
> >>> >> exact
> >>> >> >>> >> >> >> >> reproducibility
> >>> >> >>> >> >> >> >> of the results on the given card. If you find in any
> >>> >> mdin
> >>> >> >>> file
> >>> >> >>> >> >> ig=-1
> >>> >> >>> >> >> >> >> just
> >>> >> >>> >> >> >> >> delete it to ensure that you are using the identical
> >>> >> random
> >>> >> >>> seed
> >>> >> >>> >> >> for
> >>> >> >>> >> >> >> >> both
> >>> >> >>> >> >> >> >> runs. You can eventually omit NUCLEOSOME test
> >>> >> >>> >> >> >> >> as it is too much time consuming.
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> Driver 310.44 ?????
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> As far as I know the proper support for titans is
> from
> >>> >> >>> version
> >>> >> >>> >> > 313.26
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> see e.g. here :
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >>
> >>> >> >>> >> >
> >>> >> >>> >>
> >>> >> >>>
> >>> >>
> >>> http://www.geeks3d.com/**20130306/nvidia-releases-r313-**
> >>> 26-for-linux-with-gtx-titan-**support/<
> http://www.geeks3d.com/20130306/nvidia-releases-r313-26-for-linux-with-gtx-titan-support/
> >
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> BTW: On my site downgrade to drv. 313.30 did not
> >>> solved
> >>> >> the
> >>> >> >>> >> >> >> situation, I
> >>> >> >>> >> >> >> >> will post
> >>> >> >>> >> >> >> >> my results soon here.
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> M.
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >> Dne Fri, 31 May 2013 12:21:21 +0200 ET
> >>> >> >>> <sketchfoot.gmail.com>
> >>> >> >>> >> >> >> napsal/-a:
> >>> >> >>> >> >> >> >>
> >>> >> >>> >> >> >> >>> ps. I have another install of amber on another
> >>> computer
> >>> >> >>> with a
> >>> >> >>> >> >> >> >>> different
> >>> >> >>> >> >> >> >>> Titan and different Driver Version: 310.44.
> >>> >> >>> >> >> >> >>>
> >>> >> >>> >> >> >> >>> In the interests of thrashing the proverbial horse,
> >>> >> I'll
> >>> >> >>> run
> >>> >> >>> the
> >>> >> >>> >> >> >> >>> benchmark
> >>> >> >>> >> >> >> >>> for 50k steps. :P
> >>> >> >>> >> >> >> >>>
> >>> >> >>> >> >> >> >>> br,
> >>> >> >>> >> >> >> >>> g
> >>> >> >>> >> >> >> >>>
> >>> >> >>> >> >> >> >>>
> >>> >> >>> >> >> >> >>> On 31 May 2013 11:17, ET <sketchfoot.gmail.com>
> >>> wrote:
> >>> >> >>> >> >> >> >>>
> >>> >> >>> >> >> >> >>>> Hi, I just ran the Amber benchmark for the default
> >>> >> (10000
> >>> >> >>> >> steps)
> >>> >> >>> >> >> >> on my
> >>> >> >>> >> >> >> >>>> Titan.
> >>> >> >>> >> >> >> >>>>
> >>> >> >>> >> >> >> >>>> Using sdiff -sB showed that the two runs were
> >>> >> completely
> >>> >> >>> >> > identical.
> >>> >> >>> >> >> >> >>>> I've
> >>> >> >>> >> >> >> >>>> attached compressed files of the mdout & diff
> files.
> >>> >> >>> >> >> >> >>>>
> >>> >> >>> >> >> >> >>>> br,
> >>> >> >>> >> >> >> >>>> g
> >>> >> >>> >> >> >> >>>>
> >>> >> >>> >> >> >> >>>>
> >>> >> >>> >> >> >> >>>> On 30 May 2013 23:41, Marek Maly <
> >>> marek.maly.ujep.cz>
> >>> >> >>> wrote:
> >>> >> >>> >> >> >> >>>>
> >>> >> >>> >> >> >> >>>>> OK, let's see. The eventual downclocking I see as
> >>> the
> >>> >> >>> very
> >>> >> >>> >> last
> >>> >> >>> >> >> >> >>>>> possibility
> >>> >> >>> >> >> >> >>>>> (if I don't decide for RMAing). But now still
> some
> >>> >> other
> >>> >> >>> >> >> >> experiments
> >>> >> >>> >> >> >> >>>>> are
> >>> >> >>> >> >> >> >>>>> available :))
> >>> >> >>> >> >> >> >>>>> I just started 100K tests under 313.30 driver.
> For
> >>> >> today
> >>> >> >>> good
> >>> >> >>> >> >> >> night
> >>> >> >>> >> >> >> >>>>> ...
> >>> >> >>> >> >> >> >>>>>
> >>> >> >>> >> >> >> >>>>> M.
> >>> >> >>> >> >> >> >>>>>
> >>> >> >>> >> >> >> >>>>> Dne Fri, 31 May 2013 00:45:49 +0200 Scott Le
> Grand
> >>> >> >>> >> >> >> >>>>> <varelse2005.gmail.com
> >>> >> >>> >> >> >> >>>>> >
> >>> >> >>> >> >> >> >>>>> napsal/-a:
> >>> >> >>> >> >> >> >>>>>
> >>> >> >>> >> >> >> >>>>> > It will be very interesting if this behavior
> >>> >> persists
> >>> >> >>> after
> >>> >> >>> >> >> >> >>>>> downclocking.
> >>> >> >>> >> >> >> >>>>> >
> >>> >> >>> >> >> >> >>>>> > But right now, Titan 0 *looks* hosed and Titan
> 1
> >>> >> >>> *looks*
> >>> >> >>> >> like
> >>> >> >>> >> > it
> >>> >> >>> >> >> >> >>>>> needs
> >>> >> >>> >> >> >> >>>>> > downclocking...
> >>> >> >>> >> >> >> >>>>> > On May 30, 2013 3:20 PM, "Marek Maly"
> >>> >> >>> <marek.maly.ujep.cz
> >>> >> >>> >
> >>> >> >>> >> >> >> wrote:
> >>> >> >>> >> >> >> >>>>> >
> >>> >> >>> >> >> >> >>>>> >> Hi all,
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> here are my results from the 500K steps 2 x
> >>> >> repeated
> >>> >> >>> >> > benchmarks
> >>> >> >>> >> >> >> >>>>> >> under 319.23 driver and still Cuda 5.0 (see
> the
> >>> >> >>> attached
> >>> >> >>> >> >> table
> >>> >> >>> >> >> >> ).
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> It is hard to say if the results are better or
> >>> >> worse
> >>> >> >>> than
> >>> >> >>> >> in
> >>> >> >>> >> > my
> >>> >> >>> >> >> >> >>>>> >> previous 100K test under driver 319.17.
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> While results from Cellulose test were
> improved
> >>> >> and
> >>> >> >>> the
> >>> >> >>> >> > TITAN_1
> >>> >> >>> >> >> >> >>>>> card
> >>> >> >>> >> >> >> >>>>> >> even
> >>> >> >>> >> >> >> >>>>> >> successfully finished all 500K steps moreover
> >>> with
> >>> >> >>> exactly
> >>> >> >>> >> >> the
> >>> >> >>> >> >> >> >>>>> same
> >>> >> >>> >> >> >> >>>>> >> final
> >>> >> >>> >> >> >> >>>>> >> energy !
> >>> >> >>> >> >> >> >>>>> >> (TITAN_0 at least finished more than 100K
> steps
> >>> >> and
> >>> >> >>> in
> >>> >> >>> >> >> RUN_01
> >>> >> >>> >> >> >> even
> >>> >> >>> >> >> >> >>>>> more
> >>> >> >>> >> >> >> >>>>> >> than 400K steps)
> >>> >> >>> >> >> >> >>>>> >> In JAC_NPT test no GPU was able to finish at
> >>> least
> >>> >> >>> 100K
> >>> >> >>> >> >> steps
> >>> >> >>> >> >> >> and
> >>> >> >>> >> >> >> >>>>> the
> >>> >> >>> >> >> >> >>>>> >> results from JAC_NVE
> >>> >> >>> >> >> >> >>>>> >> test are also not too much convincing.
> >>> >> FACTOR_IX_NVE
> >>> >> >>> and
> >>> >> >>> >> >> >> >>>>> FACTOR_IX_NPT
> >>> >> >>> >> >> >> >>>>> >> were successfully
> >>> >> >>> >> >> >> >>>>> >> finished with 100% reproducibility in
> >>> >> FACTOR_IX_NPT
> >>> >> >>> case
> >>> >> >>> >> >> (on
> >>> >> >>> >> >> >> both
> >>> >> >>> >> >> >> >>>>> >> cards)
> >>> >> >>> >> >> >> >>>>> >> and almost
> >>> >> >>> >> >> >> >>>>> >> 100% reproducibility in case of FACTOR_IX_NVE
> >>> >> (again
> >>> >> >>> 100%
> >>> >> >>> >> in
> >>> >> >>> >> >> >> case
> >>> >> >>> >> >> >> >>>>> of
> >>> >> >>> >> >> >> >>>>> >> TITAN_1). TRPCAGE, MYOGLOBIN
> >>> >> >>> >> >> >> >>>>> >> again finished without any problem with 100%
> >>> >> >>> >> >> reproducibility.
> >>> >> >>> >> >> >> >>>>> NUCLEOSOME
> >>> >> >>> >> >> >> >>>>> >> test was not done
> >>> >> >>> >> >> >> >>>>> >> this time due to high time requirements. If
> you
> >>> >> find
> >>> >> >>> in
> >>> >> >>> the
> >>> >> >>> >> >> >> table
> >>> >> >>> >> >> >> >>>>> >> positive
> >>> >> >>> >> >> >> >>>>> >> number finishing with
> >>> >> >>> >> >> >> >>>>> >> K (which means "thousands") it means the last
> >>> >> number
> >>> >> >>> of
> >>> >> >>> >> step
> >>> >> >>> >> >> >> >>>>> written in
> >>> >> >>> >> >> >> >>>>> >> mdout before crash.
> >>> >> >>> >> >> >> >>>>> >> Below are all the 3 types of detected errs
> with
> >>> >> >>> relevant
> >>> >> >>> >> >> >> >>>>> systems/rounds
> >>> >> >>> >> >> >> >>>>> >> where the given err
> >>> >> >>> >> >> >> >>>>> >> appeared.
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> Now I will try just 100K tests under ETs
> >>> favourite
> >>> >> >>> driver
> >>> >> >>> >> >> >> version
> >>> >> >>> >> >> >> >>>>> 313.30
> >>> >> >>> >> >> >> >>>>> >> :)) and then
> >>> >> >>> >> >> >> >>>>> >> I will eventually try to experiment with cuda
> >>> 5.5
> >>> >> >>> which I
> >>> >> >>> >> >> >> already
> >>> >> >>> >> >> >> >>>>> >> downloaded from the
> >>> >> >>> >> >> >> >>>>> >> cuda zone ( I had to become cuda developer for
> >>> >> this
> >>> >> >>> :)) )
> >>> >> >>> >> >> BTW
> >>> >> >>> >> >> >> ET
> >>> >> >>> >> >> >> >>>>> thanks
> >>> >> >>> >> >> >> >>>>> >> for the frequency info !
> >>> >> >>> >> >> >> >>>>> >> and I am still ( perhaps not alone :)) ) very
> >>> >> curious
> >>> >> >>> about
> >>> >> >>> >> >> >> your 2
> >>> >> >>> >> >> >> >>>>> x
> >>> >> >>> >> >> >> >>>>> >> repeated Amber benchmark tests with
> superclocked
> >>> >> >>> Titan.
> >>> >> >>> >> >> Indeed
> >>> >> >>> >> >> >> >>>>> that
> >>> >> >>> >> >> >> >>>>> I
> >>> >> >>> >> >> >> >>>>> am
> >>> >> >>> >> >> >> >>>>> >> very curious also about that Ross "hot" patch.
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> M.
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> ERRORS DETECTED DURING THE 500K steps tests
> with
> >>> >> >>> driver
> >>> >> >>> >> >> 319.23
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> #1 ERR writtent in mdout:
> >>> >> >>> >> >> >> >>>>> >> ------
> >>> >> >>> >> >> >> >>>>> >> | ERROR: max pairlist cutoff must be less
> than
> >>> >> unit
> >>> >> >>> cell
> >>> >> >>> >> >> max
> >>> >> >>> >> >> >> >>>>> sphere
> >>> >> >>> >> >> >> >>>>> >> radius!
> >>> >> >>> >> >> >> >>>>> >> ------
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> TITAN_0 ROUND_1 JAC_NPT (at least 5000 steps
> >>> >> >>> successfully
> >>> >> >>> >> > done
> >>> >> >>> >> >> >> >>>>> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NPT (at least 8000 steps
> >>> >> >>> successfully
> >>> >> >>> >> > done
> >>> >> >>> >> >> >> >>>>> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> #2 no ERR writtent in mdout, ERR written in
> >>> >> standard
> >>> >> >>> output
> >>> >> >>> >> >> >> >>>>> (nohup.out)
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> ----
> >>> >> >>> >> >> >> >>>>> >> Error: unspecified launch failure launching
> >>> kernel
> >>> >> >>> >> >> kNLSkinTest
> >>> >> >>> >> >> >> >>>>> >> cudaFree GpuBuffer::Deallocate failed
> >>> unspecified
> >>> >> >>> launch
> >>> >> >>> >> >> >> failure
> >>> >> >>> >> >> >> >>>>> >> ----
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> TITAN_0 ROUND_1 CELLULOSE_NVE (at least 437
> 000
> >>> >> steps
> >>> >> >>> >> >> >> successfully
> >>> >> >>> >> >> >> >>>>> done
> >>> >> >>> >> >> >> >>>>> >> before crash)
> >>> >> >>> >> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NVE (at least 162 000
> steps
> >>> >> >>> >> >> successfully
> >>> >> >>> >> >> >> done
> >>> >> >>> >> >> >> >>>>> >> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >> TITAN_0 ROUND_2 CELLULOSE_NVE (at least 117
> 000
> >>> >> steps
> >>> >> >>> >> >> >> successfully
> >>> >> >>> >> >> >> >>>>> done
> >>> >> >>> >> >> >> >>>>> >> before crash)
> >>> >> >>> >> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NVE (at least 119 000
> steps
> >>> >> >>> >> >> successfully
> >>> >> >>> >> >> >> done
> >>> >> >>> >> >> >> >>>>> >> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NVE (at least 43 000
> steps
> >>> >> >>> >> successfully
> >>> >> >>> >> >> >> done
> >>> >> >>> >> >> >> >>>>> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> #3 no ERR writtent in mdout, ERR written in
> >>> >> standard
> >>> >> >>> output
> >>> >> >>> >> >> >> >>>>> (nohup.out)
> >>> >> >>> >> >> >> >>>>> >> ----
> >>> >> >>> >> >> >> >>>>> >> cudaMemcpy GpuBuffer::Download failed
> >>> unspecified
> >>> >> >>> launch
> >>> >> >>> >> >> >> failure
> >>> >> >>> >> >> >> >>>>> >> ----
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NPT (at least 77 000
> steps
> >>> >> >>> >> successfully
> >>> >> >>> >> >> >> done
> >>> >> >>> >> >> >> >>>>> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NPT (at least 58 000
> steps
> >>> >> >>> >> successfully
> >>> >> >>> >> >> >> done
> >>> >> >>> >> >> >> >>>>> before
> >>> >> >>> >> >> >> >>>>> >> crash)
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> Dne Thu, 30 May 2013 21:27:17 +0200 Scott Le
> >>> Grand
> >>> >> >>> >> >> >> >>>>> >> <varelse2005.gmail.com>
> >>> >> >>> >> >> >> >>>>> >> napsal/-a:
> >>> >> >>> >> >> >> >>>>> >>
> >>> >> >>> >> >> >> >>>>> >> Oops meant to send that to Jason...
> >>> >> >>> >> >> >> >>>>> >>>
> >>> >> >>> >> >> >> >>>>> >>> Anyway, before we all panic, we need to get
> >>> K20's
> >>> >> >>> behavior
> >>> >> >>> >> >> >> >>>>> analyzed
> >>> >> >>> >> >> >> >>>>> >>> here.
> >>> >> >>> >> >> >> >>>>> >>> If it's deterministic, this truly is a
> hardware
> >>> >> >>> issue.
> >>> >> >>> If
> >>> >> >>> >> >> >> not,
> >>> >> >>> >> >> >> >>>>> then
> >>> >> >>> >> >> >> >>>>> it
> >>> >> >>> >> >> >> >>>>> >>> gets interesting because 680 is deterministic
> >>> as
> >>> >> far
> >>> >> >>> as
> >>> >> >>> I
> >>> >> >>> >> >> can
> >>> >> >>> >> >> >> >>>>> tell...
> >>> >> >>> >> >> >> >>>>> >>> On May 30, 2013 12:24 PM, "Scott Le Grand"
> >>> >> >>> >> >> >> >>>>> <varelse2005.gmail.com>
> >>> >> >>> >> >> >> >>>>> >>> wrote:
> >>> >> >>> >> >> >> >>>>> >>>
> >>> >> >>> >> >> >> >>>>> >>> If the errors are not deterministically
> >>> >> triggered,
> >>> >> >>> they
> >>> >> >>> >> >> >> probably
> >>> >> >>> >> >> >> >>>>> >>> won't be
> >>> >> >>> >> >> >> >>>>> >>>> fixed by the patch, alas...
> >>> >> >>> >> >> >> >>>>> >>>> On May 30, 2013 12:15 PM, "Jason Swails"
> >>> >> >>> >> >> >> >>>>> <jason.swails.gmail.com>
> >>> >> >>> >> >> >> >>>>> >>>> wrote:
> >>> >> >>> >> >> >> >>>>> >>>>
> >>> >> >>> >> >> >> >>>>> >>>> Just a reminder to everyone based on what
> >>> Ross
> >>> >> >>> said:
> >>> >> >>> >> >> there
> >>> >> >>> >> >> >> is a
> >>> >> >>> >> >> >> >>>>> >>>> pending
> >>> >> >>> >> >> >> >>>>> >>>>> patch to pmemd.cuda that will be coming out
> >>> >> >>> shortly
> >>> >> >>> >> >> (maybe
> >>> >> >>> >> >> >> even
> >>> >> >>> >> >> >> >>>>> >>>>> within
> >>> >> >>> >> >> >> >>>>> >>>>> hours). It's entirely possible that
> several
> >>> of
> >>> >> >>> these
> >>> >> >>> >> > errors
> >>> >> >>> >> >> >> >>>>> are
> >>> >> >>> >> >> >> >>>>> >>>>> fixed
> >>> >> >>> >> >> >> >>>>> >>>>> by
> >>> >> >>> >> >> >> >>>>> >>>>> this patch.
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>> All the best,
> >>> >> >>> >> >> >> >>>>> >>>>> Jason
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>> On Thu, May 30, 2013 at 2:46 PM, filip
> >>> fratev <
> >>> >> >>> >> >> >> >>>>> filipfratev.yahoo.com>
> >>> >> >>> >> >> >> >>>>> >>>>> wrote:
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>> > I have observed the same crashes from
> time
> >>> to
> >>> >> >>> time.
> >>> >> >>> I
> >>> >> >>> >> > will
> >>> >> >>> >> >> >> >>>>> run
> >>> >> >>> >> >> >> >>>>> >>>>> cellulose
> >>> >> >>> >> >> >> >>>>> >>>>> > nve for 100k and will past results here.
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > All the best,
> >>> >> >>> >> >> >> >>>>> >>>>> > Filip
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > ______________________________****__
> >>> >> >>> >> >> >> >>>>> >>>>> > From: Scott Le Grand <
> >>> varelse2005.gmail.com
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > To: AMBER Mailing List <
> amber.ambermd.org>
> >>> >> >>> >> >> >> >>>>> >>>>> > Sent: Thursday, May 30, 2013 9:01 PM
> >>> >> >>> >> >> >> >>>>> >>>>> > Subject: Re: [AMBER] experiences with
> EVGA
> >>> >> GTX
> >>> >> >>> TITAN
> >>> >> >>> >> >> >> >>>>> Superclocked
> >>> >> >>> >> >> >> >>>>> -
> >>> >> >>> >> >> >> >>>>> >>>>> > memtestG80 - UNDERclocking in Linux ?
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > Run cellulose nve for 100k iterations
> >>> twice .
> >>> >> >>> If
> >>> >> >>> the
> >>> >> >>> >> >> >> final
> >>> >> >>> >> >> >> >>>>> >>>>> energies
> >>> >> >>> >> >> >> >>>>> >>>>> don't
> >>> >> >>> >> >> >> >>>>> >>>>> > match, you have a hardware issue. No
> need
> >>> to
> >>> >> >>> play
> >>> >> >>> >> with
> >>> >> >>> >> >> >> ntpr
> >>> >> >>> >> >> >> >>>>> or
> >>> >> >>> >> >> >> >>>>> any
> >>> >> >>> >> >> >> >>>>> >>>>> other
> >>> >> >>> >> >> >> >>>>> >>>>> > variable.
> >>> >> >>> >> >> >> >>>>> >>>>> > On May 30, 2013 10:58 AM,
> >>> >> <pavel.banas.upol.cz>
> >>> >> >>> >> wrote:
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > Dear all,
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > I would also like to share one of my
> >>> >> >>> experience
> >>> >> >>> with
> >>> >> >>> >> >> >> titan
> >>> >> >>> >> >> >> >>>>> >>>>> cards. We
> >>> >> >>> >> >> >> >>>>> >>>>> have
> >>> >> >>> >> >> >> >>>>> >>>>> > > one gtx titan card and with one system
> >>> >> (~55k
> >>> >> >>> atoms,
> >>> >> >>> >> > NVT,
> >>> >> >>> >> >> >> >>>>> >>>>> RNA+waters)
> >>> >> >>> >> >> >> >>>>> >>>>> we
> >>> >> >>> >> >> >> >>>>> >>>>> > run
> >>> >> >>> >> >> >> >>>>> >>>>> > > into same troubles you are describing.
> I
> >>> >> was
> >>> >> >>> also
> >>> >> >>> >> >> >> playing
> >>> >> >>> >> >> >> >>>>> with
> >>> >> >>> >> >> >> >>>>> >>>>> ntpr
> >>> >> >>> >> >> >> >>>>> >>>>> to
> >>> >> >>> >> >> >> >>>>> >>>>> > > figure out what is going on, step by
> >>> step.
> >>> >> I
> >>> >> >>> >> >> understand
> >>> >> >>> >> >> >> >>>>> that
> >>> >> >>> >> >> >> >>>>> the
> >>> >> >>> >> >> >> >>>>> >>>>> code
> >>> >> >>> >> >> >> >>>>> >>>>> is
> >>> >> >>> >> >> >> >>>>> >>>>> > > using different routines for
> calculation
> >>> >> >>> >> >> >> energies+forces or
> >>> >> >>> >> >> >> >>>>> only
> >>> >> >>> >> >> >> >>>>> >>>>> forces.
> >>> >> >>> >> >> >> >>>>> >>>>> > > The
> >>> >> >>> >> >> >> >>>>> >>>>> > > simulations of other systems are
> >>> perfectly
> >>> >> >>> stable,
> >>> >> >>> >> >> >> running
> >>> >> >>> >> >> >> >>>>> for
> >>> >> >>> >> >> >> >>>>> >>>>> days
> >>> >> >>> >> >> >> >>>>> >>>>> and
> >>> >> >>> >> >> >> >>>>> >>>>> > > weeks. Only that particular system
> >>> >> >>> systematically
> >>> >> >>> >> >> ends
> >>> >> >>> >> >> >> up
> >>> >> >>> >> >> >> >>>>> with
> >>> >> >>> >> >> >> >>>>> >>>>> this
> >>> >> >>> >> >> >> >>>>> >>>>> > error.
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > However, there was one interesting
> issue.
> >>> >> When
> >>> >> >>> I
> >>> >> >>> set
> >>> >> >>> >> >> >> >>>>> ntpr=1,
> >>> >> >>> >> >> >> >>>>> the
> >>> >> >>> >> >> >> >>>>> >>>>> error
> >>> >> >>> >> >> >> >>>>> >>>>> > > vanished (systematically in multiple
> >>> runs)
> >>> >> and
> >>> >> >>> the
> >>> >> >>> >> >> >> >>>>> simulation
> >>> >> >>> >> >> >> >>>>> was
> >>> >> >>> >> >> >> >>>>> >>>>> able to
> >>> >> >>> >> >> >> >>>>> >>>>> > > run for more than millions of steps (I
> >>> was
> >>> >> not
> >>> >> >>> let
> >>> >> >>> >> it
> >>> >> >>> >> >> >> >>>>> running
> >>> >> >>> >> >> >> >>>>> for
> >>> >> >>> >> >> >> >>>>> >>>>> weeks
> >>> >> >>> >> >> >> >>>>> >>>>> > as
> >>> >> >>> >> >> >> >>>>> >>>>> > > in the meantime I shifted that
> simulation
> >>> >> to
> >>> >> >>> other
> >>> >> >>> >> >> card
> >>> >> >>> >> >> >> -
> >>> >> >>> >> >> >> >>>>> need
> >>> >> >>> >> >> >> >>>>> >>>>> data,
> >>> >> >>> >> >> >> >>>>> >>>>> not
> >>> >> >>> >> >> >> >>>>> >>>>> > > testing). All other setting of ntpr
> >>> >> failed. As
> >>> >> >>> I
> >>> >> >>> >> read
> >>> >> >>> >> >> >> this
> >>> >> >>> >> >> >> >>>>> >>>>> discussion, I
> >>> >> >>> >> >> >> >>>>> >>>>> > > tried to set ene_avg_sampling=1 with
> some
> >>> >> high
> >>> >> >>> value
> >>> >> >>> >> >> of
> >>> >> >>> >> >> >> >>>>> ntpr
> >>> >> >>> >> >> >> >>>>> (I
> >>> >> >>> >> >> >> >>>>> >>>>> expected
> >>> >> >>> >> >> >> >>>>> >>>>> > > that this will shift the code to
> >>> >> permanently
> >>> >> >>> use
> >>> >> >>> the
> >>> >> >>> >> >> >> >>>>> >>>>> force+energies
> >>> >> >>> >> >> >> >>>>> >>>>> part
> >>> >> >>> >> >> >> >>>>> >>>>> > of
> >>> >> >>> >> >> >> >>>>> >>>>> > > the code, similarly to ntpr=1), but the
> >>> >> error
> >>> >> >>> >> >> occurred
> >>> >> >>> >> >> >> >>>>> again.
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > I know it is not very conclusive for
> >>> >> finding
> >>> >> >>> out
> >>> >> >>> >> what
> >>> >> >>> >> > is
> >>> >> >>> >> >> >> >>>>> >>>>> happening,
> >>> >> >>> >> >> >> >>>>> >>>>> at
> >>> >> >>> >> >> >> >>>>> >>>>> > > least
> >>> >> >>> >> >> >> >>>>> >>>>> > > not for me. Do you have any idea, why
> >>> >> ntpr=1
> >>> >> >>> might
> >>> >> >>> >> > help?
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > best regards,
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > Pavel
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > --
> >>> >> >>> >> >> >> >>>>> >>>>> > > Pavel Banáš
> >>> >> >>> >> >> >> >>>>> >>>>> > > pavel.banas.upol.cz
> >>> >> >>> >> >> >> >>>>> >>>>> > > Department of Physical Chemistry,
> >>> >> >>> >> >> >> >>>>> >>>>> > > Palacky University Olomouc
> >>> >> >>> >> >> >> >>>>> >>>>> > > Czech Republic
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > ---------- Původní zpráva ----------
> >>> >> >>> >> >> >> >>>>> >>>>> > > Od: Jason Swails <
> jason.swails.gmail.com
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > > Datum: 29. 5. 2013
> >>> >> >>> >> >> >> >>>>> >>>>> > > Předmět: Re: [AMBER] experiences with
> >>> EVGA
> >>> >> GTX
> >>> >> >>> TITAN
> >>> >> >>> >> >> >> >>>>> >>>>> Superclocked -
> >>> >> >>> >> >> >> >>>>> >>>>> > > memtestG
> >>> >> >>> >> >> >> >>>>> >>>>> > > 80 - UNDERclocking in Linux ?
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > "I'll answer a little bit:
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > NTPR=10 Etot after 2000 steps
> >>> >> >>> >> >> >> >>>>> >>>>> > > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > > -443256.6711
> >>> >> >>> >> >> >> >>>>> >>>>> > > > -443256.6711
> >>> >> >>> >> >> >> >>>>> >>>>> > > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > > NTPR=200 Etot after 2000 steps
> >>> >> >>> >> >> >> >>>>> >>>>> > > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > > -443261.0705
> >>> >> >>> >> >> >> >>>>> >>>>> > > > -443261.0705
> >>> >> >>> >> >> >> >>>>> >>>>> > > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > > Any idea why energies should depend
> on
> >>> >> >>> frequency
> >>> >> >>> >> of
> >>> >> >>> >> >> >> >>>>> energy
> >>> >> >>> >> >> >> >>>>> >>>>> records
> >>> >> >>> >> >> >> >>>>> >>>>> > (NTPR)
> >>> >> >>> >> >> >> >>>>> >>>>> > > ?
> >>> >> >>> >> >> >> >>>>> >>>>> > > >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > It is a subtle point, but the answer is
> >>> >> >>> 'different
> >>> >> >>> >> >> code
> >>> >> >>> >> >> >> >>>>> paths.'
> >>> >> >>> >> >> >> >>>>> >>>>> In
> >>> >> >>> >> >> >> >>>>> >>>>> > > general, it is NEVER necessary to
> compute
> >>> >> the
> >>> >> >>> actual
> >>> >> >>> >> >> >> energy
> >>> >> >>> >> >> >> >>>>> of a
> >>> >> >>> >> >> >> >>>>> >>>>> molecule
> >>> >> >>> >> >> >> >>>>> >>>>> > > during the course of standard molecular
> >>> >> >>> dynamics
> >>> >> >>> (by
> >>> >> >>> >> >> >> >>>>> analogy, it
> >>> >> >>> >> >> >> >>>>> >>>>> is
> >>> >> >>> >> >> >> >>>>> >>>>> NEVER
> >>> >> >>> >> >> >> >>>>> >>>>> > > necessary to compute atomic forces
> during
> >>> >> the
> >>> >> >>> course
> >>> >> >>> >> >> of
> >>> >> >>> >> >> >> >>>>> random
> >>> >> >>> >> >> >> >>>>> >>>>> Monte
> >>> >> >>> >> >> >> >>>>> >>>>> > Carlo
> >>> >> >>> >> >> >> >>>>> >>>>> > > sampling).
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > For performance's sake, then,
> pmemd.cuda
> >>> >> >>> computes
> >>> >> >>> >> >> only
> >>> >> >>> >> >> >> the
> >>> >> >>> >> >> >> >>>>> force
> >>> >> >>> >> >> >> >>>>> >>>>> when
> >>> >> >>> >> >> >> >>>>> >>>>> > > energies are not requested, leading to
> a
> >>> >> >>> different
> >>> >> >>> >> >> >> order of
> >>> >> >>> >> >> >> >>>>> >>>>> operations
> >>> >> >>> >> >> >> >>>>> >>>>> > for
> >>> >> >>> >> >> >> >>>>> >>>>> > > those runs. This difference ultimately
> >>> >> causes
> >>> >> >>> >> >> >> divergence.
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > To test this, try setting the variable
> >>> >> >>> >> >> >> ene_avg_sampling=10
> >>> >> >>> >> >> >> >>>>> in
> >>> >> >>> >> >> >> >>>>> the
> >>> >> >>> >> >> >> >>>>> >>>>> &cntrl
> >>> >> >>> >> >> >> >>>>> >>>>> > > section. This will force pmemd.cuda to
> >>> >> compute
> >>> >> >>> >> >> energies
> >>> >> >>> >> >> >> >>>>> every 10
> >>> >> >>> >> >> >> >>>>> >>>>> steps
> >>> >> >>> >> >> >> >>>>> >>>>> > > (for energy averaging), which will in
> >>> turn
> >>> >> >>> make
> >>> >> >>> the
> >>> >> >>> >> >> >> >>>>> followed
> >>> >> >>> >> >> >> >>>>> code
> >>> >> >>> >> >> >> >>>>> >>>>> path
> >>> >> >>> >> >> >> >>>>> >>>>> > > identical for any multiple-of-10 value
> of
> >>> >> >>> ntpr.
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> > > --
> >>> >> >>> >> >> >> >>>>> >>>>> > > Jason M. Swails
> >>> >> >>> >> >> >> >>>>> >>>>> > > Quantum Theory Project,
> >>> >> >>> >> >> >> >>>>> >>>>> > > University of Florida
> >>> >> >>> >> >> >> >>>>> >>>>> > > Ph.D. Candidate
> >>> >> >>> >> >> >> >>>>> >>>>> > > 352-392-4032
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> ______________________________****_________________
> >>> >> >>> >> >> >> >>>>> >>>>> > > AMBER mailing list
> >>> >> >>> >> >> >> >>>>> >>>>> > > AMBER.ambermd.org
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> http://lists.ambermd.org/****mailman/listinfo/amber<
> http://lists.ambermd.org/**mailman/listinfo/amber>
> >>> >> <
> >>> >> >>> >> >> >> >>>>>
> http://lists.ambermd.org/**mailman/listinfo/amber<
> http://lists.ambermd.org/mailman/listinfo/amber>
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>> "
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> ______________________________****_________________
> >>> >> >>> >> >> >> >>>>> >>>>> > > AMBER mailing list
> >>> >> >>> >> >> >> >>>>> >>>>> > > AMBER.ambermd.org
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> http://lists.ambermd.org/****mailman/listinfo/amber<
> http://lists.ambermd.org/**mailman/listinfo/amber>
> >>> >> <
> >>> >> >>> >> >> >> >>>>>
> http://lists.ambermd.org/**mailman/listinfo/amber<
> http://lists.ambermd.org/mailman/listinfo/amber>
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>> > >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> ______________________________****_________________
> >>> >> >>> >> >> >> >>>>> >>>>> > AMBER mailing list
> >>> >> >>> >> >> >> >>>>> >>>>> > AMBER.ambermd.org
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> http://lists.ambermd.org/****mailman/listinfo/amber<
> http://lists.ambermd.org/**mailman/listinfo/amber>
> >>> >> <
> >>> >> >>> >> >> >> >>>>>
> http://lists.ambermd.org/**mailman/listinfo/amber<
> http://lists.ambermd.org/mailman/listinfo/amber>
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> ______________________________****_________________
> >>> >> >>> >> >> >> >>>>> >>>>> > AMBER mailing list
> >>> >> >>> >> >> >> >>>>> >>>>> > AMBER.ambermd.org
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> http://lists.ambermd.org/****mailman/listinfo/amber<
> http://lists.ambermd.org/**mailman/listinfo/amber>
> >>> >> <
> >>> >> >>> >> >> >> >>>>>
> http://lists.ambermd.org/**mailman/listinfo/amber<
> http://lists.ambermd.org/mailman/listinfo/amber>
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>> >
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>> --
> >>> >> >>> >> >> >> >>>>> >>>>> Jason M. Swails
> >>> >> >>> >> >> >> >>>>> >>>>> Quantum Theory Project,
> >>> >> >>> >> >> >> >>>>> >>>>> University of Florida
> >>> >> >>> >> >> >> >>>>> >>>>> Ph.D. Candidate
> >>> >> >>> >> >> >> >>>>> >>>>> 352-392-4032
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> ______________________________****_________________
> >>> >> >>> >> >> >> >>>>> >>>>> AMBER mailing list
> >>> >> >>> >> >> >> >>>>> >>>>> AMBER.ambermd.org
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> http://lists.ambermd.org/****mailman/listinfo/amber<
> http://lists.ambermd.org/**mailman/listinfo/amber>
> >>> >> <
> >>> >> >>> >> >> >> >>>>>
> http://lists.ambermd.org/**mailman/listinfo/amber<
> http://lists.ambermd.org/mailman/listinfo/amber>
> >>> >
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>>
> >>> >> >>> >> >> >> >>>>> >>>>
> >>
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jun 01 2013 - 15:30:02 PDT
Custom Search