Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Marek Maly <marek.maly.ujep.cz>
Date: Sat, 01 Jun 2013 21:34:17 +0200

Sorry,

regarding that double precision mode it perhaps mean to recompile
GPU amber part with "DPDP" configure setting. Am I right ?

   M.


Dne Sat, 01 Jun 2013 21:26:42 +0200 Marek Maly <marek.maly.ujep.cz>
napsal/-a:

> Hi Scott,
>
> please how can I activate double-precision mode ?
>
> It is something which could be enabled using nvidia-smi ?
>
>
> Regarding to this your comment:
>
> --------
> The only thing that's funny about these tests is how little they diverge.
> So I am *hoping* this might be a bug in cuFFT rather than a GTX Titan HW
> ------
>
> So you mean that the problem here might be in CUDA 5.0 implementation of
> cuFFT ? If yes, it means that there is some kind of "incompatibility"
> just
> in case of Titanes as with GTX 580, GTX 680 I have obtained perfect
> reproducibility
> in all tests (see my older posts in this thread).
>
> So perhaps would be good idea to try with CUDA 5.5 where maybe cuFFT will
> be
> more "compatible" also with the new Titanes (or also GTX 780s).
>
> I will do this this experiment, but before I would like to check
> if the bugfix 18 will solve at least some of reported issues or not
> (still using CUDA 5.0 which is also the latest version officially
> compatible with
> Amber code as reported here http://ambermd.org/gpus/ )
>
> M.
>
>
>
>
>
>
>
> The only thing that's funny about these tests is how little they diverge.
> So I am *hoping* this might be a bug in cuFFT rather than a GTX Titan HW
>
>
> Dne Sat, 01 Jun 2013 20:46:29 +0200 Scott Le Grand
> <varelse2005.gmail.com>
> napsal/-a:
>
>> The acid test is running on a K20. If K20 is OK, then I really think
>> (99.5%) Titan is hosed...
>>
>> If K20 shows the same irreproducible behavior, my life gets a whole lot
>> more interesting...
>>
>> But along those lines, could you try activating double-precision mode
>> and
>> retesting? That ought to clock the thing down significantly, and if it
>> suddenly runs reproducibly, then 99.5% this is a Titan HW issue...
>>
>> Scott
>>
>>
>> On Sat, Jun 1, 2013 at 11:26 AM, ET <sketchfoot.gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I've put the graphics card into a machine with the working GTX titan
>>> that I
>>> mentioned earlier.
>>>
>>> The Nvidia driver version is: 133.30
>>>
>>> Amber version is:
>>> AmberTools version 13.03
>>> Amber version 12.16
>>>
>>> I ran 50k steps with the amber benchmark using ig=43689 on both cards.
>>> For
>>> the purpose of discriminating between them, the card I believe (fingers
>>> crossed) is working is called GPU-00_TeaNCake, whilst the other one is
>>> called GPU-01_008.
>>>
>>> *When I run the tests on GPU-01_008:*
>>>
>>> 1) All the tests (across 2x repeats) finish apart from the following
>>> which
>>> have the errors listed:
>>>
>>> --------------------------------------------
>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>> Error: unspecified launch failure launching kernel kNLSkinTest
>>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>>
>>> --------------------------------------------
>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>
>>> --------------------------------------------
>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>> Error: unspecified launch failure launching kernel kNLSkinTest
>>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>>
>>> --------------------------------------------
>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>> grep: mdinfo.1GTX680: No such file or directory
>>>
>>>
>>>
>>> 2) The sdiff logs indicate that reproducibility across the two repeats
>>> is
>>> as follows:
>>>
>>> *GB_myoglobin: *Reproducible across 50k steps
>>> *GB_nucleosome:* Reproducible till step 7400
>>> *GB_TRPCage:* Reproducible across 50k steps
>>>
>>> *PME_JAC_production_NVE: *No reproducibility shown from step 1,000
>>> onwards
>>> *PME_JAC_production_NPT*: Reproducible till step 1,000. Also outfile
>>> is
>>> not written properly - blank gaps appear where something should have
>>> been
>>> written
>>>
>>> *PME_FactorIX_production_NVE:* Reproducible across 50k steps
>>> *PME_FactorIX_production_NPT:* Reproducible across 50k steps
>>>
>>> *PME_Cellulose_production_NVE:* Failure means that both runs do not
>>> finish
>>> (see point1)
>>> *PME_Cellulose_production_NPT: *Failure means that both runs do not
>>> finish
>>> (see point1)
>>>
>>>
>>> #######################################################################################
>>>
>>> *When I run the tests on * *GPU-00_TeaNCake:*
>>> *
>>> *
>>> 1) All the tests (across 2x repeats) finish apart from the following
>>> which
>>> have the errors listed:
>>> -------------------------------------
>>> JAC_PRODUCTION_NPT - 23,558 atoms PME
>>> PMEMD Terminated Abnormally!
>>> -------------------------------------
>>>
>>>
>>> 2) The sdiff logs indicate that reproducibility across the two repeats
>>> is
>>> as follows:
>>>
>>> *GB_myoglobin:* Reproducible across 50k steps
>>> *GB_nucleosome:* Reproducible across 50k steps
>>> *GB_TRPCage:* Reproducible across 50k steps
>>>
>>> *PME_JAC_production_NVE:* No reproducibility shown from step 10,000
>>> onwards
>>> *PME_JAC_production_NPT: * No reproducibility shown from step 10,000
>>> onwards. Also outfile is not written properly - blank gaps appear where
>>> something should have been written. Repeat 2 Crashes with error noted
>>> in 1.
>>>
>>> *PME_FactorIX_production_NVE:* No reproducibility shown from step 9,000
>>> onwards
>>> *PME_FactorIX_production_NPT: *Reproducible across 50k steps
>>>
>>> *PME_Cellulose_production_NVE: *No reproducibility shown from step
>>> 5,000
>>> onwards
>>> *PME_Cellulose_production_NPT: ** *No reproducibility shown from step
>>> 29,000 onwards. Also outfile is not written properly - blank gaps
>>> appear
>>> where something should have been written.
>>>
>>>
>>> Out files and sdiff files are included as attatchments
>>>
>>> #################################################
>>>
>>> So I'm going to update my nvidia driver to the latest version and patch
>>> amber to the latest version and rerun the tests to see if there is any
>>> improvement. Could someone let me know if it is necessary to recompile
>>> any
>>> or all of AMBER after applying the bugfixes?
>>>
>>> Additionally, I'm going to run memory tests and heaven benchmarks on
>>> the
>>> cards to check whether they are faulty or not.
>>>
>>> I'm thinking that there is a mix of hardware error/configuration (esp
>>> in
>>> the case of GPU-01_008) and amber software error in this situation.
>>> What do
>>> you guys think?
>>>
>>> Also am I right in thinking (from what Scott was saying) that all the
>>> benchmarks should be reproducible across 50k steps but begin to diverge
>>> at
>>> around 100K steps? Is there any difference from in setting *ig *to an
>>> explicit number to removing it from the mdin file?
>>>
>>> br,
>>> g
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 31 May 2013 23:45, ET <sketchfoot.gmail.com> wrote:
>>>
>>> > I don't need sysadmins, but sysadmins need me as it gives purpose to
>>> their
>>> > bureaucratic existence. A encountered evil if working in an
>>> institution
>>> or
>>> > comapny IMO. Good science and indiviguality being sacrificed for
>>> > standardisation and mediocrity in the intrerests of maintaing a
>>> system
>>> that
>>> > focusses on maintaining the system and not the objective.
>>> >
>>> > You need root to move fwd on these things, unfortunately. and ppl
>>> with
>>> > root are kinda like your parents when you try to borrow money from
>>> them .
>>> > age 12 :D
>>> > On May 31, 2013 9:34 PM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>>> >
>>> >> Sorry why do you need sysadmins :)) ?
>>> >>
>>> >> BTW here is the most recent driver:
>>> >>
>>> >> http://www.nvidia.com/object/linux-display-amd64-319.23-driver.html
>>> >>
>>> >> I do not remember anything easier than is to install driver
>>> (especially
>>> >> in case of binary (*.run) installer) :))
>>> >>
>>> >> M.
>>> >>
>>> >>
>>> >>
>>> >> Dne Fri, 31 May 2013 22:02:34 +0200 ET <sketchfoot.gmail.com>
>>> napsal/-a:
>>> >>
>>> >> > Yup. I know. I replaced a 680 and the everknowing sysadmins are
>>> >> reluctant
>>> >> > to install drivers not in the repositoery as they are lame. :(
>>> >> > On May 31, 2013 7:14 PM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>>> >> >>
>>> >> >> As I already wrote you,
>>> >> >>
>>> >> >> the first driver which properly/officially supports Titans,
>>> should be
>>> >> >> 313.26 .
>>> >> >>
>>> >> >> Anyway I am curious mainly about your 100K repetitive tests with
>>> >> >> your Titan SC card. Especially in case of these tests ( JAC_NVE,
>>> >> JAC_NPT
>>> >> >> and CELLULOSE_NVE ) where
>>> >> >> my Titans SC randomly failed or succeeded. In FACTOR_IX_NVE,
>>> >> >> FACTOR_IX_NPT
>>> >> >> tests both
>>> >> >> my cards are perfectly stable (independently from drv. version)
>>> and
>>> >> also
>>> >> >> the runs
>>> >> >> are perfectly or almost perfectly reproducible.
>>> >> >>
>>> >> >> Also if your test will crash please report the eventual errs.
>>> >> >>
>>> >> >> To this moment I have this actual library of errs on my Titans SC
>>> GPUs.
>>> >> >>
>>> >> >> #1 ERR writtent in mdout:
>>> >> >> ------
>>> >> >> | ERROR: max pairlist cutoff must be less than unit cell max
>>> sphere
>>> >> >> radius!
>>> >> >> ------
>>> >> >>
>>> >> >>
>>> >> >> #2 no ERR writtent in mdout, ERR written in standard output
>>> (nohup.out)
>>> >> >>
>>> >> >> ----
>>> >> >> Error: unspecified launch failure launching kernel kNLSkinTest
>>> >> >> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>> >> >> ----
>>> >> >>
>>> >> >>
>>> >> >> #3 no ERR writtent in mdout, ERR written in standard output
>>> (nohup.out)
>>> >> >> ----
>>> >> >> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>> >> >> ----
>>> >> >>
>>> >> >> Another question, regarding your Titan SC, it is also EVGA as in
>>> my
>>> >> case
>>> >> >> or it is another producer ?
>>> >> >>
>>> >> >> Thanks,
>>> >> >>
>>> >> >> M.
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> Dne Fri, 31 May 2013 19:17:03 +0200 ET <sketchfoot.gmail.com>
>>> >> napsal/-a:
>>> >> >>
>>> >> >> > Well, this is interesting...
>>> >> >> >
>>> >> >> > I ran 50k steps on the Titan on the other machine with driver
>>> 310.44
>>> >> >> and
>>> >> >> > it
>>> >> >> > passed all the GB steps. i.e totally identical results over two
>>> >> >> repeats.
>>> >> >> > However, it failed all the PME tests after step 1000. I'm going
>>> to
>>> >> > update
>>> >> >> > the driver and test it again.
>>> >> >> >
>>> >> >> > Files included as attachments.
>>> >> >> >
>>> >> >> > br,
>>> >> >> > g
>>> >> >> >
>>> >> >> >
>>> >> >> > On 31 May 2013 16:40, Marek Maly <marek.maly.ujep.cz> wrote:
>>> >> >> >
>>> >> >> >> One more thing,
>>> >> >> >>
>>> >> >> >> can you please check under which frequency is running that
>>> your
>>> >> >> titan ?
>>> >> >> >>
>>> >> >> >> As the base frequency of normal Titans is 837MHz and the Boost
>>> one
>>> >> is
>>> >> >> >> 876MHz I
>>> >> >> >> assume that yor GPU is running automatically also under it's
>>> boot
>>> >> >> >> frequency (876MHz).
>>> >> >> >> You can find this information e.g. in Amber mdout file.
>>> >> >> >>
>>> >> >> >> You also mentioned some crashes in your previous email. Your
>>> ERRs
>>> >> >> were
>>> >> >> >> something like those here:
>>> >> >> >>
>>> >> >> >> #1 ERR writtent in mdout:
>>> >> >> >> ------
>>> >> >> >> | ERROR: max pairlist cutoff must be less than unit cell max
>>> >> sphere
>>> >> >> >> radius!
>>> >> >> >> ------
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> #2 no ERR writtent in mdout, ERR written in standard output
>>> >> >> (nohup.out)
>>> >> >> >>
>>> >> >> >> ----
>>> >> >> >> Error: unspecified launch failure launching kernel kNLSkinTest
>>> >> >> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
>>> failure
>>> >> >> >> ----
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> #3 no ERR writtent in mdout, ERR written in standard output
>>> >> >> (nohup.out)
>>> >> >> >> ----
>>> >> >> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
>>> failure
>>> >> >> >> ----
>>> >> >> >>
>>> >> >> >> or you obtained some new/additional errs ?
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> M.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Dne Fri, 31 May 2013 17:30:57 +0200 filip fratev
>>> >> >> <filipfratev.yahoo.com
>>> >> >>
>>> >> >> >> napsal/-a:
>>> >> >> >>
>>> >> >> >> > Hi,
>>> >> >> >> > This is what I obtained for 50K tests and "normal" GTXTitan:
>>> >> >> >> >
>>> >> >> >> > run1:
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > A V E R A G E S O V E R 50 S T E P S
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>> 299.87
>>> >> >> PRESS
>>> >> >> >> > = 0.0
>>> >> >> >> > Etot = -443237.1079 EKtot = 257679.9750 EPtot
>>> =
>>> >> >> >> > -700917.0829
>>> >> >> >> > BOND = 20193.1856 ANGLE = 53517.5432
>>> DIHED =
>>> >> >> >> > 23575.4648
>>> >> >> >> > 1-4 NB = 21759.5524 1-4 EEL = 742552.5939
>>> VDWAALS =
>>> >> >> >> > 96286.7714
>>> >> >> >> > EELEC = -1658802.1941 EHBOND = 0.0000
>>> RESTRAINT =
>>> >> >> >> > 0.0000
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > R M S F L U C T U A T I O N S
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>> 0.33
>>> >> >> PRESS
>>> >> >> >> > = 0.0
>>> >> >> >> > Etot = 11.2784 EKtot = 284.8999
>>> EPtot =
>>> >> >> >> > 289.0773
>>> >> >> >> > BOND = 136.3417 ANGLE = 214.0054
>>> DIHED =
>>> >> >> >> > 59.4893
>>> >> >> >> > 1-4 NB = 58.5891 1-4 EEL = 330.5400
>>> VDWAALS =
>>> >> >> >> > 559.2079
>>> >> >> >> > EELEC = 743.8771 EHBOND = 0.0000
>>> RESTRAINT =
>>> >> >> >> > 0.0000
>>> >> >> >> > |E(PBS) = 21.8119
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> > run2:
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > A V E R A G E S O V E R 50 S T E P S
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>> 299.89
>>> >> >> PRESS
>>> >> >> >> > = 0.0
>>> >> >> >> > Etot = -443240.0999 EKtot = 257700.0950
>>> EPtot =
>>> >> >> >> > -700940.1949
>>> >> >> >> > BOND = 20241.9174 ANGLE = 53644.6694
>>> DIHED =
>>> >> >> >> > 23541.3737
>>> >> >> >> > 1-4 NB = 21803.1898 1-4 EEL = 742754.2254
>>> VDWAALS =
>>> >> >> >> > 96298.8308
>>> >> >> >> > EELEC = -1659224.4013 EHBOND = 0.0000
>>> RESTRAINT =
>>> >> >> >> > 0.0000
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > R M S F L U C T U A T I O N S
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>> 0.41
>>> >> >> PRESS
>>> >> >> >> > = 0.0
>>> >> >> >> > Etot = 10.7633 EKtot = 348.2819
>>> EPtot =
>>> >> >> >> > 353.9918
>>> >> >> >> > BOND = 106.5314 ANGLE = 196.7052
>>> DIHED =
>>> >> >> >> > 69.7476
>>> >> >> >> > 1-4 NB = 60.3435 1-4 EEL = 400.7466
>>> VDWAALS =
>>> >> >> >> > 462.7763
>>> >> >> >> > EELEC = 651.9857 EHBOND = 0.0000
>>> RESTRAINT =
>>> >> >> >> > 0.0000
>>> >> >> >> > |E(PBS) = 17.0642
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >>
>>> >> >
>>> >>
>>> --------------------------------------------------------------------------------
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > ________________________________
>>> >> >> >> > From: Marek Maly <marek.maly.ujep.cz>
>>> >> >> >> > To: AMBER Mailing List <amber.ambermd.org>
>>> >> >> >> > Sent: Friday, May 31, 2013 3:34 PM
>>> >> >> >> > Subject: Re: [AMBER] experiences with EVGA GTX TITAN
>>> Superclocked
>>> >> -
>>> >> >> >> > memtestG80 - UNDERclocking in Linux ?
>>> >> >> >> >
>>> >> >> >> > Hi here are my 100K results for driver 313.30 (and still
>>> Cuda
>>> >> 5.0).
>>> >> >> >> >
>>> >> >> >> > The results are rather similar to those obtained
>>> >> >> >> > under my original driver 319.17 (see the first table
>>> >> >> >> > which I sent in this thread).
>>> >> >> >> >
>>> >> >> >> > M.
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > Dne Fri, 31 May 2013 12:29:59 +0200 Marek Maly <
>>> >> marek.maly.ujep.cz>
>>> >> >> >> > napsal/-a:
>>> >> >> >> >
>>> >> >> >> >> Hi,
>>> >> >> >> >>
>>> >> >> >> >> please try to run at lest 100K tests twice to verify exact
>>> >> >> >> >> reproducibility
>>> >> >> >> >> of the results on the given card. If you find in any mdin
>>> file
>>> >> >> ig=-1
>>> >> >> >> >> just
>>> >> >> >> >> delete it to ensure that you are using the identical random
>>> seed
>>> >> >> for
>>> >> >> >> >> both
>>> >> >> >> >> runs. You can eventually omit NUCLEOSOME test
>>> >> >> >> >> as it is too much time consuming.
>>> >> >> >> >>
>>> >> >> >> >> Driver 310.44 ?????
>>> >> >> >> >>
>>> >> >> >> >> As far as I know the proper support for titans is from
>>> version
>>> >> > 313.26
>>> >> >> >> >>
>>> >> >> >> >> see e.g. here :
>>> >> >> >> >>
>>> >> >> >>
>>> >> >
>>> >>
>>> http://www.geeks3d.com/20130306/nvidia-releases-r313-26-for-linux-with-gtx-titan-support/
>>> >> >> >> >>
>>> >> >> >> >> BTW: On my site downgrade to drv. 313.30 did not solved the
>>> >> >> >> situation, I
>>> >> >> >> >> will post
>>> >> >> >> >> my results soon here.
>>> >> >> >> >>
>>> >> >> >> >> M.
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >> Dne Fri, 31 May 2013 12:21:21 +0200 ET
>>> <sketchfoot.gmail.com>
>>> >> >> >> napsal/-a:
>>> >> >> >> >>
>>> >> >> >> >>> ps. I have another install of amber on another computer
>>> with a
>>> >> >> >> >>> different
>>> >> >> >> >>> Titan and different Driver Version: 310.44.
>>> >> >> >> >>>
>>> >> >> >> >>> In the interests of thrashing the proverbial horse, I'll
>>> run
>>> the
>>> >> >> >> >>> benchmark
>>> >> >> >> >>> for 50k steps. :P
>>> >> >> >> >>>
>>> >> >> >> >>> br,
>>> >> >> >> >>> g
>>> >> >> >> >>>
>>> >> >> >> >>>
>>> >> >> >> >>> On 31 May 2013 11:17, ET <sketchfoot.gmail.com> wrote:
>>> >> >> >> >>>
>>> >> >> >> >>>> Hi, I just ran the Amber benchmark for the default (10000
>>> >> steps)
>>> >> >> >> on my
>>> >> >> >> >>>> Titan.
>>> >> >> >> >>>>
>>> >> >> >> >>>> Using sdiff -sB showed that the two runs were completely
>>> >> > identical.
>>> >> >> >> >>>> I've
>>> >> >> >> >>>> attached compressed files of the mdout & diff files.
>>> >> >> >> >>>>
>>> >> >> >> >>>> br,
>>> >> >> >> >>>> g
>>> >> >> >> >>>>
>>> >> >> >> >>>>
>>> >> >> >> >>>> On 30 May 2013 23:41, Marek Maly <marek.maly.ujep.cz>
>>> wrote:
>>> >> >> >> >>>>
>>> >> >> >> >>>>> OK, let's see. The eventual downclocking I see as the
>>> very
>>> >> last
>>> >> >> >> >>>>> possibility
>>> >> >> >> >>>>> (if I don't decide for RMAing). But now still some other
>>> >> >> >> experiments
>>> >> >> >> >>>>> are
>>> >> >> >> >>>>> available :))
>>> >> >> >> >>>>> I just started 100K tests under 313.30 driver. For today
>>> good
>>> >> >> >> night
>>> >> >> >> >>>>> ...
>>> >> >> >> >>>>>
>>> >> >> >> >>>>> M.
>>> >> >> >> >>>>>
>>> >> >> >> >>>>> Dne Fri, 31 May 2013 00:45:49 +0200 Scott Le Grand
>>> >> >> >> >>>>> <varelse2005.gmail.com
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> napsal/-a:
>>> >> >> >> >>>>>
>>> >> >> >> >>>>> > It will be very interesting if this behavior persists
>>> after
>>> >> >> >> >>>>> downclocking.
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> > But right now, Titan 0 *looks* hosed and Titan 1
>>> *looks*
>>> >> like
>>> >> > it
>>> >> >> >> >>>>> needs
>>> >> >> >> >>>>> > downclocking...
>>> >> >> >> >>>>> > On May 30, 2013 3:20 PM, "Marek Maly"
>>> <marek.maly.ujep.cz
>>> >
>>> >> >> >> wrote:
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> >> Hi all,
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> here are my results from the 500K steps 2 x repeated
>>> >> > benchmarks
>>> >> >> >> >>>>> >> under 319.23 driver and still Cuda 5.0 (see the
>>> attached
>>> >> >> table
>>> >> >> >> ).
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> It is hard to say if the results are better or worse
>>> than
>>> >> in
>>> >> > my
>>> >> >> >> >>>>> >> previous 100K test under driver 319.17.
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> While results from Cellulose test were improved and
>>> the
>>> >> > TITAN_1
>>> >> >> >> >>>>> card
>>> >> >> >> >>>>> >> even
>>> >> >> >> >>>>> >> successfully finished all 500K steps moreover with
>>> exactly
>>> >> >> the
>>> >> >> >> >>>>> same
>>> >> >> >> >>>>> >> final
>>> >> >> >> >>>>> >> energy !
>>> >> >> >> >>>>> >> (TITAN_0 at least finished more than 100K steps and
>>> in
>>> >> >> RUN_01
>>> >> >> >> even
>>> >> >> >> >>>>> more
>>> >> >> >> >>>>> >> than 400K steps)
>>> >> >> >> >>>>> >> In JAC_NPT test no GPU was able to finish at least
>>> 100K
>>> >> >> steps
>>> >> >> >> and
>>> >> >> >> >>>>> the
>>> >> >> >> >>>>> >> results from JAC_NVE
>>> >> >> >> >>>>> >> test are also not too much convincing. FACTOR_IX_NVE
>>> and
>>> >> >> >> >>>>> FACTOR_IX_NPT
>>> >> >> >> >>>>> >> were successfully
>>> >> >> >> >>>>> >> finished with 100% reproducibility in FACTOR_IX_NPT
>>> case
>>> >> >> (on
>>> >> >> >> both
>>> >> >> >> >>>>> >> cards)
>>> >> >> >> >>>>> >> and almost
>>> >> >> >> >>>>> >> 100% reproducibility in case of FACTOR_IX_NVE (again
>>> 100%
>>> >> in
>>> >> >> >> case
>>> >> >> >> >>>>> of
>>> >> >> >> >>>>> >> TITAN_1). TRPCAGE, MYOGLOBIN
>>> >> >> >> >>>>> >> again finished without any problem with 100%
>>> >> >> reproducibility.
>>> >> >> >> >>>>> NUCLEOSOME
>>> >> >> >> >>>>> >> test was not done
>>> >> >> >> >>>>> >> this time due to high time requirements. If you find
>>> in
>>> the
>>> >> >> >> table
>>> >> >> >> >>>>> >> positive
>>> >> >> >> >>>>> >> number finishing with
>>> >> >> >> >>>>> >> K (which means "thousands") it means the last number
>>> of
>>> >> step
>>> >> >> >> >>>>> written in
>>> >> >> >> >>>>> >> mdout before crash.
>>> >> >> >> >>>>> >> Below are all the 3 types of detected errs with
>>> relevant
>>> >> >> >> >>>>> systems/rounds
>>> >> >> >> >>>>> >> where the given err
>>> >> >> >> >>>>> >> appeared.
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> Now I will try just 100K tests under ETs favourite
>>> driver
>>> >> >> >> version
>>> >> >> >> >>>>> 313.30
>>> >> >> >> >>>>> >> :)) and then
>>> >> >> >> >>>>> >> I will eventually try to experiment with cuda 5.5
>>> which I
>>> >> >> >> already
>>> >> >> >> >>>>> >> downloaded from the
>>> >> >> >> >>>>> >> cuda zone ( I had to become cuda developer for this
>>> :)) )
>>> >> >> BTW
>>> >> >> >> ET
>>> >> >> >> >>>>> thanks
>>> >> >> >> >>>>> >> for the frequency info !
>>> >> >> >> >>>>> >> and I am still ( perhaps not alone :)) ) very curious
>>> about
>>> >> >> >> your 2
>>> >> >> >> >>>>> x
>>> >> >> >> >>>>> >> repeated Amber benchmark tests with superclocked
>>> Titan.
>>> >> >> Indeed
>>> >> >> >> >>>>> that
>>> >> >> >> >>>>> I
>>> >> >> >> >>>>> am
>>> >> >> >> >>>>> >> very curious also about that Ross "hot" patch.
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> M.
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> ERRORS DETECTED DURING THE 500K steps tests with
>>> driver
>>> >> >> 319.23
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> #1 ERR writtent in mdout:
>>> >> >> >> >>>>> >> ------
>>> >> >> >> >>>>> >> | ERROR: max pairlist cutoff must be less than unit
>>> cell
>>> >> >> max
>>> >> >> >> >>>>> sphere
>>> >> >> >> >>>>> >> radius!
>>> >> >> >> >>>>> >> ------
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> TITAN_0 ROUND_1 JAC_NPT (at least 5000 steps
>>> successfully
>>> >> > done
>>> >> >> >> >>>>> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NPT (at least 8000 steps
>>> successfully
>>> >> > done
>>> >> >> >> >>>>> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> #2 no ERR writtent in mdout, ERR written in standard
>>> output
>>> >> >> >> >>>>> (nohup.out)
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> ----
>>> >> >> >> >>>>> >> Error: unspecified launch failure launching kernel
>>> >> >> kNLSkinTest
>>> >> >> >> >>>>> >> cudaFree GpuBuffer::Deallocate failed unspecified
>>> launch
>>> >> >> >> failure
>>> >> >> >> >>>>> >> ----
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> TITAN_0 ROUND_1 CELLULOSE_NVE (at least 437 000 steps
>>> >> >> >> successfully
>>> >> >> >> >>>>> done
>>> >> >> >> >>>>> >> before crash)
>>> >> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NVE (at least 162 000 steps
>>> >> >> successfully
>>> >> >> >> done
>>> >> >> >> >>>>> >> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >> TITAN_0 ROUND_2 CELLULOSE_NVE (at least 117 000 steps
>>> >> >> >> successfully
>>> >> >> >> >>>>> done
>>> >> >> >> >>>>> >> before crash)
>>> >> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NVE (at least 119 000 steps
>>> >> >> successfully
>>> >> >> >> done
>>> >> >> >> >>>>> >> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NVE (at least 43 000 steps
>>> >> successfully
>>> >> >> >> done
>>> >> >> >> >>>>> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> #3 no ERR writtent in mdout, ERR written in standard
>>> output
>>> >> >> >> >>>>> (nohup.out)
>>> >> >> >> >>>>> >> ----
>>> >> >> >> >>>>> >> cudaMemcpy GpuBuffer::Download failed unspecified
>>> launch
>>> >> >> >> failure
>>> >> >> >> >>>>> >> ----
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NPT (at least 77 000 steps
>>> >> successfully
>>> >> >> >> done
>>> >> >> >> >>>>> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NPT (at least 58 000 steps
>>> >> successfully
>>> >> >> >> done
>>> >> >> >> >>>>> before
>>> >> >> >> >>>>> >> crash)
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> Dne Thu, 30 May 2013 21:27:17 +0200 Scott Le Grand
>>> >> >> >> >>>>> >> <varelse2005.gmail.com>
>>> >> >> >> >>>>> >> napsal/-a:
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> Oops meant to send that to Jason...
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>> Anyway, before we all panic, we need to get K20's
>>> behavior
>>> >> >> >> >>>>> analyzed
>>> >> >> >> >>>>> >>> here.
>>> >> >> >> >>>>> >>> If it's deterministic, this truly is a hardware
>>> issue.
>>> If
>>> >> >> >> not,
>>> >> >> >> >>>>> then
>>> >> >> >> >>>>> it
>>> >> >> >> >>>>> >>> gets interesting because 680 is deterministic as far
>>> as
>>> I
>>> >> >> can
>>> >> >> >> >>>>> tell...
>>> >> >> >> >>>>> >>> On May 30, 2013 12:24 PM, "Scott Le Grand"
>>> >> >> >> >>>>> <varelse2005.gmail.com>
>>> >> >> >> >>>>> >>> wrote:
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>> If the errors are not deterministically triggered,
>>> they
>>> >> >> >> probably
>>> >> >> >> >>>>> >>> won't be
>>> >> >> >> >>>>> >>>> fixed by the patch, alas...
>>> >> >> >> >>>>> >>>> On May 30, 2013 12:15 PM, "Jason Swails"
>>> >> >> >> >>>>> <jason.swails.gmail.com>
>>> >> >> >> >>>>> >>>> wrote:
>>> >> >> >> >>>>> >>>>
>>> >> >> >> >>>>> >>>> Just a reminder to everyone based on what Ross
>>> said:
>>> >> >> there
>>> >> >> >> is a
>>> >> >> >> >>>>> >>>> pending
>>> >> >> >> >>>>> >>>>> patch to pmemd.cuda that will be coming out
>>> shortly
>>> >> >> (maybe
>>> >> >> >> even
>>> >> >> >> >>>>> >>>>> within
>>> >> >> >> >>>>> >>>>> hours). It's entirely possible that several of
>>> these
>>> >> > errors
>>> >> >> >> >>>>> are
>>> >> >> >> >>>>> >>>>> fixed
>>> >> >> >> >>>>> >>>>> by
>>> >> >> >> >>>>> >>>>> this patch.
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>> All the best,
>>> >> >> >> >>>>> >>>>> Jason
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>> On Thu, May 30, 2013 at 2:46 PM, filip fratev <
>>> >> >> >> >>>>> filipfratev.yahoo.com>
>>> >> >> >> >>>>> >>>>> wrote:
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>> > I have observed the same crashes from time to
>>> time.
>>> I
>>> >> > will
>>> >> >> >> >>>>> run
>>> >> >> >> >>>>> >>>>> cellulose
>>> >> >> >> >>>>> >>>>> > nve for 100k and will past results here.
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> > All the best,
>>> >> >> >> >>>>> >>>>> > Filip
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> > ______________________________**__
>>> >> >> >> >>>>> >>>>> > From: Scott Le Grand <varelse2005.gmail.com>
>>> >> >> >> >>>>> >>>>> > To: AMBER Mailing List <amber.ambermd.org>
>>> >> >> >> >>>>> >>>>> > Sent: Thursday, May 30, 2013 9:01 PM
>>> >> >> >> >>>>> >>>>> > Subject: Re: [AMBER] experiences with EVGA GTX
>>> TITAN
>>> >> >> >> >>>>> Superclocked
>>> >> >> >> >>>>> -
>>> >> >> >> >>>>> >>>>> > memtestG80 - UNDERclocking in Linux ?
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> > Run cellulose nve for 100k iterations twice .
>>> If
>>> the
>>> >> >> >> final
>>> >> >> >> >>>>> >>>>> energies
>>> >> >> >> >>>>> >>>>> don't
>>> >> >> >> >>>>> >>>>> > match, you have a hardware issue. No need to
>>> play
>>> >> with
>>> >> >> >> ntpr
>>> >> >> >> >>>>> or
>>> >> >> >> >>>>> any
>>> >> >> >> >>>>> >>>>> other
>>> >> >> >> >>>>> >>>>> > variable.
>>> >> >> >> >>>>> >>>>> > On May 30, 2013 10:58 AM, <pavel.banas.upol.cz>
>>> >> wrote:
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > Dear all,
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > I would also like to share one of my
>>> experience
>>> with
>>> >> >> >> titan
>>> >> >> >> >>>>> >>>>> cards. We
>>> >> >> >> >>>>> >>>>> have
>>> >> >> >> >>>>> >>>>> > > one gtx titan card and with one system (~55k
>>> atoms,
>>> >> > NVT,
>>> >> >> >> >>>>> >>>>> RNA+waters)
>>> >> >> >> >>>>> >>>>> we
>>> >> >> >> >>>>> >>>>> > run
>>> >> >> >> >>>>> >>>>> > > into same troubles you are describing. I was
>>> also
>>> >> >> >> playing
>>> >> >> >> >>>>> with
>>> >> >> >> >>>>> >>>>> ntpr
>>> >> >> >> >>>>> >>>>> to
>>> >> >> >> >>>>> >>>>> > > figure out what is going on, step by step. I
>>> >> >> understand
>>> >> >> >> >>>>> that
>>> >> >> >> >>>>> the
>>> >> >> >> >>>>> >>>>> code
>>> >> >> >> >>>>> >>>>> is
>>> >> >> >> >>>>> >>>>> > > using different routines for calculation
>>> >> >> >> energies+forces or
>>> >> >> >> >>>>> only
>>> >> >> >> >>>>> >>>>> forces.
>>> >> >> >> >>>>> >>>>> > > The
>>> >> >> >> >>>>> >>>>> > > simulations of other systems are perfectly
>>> stable,
>>> >> >> >> running
>>> >> >> >> >>>>> for
>>> >> >> >> >>>>> >>>>> days
>>> >> >> >> >>>>> >>>>> and
>>> >> >> >> >>>>> >>>>> > > weeks. Only that particular system
>>> systematically
>>> >> >> ends
>>> >> >> >> up
>>> >> >> >> >>>>> with
>>> >> >> >> >>>>> >>>>> this
>>> >> >> >> >>>>> >>>>> > error.
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > However, there was one interesting issue. When
>>> I
>>> set
>>> >> >> >> >>>>> ntpr=1,
>>> >> >> >> >>>>> the
>>> >> >> >> >>>>> >>>>> error
>>> >> >> >> >>>>> >>>>> > > vanished (systematically in multiple runs) and
>>> the
>>> >> >> >> >>>>> simulation
>>> >> >> >> >>>>> was
>>> >> >> >> >>>>> >>>>> able to
>>> >> >> >> >>>>> >>>>> > > run for more than millions of steps (I was not
>>> let
>>> >> it
>>> >> >> >> >>>>> running
>>> >> >> >> >>>>> for
>>> >> >> >> >>>>> >>>>> weeks
>>> >> >> >> >>>>> >>>>> > as
>>> >> >> >> >>>>> >>>>> > > in the meantime I shifted that simulation to
>>> other
>>> >> >> card
>>> >> >> >> -
>>> >> >> >> >>>>> need
>>> >> >> >> >>>>> >>>>> data,
>>> >> >> >> >>>>> >>>>> not
>>> >> >> >> >>>>> >>>>> > > testing). All other setting of ntpr failed. As
>>> I
>>> >> read
>>> >> >> >> this
>>> >> >> >> >>>>> >>>>> discussion, I
>>> >> >> >> >>>>> >>>>> > > tried to set ene_avg_sampling=1 with some high
>>> value
>>> >> >> of
>>> >> >> >> >>>>> ntpr
>>> >> >> >> >>>>> (I
>>> >> >> >> >>>>> >>>>> expected
>>> >> >> >> >>>>> >>>>> > > that this will shift the code to permanently
>>> use
>>> the
>>> >> >> >> >>>>> >>>>> force+energies
>>> >> >> >> >>>>> >>>>> part
>>> >> >> >> >>>>> >>>>> > of
>>> >> >> >> >>>>> >>>>> > > the code, similarly to ntpr=1), but the error
>>> >> >> occurred
>>> >> >> >> >>>>> again.
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > I know it is not very conclusive for finding
>>> out
>>> >> what
>>> >> > is
>>> >> >> >> >>>>> >>>>> happening,
>>> >> >> >> >>>>> >>>>> at
>>> >> >> >> >>>>> >>>>> > > least
>>> >> >> >> >>>>> >>>>> > > not for me. Do you have any idea, why ntpr=1
>>> might
>>> >> > help?
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > best regards,
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > Pavel
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > --
>>> >> >> >> >>>>> >>>>> > > Pavel Banáš
>>> >> >> >> >>>>> >>>>> > > pavel.banas.upol.cz
>>> >> >> >> >>>>> >>>>> > > Department of Physical Chemistry,
>>> >> >> >> >>>>> >>>>> > > Palacky University Olomouc
>>> >> >> >> >>>>> >>>>> > > Czech Republic
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > ---------- Původní zpráva ----------
>>> >> >> >> >>>>> >>>>> > > Od: Jason Swails <jason.swails.gmail.com>
>>> >> >> >> >>>>> >>>>> > > Datum: 29. 5. 2013
>>> >> >> >> >>>>> >>>>> > > Předmět: Re: [AMBER] experiences with EVGA GTX
>>> TITAN
>>> >> >> >> >>>>> >>>>> Superclocked -
>>> >> >> >> >>>>> >>>>> > > memtestG
>>> >> >> >> >>>>> >>>>> > > 80 - UNDERclocking in Linux ?
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > "I'll answer a little bit:
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > NTPR=10 Etot after 2000 steps
>>> >> >> >> >>>>> >>>>> > > >
>>> >> >> >> >>>>> >>>>> > > > -443256.6711
>>> >> >> >> >>>>> >>>>> > > > -443256.6711
>>> >> >> >> >>>>> >>>>> > > >
>>> >> >> >> >>>>> >>>>> > > > NTPR=200 Etot after 2000 steps
>>> >> >> >> >>>>> >>>>> > > >
>>> >> >> >> >>>>> >>>>> > > > -443261.0705
>>> >> >> >> >>>>> >>>>> > > > -443261.0705
>>> >> >> >> >>>>> >>>>> > > >
>>> >> >> >> >>>>> >>>>> > > > Any idea why energies should depend on
>>> frequency
>>> >> of
>>> >> >> >> >>>>> energy
>>> >> >> >> >>>>> >>>>> records
>>> >> >> >> >>>>> >>>>> > (NTPR)
>>> >> >> >> >>>>> >>>>> > > ?
>>> >> >> >> >>>>> >>>>> > > >
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > It is a subtle point, but the answer is
>>> 'different
>>> >> >> code
>>> >> >> >> >>>>> paths.'
>>> >> >> >> >>>>> >>>>> In
>>> >> >> >> >>>>> >>>>> > > general, it is NEVER necessary to compute the
>>> actual
>>> >> >> >> energy
>>> >> >> >> >>>>> of a
>>> >> >> >> >>>>> >>>>> molecule
>>> >> >> >> >>>>> >>>>> > > during the course of standard molecular
>>> dynamics
>>> (by
>>> >> >> >> >>>>> analogy, it
>>> >> >> >> >>>>> >>>>> is
>>> >> >> >> >>>>> >>>>> NEVER
>>> >> >> >> >>>>> >>>>> > > necessary to compute atomic forces during the
>>> course
>>> >> >> of
>>> >> >> >> >>>>> random
>>> >> >> >> >>>>> >>>>> Monte
>>> >> >> >> >>>>> >>>>> > Carlo
>>> >> >> >> >>>>> >>>>> > > sampling).
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > For performance's sake, then, pmemd.cuda
>>> computes
>>> >> >> only
>>> >> >> >> the
>>> >> >> >> >>>>> force
>>> >> >> >> >>>>> >>>>> when
>>> >> >> >> >>>>> >>>>> > > energies are not requested, leading to a
>>> different
>>> >> >> >> order of
>>> >> >> >> >>>>> >>>>> operations
>>> >> >> >> >>>>> >>>>> > for
>>> >> >> >> >>>>> >>>>> > > those runs. This difference ultimately causes
>>> >> >> >> divergence.
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > To test this, try setting the variable
>>> >> >> >> ene_avg_sampling=10
>>> >> >> >> >>>>> in
>>> >> >> >> >>>>> the
>>> >> >> >> >>>>> >>>>> &cntrl
>>> >> >> >> >>>>> >>>>> > > section. This will force pmemd.cuda to compute
>>> >> >> energies
>>> >> >> >> >>>>> every 10
>>> >> >> >> >>>>> >>>>> steps
>>> >> >> >> >>>>> >>>>> > > (for energy averaging), which will in turn
>>> make
>>> the
>>> >> >> >> >>>>> followed
>>> >> >> >> >>>>> code
>>> >> >> >> >>>>> >>>>> path
>>> >> >> >> >>>>> >>>>> > > identical for any multiple-of-10 value of
>>> ntpr.
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> > > --
>>> >> >> >> >>>>> >>>>> > > Jason M. Swails
>>> >> >> >> >>>>> >>>>> > > Quantum Theory Project,
>>> >> >> >> >>>>> >>>>> > > University of Florida
>>> >> >> >> >>>>> >>>>> > > Ph.D. Candidate
>>> >> >> >> >>>>> >>>>> > > 352-392-4032
>>> >> >> >> >>>>> >>>>> > >
>>> ______________________________**_________________
>>> >> >> >> >>>>> >>>>> > > AMBER mailing list
>>> >> >> >> >>>>> >>>>> > > AMBER.ambermd.org
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> http://lists.ambermd.org/**mailman/listinfo/amber<
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber>
>>> >> >> >> >>>>> >>>>> "
>>> >> >> >> >>>>> >>>>> > >
>>> ______________________________**_________________
>>> >> >> >> >>>>> >>>>> > > AMBER mailing list
>>> >> >> >> >>>>> >>>>> > > AMBER.ambermd.org
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> http://lists.ambermd.org/**mailman/listinfo/amber<
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber>
>>> >> >> >> >>>>> >>>>> > >
>>> >> >> >> >>>>> >>>>> >
>>> ______________________________**_________________
>>> >> >> >> >>>>> >>>>> > AMBER mailing list
>>> >> >> >> >>>>> >>>>> > AMBER.ambermd.org
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> http://lists.ambermd.org/**mailman/listinfo/amber<
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber>
>>> >> >> >> >>>>> >>>>> >
>>> ______________________________**_________________
>>> >> >> >> >>>>> >>>>> > AMBER mailing list
>>> >> >> >> >>>>> >>>>> > AMBER.ambermd.org
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>> http://lists.ambermd.org/**mailman/listinfo/amber<
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber>
>>> >> >> >> >>>>> >>>>> >
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>> --
>>> >> >> >> >>>>> >>>>> Jason M. Swails
>>> >> >> >> >>>>> >>>>> Quantum Theory Project,
>>> >> >> >> >>>>> >>>>> University of Florida
>>> >> >> >> >>>>> >>>>> Ph.D. Candidate
>>> >> >> >> >>>>> >>>>> 352-392-4032
>>> >> >> >> >>>>> >>>>> ______________________________**_________________
>>> >> >> >> >>>>> >>>>> AMBER mailing list
>>> >> >> >> >>>>> >>>>> AMBER.ambermd.org
>>> >> >> >> >>>>> >>>>> http://lists.ambermd.org/**mailman/listinfo/amber<
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber>
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>>>
>>> >> >> >> >>>>> >>>> ______________________________**_________________
>>> >> >> >> >>>>> >>> AMBER mailing list
>>> >> >> >> >>>>> >>> AMBER.ambermd.org
>>> >> >> >> >>>>> >>> http://lists.ambermd.org/**mailman/listinfo/amber<
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber>
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>> __________ Informace od ESET NOD32 Antivirus, verze
>>> >> >> databaze
>>> >> >> >> 8394
>>> >> >> >> >>>>> >>> (20130530) __________
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>> http://www.eset.cz
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >> --
>>> >> >> >> >>>>> >> Tato zpráva byla vytvořena převratným poštovním
>>> klientem
>>> >> > Opery:
>>> >> >> >> >>>>> >> http://www.opera.com/mail/
>>> >> >> >> >>>>> >> _______________________________________________
>>> >> >> >> >>>>> >> AMBER mailing list
>>> >> >> >> >>>>> >> AMBER.ambermd.org
>>> >> >> >> >>>>> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> >>
>>> >> >> >> >>>>> > _______________________________________________
>>> >> >> >> >>>>> > AMBER mailing list
>>> >> >> >> >>>>> > AMBER.ambermd.org
>>> >> >> >> >>>>> > http://lists.ambermd.org/mailman/listinfo/amber
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> > __________ Informace od ESET NOD32 Antivirus, verze
>>> databaze
>>> >> >> >> 8394
>>> >> >> >> >>>>> > (20130530) __________
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> > http://www.eset.cz
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>> >
>>> >> >> >> >>>>>
>>> >> >> >> >>>>>
>>> >> >> >> >>>>> --
>>> >> >> >> >>>>> Tato zpráva byla vytvořena převratným poštovním klientem
>>> >> Opery:
>>> >> >> >> >>>>> http://www.opera.com/mail/
>>> >> >> >> >>>>>
>>> >> >> >> >>>>> _______________________________________________
>>> >> >> >> >>>>> AMBER mailing list
>>> >> >> >> >>>>> AMBER.ambermd.org
>>> >> >> >> >>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> >> >> >> >>>>>
>>> >> >> >> >>>>
>>> >> >> >> >>>>
>>> >> >> >> >>> _______________________________________________
>>> >> >> >> >>> AMBER mailing list
>>> >> >> >> >>> AMBER.ambermd.org
>>> >> >> >> >>> http://lists.ambermd.org/mailman/listinfo/amber
>>> >> >> >> >>>
>>> >> >> >> >>> __________ Informace od ESET NOD32 Antivirus, verze
>>> databaze
>>> >> 8395
>>> >> >> >> >>> (20130531) __________
>>> >> >> >> >>>
>>> >> >> >> >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>> >> >> >> >>>
>>> >> >> >> >>> http://www.eset.cz
>>> >> >> >> >>>
>>> >> >> >> >>>
>>> >> >> >> >>>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> --
>>> >> >> >> Tato zpráva byla vytvořena převratným poštovním klientem
>>> Opery:
>>> >> >> >> http://www.opera.com/mail/
>>> >> >> >>
>>> >> >> >> _______________________________________________
>>> >> >> >> AMBER mailing list
>>> >> >> >> AMBER.ambermd.org
>>> >> >> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> >> >> >>
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > __________ Informace od ESET NOD32 Antivirus, verze databaze
>>> 8397
>>> >> >> > (20130531) __________
>>> >> >> >
>>> >> >> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>> >> >> >
>>> >> >> > GB_out_plus_diff_Files.tar.gz - poskozeny archiv
>>> >> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> GB_out_plus_diff_Files.tar
>>> >> >> > - poskozeny archiv
>>> >> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > GB_out_plus_diff_Files.tar > TAR >
>>> GB_out_plus_diff_Files.tar.gz -
>>> >> >> > poskozeny archiv
>>> >> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > GB_out_plus_diff_Files.tar > TAR >
>>> GB_out_plus_diff_Files.tar.gz >
>>> >> >> GZIP
>>> >> >> > > GB_out_plus_diff_Files.tar - poskozeny archiv
>>> >> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > GB_out_plus_diff_Files.tar > TAR >
>>> GB_out_plus_diff_Files.tar.gz >
>>> >> >> GZIP
>>> >> >> > > GB_out_plus_diff_Files.tar > TAR >
>>> GB_nucleosome-sim3.mdout-full
>>> -
>>> >> >> > vyskytl se problem pri cteni archivu
>>> >> >> > PME_out_plus_diff_Files.tar.gz - poskozeny archiv
>>> >> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > PME_out_plus_diff_Files.tar - poskozeny archiv
>>> >> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > PME_out_plus_diff_Files.tar > TAR >
>>> PME_out_plus_diff_Files.tar.gz
>>> -
>>> >> >> > poskozeny archiv
>>> >> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > PME_out_plus_diff_Files.tar > TAR >
>>> PME_out_plus_diff_Files.tar.gz
>>> >
>>> >> >> > GZIP > PME_out_plus_diff_Files.tar - poskozeny archiv
>>> >> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>> >> >> > PME_out_plus_diff_Filestar > TAR >
>>> PME_out_plus_diff_Files.tar.gz >
>>> >> >> GZIP
>>> >> >> > > PME_out_plus_diff_Files.tar > TAR >
>>> >> >> > PME_JAC_production_NPT-sim3.mdout-full - vyskytl se problem pri
>>> cteni
>>> >> >> > archivu
>>> >> >> >
>>> >> >> > http://www.eset.cz
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> >> >> http://www.opera.com/mail/
>>> >> >>
>>> >> >> _______________________________________________
>>> >> >> AMBER mailing list
>>> >> >> AMBER.ambermd.org
>>> >> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> >> > _______________________________________________
>>> >> > AMBER mailing list
>>> >> > AMBER.ambermd.org
>>> >> > http://lists.ambermd.org/mailman/listinfo/amber
>>> >> >
>>> >> > __________ Informace od ESET NOD32 Antivirus, verze databaze 8398
>>> >> > (20130531) __________
>>> >> >
>>> >> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>> >> >
>>> >> > http://www.eset.cz
>>> >> >
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >> --
>>> >> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> >> http://www.opera.com/mail/
>>> >>
>>> >> _______________________________________________
>>> >> AMBER mailing list
>>> >> AMBER.ambermd.org
>>> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> >>
>>> >
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8401
>> (20130601) __________
>>
>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>
>> http://www.eset.cz
>>
>>
>>
>
>


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jun 01 2013 - 13:00:04 PDT
Custom Search