Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ? from Marek Maly on 2013-06-02 (Amber Archive Jun 2013)

From: Marek Maly <marek.maly.ujep.cz>
Date: Sun, 02 Jun 2013 19:22:54 +0200

Hi so I finally succeeded to compile GPU Amber part under CUDA 5.5
(after "hacking" of the configure2 file) with common results in
consequent tests:

------
80 file comparisons passed
9 file comparisons failed
0 tests experienced errors
------

So now I am running the 100K(PME)/1000K(GB) repetitive benchmark tests
under
this configuration: drv. 319.23, CUDA 5.5. , bugfix 18 installed

When I finish it I will report results here.

M.

Dne Sun, 02 Jun 2013 18:44:23 +0200 Marek Maly <marek.maly.ujep.cz>
napsal/-a:

> Hi Scott thanks for the update !
>
> Anyway any explanation regarding "cuFFT hypothesis" why there are no
> problems
> with GTX 580, GTX 680 or even K20c ???
>
>
> meanwhile I also tried to recompile GPU part of Amber with
> cuda 5.5 installed before, I have obtained these errs
> already in configure phase:
>
> --------
> [root.dyn-138-272 amber12]# ./configure -cuda -noX11 gnu
> Checking for updates...
> Checking for available patches online. This may take a few seconds...
>
> Available AmberTools 13 patches:
>
> No patches available
>
> Available Amber 12 patches:
>
> No patches available
> Searching for python2... Found python2.6: /usr/bin/python2.6
> Error: Unsupported CUDA version 5.5 detected.
> AMBER requires CUDA version == 4.2 .or. 5.0
> Configure failed due to the errors above!
> ---------
>
> so it seems that Amber is possible to compile only with CUDA 4.2 or 5.0
> at
> the moment:
>
> and this part of configure2 file has to be edited:
>
>
> -----------
> nvcc="$CUDA_HOME/bin/nvcc"
> sm35flags='-gencode arch=compute_35,code=sm_35'
> sm30flags='-gencode arch=compute_30,code=sm_30'
> sm20flags='-gencode arch=compute_20,code=sm_20'
> sm13flags='-gencode arch=compute_13,code=sm_13'
> nvccflags="$sm13flags $sm20flags"
> cudaversion=`$nvcc --version | grep 'release' | cut -d' ' -f5 | cut
> -d',' -f1`
> if [ "$cudaversion" == "5.0" ]; then
> echo "CUDA Version $cudaversion detected"
> nvccflags="$nvccflags $sm30flags $sm35flags"
> elif [ "$cudaversion" == "4.2" ]; then
> echo "CUDA Version $cudaversion detected"
> nvccflags="$nvccflags $sm30flags"
> else
> echo "Error: Unsupported CUDA version $cudaversion detected."
> echo "AMBER requires CUDA version == 4.2 .or. 5.0"
> exit 1
> fi
> nvcc="$nvcc $nvccflags"
>
> fi
>
> -----------
>
> would it be just OK to change
> "if [ "$cudaversion" == "5.0" ]; then"
>
> to
>
> "if [ "$cudaversion" == "5.5" ]; then"
>
>
> or some more flags etc. should be defined here to proceed successfully ?
>
>
> BTW it seems Scott, that you are on the way to isolate the problem soon
> so maybe it's better to wait and not to loose time with cuda 5.5
> experiments.
>
> I just thought that cuda 5.5 might be more "friendly" to Titans :)) e.g.
> in terms of cuFFT function ....
>
>
> I will keep fingers crossed :))
>
> M.
>
>
>
>
>
>
>
>
>
>
> Dne Sun, 02 Jun 2013 18:33:52 +0200 Scott Le Grand
> <varelse2005.gmail.com>
> napsal/-a:
>
>> PS this *might* indicate a software bug in cuFFT, but it needs more
>> characterization... And things are going to get a little stream of
>> consciousness from here because you're getting unfiltered raw data, so
>> please don't draw any conclusions towards anything yet - I'm just
>> letting
>> you guys know what I'm finding out as I find it...
>>
>>
>>
>> On Sun, Jun 2, 2013 at 9:31 AM, Scott Le Grand
>> <varelse2005.gmail.com>wrote:
>>
>>> And bingo...
>>>
>>> At the very least, the reciprocal sum is intermittently inconsistent...
>>> This explains the irreproducible behavior...
>>>
>>> And here's the level of inconsistency:
>>> 31989.38940628897399 vs
>>> 31989.39168370794505
>>>
>>> That's error at the level of 1e-7 or a somehow missed single-precision
>>> transaction somewhere...
>>>
>>> The next question is figuring out why... This may or may not
>>> ultimately
>>> explain the crashes you guys are also seeing...
>>>
>>>
>>>
>>> On Sun, Jun 2, 2013 at 9:07 AM, Scott Le Grand
>>> <varelse2005.gmail.com>wrote:
>>>
>>>>
>>>> Observations:
>>>> 1. The degree to which the reproducibility is broken *does* appear to
>>>> vary between individual Titan GPUs. One of my Titans breaks within
>>>> 10K
>>>> steps on cellulose, the other one made it to 100K steps twice without
>>>> doing
>>>> so leading me to believe it could be trusted (until yesterday where I
>>>> now
>>>> see it dies between 50K and 100K steps most of the time).
>>>>
>>>> 2. GB hasn't broken (yet). So could you run myoglobin for 500K and
>>>> TRPcage for 1,000,000 steps and let's see if that's universal.
>>>>
>>>> 3. Turning on double-precision mode makes my Titan crash rather than
>>>> run
>>>> irreproducibly, sigh...
>>>>
>>>> So whatever is going on is triggered by something in PME but not GB.
>>>> So
>>>> that's either the radix sort, the FFT, the Ewald grid interpolation,
>>>> or the
>>>> neighbor list code. Fixing this involves isolating this and figuring
>>>> out
>>>> what exactly goes haywire. It could *still* be software at some very
>>>> small
>>>> probability but the combination of both 680 and K20c with ECC off
>>>> running
>>>> reliably is really pointing towards the Titans just being clocked too
>>>> fast.
>>>>
>>>> So how long with this take? Asking people how long it takes to fix a
>>>> bug
>>>> never really works out well. That said, I found the 480 bug within a
>>>> week
>>>> and my usual turnaround for a bug with a solid repro is <24 hours.
>>>>
>>>> Scott
>>>>
>>>> On Sun, Jun 2, 2013 at 7:58 AM, Marek Maly <marek.maly.ujep.cz> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> here are my results after bugfix 18 application (see attachment).
>>>>>
>>>>> In principle I don't see any "drastical" changes.
>>>>>
>>>>> FACTOR_IX still perfectly stable/reproducible on both cards,
>>>>>
>>>>> JAC tests - problems with finishing AND/OR reproducibility the
>>>>> same CELLULOSE_NVE although here it seems that my TITAN_1
>>>>> has no problems with this test (but the same same trend I saw also
>>>>> before bugfix 18 - see my older 500K steps test).
>>>>>
>>>>> But anyway bugfix 18 brought here one change.
>>>>>
>>>>> The err
>>>>>
>>>>>
>>>>> #1 ERR writtent in mdout:
>>>>> ------
>>>>> | ERROR: max pairlist cutoff must be less than unit cell max sphere
>>>>> radius!
>>>>> ------
>>>>>
>>>>> was substituted with err/warning ?
>>>>>
>>>>> #0 no ERR writtent in mdout, ERR written in standard output
>>>>> (nohup.out)
>>>>> -----
>>>>> Nonbond cells need to be recalculated, restart simulation from
>>>>> previous
>>>>> checkpoint
>>>>> with a higher value for skinnb.
>>>>>
>>>>> -----
>>>>>
>>>>> Another thing,
>>>>>
>>>>> recently I started on another machine and GTX 580 GPU simulation of
>>>>> relatively
>>>>> big system ( 364275 atoms/PME ). The system is composed also from the
>>>>> "exotic" molecules like polymers. ff12SB, gaff, GLYCAM forcefields
>>>>> used
>>>>> here. I had problem even with minimization part here, having big
>>>>> energy
>>>>> on the start:
>>>>>
>>>>> -----
>>>>> NSTEP ENERGY RMS GMAX NAME
>>>>> NUMBER
>>>>> 1 2.8442E+09 2.1339E+02 1.7311E+04 O
>>>>> 32998
>>>>>
>>>>> BOND = 11051.7467 ANGLE = 17720.4706 DIHED =
>>>>> 18977.7584
>>>>> VDWAALS = ************* EEL = -1257709.6203 HBOND =
>>>>> 0.0000
>>>>> 1-4 VDW = 7253.7412 1-4 EEL = 149867.0207 RESTRAINT =
>>>>> 0.0000
>>>>>
>>>>> ----
>>>>>
>>>>> with no chance to minimize the system even with 50 000 steps in both
>>>>> min cycles (with constrained and unconstrained solute) and hence
>>>>> heating
>>>>> NVT
>>>>> crashed immediately even with very small dt. I patched Amber12 here
>>>>> with
>>>>> the
>>>>> bugfix 18 and the minimization was done without any problem with
>>>>> common
>>>>> 5000 steps
>>>>> (obtaining target Energy -1.4505E+06 while that initial was that
>>>>> written
>>>>> above).
>>>>>
>>>>> So indeed bugfix 18 solved some issues, but unfortunately not those
>>>>> related to
>>>>> Titans.
>>>>>
>>>>> Here I will try to install cuda 5.5, recompile GPU Amber part with
>>>>> this
>>>>> new
>>>>> cuda version and repeat the 100K tests.
>>>>>
>>>>> Scott, let us know how finished your experiment with downclocking of
>>>>> Titan.
>>>>> Maybe the best choice would be here to flash Titan directly with your
>>>>> K20c bios :))
>>>>>
>>>>> M.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dne Sat, 01 Jun 2013 21:09:46 +0200 Marek Maly <marek.maly.ujep.cz>
>>>>> napsal/-a:
>>>>>
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> first of all thanks for providing of your test results !
>>>>>>
>>>>>> It seems that your results are more or less similar to that of
>>>>>> mine maybe with the exception of the results on FactorIX tests
>>>>>> where I had perfect stability and 100% or close to 100%
>>>>>> reproducibility.
>>>>>>
>>>>>> Anyway the type of errs which you reported are the same which I
>>>>>> obtained.
>>>>>>
>>>>>> So let's see if the bugfix 18 will help here (or at least on NPT
>>>>>> tests)
>>>>>> or not. As I wrote just before few minutes, it seems that it was not
>>>>>> still
>>>>>> loaded
>>>>>> to the given server, although it's description is already present on
>>>>>> the
>>>>>> given
>>>>>> web page ( see
>>>>>> http://ambermd.org/bugfixes12.**html<http://ambermd.org/bugfixes12.html>).
>>>>>>
>>>>>> As you can see, this bugfix contains also changes in CPU code
>>>>>> although
>>>>>> the majority is devoted to GPU code, so perhaps the best will be to
>>>>>> recompile
>>>>>> whole amber with this patch although this patch would be perhaps
>>>>>> applied
>>>>>> even after just
>>>>>> GPU configure command ( i.e. ./configure -cuda -noX11 gnu ) but
>>>>>> after
>>>>>> consequent
>>>>>> building, just the GPU binaries will be updated. Anyway I would
>>>>>> rather
>>>>>> recompile
>>>>>> whole Amber after this patch.
>>>>>>
>>>>>> Regarding to GPU test under linux you may try memtestG80
>>>>>> (please use the updated/patched version from here
>>>>>> https://github.com/ihaque/**memtestG80<https://github.com/ihaque/memtestG80>
>>>>>> )
>>>>>>
>>>>>> just use git command like:
>>>>>>
>>>>>> git clone
>>>>>> https://github.com/ihaque/**memtestG80.git<https://github.com/ihaque/memtestG80.git>PATCHED_MEMTEST-G80
>>>>>>
>>>>>> to download all the files and save them into directory named
>>>>>> PATCHED_MEMTEST-G80.
>>>>>>
>>>>>> another possibility is to try perhaps similar (but maybe more up to
>>>>>> date)
>>>>>> test
>>>>>> cuda_memtest (
>>>>>> http://sourceforge.net/**projects/cudagpumemtest/<http://sourceforge.net/projects/cudagpumemtest/>).
>>>>>>
>>>>>> regarding ig value: If ig is not present in mdin, the default value
>>>>>> is
>>>>>> used
>>>>>> (e.g. 71277) if ig=-1 the random seed will be based on the current
>>>>>> date
>>>>>> and time, and hence will be different for every run (not a good
>>>>>> variant
>>>>>> for our testts). I simply deleted eventual ig records from all mdins
>>>>>> so
>>>>>> I
>>>>>> assume that in each run the default seed 71277 was automatically
>>>>>> used.
>>>>>>
>>>>>> M.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dne Sat, 01 Jun 2013 20:26:16 +0200 ET <sketchfoot.gmail.com>
>>>>>> napsal/-a:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>> I've put the graphics card into a machine with the working GTX
>>>>>>> titan
>>>>>>> that I
>>>>>>> mentioned earlier.
>>>>>>>
>>>>>>> The Nvidia driver version is: 133.30
>>>>>>>
>>>>>>> Amber version is:
>>>>>>> AmberTools version 13.03
>>>>>>> Amber version 12.16
>>>>>>>
>>>>>>> I ran 50k steps with the amber benchmark using ig=43689 on both
>>>>>>> cards.
>>>>>>> For
>>>>>>> the purpose of discriminating between them, the card I believe
>>>>>>> (fingers
>>>>>>> crossed) is working is called GPU-00_TeaNCake, whilst the other one
>>>>>>> is
>>>>>>> called GPU-01_008.
>>>>>>>
>>>>>>> *When I run the tests on GPU-01_008:*
>>>>>>>
>>>>>>> 1) All the tests (across 2x repeats) finish apart from the
>>>>>>> following
>>>>>>> which
>>>>>>> have the errors listed:
>>>>>>>
>>>>>>> ------------------------------**--------------
>>>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>>>>>> Error: unspecified launch failure launching kernel kNLSkinTest
>>>>>>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>>>>>>
>>>>>>> ------------------------------**--------------
>>>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>>>>>
>>>>>>> ------------------------------**--------------
>>>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>>>>>> Error: unspecified launch failure launching kernel kNLSkinTest
>>>>>>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>>>>>>
>>>>>>> ------------------------------**--------------
>>>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>>>>> grep: mdinfo.1GTX680: No such file or directory
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2) The sdiff logs indicate that reproducibility across the two
>>>>>>> repeats
>>>>>>> is
>>>>>>> as follows:
>>>>>>>
>>>>>>> *GB_myoglobin: *Reproducible across 50k steps
>>>>>>> *GB_nucleosome:* Reproducible till step 7400
>>>>>>> *GB_TRPCage:* Reproducible across 50k steps
>>>>>>>
>>>>>>> *PME_JAC_production_NVE: *No reproducibility shown from step 1,000
>>>>>>> onwards
>>>>>>> *PME_JAC_production_NPT*: Reproducible till step 1,000. Also
>>>>>>> outfile
>>>>>>> is
>>>>>>> not written properly - blank gaps appear where something should
>>>>>>> have
>>>>>>> been
>>>>>>> written
>>>>>>>
>>>>>>> *PME_FactorIX_production_NVE:* Reproducible across 50k steps
>>>>>>> *PME_FactorIX_production_NPT:* Reproducible across 50k steps
>>>>>>>
>>>>>>> *PME_Cellulose_production_NVE:*** Failure means that both runs do
>>>>>>> not
>>>>>>> finish
>>>>>>> (see point1)
>>>>>>> *PME_Cellulose_production_NPT: *Failure means that both runs do not
>>>>>>> finish
>>>>>>> (see point1)
>>>>>>>
>>>>>>> ##############################**##############################**
>>>>>>> ###########################
>>>>>>>
>>>>>>> *When I run the tests on * *GPU-00_TeaNCake:*
>>>>>>> *
>>>>>>> *
>>>>>>> 1) All the tests (across 2x repeats) finish apart from the
>>>>>>> following
>>>>>>> which
>>>>>>> have the errors listed:
>>>>>>> ------------------------------**-------
>>>>>>> JAC_PRODUCTION_NPT - 23,558 atoms PME
>>>>>>> PMEMD Terminated Abnormally!
>>>>>>> ------------------------------**-------
>>>>>>>
>>>>>>>
>>>>>>> 2) The sdiff logs indicate that reproducibility across the two
>>>>>>> repeats
>>>>>>> is
>>>>>>> as follows:
>>>>>>>
>>>>>>> *GB_myoglobin:* Reproducible across 50k steps
>>>>>>> *GB_nucleosome:* Reproducible across 50k steps
>>>>>>> *GB_TRPCage:* Reproducible across 50k steps
>>>>>>>
>>>>>>> *PME_JAC_production_NVE:* No reproducibility shown from step 10,000
>>>>>>> onwards
>>>>>>> *PME_JAC_production_NPT: * No reproducibility shown from step
>>>>>>> 10,000
>>>>>>> onwards. Also outfile is not written properly - blank gaps appear
>>>>>>> where
>>>>>>> something should have been written. Repeat 2 Crashes with error
>>>>>>> noted
>>>>>>> in
>>>>>>> 1.
>>>>>>>
>>>>>>> *PME_FactorIX_production_NVE:* No reproducibility shown from step
>>>>>>> 9,000
>>>>>>> onwards
>>>>>>> *PME_FactorIX_production_NPT: *Reproducible across 50k steps
>>>>>>>
>>>>>>> *PME_Cellulose_production_NVE: *No reproducibility shown from step
>>>>>>> 5,000
>>>>>>> onwards
>>>>>>> *PME_Cellulose_production_NPT: ** *No reproducibility shown from
>>>>>>> step
>>>>>>> 29,000 onwards. Also outfile is not written properly - blank gaps
>>>>>>> appear
>>>>>>> where something should have been written.
>>>>>>>
>>>>>>>
>>>>>>> Out files and sdiff files are included as attatchments
>>>>>>>
>>>>>>> ##############################**###################
>>>>>>>
>>>>>>> So I'm going to update my nvidia driver to the latest version and
>>>>>>> patch
>>>>>>> amber to the latest version and rerun the tests to see if there is
>>>>>>> any
>>>>>>> improvement. Could someone let me know if it is necessary to
>>>>>>> recompile
>>>>>>> any
>>>>>>> or all of AMBER after applying the bugfixes?
>>>>>>>
>>>>>>> Additionally, I'm going to run memory tests and heaven benchmarks
>>>>>>> on
>>>>>>> the
>>>>>>> cards to check whether they are faulty or not.
>>>>>>>
>>>>>>> I'm thinking that there is a mix of hardware error/configuration
>>>>>>> (esp
>>>>>>> in
>>>>>>> the case of GPU-01_008) and amber software error in this situation.
>>>>>>> What
>>>>>>> do
>>>>>>> you guys think?
>>>>>>>
>>>>>>> Also am I right in thinking (from what Scott was saying) that all
>>>>>>> the
>>>>>>> benchmarks should be reproducible across 50k steps but begin to
>>>>>>> diverge
>>>>>>> at
>>>>>>> around 100K steps? Is there any difference from in setting *ig *to
>>>>>>> an
>>>>>>> explicit number to removing it from the mdin file?
>>>>>>>
>>>>>>> br,
>>>>>>> g
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 31 May 2013 23:45, ET <sketchfoot.gmail.com> wrote:
>>>>>>>
>>>>>>> I don't need sysadmins, but sysadmins need me as it gives purpose
>>>>>>> to
>>>>>>>> their
>>>>>>>> bureaucratic existence. A encountered evil if working in an
>>>>>>>> institution
>>>>>>>> or
>>>>>>>> comapny IMO. Good science and indiviguality being sacrificed for
>>>>>>>> standardisation and mediocrity in the intrerests of maintaing a
>>>>>>>> system
>>>>>>>> that
>>>>>>>> focusses on maintaining the system and not the objective.
>>>>>>>>
>>>>>>>> You need root to move fwd on these things, unfortunately. and ppl
>>>>>>>> with
>>>>>>>> root are kinda like your parents when you try to borrow money from
>>>>>>>> them
>>>>>>>> .
>>>>>>>> age 12 :D
>>>>>>>> On May 31, 2013 9:34 PM, "Marek Maly" <marek.maly.ujep.cz> wrote:
>>>>>>>>
>>>>>>>> Sorry why do you need sysadmins :)) ?
>>>>>>>>>
>>>>>>>>> BTW here is the most recent driver:
>>>>>>>>>
>>>>>>>>> http://www.nvidia.com/object/**linux-display-amd64-319.23-**
>>>>>>>>> driver.html<http://www.nvidia.com/object/linux-display-amd64-319.23-driver.html>
>>>>>>>>>
>>>>>>>>> I do not remember anything easier than is to install driver
>>>>>>>>> (especially
>>>>>>>>> in case of binary (*.run) installer) :))
>>>>>>>>>
>>>>>>>>> M.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dne Fri, 31 May 2013 22:02:34 +0200 ET <sketchfoot.gmail.com>
>>>>>>>>> napsal/-a:
>>>>>>>>>
>>>>>>>>> > Yup. I know. I replaced a 680 and the everknowing sysadmins are
>>>>>>>>> reluctant
>>>>>>>>> > to install drivers not in the repositoery as they are lame. :(
>>>>>>>>> > On May 31, 2013 7:14 PM, "Marek Maly" <marek.maly.ujep.cz>
>>>>>>>>> wrote:
>>>>>>>>> >>
>>>>>>>>> >> As I already wrote you,
>>>>>>>>> >>
>>>>>>>>> >> the first driver which properly/officially supports Titans,
>>>>>>>>> should
>>>>>>>>> be
>>>>>>>>> >> 313.26 .
>>>>>>>>> >>
>>>>>>>>> >> Anyway I am curious mainly about your 100K repetitive tests
>>>>>>>>> with
>>>>>>>>> >> your Titan SC card. Especially in case of these tests (
>>>>>>>>> JAC_NVE,
>>>>>>>>> JAC_NPT
>>>>>>>>> >> and CELLULOSE_NVE ) where
>>>>>>>>> >> my Titans SC randomly failed or succeeded. In FACTOR_IX_NVE,
>>>>>>>>> >> FACTOR_IX_NPT
>>>>>>>>> >> tests both
>>>>>>>>> >> my cards are perfectly stable (independently from drv.
>>>>>>>>> version)
>>>>>>>>> and
>>>>>>>>> also
>>>>>>>>> >> the runs
>>>>>>>>> >> are perfectly or almost perfectly reproducible.
>>>>>>>>> >>
>>>>>>>>> >> Also if your test will crash please report the eventual errs.
>>>>>>>>> >>
>>>>>>>>> >> To this moment I have this actual library of errs on my Titans
>>>>>>>>> SC
>>>>>>>>> GPUs.
>>>>>>>>> >>
>>>>>>>>> >> #1 ERR writtent in mdout:
>>>>>>>>> >> ------
>>>>>>>>> >> | ERROR: max pairlist cutoff must be less than unit cell max
>>>>>>>>> sphere
>>>>>>>>> >> radius!
>>>>>>>>> >> ------
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> #2 no ERR writtent in mdout, ERR written in standard output
>>>>>>>>> (nohup.out)
>>>>>>>>> >>
>>>>>>>>> >> ----
>>>>>>>>> >> Error: unspecified launch failure launching kernel kNLSkinTest
>>>>>>>>> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
>>>>>>>>> failure
>>>>>>>>> >> ----
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> #3 no ERR writtent in mdout, ERR written in standard output
>>>>>>>>> (nohup.out)
>>>>>>>>> >> ----
>>>>>>>>> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
>>>>>>>>> failure
>>>>>>>>> >> ----
>>>>>>>>> >>
>>>>>>>>> >> Another question, regarding your Titan SC, it is also EVGA as
>>>>>>>>> in
>>>>>>>>> my
>>>>>>>>> case
>>>>>>>>> >> or it is another producer ?
>>>>>>>>> >>
>>>>>>>>> >> Thanks,
>>>>>>>>> >>
>>>>>>>>> >> M.
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> Dne Fri, 31 May 2013 19:17:03 +0200 ET <sketchfoot.gmail.com>
>>>>>>>>> napsal/-a:
>>>>>>>>> >>
>>>>>>>>> >> > Well, this is interesting...
>>>>>>>>> >> >
>>>>>>>>> >> > I ran 50k steps on the Titan on the other machine with
>>>>>>>>> driver
>>>>>>>>> 310.44
>>>>>>>>> >> and
>>>>>>>>> >> > it
>>>>>>>>> >> > passed all the GB steps. i.e totally identical results over
>>>>>>>>> two
>>>>>>>>> >> repeats.
>>>>>>>>> >> > However, it failed all the PME tests after step 1000. I'm
>>>>>>>>> going
>>>>>>>>> to
>>>>>>>>> > update
>>>>>>>>> >> > the driver and test it again.
>>>>>>>>> >> >
>>>>>>>>> >> > Files included as attachments.
>>>>>>>>> >> >
>>>>>>>>> >> > br,
>>>>>>>>> >> > g
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > On 31 May 2013 16:40, Marek Maly <marek.maly.ujep.cz> wrote:
>>>>>>>>> >> >
>>>>>>>>> >> >> One more thing,
>>>>>>>>> >> >>
>>>>>>>>> >> >> can you please check under which frequency is running that
>>>>>>>>> your
>>>>>>>>> >> titan ?
>>>>>>>>> >> >>
>>>>>>>>> >> >> As the base frequency of normal Titans is 837MHz and the
>>>>>>>>> Boost
>>>>>>>>> one
>>>>>>>>> is
>>>>>>>>> >> >> 876MHz I
>>>>>>>>> >> >> assume that yor GPU is running automatically also under
>>>>>>>>> it's
>>>>>>>>> boot
>>>>>>>>> >> >> frequency (876MHz).
>>>>>>>>> >> >> You can find this information e.g. in Amber mdout file.
>>>>>>>>> >> >>
>>>>>>>>> >> >> You also mentioned some crashes in your previous email.
>>>>>>>>> Your
>>>>>>>>> ERRs
>>>>>>>>> >> were
>>>>>>>>> >> >> something like those here:
>>>>>>>>> >> >>
>>>>>>>>> >> >> #1 ERR writtent in mdout:
>>>>>>>>> >> >> ------
>>>>>>>>> >> >> | ERROR: max pairlist cutoff must be less than unit cell
>>>>>>>>> max
>>>>>>>>> sphere
>>>>>>>>> >> >> radius!
>>>>>>>>> >> >> ------
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >> #2 no ERR writtent in mdout, ERR written in standard output
>>>>>>>>> >> (nohup.out)
>>>>>>>>> >> >>
>>>>>>>>> >> >> ----
>>>>>>>>> >> >> Error: unspecified launch failure launching kernel
>>>>>>>>> kNLSkinTest
>>>>>>>>> >> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
>>>>>>>>> failure
>>>>>>>>> >> >> ----
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >> #3 no ERR writtent in mdout, ERR written in standard output
>>>>>>>>> >> (nohup.out)
>>>>>>>>> >> >> ----
>>>>>>>>> >> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
>>>>>>>>> failure
>>>>>>>>> >> >> ----
>>>>>>>>> >> >>
>>>>>>>>> >> >> or you obtained some new/additional errs ?
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >> M.
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >> Dne Fri, 31 May 2013 17:30:57 +0200 filip fratev
>>>>>>>>> >> <filipfratev.yahoo.com
>>>>>>>>> >>
>>>>>>>>> >> >> napsal/-a:
>>>>>>>>> >> >>
>>>>>>>>> >> >> > Hi,
>>>>>>>>> >> >> > This is what I obtained for 50K tests and "normal"
>>>>>>>>> GTXTitan:
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > run1:
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> ------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > A V E R A G E S O V E R 50 S T E P S
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>> 299.87
>>>>>>>>> >> PRESS
>>>>>>>>> >> >> > = 0.0
>>>>>>>>> >> >> > Etot = -443237.1079 EKtot = 257679.9750 EPtot
>>>>>>>>> =
>>>>>>>>> >> >> > -700917.0829
>>>>>>>>> >> >> > BOND = 20193.1856 ANGLE = 53517.5432 DIHED
>>>>>>>>> =
>>>>>>>>> >> >> > 23575.4648
>>>>>>>>> >> >> > 1-4 NB = 21759.5524 1-4 EEL = 742552.5939
>>>>>>>>> VDWAALS
>>>>>>>>> =
>>>>>>>>> >> >> > 96286.7714
>>>>>>>>> >> >> > EELEC = -1658802.1941 EHBOND = 0.0000
>>>>>>>>> RESTRAINT
>>>>>>>>> =
>>>>>>>>> >> >> > 0.0000
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> ------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > R M S F L U C T U A T I O N S
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>> 0.33
>>>>>>>>> >> PRESS
>>>>>>>>> >> >> > = 0.0
>>>>>>>>> >> >> > Etot = 11.2784 EKtot = 284.8999 EPtot
>>>>>>>>> =
>>>>>>>>> >> >> > 289.0773
>>>>>>>>> >> >> > BOND = 136.3417 ANGLE = 214.0054 DIHED
>>>>>>>>> =
>>>>>>>>> >> >> > 59.4893
>>>>>>>>> >> >> > 1-4 NB = 58.5891 1-4 EEL = 330.5400
>>>>>>>>> VDWAALS
>>>>>>>>> =
>>>>>>>>> >> >> > 559.2079
>>>>>>>>> >> >> > EELEC = 743.8771 EHBOND = 0.0000
>>>>>>>>> RESTRAINT
>>>>>>>>> =
>>>>>>>>> >> >> > 0.0000
>>>>>>>>> >> >> > |E(PBS) = 21.8119
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> ------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > run2:
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> ------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > A V E R A G E S O V E R 50 S T E P S
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>> 299.89
>>>>>>>>> >> PRESS
>>>>>>>>> >> >> > = 0.0
>>>>>>>>> >> >> > Etot = -443240.0999 EKtot = 257700.0950 EPtot
>>>>>>>>> =
>>>>>>>>> >> >> > -700940.1949
>>>>>>>>> >> >> > BOND = 20241.9174 ANGLE = 53644.6694 DIHED
>>>>>>>>> =
>>>>>>>>> >> >> > 23541.3737
>>>>>>>>> >> >> > 1-4 NB = 21803.1898 1-4 EEL = 742754.2254
>>>>>>>>> VDWAALS
>>>>>>>>> =
>>>>>>>>> >> >> > 96298.8308
>>>>>>>>> >> >> > EELEC = -1659224.4013 EHBOND = 0.0000
>>>>>>>>> RESTRAINT
>>>>>>>>> =
>>>>>>>>> >> >> > 0.0000
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> ------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > R M S F L U C T U A T I O N S
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>> 0.41
>>>>>>>>> >> PRESS
>>>>>>>>> >> >> > = 0.0
>>>>>>>>> >> >> > Etot = 10.7633 EKtot = 348.2819 EPtot
>>>>>>>>> =
>>>>>>>>> >> >> > 353.9918
>>>>>>>>> >> >> > BOND = 106.5314 ANGLE = 196.7052 DIHED
>>>>>>>>> =
>>>>>>>>> >> >> > 69.7476
>>>>>>>>> >> >> > 1-4 NB = 60.3435 1-4 EEL = 400.7466
>>>>>>>>> VDWAALS
>>>>>>>>> =
>>>>>>>>> >> >> > 462.7763
>>>>>>>>> >> >> > EELEC = 651.9857 EHBOND = 0.0000
>>>>>>>>> RESTRAINT
>>>>>>>>> =
>>>>>>>>> >> >> > 0.0000
>>>>>>>>> >> >> > |E(PBS) = 17.0642
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> ------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>> --------------------
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > ______________________________**__
>>>>>>>>> >> >> > From: Marek Maly <marek.maly.ujep.cz>
>>>>>>>>> >> >> > To: AMBER Mailing List <amber.ambermd.org>
>>>>>>>>> >> >> > Sent: Friday, May 31, 2013 3:34 PM
>>>>>>>>> >> >> > Subject: Re: [AMBER] experiences with EVGA GTX TITAN
>>>>>>>>> Superclocked
>>>>>>>>> -
>>>>>>>>> >> >> > memtestG80 - UNDERclocking in Linux ?
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > Hi here are my 100K results for driver 313.30 (and still
>>>>>>>>> Cuda
>>>>>>>>> 5.0).
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > The results are rather similar to those obtained
>>>>>>>>> >> >> > under my original driver 319.17 (see the first table
>>>>>>>>> >> >> > which I sent in this thread).
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > M.
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > Dne Fri, 31 May 2013 12:29:59 +0200 Marek Maly <
>>>>>>>>> marek.maly.ujep.cz>
>>>>>>>>> >> >> > napsal/-a:
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >> Hi,
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> please try to run at lest 100K tests twice to verify
>>>>>>>>> exact
>>>>>>>>> >> >> >> reproducibility
>>>>>>>>> >> >> >> of the results on the given card. If you find in any
>>>>>>>>> mdin
>>>>>>>>> file
>>>>>>>>> >> ig=-1
>>>>>>>>> >> >> >> just
>>>>>>>>> >> >> >> delete it to ensure that you are using the identical
>>>>>>>>> random
>>>>>>>>> seed
>>>>>>>>> >> for
>>>>>>>>> >> >> >> both
>>>>>>>>> >> >> >> runs. You can eventually omit NUCLEOSOME test
>>>>>>>>> >> >> >> as it is too much time consuming.
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> Driver 310.44 ?????
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> As far as I know the proper support for titans is from
>>>>>>>>> version
>>>>>>>>> > 313.26
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> see e.g. here :
>>>>>>>>> >> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >
>>>>>>>>> http://www.geeks3d.com/**20130306/nvidia-releases-r313-**
>>>>>>>>> 26-for-linux-with-gtx-titan-**support/<http://www.geeks3d.com/20130306/nvidia-releases-r313-26-for-linux-with-gtx-titan-support/>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> BTW: On my site downgrade to drv. 313.30 did not solved
>>>>>>>>> the
>>>>>>>>> >> >> situation, I
>>>>>>>>> >> >> >> will post
>>>>>>>>> >> >> >> my results soon here.
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> M.
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> Dne Fri, 31 May 2013 12:21:21 +0200 ET <
>>>>>>>>> sketchfoot.gmail.com>
>>>>>>>>> >> >> napsal/-a:
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>> ps. I have another install of amber on another computer
>>>>>>>>> with
>>>>>>>>> a
>>>>>>>>> >> >> >>> different
>>>>>>>>> >> >> >>> Titan and different Driver Version: 310.44.
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>> In the interests of thrashing the proverbial horse,
>>>>>>>>> I'll
>>>>>>>>> run
>>>>>>>>> the
>>>>>>>>> >> >> >>> benchmark
>>>>>>>>> >> >> >>> for 50k steps. :P
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>> br,
>>>>>>>>> >> >> >>> g
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>> On 31 May 2013 11:17, ET <sketchfoot.gmail.com> wrote:
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>>> Hi, I just ran the Amber benchmark for the default
>>>>>>>>> (10000
>>>>>>>>> steps)
>>>>>>>>> >> >> on my
>>>>>>>>> >> >> >>>> Titan.
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>>> Using sdiff -sB showed that the two runs were
>>>>>>>>> completely
>>>>>>>>> > identical.
>>>>>>>>> >> >> >>>> I've
>>>>>>>>> >> >> >>>> attached compressed files of the mdout & diff files.
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>>> br,
>>>>>>>>> >> >> >>>> g
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>>> On 30 May 2013 23:41, Marek Maly <marek.maly.ujep.cz>
>>>>>>>>> wrote:
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>>>> OK, let's see. The eventual downclocking I see as the
>>>>>>>>> very
>>>>>>>>> last
>>>>>>>>> >> >> >>>>> possibility
>>>>>>>>> >> >> >>>>> (if I don't decide for RMAing). But now still some
>>>>>>>>> other
>>>>>>>>> >> >> experiments
>>>>>>>>> >> >> >>>>> are
>>>>>>>>> >> >> >>>>> available :))
>>>>>>>>> >> >> >>>>> I just started 100K tests under 313.30 driver. For
>>>>>>>>> today
>>>>>>>>> good
>>>>>>>>> >> >> night
>>>>>>>>> >> >> >>>>> ...
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>> M.
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>> Dne Fri, 31 May 2013 00:45:49 +0200 Scott Le Grand
>>>>>>>>> >> >> >>>>> <varelse2005.gmail.com
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> napsal/-a:
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>> > It will be very interesting if this behavior
>>>>>>>>> persists
>>>>>>>>> after
>>>>>>>>> >> >> >>>>> downclocking.
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> > But right now, Titan 0 *looks* hosed and Titan 1
>>>>>>>>> *looks*
>>>>>>>>> like
>>>>>>>>> > it
>>>>>>>>> >> >> >>>>> needs
>>>>>>>>> >> >> >>>>> > downclocking...
>>>>>>>>> >> >> >>>>> > On May 30, 2013 3:20 PM, "Marek Maly"
>>>>>>>>> <marek.maly.ujep.cz>
>>>>>>>>> >> >> wrote:
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> >> Hi all,
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> here are my results from the 500K steps 2 x
>>>>>>>>> repeated
>>>>>>>>> > benchmarks
>>>>>>>>> >> >> >>>>> >> under 319.23 driver and still Cuda 5.0 (see the
>>>>>>>>> attached
>>>>>>>>> >> table
>>>>>>>>> >> >> ).
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> It is hard to say if the results are better or
>>>>>>>>> worse
>>>>>>>>> than
>>>>>>>>> in
>>>>>>>>> > my
>>>>>>>>> >> >> >>>>> >> previous 100K test under driver 319.17.
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> While results from Cellulose test were improved
>>>>>>>>> and
>>>>>>>>> the
>>>>>>>>> > TITAN_1
>>>>>>>>> >> >> >>>>> card
>>>>>>>>> >> >> >>>>> >> even
>>>>>>>>> >> >> >>>>> >> successfully finished all 500K steps moreover with
>>>>>>>>> exactly
>>>>>>>>> >> the
>>>>>>>>> >> >> >>>>> same
>>>>>>>>> >> >> >>>>> >> final
>>>>>>>>> >> >> >>>>> >> energy !
>>>>>>>>> >> >> >>>>> >> (TITAN_0 at least finished more than 100K steps
>>>>>>>>> and in
>>>>>>>>> >> RUN_01
>>>>>>>>> >> >> even
>>>>>>>>> >> >> >>>>> more
>>>>>>>>> >> >> >>>>> >> than 400K steps)
>>>>>>>>> >> >> >>>>> >> In JAC_NPT test no GPU was able to finish at least
>>>>>>>>> 100K
>>>>>>>>> >> steps
>>>>>>>>> >> >> and
>>>>>>>>> >> >> >>>>> the
>>>>>>>>> >> >> >>>>> >> results from JAC_NVE
>>>>>>>>> >> >> >>>>> >> test are also not too much convincing.
>>>>>>>>> FACTOR_IX_NVE
>>>>>>>>> and
>>>>>>>>> >> >> >>>>> FACTOR_IX_NPT
>>>>>>>>> >> >> >>>>> >> were successfully
>>>>>>>>> >> >> >>>>> >> finished with 100% reproducibility in
>>>>>>>>> FACTOR_IX_NPT
>>>>>>>>> case
>>>>>>>>> >> (on
>>>>>>>>> >> >> both
>>>>>>>>> >> >> >>>>> >> cards)
>>>>>>>>> >> >> >>>>> >> and almost
>>>>>>>>> >> >> >>>>> >> 100% reproducibility in case of FACTOR_IX_NVE
>>>>>>>>> (again
>>>>>>>>> 100%
>>>>>>>>> in
>>>>>>>>> >> >> case
>>>>>>>>> >> >> >>>>> of
>>>>>>>>> >> >> >>>>> >> TITAN_1). TRPCAGE, MYOGLOBIN
>>>>>>>>> >> >> >>>>> >> again finished without any problem with 100%
>>>>>>>>> >> reproducibility.
>>>>>>>>> >> >> >>>>> NUCLEOSOME
>>>>>>>>> >> >> >>>>> >> test was not done
>>>>>>>>> >> >> >>>>> >> this time due to high time requirements. If you
>>>>>>>>> find
>>>>>>>>> in
>>>>>>>>> the
>>>>>>>>> >> >> table
>>>>>>>>> >> >> >>>>> >> positive
>>>>>>>>> >> >> >>>>> >> number finishing with
>>>>>>>>> >> >> >>>>> >> K (which means "thousands") it means the last
>>>>>>>>> number
>>>>>>>>> of
>>>>>>>>> step
>>>>>>>>> >> >> >>>>> written in
>>>>>>>>> >> >> >>>>> >> mdout before crash.
>>>>>>>>> >> >> >>>>> >> Below are all the 3 types of detected errs with
>>>>>>>>> relevant
>>>>>>>>> >> >> >>>>> systems/rounds
>>>>>>>>> >> >> >>>>> >> where the given err
>>>>>>>>> >> >> >>>>> >> appeared.
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> Now I will try just 100K tests under ETs favourite
>>>>>>>>> driver
>>>>>>>>> >> >> version
>>>>>>>>> >> >> >>>>> 313.30
>>>>>>>>> >> >> >>>>> >> :)) and then
>>>>>>>>> >> >> >>>>> >> I will eventually try to experiment with cuda 5.5
>>>>>>>>> which
>>>>>>>>> I
>>>>>>>>> >> >> already
>>>>>>>>> >> >> >>>>> >> downloaded from the
>>>>>>>>> >> >> >>>>> >> cuda zone ( I had to become cuda developer for
>>>>>>>>> this
>>>>>>>>> :))
>>>>>>>>> )
>>>>>>>>> >> BTW
>>>>>>>>> >> >> ET
>>>>>>>>> >> >> >>>>> thanks
>>>>>>>>> >> >> >>>>> >> for the frequency info !
>>>>>>>>> >> >> >>>>> >> and I am still ( perhaps not alone :)) ) very
>>>>>>>>> curious
>>>>>>>>> about
>>>>>>>>> >> >> your 2
>>>>>>>>> >> >> >>>>> x
>>>>>>>>> >> >> >>>>> >> repeated Amber benchmark tests with superclocked
>>>>>>>>> Titan.
>>>>>>>>> >> Indeed
>>>>>>>>> >> >> >>>>> that
>>>>>>>>> >> >> >>>>> I
>>>>>>>>> >> >> >>>>> am
>>>>>>>>> >> >> >>>>> >> very curious also about that Ross "hot" patch.
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> M.
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> ERRORS DETECTED DURING THE 500K steps tests with
>>>>>>>>> driver
>>>>>>>>> >> 319.23
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> #1 ERR writtent in mdout:
>>>>>>>>> >> >> >>>>> >> ------
>>>>>>>>> >> >> >>>>> >> | ERROR: max pairlist cutoff must be less than
>>>>>>>>> unit
>>>>>>>>> cell
>>>>>>>>> >> max
>>>>>>>>> >> >> >>>>> sphere
>>>>>>>>> >> >> >>>>> >> radius!
>>>>>>>>> >> >> >>>>> >> ------
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_1 JAC_NPT (at least 5000 steps
>>>>>>>>> successfully
>>>>>>>>> > done
>>>>>>>>> >> >> >>>>> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NPT (at least 8000 steps
>>>>>>>>> successfully
>>>>>>>>> > done
>>>>>>>>> >> >> >>>>> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> #2 no ERR writtent in mdout, ERR written in
>>>>>>>>> standard
>>>>>>>>> output
>>>>>>>>> >> >> >>>>> (nohup.out)
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>> >> >> >>>>> >> Error: unspecified launch failure launching kernel
>>>>>>>>> >> kNLSkinTest
>>>>>>>>> >> >> >>>>> >> cudaFree GpuBuffer::Deallocate failed unspecified
>>>>>>>>> launch
>>>>>>>>> >> >> failure
>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_1 CELLULOSE_NVE (at least 437 000
>>>>>>>>> steps
>>>>>>>>> >> >> successfully
>>>>>>>>> >> >> >>>>> done
>>>>>>>>> >> >> >>>>> >> before crash)
>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NVE (at least 162 000 steps
>>>>>>>>> >> successfully
>>>>>>>>> >> >> done
>>>>>>>>> >> >> >>>>> >> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_2 CELLULOSE_NVE (at least 117 000
>>>>>>>>> steps
>>>>>>>>> >> >> successfully
>>>>>>>>> >> >> >>>>> done
>>>>>>>>> >> >> >>>>> >> before crash)
>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NVE (at least 119 000 steps
>>>>>>>>> >> successfully
>>>>>>>>> >> >> done
>>>>>>>>> >> >> >>>>> >> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NVE (at least 43 000 steps
>>>>>>>>> successfully
>>>>>>>>> >> >> done
>>>>>>>>> >> >> >>>>> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> #3 no ERR writtent in mdout, ERR written in
>>>>>>>>> standard
>>>>>>>>> output
>>>>>>>>> >> >> >>>>> (nohup.out)
>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>> >> >> >>>>> >> cudaMemcpy GpuBuffer::Download failed unspecified
>>>>>>>>> launch
>>>>>>>>> >> >> failure
>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NPT (at least 77 000 steps
>>>>>>>>> successfully
>>>>>>>>> >> >> done
>>>>>>>>> >> >> >>>>> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NPT (at least 58 000 steps
>>>>>>>>> successfully
>>>>>>>>> >> >> done
>>>>>>>>> >> >> >>>>> before
>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> Dne Thu, 30 May 2013 21:27:17 +0200 Scott Le Grand
>>>>>>>>> >> >> >>>>> >> <varelse2005.gmail.com>
>>>>>>>>> >> >> >>>>> >> napsal/-a:
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> Oops meant to send that to Jason...
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>> Anyway, before we all panic, we need to get K20's
>>>>>>>>> behavior
>>>>>>>>> >> >> >>>>> analyzed
>>>>>>>>> >> >> >>>>> >>> here.
>>>>>>>>> >> >> >>>>> >>> If it's deterministic, this truly is a hardware
>>>>>>>>> issue. If
>>>>>>>>> >> >> not,
>>>>>>>>> >> >> >>>>> then
>>>>>>>>> >> >> >>>>> it
>>>>>>>>> >> >> >>>>> >>> gets interesting because 680 is deterministic as
>>>>>>>>> far
>>>>>>>>> as I
>>>>>>>>> >> can
>>>>>>>>> >> >> >>>>> tell...
>>>>>>>>> >> >> >>>>> >>> On May 30, 2013 12:24 PM, "Scott Le Grand"
>>>>>>>>> >> >> >>>>> <varelse2005.gmail.com>
>>>>>>>>> >> >> >>>>> >>> wrote:
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>> If the errors are not deterministically
>>>>>>>>> triggered,
>>>>>>>>> they
>>>>>>>>> >> >> probably
>>>>>>>>> >> >> >>>>> >>> won't be
>>>>>>>>> >> >> >>>>> >>>> fixed by the patch, alas...
>>>>>>>>> >> >> >>>>> >>>> On May 30, 2013 12:15 PM, "Jason Swails"
>>>>>>>>> >> >> >>>>> <jason.swails.gmail.com>
>>>>>>>>> >> >> >>>>> >>>> wrote:
>>>>>>>>> >> >> >>>>> >>>>
>>>>>>>>> >> >> >>>>> >>>> Just a reminder to everyone based on what Ross
>>>>>>>>> said:
>>>>>>>>> >> there
>>>>>>>>> >> >> is a
>>>>>>>>> >> >> >>>>> >>>> pending
>>>>>>>>> >> >> >>>>> >>>>> patch to pmemd.cuda that will be coming out
>>>>>>>>> shortly
>>>>>>>>> >> (maybe
>>>>>>>>> >> >> even
>>>>>>>>> >> >> >>>>> >>>>> within
>>>>>>>>> >> >> >>>>> >>>>> hours). It's entirely possible that several of
>>>>>>>>> these
>>>>>>>>> > errors
>>>>>>>>> >> >> >>>>> are
>>>>>>>>> >> >> >>>>> >>>>> fixed
>>>>>>>>> >> >> >>>>> >>>>> by
>>>>>>>>> >> >> >>>>> >>>>> this patch.
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>> All the best,
>>>>>>>>> >> >> >>>>> >>>>> Jason
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>> On Thu, May 30, 2013 at 2:46 PM, filip fratev <
>>>>>>>>> >> >> >>>>> filipfratev.yahoo.com>
>>>>>>>>> >> >> >>>>> >>>>> wrote:
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>> > I have observed the same crashes from time to
>>>>>>>>> time. I
>>>>>>>>> > will
>>>>>>>>> >> >> >>>>> run
>>>>>>>>> >> >> >>>>> >>>>> cellulose
>>>>>>>>> >> >> >>>>> >>>>> > nve for 100k and will past results here.
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> > All the best,
>>>>>>>>> >> >> >>>>> >>>>> > Filip
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> > ______________________________****__
>>>>>>>>> >> >> >>>>> >>>>> > From: Scott Le Grand <varelse2005.gmail.com>
>>>>>>>>> >> >> >>>>> >>>>> > To: AMBER Mailing List <amber.ambermd.org>
>>>>>>>>> >> >> >>>>> >>>>> > Sent: Thursday, May 30, 2013 9:01 PM
>>>>>>>>> >> >> >>>>> >>>>> > Subject: Re: [AMBER] experiences with EVGA
>>>>>>>>> GTX
>>>>>>>>> TITAN
>>>>>>>>> >> >> >>>>> Superclocked
>>>>>>>>> >> >> >>>>> -
>>>>>>>>> >> >> >>>>> >>>>> > memtestG80 - UNDERclocking in Linux ?
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> > Run cellulose nve for 100k iterations twice
>>>>>>>>> . If
>>>>>>>>> the
>>>>>>>>> >> >> final
>>>>>>>>> >> >> >>>>> >>>>> energies
>>>>>>>>> >> >> >>>>> >>>>> don't
>>>>>>>>> >> >> >>>>> >>>>> > match, you have a hardware issue. No need to
>>>>>>>>> play
>>>>>>>>> with
>>>>>>>>> >> >> ntpr
>>>>>>>>> >> >> >>>>> or
>>>>>>>>> >> >> >>>>> any
>>>>>>>>> >> >> >>>>> >>>>> other
>>>>>>>>> >> >> >>>>> >>>>> > variable.
>>>>>>>>> >> >> >>>>> >>>>> > On May 30, 2013 10:58 AM,
>>>>>>>>> <pavel.banas.upol.cz>
>>>>>>>>> wrote:
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > Dear all,
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > I would also like to share one of my
>>>>>>>>> experience
>>>>>>>>> with
>>>>>>>>> >> >> titan
>>>>>>>>> >> >> >>>>> >>>>> cards. We
>>>>>>>>> >> >> >>>>> >>>>> have
>>>>>>>>> >> >> >>>>> >>>>> > > one gtx titan card and with one system
>>>>>>>>> (~55k
>>>>>>>>> atoms,
>>>>>>>>> > NVT,
>>>>>>>>> >> >> >>>>> >>>>> RNA+waters)
>>>>>>>>> >> >> >>>>> >>>>> we
>>>>>>>>> >> >> >>>>> >>>>> > run
>>>>>>>>> >> >> >>>>> >>>>> > > into same troubles you are describing. I
>>>>>>>>> was
>>>>>>>>> also
>>>>>>>>> >> >> playing
>>>>>>>>> >> >> >>>>> with
>>>>>>>>> >> >> >>>>> >>>>> ntpr
>>>>>>>>> >> >> >>>>> >>>>> to
>>>>>>>>> >> >> >>>>> >>>>> > > figure out what is going on, step by step.
>>>>>>>>> I
>>>>>>>>> >> understand
>>>>>>>>> >> >> >>>>> that
>>>>>>>>> >> >> >>>>> the
>>>>>>>>> >> >> >>>>> >>>>> code
>>>>>>>>> >> >> >>>>> >>>>> is
>>>>>>>>> >> >> >>>>> >>>>> > > using different routines for calculation
>>>>>>>>> >> >> energies+forces or
>>>>>>>>> >> >> >>>>> only
>>>>>>>>> >> >> >>>>> >>>>> forces.
>>>>>>>>> >> >> >>>>> >>>>> > > The
>>>>>>>>> >> >> >>>>> >>>>> > > simulations of other systems are perfectly
>>>>>>>>> stable,
>>>>>>>>> >> >> running
>>>>>>>>> >> >> >>>>> for
>>>>>>>>> >> >> >>>>> >>>>> days
>>>>>>>>> >> >> >>>>> >>>>> and
>>>>>>>>> >> >> >>>>> >>>>> > > weeks. Only that particular system
>>>>>>>>> systematically
>>>>>>>>> >> ends
>>>>>>>>> >> >> up
>>>>>>>>> >> >> >>>>> with
>>>>>>>>> >> >> >>>>> >>>>> this
>>>>>>>>> >> >> >>>>> >>>>> > error.
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > However, there was one interesting issue.
>>>>>>>>> When
>>>>>>>>> I
>>>>>>>>> set
>>>>>>>>> >> >> >>>>> ntpr=1,
>>>>>>>>> >> >> >>>>> the
>>>>>>>>> >> >> >>>>> >>>>> error
>>>>>>>>> >> >> >>>>> >>>>> > > vanished (systematically in multiple runs)
>>>>>>>>> and
>>>>>>>>> the
>>>>>>>>> >> >> >>>>> simulation
>>>>>>>>> >> >> >>>>> was
>>>>>>>>> >> >> >>>>> >>>>> able to
>>>>>>>>> >> >> >>>>> >>>>> > > run for more than millions of steps (I was
>>>>>>>>> not
>>>>>>>>> let
>>>>>>>>> it
>>>>>>>>> >> >> >>>>> running
>>>>>>>>> >> >> >>>>> for
>>>>>>>>> >> >> >>>>> >>>>> weeks
>>>>>>>>> >> >> >>>>> >>>>> > as
>>>>>>>>> >> >> >>>>> >>>>> > > in the meantime I shifted that simulation
>>>>>>>>> to
>>>>>>>>> other
>>>>>>>>> >> card
>>>>>>>>> >> >> -
>>>>>>>>> >> >> >>>>> need
>>>>>>>>> >> >> >>>>> >>>>> data,
>>>>>>>>> >> >> >>>>> >>>>> not
>>>>>>>>> >> >> >>>>> >>>>> > > testing). All other setting of ntpr failed.
>>>>>>>>> As
>>>>>>>>> I
>>>>>>>>> read
>>>>>>>>> >> >> this
>>>>>>>>> >> >> >>>>> >>>>> discussion, I
>>>>>>>>> >> >> >>>>> >>>>> > > tried to set ene_avg_sampling=1 with some
>>>>>>>>> high
>>>>>>>>> value
>>>>>>>>> >> of
>>>>>>>>> >> >> >>>>> ntpr
>>>>>>>>> >> >> >>>>> (I
>>>>>>>>> >> >> >>>>> >>>>> expected
>>>>>>>>> >> >> >>>>> >>>>> > > that this will shift the code to
>>>>>>>>> permanently
>>>>>>>>> use
>>>>>>>>> the
>>>>>>>>> >> >> >>>>> >>>>> force+energies
>>>>>>>>> >> >> >>>>> >>>>> part
>>>>>>>>> >> >> >>>>> >>>>> > of
>>>>>>>>> >> >> >>>>> >>>>> > > the code, similarly to ntpr=1), but the
>>>>>>>>> error
>>>>>>>>> >> occurred
>>>>>>>>> >> >> >>>>> again.
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > I know it is not very conclusive for
>>>>>>>>> finding
>>>>>>>>> out
>>>>>>>>> what
>>>>>>>>> > is
>>>>>>>>> >> >> >>>>> >>>>> happening,
>>>>>>>>> >> >> >>>>> >>>>> at
>>>>>>>>> >> >> >>>>> >>>>> > > least
>>>>>>>>> >> >> >>>>> >>>>> > > not for me. Do you have any idea, why
>>>>>>>>> ntpr=1
>>>>>>>>> might
>>>>>>>>> > help?
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > best regards,
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > Pavel
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > --
>>>>>>>>> >> >> >>>>> >>>>> > > Pavel Banáš
>>>>>>>>> >> >> >>>>> >>>>> > > pavel.banas.upol.cz
>>>>>>>>> >> >> >>>>> >>>>> > > Department of Physical Chemistry,
>>>>>>>>> >> >> >>>>> >>>>> > > Palacky University Olomouc
>>>>>>>>> >> >> >>>>> >>>>> > > Czech Republic
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > ---------- Původní zpráva ----------
>>>>>>>>> >> >> >>>>> >>>>> > > Od: Jason Swails <jason.swails.gmail.com>
>>>>>>>>> >> >> >>>>> >>>>> > > Datum: 29. 5. 2013
>>>>>>>>> >> >> >>>>> >>>>> > > Předmět: Re: [AMBER] experiences with EVGA
>>>>>>>>> GTX
>>>>>>>>> TITAN
>>>>>>>>> >> >> >>>>> >>>>> Superclocked -
>>>>>>>>> >> >> >>>>> >>>>> > > memtestG
>>>>>>>>> >> >> >>>>> >>>>> > > 80 - UNDERclocking in Linux ?
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > "I'll answer a little bit:
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > NTPR=10 Etot after 2000 steps
>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>> >> >> >>>>> >>>>> > > > -443256.6711
>>>>>>>>> >> >> >>>>> >>>>> > > > -443256.6711
>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>> >> >> >>>>> >>>>> > > > NTPR=200 Etot after 2000 steps
>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>> >> >> >>>>> >>>>> > > > -443261.0705
>>>>>>>>> >> >> >>>>> >>>>> > > > -443261.0705
>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>> >> >> >>>>> >>>>> > > > Any idea why energies should depend on
>>>>>>>>> frequency
>>>>>>>>> of
>>>>>>>>> >> >> >>>>> energy
>>>>>>>>> >> >> >>>>> >>>>> records
>>>>>>>>> >> >> >>>>> >>>>> > (NTPR)
>>>>>>>>> >> >> >>>>> >>>>> > > ?
>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > It is a subtle point, but the answer is
>>>>>>>>> 'different
>>>>>>>>> >> code
>>>>>>>>> >> >> >>>>> paths.'
>>>>>>>>> >> >> >>>>> >>>>> In
>>>>>>>>> >> >> >>>>> >>>>> > > general, it is NEVER necessary to compute
>>>>>>>>> the
>>>>>>>>> actual
>>>>>>>>> >> >> energy
>>>>>>>>> >> >> >>>>> of a
>>>>>>>>> >> >> >>>>> >>>>> molecule
>>>>>>>>> >> >> >>>>> >>>>> > > during the course of standard molecular
>>>>>>>>> dynamics
>>>>>>>>> (by
>>>>>>>>> >> >> >>>>> analogy, it
>>>>>>>>> >> >> >>>>> >>>>> is
>>>>>>>>> >> >> >>>>> >>>>> NEVER
>>>>>>>>> >> >> >>>>> >>>>> > > necessary to compute atomic forces during
>>>>>>>>> the
>>>>>>>>> course
>>>>>>>>> >> of
>>>>>>>>> >> >> >>>>> random
>>>>>>>>> >> >> >>>>> >>>>> Monte
>>>>>>>>> >> >> >>>>> >>>>> > Carlo
>>>>>>>>> >> >> >>>>> >>>>> > > sampling).
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > For performance's sake, then, pmemd.cuda
>>>>>>>>> computes
>>>>>>>>> >> only
>>>>>>>>> >> >> the
>>>>>>>>> >> >> >>>>> force
>>>>>>>>> >> >> >>>>> >>>>> when
>>>>>>>>> >> >> >>>>> >>>>> > > energies are not requested, leading to a
>>>>>>>>> different
>>>>>>>>> >> >> order of
>>>>>>>>> >> >> >>>>> >>>>> operations
>>>>>>>>> >> >> >>>>> >>>>> > for
>>>>>>>>> >> >> >>>>> >>>>> > > those runs. This difference ultimately
>>>>>>>>> causes
>>>>>>>>> >> >> divergence.
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > To test this, try setting the variable
>>>>>>>>> >> >> ene_avg_sampling=10
>>>>>>>>> >> >> >>>>> in
>>>>>>>>> >> >> >>>>> the
>>>>>>>>> >> >> >>>>> >>>>> &cntrl
>>>>>>>>> >> >> >>>>> >>>>> > > section. This will force pmemd.cuda to
>>>>>>>>> compute
>>>>>>>>> >> energies
>>>>>>>>> >> >> >>>>> every 10
>>>>>>>>> >> >> >>>>> >>>>> steps
>>>>>>>>> >> >> >>>>> >>>>> > > (for energy averaging), which will in turn
>>>>>>>>> make
>>>>>>>>> the
>>>>>>>>> >> >> >>>>> followed
>>>>>>>>> >> >> >>>>> code
>>>>>>>>> >> >> >>>>> >>>>> path
>>>>>>>>> >> >> >>>>> >>>>> > > identical for any multiple-of-10 value of
>>>>>>>>> ntpr.
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > > --
>>>>>>>>> >> >> >>>>> >>>>> > > Jason M. Swails
>>>>>>>>> >> >> >>>>> >>>>> > > Quantum Theory Project,
>>>>>>>>> >> >> >>>>> >>>>> > > University of Florida
>>>>>>>>> >> >> >>>>> >>>>> > > Ph.D. Candidate
>>>>>>>>> >> >> >>>>> >>>>> > > 352-392-4032
>>>>>>>>> >> >> >>>>> >>>>> > > ______________________________**
>>>>>>>>> **_________________
>>>>>>>>> >> >> >>>>> >>>>> > > AMBER mailing list
>>>>>>>>> >> >> >>>>> >>>>> > > AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>> <
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> >> >> >>>>> >>>>> "
>>>>>>>>> >> >> >>>>> >>>>> > > ______________________________**
>>>>>>>>> **_________________
>>>>>>>>> >> >> >>>>> >>>>> > > AMBER mailing list
>>>>>>>>> >> >> >>>>> >>>>> > > AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>> <
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>> >> >> >>>>> >>>>> > ______________________________**
>>>>>>>>> **_________________
>>>>>>>>> >> >> >>>>> >>>>> > AMBER mailing list
>>>>>>>>> >> >> >>>>> >>>>> > AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>> <
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> >> >> >>>>> >>>>> > ______________________________**
>>>>>>>>> **_________________
>>>>>>>>> >> >> >>>>> >>>>> > AMBER mailing list
>>>>>>>>> >> >> >>>>> >>>>> > AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>> <
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>> --
>>>>>>>>> >> >> >>>>> >>>>> Jason M. Swails
>>>>>>>>> >> >> >>>>> >>>>> Quantum Theory Project,
>>>>>>>>> >> >> >>>>> >>>>> University of Florida
>>>>>>>>> >> >> >>>>> >>>>> Ph.D. Candidate
>>>>>>>>> >> >> >>>>> >>>>> 352-392-4032
>>>>>>>>> >> >> >>>>> >>>>> ______________________________**
>>>>>>>>> **_________________
>>>>>>>>> >> >> >>>>> >>>>> AMBER mailing list
>>>>>>>>> >> >> >>>>> >>>>> AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>> <
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>> >> >> >>>>> >>>> ______________________________**
>>>>>>>>> **_________________
>>>>>>>>> >> >> >>>>> >>> AMBER mailing list
>>>>>>>>> >> >> >>>>> >>> AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> http://lists.ambermd.org/****mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>> <
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>> __________ Informace od ESET NOD32 Antivirus,
>>>>>>>>> verze
>>>>>>>>> >> databaze
>>>>>>>>> >> >> 8394
>>>>>>>>> >> >> >>>>> >>> (20130530) __________
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>> http://www.eset.cz
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >> --
>>>>>>>>> >> >> >>>>> >> Tato zpráva byla vytvořena převratným poštovním
>>>>>>>>> klientem
>>>>>>>>> > Opery:
>>>>>>>>> >> >> >>>>> >> http://www.opera.com/mail/
>>>>>>>>> >> >> >>>>> >> ______________________________**_________________
>>>>>>>>> >> >> >>>>> >> AMBER mailing list
>>>>>>>>> >> >> >>>>> >> AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> >>
>>>>>>>>> >> >> >>>>> > ______________________________**_________________
>>>>>>>>> >> >> >>>>> > AMBER mailing list
>>>>>>>>> >> >> >>>>> > AMBER.ambermd.org
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> > __________ Informace od ESET NOD32 Antivirus, verze
>>>>>>>>> databaze
>>>>>>>>> >> >> 8394
>>>>>>>>> >> >> >>>>> > (20130530) __________
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> > http://www.eset.cz
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>> >
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>> --
>>>>>>>>> >> >> >>>>> Tato zpráva byla vytvořena převratným poštovním
>>>>>>>>> klientem
>>>>>>>>> Opery:
>>>>>>>>> >> >> >>>>> http://www.opera.com/mail/
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>> ______________________________**_________________
>>>>>>>>> >> >> >>>>> AMBER mailing list
>>>>>>>>> >> >> >>>>> AMBER.ambermd.org
>>>>>>>>> >> >> >>>>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >> >> >>>>>
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>>>
>>>>>>>>> >> >> >>> ______________________________**_________________
>>>>>>>>> >> >> >>> AMBER mailing list
>>>>>>>>> >> >> >>> AMBER.ambermd.org
>>>>>>>>> >> >> >>>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>> __________ Informace od ESET NOD32 Antivirus, verze
>>>>>>>>> databaze
>>>>>>>>> 8395
>>>>>>>>> >> >> >>> (20130531) __________
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>> http://www.eset.cz
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >> --
>>>>>>>>> >> >> Tato zpráva byla vytvořena převratným poštovním klientem
>>>>>>>>> Opery:
>>>>>>>>> >> >> http://www.opera.com/mail/
>>>>>>>>> >> >>
>>>>>>>>> >> >> ______________________________**_________________
>>>>>>>>> >> >> AMBER mailing list
>>>>>>>>> >> >> AMBER.ambermd.org
>>>>>>>>> >> >>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >> >>
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > __________ Informace od ESET NOD32 Antivirus, verze databaze
>>>>>>>>> 8397
>>>>>>>>> >> > (20130531) __________
>>>>>>>>> >> >
>>>>>>>>> >> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>> >> >
>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz - poskozeny archiv
>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> GB_out_plus_diff_Files.tar
>>>>>>>>> >> > - poskozeny archiv
>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>> GB_out_plus_diff_Files.tar.gz -
>>>>>>>>> >> > poskozeny archiv
>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>> GB_out_plus_diff_Files.tar.gz >
>>>>>>>>> >> GZIP
>>>>>>>>> >> > > GB_out_plus_diff_Files.tar - poskozeny archiv
>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>> GB_out_plus_diff_Files.tar.gz >
>>>>>>>>> >> GZIP
>>>>>>>>> >> > > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>> GB_nucleosome-sim3.mdout-full -
>>>>>>>>> >> > vyskytl se problem pri cteni archivu
>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz - poskozeny archiv
>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > PME_out_plus_diff_Files.tar - poskozeny archiv
>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > PME_out_plus_diff_Files.tar > TAR >
>>>>>>>>> PME_out_plus_diff_Files.tar.gz -
>>>>>>>>> >> > poskozeny archiv
>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > PME_out_plus_diff_Files.tar > TAR >
>>>>>>>>> PME_out_plus_diff_Files.tar.gz >
>>>>>>>>> >> > GZIP > PME_out_plus_diff_Files.tar - poskozeny archiv
>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>> >> > PME_out_plus_diff_Filestar > TAR >
>>>>>>>>> PME_out_plus_diff_Files.tar.gz
>>>>>>>>> >
>>>>>>>>> >> GZIP
>>>>>>>>> >> > > PME_out_plus_diff_Files.tar > TAR >
>>>>>>>>> >> > PME_JAC_production_NPT-sim3.**mdout-full - vyskytl se
>>>>>>>>> problem
>>>>>>>>> pri
>>>>>>>>> cteni
>>>>>>>>> >> > archivu
>>>>>>>>> >> >
>>>>>>>>> >> > http://www.eset.cz
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> --
>>>>>>>>> >> Tato zpráva byla vytvořena převratným poštovním klientem
>>>>>>>>> Opery:
>>>>>>>>> >> http://www.opera.com/mail/
>>>>>>>>> >>
>>>>>>>>> >> ______________________________**_________________
>>>>>>>>> >> AMBER mailing list
>>>>>>>>> >> AMBER.ambermd.org
>>>>>>>>> >>
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> > ______________________________**_________________
>>>>>>>>> > AMBER mailing list
>>>>>>>>> > AMBER.ambermd.org
>>>>>>>>> >
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>> >
>>>>>>>>> > __________ Informace od ESET NOD32 Antivirus, verze databaze
>>>>>>>>> 8398
>>>>>>>>> > (20130531) __________
>>>>>>>>> >
>>>>>>>>> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>> >
>>>>>>>>> > http://www.eset.cz
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>
>>>>>>>>> ______________________________**_________________
>>>>>>>>> AMBER mailing list
>>>>>>>>> AMBER.ambermd.org
>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8401
>>>>>>> (20130601) __________
>>>>>>>
>>>>>>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>
>>>>>>> http://www.eset.cz
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>>>> http://www.opera.com/mail/
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>>
>>>>
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8403
>> (20130602) __________
>>
>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>
>> http://www.eset.cz
>>
>>
>>
>
>

-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Sun Jun 02 2013 - 11:00:02 PDT