Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Marek Maly <marek.maly.ujep.cz>
Date: Wed, 19 Jun 2013 17:20:15 +0200

Hi all,

just a small update from my site.

As I have yesterday obtained announcement that the CUDA 5.5 is
now available for public (not just for developers).

I downloaded it from here:

https://developer.nvidia.com/cuda-pre-production

It is still "just" release candidate ( as all Amber/Titan club members
perfectly know :)) ).

So I installed this newest release and recompiled Amber cuda code.

I was hoping that maybe there was "silently" incorporated some
improvement (e.g. in cuFFT) as the result e.g. of Scott's bug report.

The results of my 100K tests are attached. It seems that comparing to my
latest
tests with CUDA 5.5. release candidate from June 3rd (when it was
accessible just for CUDA developers in the form of *.run binary installer)
there
is some slight improvement - e.g. my more stable TITAN was able to finish
successfully
all the 100K tests including Cellulose twice. But there is still an issue
with JAC NVE/NPT irreproducible results. On my "less stable" TITAN the
results are slightly better
then those older ones as well but still not err free (JAC/CELLULOSE) - see
attached file.

FACTOR IX NVE/NPT finished again with 100% reproducibility on both GPUs as
usually.

Scott, do you have any update regarding the "cuFFT"/TITAN issue which you
reported/described
to NVIDIA guys ? The latest info from you regarding this story was, that
they were able to
reproduce the "cuFFT"/TITAN error as well. Do you have any more recent
information ? How long
time it might take to NVIDIA developers to fully solve such problem in
your opinion ?

Another thing. It seems that you successfully solved the "GB/TITAN"
problem in case of bigger molecular systems, here is your relevant message
form June 7th.

----------------------------------------------
Really really interesting...

I seem to have found a fix for the GB issues on my Titan - not so
surprisingly, it's the same fix as on GTX4xx/GTX5xx...

But this doesn't yet explain the weirdness with cuFFT so we're not done
here yet...
---------------------------------------------

It was already after the latest Amber12 bugfix18 was released and there
was no additional
bugfix released from that moment. So the "GB/TITAN" patch will be
released later maybe as the part of some bigger bugfix ? Or you simply
additionally included it into bugfix 18 after it's release ?


My last question maybe deserves the new separate thread, but anyway would
be interesting
to have some information how "Amber-stable" are GTX780 comparing to TITANS
(of course based
on experience of more users or on testing more than 1 or 2 GTX780 GPUs).

     Best wishes,

          Marek














Dne Mon, 03 Jun 2013 01:57:36 +0200 Marek Maly <marek.maly.ujep.cz>
napsal/-a:

> Hi here are my results with CUDA 5.5
> (Total energy at step 100K(PME)/1000K(GB) (driver 319.23, Amber12 bugfix
> 18 applied, cuda 5.5))
>
>
> No significant differences comparing the previous test with CUDA 5.0
> (I also added those data to the attached table with CUDA 5.5 test).
>
> Still the same trend instability in JAC tests, perfect stability and
> reproducibility
> in FACTOR_IX tests (interesting isn't it ? especially if we consider 23K
> atoms
> in JAC case and 90K atoms in case of FACTOR_IX). Again the same crashes
> in
> CELLULOSE
> test now also in case of TITAN_1. Also in stable and reproducible
> FACTOR_IX slightly
> changed the final energy values comparing to CUDA 5.0 case.
>
> GB simulations (1M steps) again perfectly stable and reproducible.
>
> So to conclude, Scott we trust you :)) !
>
> If you have any idea what to try else (except GPU bios editing, perhaps
> too
> premature step at this moment) let me know. I got just last idea,
> which could be perhaps to try change rand seed and see if it has any
> influence in actual trends (e.g. JAC versus FACTOR_IX).
>
> TO ET : I am curious about your test in single GPU configuration.
> Regarding
> to your Win tests, in my opinion it is just wasting of time. They perhaps
> tells
> you just something about the GPU performance not about the eventual GPU
> "soft" errs.
>
> If intensive memtestG80 and/or cuda_memtest results were negative there
> is
> in my opinion
> very unlikely that Win performace testers will find any errs, but I am
> not
> an expert
> here ...
>
> Anyway If you learn which tests the ebuyer is using to confirm GPU errs,
> let us know.
>
> M.
>
>
>
>
>
>
>
> Dne Sun, 02 Jun 2013 19:22:54 +0200 Marek Maly <marek.maly.ujep.cz>
> napsal/-a:
>
>> Hi so I finally succeeded to compile GPU Amber part under CUDA 5.5
>> (after "hacking" of the configure2 file) with common results in
>> consequent tests:
>>
>> ------
>> 80 file comparisons passed
>> 9 file comparisons failed
>> 0 tests experienced errors
>> ------
>>
>> So now I am running the 100K(PME)/1000K(GB) repetitive benchmark tests
>> under
>> this configuration: drv. 319.23, CUDA 5.5. , bugfix 18 installed
>>
>> When I finish it I will report results here.
>>
>> M.
>>
>>
>>
>>
>>
>> Dne Sun, 02 Jun 2013 18:44:23 +0200 Marek Maly <marek.maly.ujep.cz>
>> napsal/-a:
>>
>>> Hi Scott thanks for the update !
>>>
>>> Anyway any explanation regarding "cuFFT hypothesis" why there are no
>>> problems
>>> with GTX 580, GTX 680 or even K20c ???
>>>
>>>
>>> meanwhile I also tried to recompile GPU part of Amber with
>>> cuda 5.5 installed before, I have obtained these errs
>>> already in configure phase:
>>>
>>> --------
>>> [root.dyn-138-272 amber12]# ./configure -cuda -noX11 gnu
>>> Checking for updates...
>>> Checking for available patches online. This may take a few seconds...
>>>
>>> Available AmberTools 13 patches:
>>>
>>> No patches available
>>>
>>> Available Amber 12 patches:
>>>
>>> No patches available
>>> Searching for python2... Found python2.6: /usr/bin/python2.6
>>> Error: Unsupported CUDA version 5.5 detected.
>>> AMBER requires CUDA version == 4.2 .or. 5.0
>>> Configure failed due to the errors above!
>>> ---------
>>>
>>> so it seems that Amber is possible to compile only with CUDA 4.2 or 5.0
>>> at
>>> the moment:
>>>
>>> and this part of configure2 file has to be edited:
>>>
>>>
>>> -----------
>>> nvcc="$CUDA_HOME/bin/nvcc"
>>> sm35flags='-gencode arch=compute_35,code=sm_35'
>>> sm30flags='-gencode arch=compute_30,code=sm_30'
>>> sm20flags='-gencode arch=compute_20,code=sm_20'
>>> sm13flags='-gencode arch=compute_13,code=sm_13'
>>> nvccflags="$sm13flags $sm20flags"
>>> cudaversion=`$nvcc --version | grep 'release' | cut -d' ' -f5 |
>>> cut
>>> -d',' -f1`
>>> if [ "$cudaversion" == "5.0" ]; then
>>> echo "CUDA Version $cudaversion detected"
>>> nvccflags="$nvccflags $sm30flags $sm35flags"
>>> elif [ "$cudaversion" == "4.2" ]; then
>>> echo "CUDA Version $cudaversion detected"
>>> nvccflags="$nvccflags $sm30flags"
>>> else
>>> echo "Error: Unsupported CUDA version $cudaversion detected."
>>> echo "AMBER requires CUDA version == 4.2 .or. 5.0"
>>> exit 1
>>> fi
>>> nvcc="$nvcc $nvccflags"
>>>
>>> fi
>>>
>>> -----------
>>>
>>> would it be just OK to change
>>> "if [ "$cudaversion" == "5.0" ]; then"
>>>
>>> to
>>>
>>> "if [ "$cudaversion" == "5.5" ]; then"
>>>
>>>
>>> or some more flags etc. should be defined here to proceed successfully
>>> ?
>>>
>>>
>>> BTW it seems Scott, that you are on the way to isolate the problem soon
>>> so maybe it's better to wait and not to loose time with cuda 5.5
>>> experiments.
>>>
>>> I just thought that cuda 5.5 might be more "friendly" to Titans :))
>>> e.g.
>>> in terms of cuFFT function ....
>>>
>>>
>>> I will keep fingers crossed :))
>>>
>>> M.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dne Sun, 02 Jun 2013 18:33:52 +0200 Scott Le Grand
>>> <varelse2005.gmail.com>
>>> napsal/-a:
>>>
>>>> PS this *might* indicate a software bug in cuFFT, but it needs more
>>>> characterization... And things are going to get a little stream of
>>>> consciousness from here because you're getting unfiltered raw data, so
>>>> please don't draw any conclusions towards anything yet - I'm just
>>>> letting
>>>> you guys know what I'm finding out as I find it...
>>>>
>>>>
>>>>
>>>> On Sun, Jun 2, 2013 at 9:31 AM, Scott Le Grand
>>>> <varelse2005.gmail.com>wrote:
>>>>
>>>>> And bingo...
>>>>>
>>>>> At the very least, the reciprocal sum is intermittently
>>>>> inconsistent...
>>>>> This explains the irreproducible behavior...
>>>>>
>>>>> And here's the level of inconsistency:
>>>>> 31989.38940628897399 vs
>>>>> 31989.39168370794505
>>>>>
>>>>> That's error at the level of 1e-7 or a somehow missed
>>>>> single-precision
>>>>> transaction somewhere...
>>>>>
>>>>> The next question is figuring out why... This may or may not
>>>>> ultimately
>>>>> explain the crashes you guys are also seeing...
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Jun 2, 2013 at 9:07 AM, Scott Le Grand
>>>>> <varelse2005.gmail.com>wrote:
>>>>>
>>>>>>
>>>>>> Observations:
>>>>>> 1. The degree to which the reproducibility is broken *does* appear
>>>>>> to
>>>>>> vary between individual Titan GPUs. One of my Titans breaks within
>>>>>> 10K
>>>>>> steps on cellulose, the other one made it to 100K steps twice
>>>>>> without
>>>>>> doing
>>>>>> so leading me to believe it could be trusted (until yesterday where
>>>>>> I
>>>>>> now
>>>>>> see it dies between 50K and 100K steps most of the time).
>>>>>>
>>>>>> 2. GB hasn't broken (yet). So could you run myoglobin for 500K and
>>>>>> TRPcage for 1,000,000 steps and let's see if that's universal.
>>>>>>
>>>>>> 3. Turning on double-precision mode makes my Titan crash rather than
>>>>>> run
>>>>>> irreproducibly, sigh...
>>>>>>
>>>>>> So whatever is going on is triggered by something in PME but not GB.
>>>>>> So
>>>>>> that's either the radix sort, the FFT, the Ewald grid interpolation,
>>>>>> or the
>>>>>> neighbor list code. Fixing this involves isolating this and
>>>>>> figuring
>>>>>> out
>>>>>> what exactly goes haywire. It could *still* be software at some
>>>>>> very
>>>>>> small
>>>>>> probability but the combination of both 680 and K20c with ECC off
>>>>>> running
>>>>>> reliably is really pointing towards the Titans just being clocked
>>>>>> too
>>>>>> fast.
>>>>>>
>>>>>> So how long with this take? Asking people how long it takes to fix
>>>>>> a
>>>>>> bug
>>>>>> never really works out well. That said, I found the 480 bug within
>>>>>> a
>>>>>> week
>>>>>> and my usual turnaround for a bug with a solid repro is <24 hours.
>>>>>>
>>>>>> Scott
>>>>>>
>>>>>> On Sun, Jun 2, 2013 at 7:58 AM, Marek Maly <marek.maly.ujep.cz>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> here are my results after bugfix 18 application (see attachment).
>>>>>>>
>>>>>>> In principle I don't see any "drastical" changes.
>>>>>>>
>>>>>>> FACTOR_IX still perfectly stable/reproducible on both cards,
>>>>>>>
>>>>>>> JAC tests - problems with finishing AND/OR reproducibility the
>>>>>>> same CELLULOSE_NVE although here it seems that my TITAN_1
>>>>>>> has no problems with this test (but the same same trend I saw also
>>>>>>> before bugfix 18 - see my older 500K steps test).
>>>>>>>
>>>>>>> But anyway bugfix 18 brought here one change.
>>>>>>>
>>>>>>> The err
>>>>>>>
>>>>>>>
>>>>>>> #1 ERR writtent in mdout:
>>>>>>> ------
>>>>>>> | ERROR: max pairlist cutoff must be less than unit cell max
>>>>>>> sphere
>>>>>>> radius!
>>>>>>> ------
>>>>>>>
>>>>>>> was substituted with err/warning ?
>>>>>>>
>>>>>>> #0 no ERR writtent in mdout, ERR written in standard output
>>>>>>> (nohup.out)
>>>>>>> -----
>>>>>>> Nonbond cells need to be recalculated, restart simulation from
>>>>>>> previous
>>>>>>> checkpoint
>>>>>>> with a higher value for skinnb.
>>>>>>>
>>>>>>> -----
>>>>>>>
>>>>>>> Another thing,
>>>>>>>
>>>>>>> recently I started on another machine and GTX 580 GPU simulation of
>>>>>>> relatively
>>>>>>> big system ( 364275 atoms/PME ). The system is composed also from
>>>>>>> the
>>>>>>> "exotic" molecules like polymers. ff12SB, gaff, GLYCAM forcefields
>>>>>>> used
>>>>>>> here. I had problem even with minimization part here, having big
>>>>>>> energy
>>>>>>> on the start:
>>>>>>>
>>>>>>> -----
>>>>>>> NSTEP ENERGY RMS GMAX NAME
>>>>>>> NUMBER
>>>>>>> 1 2.8442E+09 2.1339E+02 1.7311E+04 O
>>>>>>> 32998
>>>>>>>
>>>>>>> BOND = 11051.7467 ANGLE = 17720.4706 DIHED =
>>>>>>> 18977.7584
>>>>>>> VDWAALS = ************* EEL = -1257709.6203 HBOND =
>>>>>>> 0.0000
>>>>>>> 1-4 VDW = 7253.7412 1-4 EEL = 149867.0207 RESTRAINT =
>>>>>>> 0.0000
>>>>>>>
>>>>>>> ----
>>>>>>>
>>>>>>> with no chance to minimize the system even with 50 000 steps in
>>>>>>> both
>>>>>>> min cycles (with constrained and unconstrained solute) and hence
>>>>>>> heating
>>>>>>> NVT
>>>>>>> crashed immediately even with very small dt. I patched Amber12 here
>>>>>>> with
>>>>>>> the
>>>>>>> bugfix 18 and the minimization was done without any problem with
>>>>>>> common
>>>>>>> 5000 steps
>>>>>>> (obtaining target Energy -1.4505E+06 while that initial was that
>>>>>>> written
>>>>>>> above).
>>>>>>>
>>>>>>> So indeed bugfix 18 solved some issues, but unfortunately not those
>>>>>>> related to
>>>>>>> Titans.
>>>>>>>
>>>>>>> Here I will try to install cuda 5.5, recompile GPU Amber part with
>>>>>>> this
>>>>>>> new
>>>>>>> cuda version and repeat the 100K tests.
>>>>>>>
>>>>>>> Scott, let us know how finished your experiment with downclocking
>>>>>>> of
>>>>>>> Titan.
>>>>>>> Maybe the best choice would be here to flash Titan directly with
>>>>>>> your
>>>>>>> K20c bios :))
>>>>>>>
>>>>>>> M.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Dne Sat, 01 Jun 2013 21:09:46 +0200 Marek Maly <marek.maly.ujep.cz>
>>>>>>> napsal/-a:
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>>
>>>>>>>> first of all thanks for providing of your test results !
>>>>>>>>
>>>>>>>> It seems that your results are more or less similar to that of
>>>>>>>> mine maybe with the exception of the results on FactorIX tests
>>>>>>>> where I had perfect stability and 100% or close to 100%
>>>>>>>> reproducibility.
>>>>>>>>
>>>>>>>> Anyway the type of errs which you reported are the same which I
>>>>>>>> obtained.
>>>>>>>>
>>>>>>>> So let's see if the bugfix 18 will help here (or at least on NPT
>>>>>>>> tests)
>>>>>>>> or not. As I wrote just before few minutes, it seems that it was
>>>>>>>> not
>>>>>>>> still
>>>>>>>> loaded
>>>>>>>> to the given server, although it's description is already present
>>>>>>>> on
>>>>>>>> the
>>>>>>>> given
>>>>>>>> web page ( see
>>>>>>>> http://ambermd.org/bugfixes12.**html<http://ambermd.org/bugfixes12.html>).
>>>>>>>>
>>>>>>>> As you can see, this bugfix contains also changes in CPU code
>>>>>>>> although
>>>>>>>> the majority is devoted to GPU code, so perhaps the best will be
>>>>>>>> to
>>>>>>>> recompile
>>>>>>>> whole amber with this patch although this patch would be perhaps
>>>>>>>> applied
>>>>>>>> even after just
>>>>>>>> GPU configure command ( i.e. ./configure -cuda -noX11 gnu ) but
>>>>>>>> after
>>>>>>>> consequent
>>>>>>>> building, just the GPU binaries will be updated. Anyway I would
>>>>>>>> rather
>>>>>>>> recompile
>>>>>>>> whole Amber after this patch.
>>>>>>>>
>>>>>>>> Regarding to GPU test under linux you may try memtestG80
>>>>>>>> (please use the updated/patched version from here
>>>>>>>> https://github.com/ihaque/**memtestG80<https://github.com/ihaque/memtestG80>
>>>>>>>> )
>>>>>>>>
>>>>>>>> just use git command like:
>>>>>>>>
>>>>>>>> git clone
>>>>>>>> https://github.com/ihaque/**memtestG80.git<https://github.com/ihaque/memtestG80.git>PATCHED_MEMTEST-G80
>>>>>>>>
>>>>>>>> to download all the files and save them into directory named
>>>>>>>> PATCHED_MEMTEST-G80.
>>>>>>>>
>>>>>>>> another possibility is to try perhaps similar (but maybe more up
>>>>>>>> to
>>>>>>>> date)
>>>>>>>> test
>>>>>>>> cuda_memtest (
>>>>>>>> http://sourceforge.net/**projects/cudagpumemtest/<http://sourceforge.net/projects/cudagpumemtest/>).
>>>>>>>>
>>>>>>>> regarding ig value: If ig is not present in mdin, the default
>>>>>>>> value
>>>>>>>> is
>>>>>>>> used
>>>>>>>> (e.g. 71277) if ig=-1 the random seed will be based on the current
>>>>>>>> date
>>>>>>>> and time, and hence will be different for every run (not a good
>>>>>>>> variant
>>>>>>>> for our testts). I simply deleted eventual ig records from all
>>>>>>>> mdins
>>>>>>>> so
>>>>>>>> I
>>>>>>>> assume that in each run the default seed 71277 was automatically
>>>>>>>> used.
>>>>>>>>
>>>>>>>> M.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Dne Sat, 01 Jun 2013 20:26:16 +0200 ET <sketchfoot.gmail.com>
>>>>>>>> napsal/-a:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I've put the graphics card into a machine with the working GTX
>>>>>>>>> titan
>>>>>>>>> that I
>>>>>>>>> mentioned earlier.
>>>>>>>>>
>>>>>>>>> The Nvidia driver version is: 133.30
>>>>>>>>>
>>>>>>>>> Amber version is:
>>>>>>>>> AmberTools version 13.03
>>>>>>>>> Amber version 12.16
>>>>>>>>>
>>>>>>>>> I ran 50k steps with the amber benchmark using ig=43689 on both
>>>>>>>>> cards.
>>>>>>>>> For
>>>>>>>>> the purpose of discriminating between them, the card I believe
>>>>>>>>> (fingers
>>>>>>>>> crossed) is working is called GPU-00_TeaNCake, whilst the other
>>>>>>>>> one
>>>>>>>>> is
>>>>>>>>> called GPU-01_008.
>>>>>>>>>
>>>>>>>>> *When I run the tests on GPU-01_008:*
>>>>>>>>>
>>>>>>>>> 1) All the tests (across 2x repeats) finish apart from the
>>>>>>>>> following
>>>>>>>>> which
>>>>>>>>> have the errors listed:
>>>>>>>>>
>>>>>>>>> ------------------------------**--------------
>>>>>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>>>>>>>> Error: unspecified launch failure launching kernel kNLSkinTest
>>>>>>>>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>>>>>>>>
>>>>>>>>> ------------------------------**--------------
>>>>>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>>>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>>>>>>>
>>>>>>>>> ------------------------------**--------------
>>>>>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>>>>>>>> Error: unspecified launch failure launching kernel kNLSkinTest
>>>>>>>>> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>>>>>>>>>
>>>>>>>>> ------------------------------**--------------
>>>>>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>>>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>>>>>>> grep: mdinfo.1GTX680: No such file or directory
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2) The sdiff logs indicate that reproducibility across the two
>>>>>>>>> repeats
>>>>>>>>> is
>>>>>>>>> as follows:
>>>>>>>>>
>>>>>>>>> *GB_myoglobin: *Reproducible across 50k steps
>>>>>>>>> *GB_nucleosome:* Reproducible till step 7400
>>>>>>>>> *GB_TRPCage:* Reproducible across 50k steps
>>>>>>>>>
>>>>>>>>> *PME_JAC_production_NVE: *No reproducibility shown from step
>>>>>>>>> 1,000
>>>>>>>>> onwards
>>>>>>>>> *PME_JAC_production_NPT*: Reproducible till step 1,000. Also
>>>>>>>>> outfile
>>>>>>>>> is
>>>>>>>>> not written properly - blank gaps appear where something should
>>>>>>>>> have
>>>>>>>>> been
>>>>>>>>> written
>>>>>>>>>
>>>>>>>>> *PME_FactorIX_production_NVE:* Reproducible across 50k steps
>>>>>>>>> *PME_FactorIX_production_NPT:* Reproducible across 50k steps
>>>>>>>>>
>>>>>>>>> *PME_Cellulose_production_NVE:*** Failure means that both runs do
>>>>>>>>> not
>>>>>>>>> finish
>>>>>>>>> (see point1)
>>>>>>>>> *PME_Cellulose_production_NPT: *Failure means that both runs do
>>>>>>>>> not
>>>>>>>>> finish
>>>>>>>>> (see point1)
>>>>>>>>>
>>>>>>>>> ##############################**##############################**
>>>>>>>>> ###########################
>>>>>>>>>
>>>>>>>>> *When I run the tests on * *GPU-00_TeaNCake:*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> 1) All the tests (across 2x repeats) finish apart from the
>>>>>>>>> following
>>>>>>>>> which
>>>>>>>>> have the errors listed:
>>>>>>>>> ------------------------------**-------
>>>>>>>>> JAC_PRODUCTION_NPT - 23,558 atoms PME
>>>>>>>>> PMEMD Terminated Abnormally!
>>>>>>>>> ------------------------------**-------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2) The sdiff logs indicate that reproducibility across the two
>>>>>>>>> repeats
>>>>>>>>> is
>>>>>>>>> as follows:
>>>>>>>>>
>>>>>>>>> *GB_myoglobin:* Reproducible across 50k steps
>>>>>>>>> *GB_nucleosome:* Reproducible across 50k steps
>>>>>>>>> *GB_TRPCage:* Reproducible across 50k steps
>>>>>>>>>
>>>>>>>>> *PME_JAC_production_NVE:* No reproducibility shown from step
>>>>>>>>> 10,000
>>>>>>>>> onwards
>>>>>>>>> *PME_JAC_production_NPT: * No reproducibility shown from step
>>>>>>>>> 10,000
>>>>>>>>> onwards. Also outfile is not written properly - blank gaps appear
>>>>>>>>> where
>>>>>>>>> something should have been written. Repeat 2 Crashes with error
>>>>>>>>> noted
>>>>>>>>> in
>>>>>>>>> 1.
>>>>>>>>>
>>>>>>>>> *PME_FactorIX_production_NVE:* No reproducibility shown from step
>>>>>>>>> 9,000
>>>>>>>>> onwards
>>>>>>>>> *PME_FactorIX_production_NPT: *Reproducible across 50k steps
>>>>>>>>>
>>>>>>>>> *PME_Cellulose_production_NVE: *No reproducibility shown from
>>>>>>>>> step
>>>>>>>>> 5,000
>>>>>>>>> onwards
>>>>>>>>> *PME_Cellulose_production_NPT: ** *No reproducibility shown from
>>>>>>>>> step
>>>>>>>>> 29,000 onwards. Also outfile is not written properly - blank gaps
>>>>>>>>> appear
>>>>>>>>> where something should have been written.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Out files and sdiff files are included as attatchments
>>>>>>>>>
>>>>>>>>> ##############################**###################
>>>>>>>>>
>>>>>>>>> So I'm going to update my nvidia driver to the latest version and
>>>>>>>>> patch
>>>>>>>>> amber to the latest version and rerun the tests to see if there
>>>>>>>>> is
>>>>>>>>> any
>>>>>>>>> improvement. Could someone let me know if it is necessary to
>>>>>>>>> recompile
>>>>>>>>> any
>>>>>>>>> or all of AMBER after applying the bugfixes?
>>>>>>>>>
>>>>>>>>> Additionally, I'm going to run memory tests and heaven benchmarks
>>>>>>>>> on
>>>>>>>>> the
>>>>>>>>> cards to check whether they are faulty or not.
>>>>>>>>>
>>>>>>>>> I'm thinking that there is a mix of hardware error/configuration
>>>>>>>>> (esp
>>>>>>>>> in
>>>>>>>>> the case of GPU-01_008) and amber software error in this
>>>>>>>>> situation.
>>>>>>>>> What
>>>>>>>>> do
>>>>>>>>> you guys think?
>>>>>>>>>
>>>>>>>>> Also am I right in thinking (from what Scott was saying) that all
>>>>>>>>> the
>>>>>>>>> benchmarks should be reproducible across 50k steps but begin to
>>>>>>>>> diverge
>>>>>>>>> at
>>>>>>>>> around 100K steps? Is there any difference from in setting *ig
>>>>>>>>> *to
>>>>>>>>> an
>>>>>>>>> explicit number to removing it from the mdin file?
>>>>>>>>>
>>>>>>>>> br,
>>>>>>>>> g
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 31 May 2013 23:45, ET <sketchfoot.gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> I don't need sysadmins, but sysadmins need me as it gives
>>>>>>>>> purpose
>>>>>>>>> to
>>>>>>>>>> their
>>>>>>>>>> bureaucratic existence. A encountered evil if working in an
>>>>>>>>>> institution
>>>>>>>>>> or
>>>>>>>>>> comapny IMO. Good science and indiviguality being sacrificed for
>>>>>>>>>> standardisation and mediocrity in the intrerests of maintaing a
>>>>>>>>>> system
>>>>>>>>>> that
>>>>>>>>>> focusses on maintaining the system and not the objective.
>>>>>>>>>>
>>>>>>>>>> You need root to move fwd on these things, unfortunately. and
>>>>>>>>>> ppl
>>>>>>>>>> with
>>>>>>>>>> root are kinda like your parents when you try to borrow money
>>>>>>>>>> from
>>>>>>>>>> them
>>>>>>>>>> .
>>>>>>>>>> age 12 :D
>>>>>>>>>> On May 31, 2013 9:34 PM, "Marek Maly" <marek.maly.ujep.cz>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Sorry why do you need sysadmins :)) ?
>>>>>>>>>>>
>>>>>>>>>>> BTW here is the most recent driver:
>>>>>>>>>>>
>>>>>>>>>>> http://www.nvidia.com/object/**linux-display-amd64-319.23-**
>>>>>>>>>>> driver.html<http://www.nvidia.com/object/linux-display-amd64-319.23-driver.html>
>>>>>>>>>>>
>>>>>>>>>>> I do not remember anything easier than is to install driver
>>>>>>>>>>> (especially
>>>>>>>>>>> in case of binary (*.run) installer) :))
>>>>>>>>>>>
>>>>>>>>>>> M.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Dne Fri, 31 May 2013 22:02:34 +0200 ET <sketchfoot.gmail.com>
>>>>>>>>>>> napsal/-a:
>>>>>>>>>>>
>>>>>>>>>>> > Yup. I know. I replaced a 680 and the everknowing sysadmins
>>>>>>>>>>> are
>>>>>>>>>>> reluctant
>>>>>>>>>>> > to install drivers not in the repositoery as they are lame.
>>>>>>>>>>> :(
>>>>>>>>>>> > On May 31, 2013 7:14 PM, "Marek Maly" <marek.maly.ujep.cz>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >>
>>>>>>>>>>> >> As I already wrote you,
>>>>>>>>>>> >>
>>>>>>>>>>> >> the first driver which properly/officially supports Titans,
>>>>>>>>>>> should
>>>>>>>>>>> be
>>>>>>>>>>> >> 313.26 .
>>>>>>>>>>> >>
>>>>>>>>>>> >> Anyway I am curious mainly about your 100K repetitive tests
>>>>>>>>>>> with
>>>>>>>>>>> >> your Titan SC card. Especially in case of these tests (
>>>>>>>>>>> JAC_NVE,
>>>>>>>>>>> JAC_NPT
>>>>>>>>>>> >> and CELLULOSE_NVE ) where
>>>>>>>>>>> >> my Titans SC randomly failed or succeeded. In FACTOR_IX_NVE,
>>>>>>>>>>> >> FACTOR_IX_NPT
>>>>>>>>>>> >> tests both
>>>>>>>>>>> >> my cards are perfectly stable (independently from drv.
>>>>>>>>>>> version)
>>>>>>>>>>> and
>>>>>>>>>>> also
>>>>>>>>>>> >> the runs
>>>>>>>>>>> >> are perfectly or almost perfectly reproducible.
>>>>>>>>>>> >>
>>>>>>>>>>> >> Also if your test will crash please report the eventual
>>>>>>>>>>> errs.
>>>>>>>>>>> >>
>>>>>>>>>>> >> To this moment I have this actual library of errs on my
>>>>>>>>>>> Titans
>>>>>>>>>>> SC
>>>>>>>>>>> GPUs.
>>>>>>>>>>> >>
>>>>>>>>>>> >> #1 ERR writtent in mdout:
>>>>>>>>>>> >> ------
>>>>>>>>>>> >> | ERROR: max pairlist cutoff must be less than unit cell
>>>>>>>>>>> max
>>>>>>>>>>> sphere
>>>>>>>>>>> >> radius!
>>>>>>>>>>> >> ------
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >> #2 no ERR writtent in mdout, ERR written in standard output
>>>>>>>>>>> (nohup.out)
>>>>>>>>>>> >>
>>>>>>>>>>> >> ----
>>>>>>>>>>> >> Error: unspecified launch failure launching kernel
>>>>>>>>>>> kNLSkinTest
>>>>>>>>>>> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
>>>>>>>>>>> failure
>>>>>>>>>>> >> ----
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >> #3 no ERR writtent in mdout, ERR written in standard output
>>>>>>>>>>> (nohup.out)
>>>>>>>>>>> >> ----
>>>>>>>>>>> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
>>>>>>>>>>> failure
>>>>>>>>>>> >> ----
>>>>>>>>>>> >>
>>>>>>>>>>> >> Another question, regarding your Titan SC, it is also EVGA
>>>>>>>>>>> as
>>>>>>>>>>> in
>>>>>>>>>>> my
>>>>>>>>>>> case
>>>>>>>>>>> >> or it is another producer ?
>>>>>>>>>>> >>
>>>>>>>>>>> >> Thanks,
>>>>>>>>>>> >>
>>>>>>>>>>> >> M.
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >> Dne Fri, 31 May 2013 19:17:03 +0200 ET
>>>>>>>>>>> <sketchfoot.gmail.com>
>>>>>>>>>>> napsal/-a:
>>>>>>>>>>> >>
>>>>>>>>>>> >> > Well, this is interesting...
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > I ran 50k steps on the Titan on the other machine with
>>>>>>>>>>> driver
>>>>>>>>>>> 310.44
>>>>>>>>>>> >> and
>>>>>>>>>>> >> > it
>>>>>>>>>>> >> > passed all the GB steps. i.e totally identical results
>>>>>>>>>>> over
>>>>>>>>>>> two
>>>>>>>>>>> >> repeats.
>>>>>>>>>>> >> > However, it failed all the PME tests after step 1000. I'm
>>>>>>>>>>> going
>>>>>>>>>>> to
>>>>>>>>>>> > update
>>>>>>>>>>> >> > the driver and test it again.
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Files included as attachments.
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > br,
>>>>>>>>>>> >> > g
>>>>>>>>>>> >> >
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > On 31 May 2013 16:40, Marek Maly <marek.maly.ujep.cz>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> >> One more thing,
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> can you please check under which frequency is running
>>>>>>>>>>> that
>>>>>>>>>>> your
>>>>>>>>>>> >> titan ?
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> As the base frequency of normal Titans is 837MHz and the
>>>>>>>>>>> Boost
>>>>>>>>>>> one
>>>>>>>>>>> is
>>>>>>>>>>> >> >> 876MHz I
>>>>>>>>>>> >> >> assume that yor GPU is running automatically also under
>>>>>>>>>>> it's
>>>>>>>>>>> boot
>>>>>>>>>>> >> >> frequency (876MHz).
>>>>>>>>>>> >> >> You can find this information e.g. in Amber mdout file.
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> You also mentioned some crashes in your previous email.
>>>>>>>>>>> Your
>>>>>>>>>>> ERRs
>>>>>>>>>>> >> were
>>>>>>>>>>> >> >> something like those here:
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> #1 ERR writtent in mdout:
>>>>>>>>>>> >> >> ------
>>>>>>>>>>> >> >> | ERROR: max pairlist cutoff must be less than unit
>>>>>>>>>>> cell
>>>>>>>>>>> max
>>>>>>>>>>> sphere
>>>>>>>>>>> >> >> radius!
>>>>>>>>>>> >> >> ------
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> #2 no ERR writtent in mdout, ERR written in standard
>>>>>>>>>>> output
>>>>>>>>>>> >> (nohup.out)
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> ----
>>>>>>>>>>> >> >> Error: unspecified launch failure launching kernel
>>>>>>>>>>> kNLSkinTest
>>>>>>>>>>> >> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
>>>>>>>>>>> failure
>>>>>>>>>>> >> >> ----
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> #3 no ERR writtent in mdout, ERR written in standard
>>>>>>>>>>> output
>>>>>>>>>>> >> (nohup.out)
>>>>>>>>>>> >> >> ----
>>>>>>>>>>> >> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
>>>>>>>>>>> failure
>>>>>>>>>>> >> >> ----
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> or you obtained some new/additional errs ?
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> M.
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> Dne Fri, 31 May 2013 17:30:57 +0200 filip fratev
>>>>>>>>>>> >> <filipfratev.yahoo.com
>>>>>>>>>>> >>
>>>>>>>>>>> >> >> napsal/-a:
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> > Hi,
>>>>>>>>>>> >> >> > This is what I obtained for 50K tests and "normal"
>>>>>>>>>>> GTXTitan:
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > run1:
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> ------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > A V E R A G E S O V E R 50 S T E P S
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>>>> 299.87
>>>>>>>>>>> >> PRESS
>>>>>>>>>>> >> >> > = 0.0
>>>>>>>>>>> >> >> > Etot = -443237.1079 EKtot = 257679.9750
>>>>>>>>>>> EPtot
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > -700917.0829
>>>>>>>>>>> >> >> > BOND = 20193.1856 ANGLE = 53517.5432
>>>>>>>>>>> DIHED
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 23575.4648
>>>>>>>>>>> >> >> > 1-4 NB = 21759.5524 1-4 EEL = 742552.5939
>>>>>>>>>>> VDWAALS
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 96286.7714
>>>>>>>>>>> >> >> > EELEC = -1658802.1941 EHBOND = 0.0000
>>>>>>>>>>> RESTRAINT
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 0.0000
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> ------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > R M S F L U C T U A T I O N S
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>>>> 0.33
>>>>>>>>>>> >> PRESS
>>>>>>>>>>> >> >> > = 0.0
>>>>>>>>>>> >> >> > Etot = 11.2784 EKtot = 284.8999
>>>>>>>>>>> EPtot
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 289.0773
>>>>>>>>>>> >> >> > BOND = 136.3417 ANGLE = 214.0054
>>>>>>>>>>> DIHED
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 59.4893
>>>>>>>>>>> >> >> > 1-4 NB = 58.5891 1-4 EEL = 330.5400
>>>>>>>>>>> VDWAALS
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 559.2079
>>>>>>>>>>> >> >> > EELEC = 743.8771 EHBOND = 0.0000
>>>>>>>>>>> RESTRAINT
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 0.0000
>>>>>>>>>>> >> >> > |E(PBS) = 21.8119
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> ------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > run2:
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> ------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > A V E R A G E S O V E R 50 S T E P S
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>>>> 299.89
>>>>>>>>>>> >> PRESS
>>>>>>>>>>> >> >> > = 0.0
>>>>>>>>>>> >> >> > Etot = -443240.0999 EKtot = 257700.0950
>>>>>>>>>>> EPtot
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > -700940.1949
>>>>>>>>>>> >> >> > BOND = 20241.9174 ANGLE = 53644.6694
>>>>>>>>>>> DIHED
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 23541.3737
>>>>>>>>>>> >> >> > 1-4 NB = 21803.1898 1-4 EEL = 742754.2254
>>>>>>>>>>> VDWAALS
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 96298.8308
>>>>>>>>>>> >> >> > EELEC = -1659224.4013 EHBOND = 0.0000
>>>>>>>>>>> RESTRAINT
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 0.0000
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> ------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > R M S F L U C T U A T I O N S
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > NSTEP = 50000 TIME(PS) = 120.020 TEMP(K) =
>>>>>>>>>>> 0.41
>>>>>>>>>>> >> PRESS
>>>>>>>>>>> >> >> > = 0.0
>>>>>>>>>>> >> >> > Etot = 10.7633 EKtot = 348.2819
>>>>>>>>>>> EPtot
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 353.9918
>>>>>>>>>>> >> >> > BOND = 106.5314 ANGLE = 196.7052
>>>>>>>>>>> DIHED
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 69.7476
>>>>>>>>>>> >> >> > 1-4 NB = 60.3435 1-4 EEL = 400.7466
>>>>>>>>>>> VDWAALS
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 462.7763
>>>>>>>>>>> >> >> > EELEC = 651.9857 EHBOND = 0.0000
>>>>>>>>>>> RESTRAINT
>>>>>>>>>>> =
>>>>>>>>>>> >> >> > 0.0000
>>>>>>>>>>> >> >> > |E(PBS) = 17.0642
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> ------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> ------------------------------**------------------------------**
>>>>>>>>>>> --------------------
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > ______________________________**__
>>>>>>>>>>> >> >> > From: Marek Maly <marek.maly.ujep.cz>
>>>>>>>>>>> >> >> > To: AMBER Mailing List <amber.ambermd.org>
>>>>>>>>>>> >> >> > Sent: Friday, May 31, 2013 3:34 PM
>>>>>>>>>>> >> >> > Subject: Re: [AMBER] experiences with EVGA GTX TITAN
>>>>>>>>>>> Superclocked
>>>>>>>>>>> -
>>>>>>>>>>> >> >> > memtestG80 - UNDERclocking in Linux ?
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > Hi here are my 100K results for driver 313.30 (and
>>>>>>>>>>> still
>>>>>>>>>>> Cuda
>>>>>>>>>>> 5.0).
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > The results are rather similar to those obtained
>>>>>>>>>>> >> >> > under my original driver 319.17 (see the first table
>>>>>>>>>>> >> >> > which I sent in this thread).
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > M.
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> > Dne Fri, 31 May 2013 12:29:59 +0200 Marek Maly <
>>>>>>>>>>> marek.maly.ujep.cz>
>>>>>>>>>>> >> >> > napsal/-a:
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >> Hi,
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> please try to run at lest 100K tests twice to verify
>>>>>>>>>>> exact
>>>>>>>>>>> >> >> >> reproducibility
>>>>>>>>>>> >> >> >> of the results on the given card. If you find in any
>>>>>>>>>>> mdin
>>>>>>>>>>> file
>>>>>>>>>>> >> ig=-1
>>>>>>>>>>> >> >> >> just
>>>>>>>>>>> >> >> >> delete it to ensure that you are using the identical
>>>>>>>>>>> random
>>>>>>>>>>> seed
>>>>>>>>>>> >> for
>>>>>>>>>>> >> >> >> both
>>>>>>>>>>> >> >> >> runs. You can eventually omit NUCLEOSOME test
>>>>>>>>>>> >> >> >> as it is too much time consuming.
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> Driver 310.44 ?????
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> As far as I know the proper support for titans is from
>>>>>>>>>>> version
>>>>>>>>>>> > 313.26
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> see e.g. here :
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >
>>>>>>>>>>> http://www.geeks3d.com/**20130306/nvidia-releases-r313-**
>>>>>>>>>>> 26-for-linux-with-gtx-titan-**support/<http://www.geeks3d.com/20130306/nvidia-releases-r313-26-for-linux-with-gtx-titan-support/>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> BTW: On my site downgrade to drv. 313.30 did not
>>>>>>>>>>> solved
>>>>>>>>>>> the
>>>>>>>>>>> >> >> situation, I
>>>>>>>>>>> >> >> >> will post
>>>>>>>>>>> >> >> >> my results soon here.
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> M.
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >> Dne Fri, 31 May 2013 12:21:21 +0200 ET <
>>>>>>>>>>> sketchfoot.gmail.com>
>>>>>>>>>>> >> >> napsal/-a:
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>> ps. I have another install of amber on another
>>>>>>>>>>> computer
>>>>>>>>>>> with
>>>>>>>>>>> a
>>>>>>>>>>> >> >> >>> different
>>>>>>>>>>> >> >> >>> Titan and different Driver Version: 310.44.
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>> In the interests of thrashing the proverbial horse,
>>>>>>>>>>> I'll
>>>>>>>>>>> run
>>>>>>>>>>> the
>>>>>>>>>>> >> >> >>> benchmark
>>>>>>>>>>> >> >> >>> for 50k steps. :P
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>> br,
>>>>>>>>>>> >> >> >>> g
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>> On 31 May 2013 11:17, ET <sketchfoot.gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>>> Hi, I just ran the Amber benchmark for the default
>>>>>>>>>>> (10000
>>>>>>>>>>> steps)
>>>>>>>>>>> >> >> on my
>>>>>>>>>>> >> >> >>>> Titan.
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>>> Using sdiff -sB showed that the two runs were
>>>>>>>>>>> completely
>>>>>>>>>>> > identical.
>>>>>>>>>>> >> >> >>>> I've
>>>>>>>>>>> >> >> >>>> attached compressed files of the mdout & diff files.
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>>> br,
>>>>>>>>>>> >> >> >>>> g
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>>> On 30 May 2013 23:41, Marek Maly
>>>>>>>>>>> <marek.maly.ujep.cz>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>>>> OK, let's see. The eventual downclocking I see as
>>>>>>>>>>> the
>>>>>>>>>>> very
>>>>>>>>>>> last
>>>>>>>>>>> >> >> >>>>> possibility
>>>>>>>>>>> >> >> >>>>> (if I don't decide for RMAing). But now still some
>>>>>>>>>>> other
>>>>>>>>>>> >> >> experiments
>>>>>>>>>>> >> >> >>>>> are
>>>>>>>>>>> >> >> >>>>> available :))
>>>>>>>>>>> >> >> >>>>> I just started 100K tests under 313.30 driver. For
>>>>>>>>>>> today
>>>>>>>>>>> good
>>>>>>>>>>> >> >> night
>>>>>>>>>>> >> >> >>>>> ...
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>> M.
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>> Dne Fri, 31 May 2013 00:45:49 +0200 Scott Le Grand
>>>>>>>>>>> >> >> >>>>> <varelse2005.gmail.com
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> napsal/-a:
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>> > It will be very interesting if this behavior
>>>>>>>>>>> persists
>>>>>>>>>>> after
>>>>>>>>>>> >> >> >>>>> downclocking.
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> > But right now, Titan 0 *looks* hosed and Titan 1
>>>>>>>>>>> *looks*
>>>>>>>>>>> like
>>>>>>>>>>> > it
>>>>>>>>>>> >> >> >>>>> needs
>>>>>>>>>>> >> >> >>>>> > downclocking...
>>>>>>>>>>> >> >> >>>>> > On May 30, 2013 3:20 PM, "Marek Maly"
>>>>>>>>>>> <marek.maly.ujep.cz>
>>>>>>>>>>> >> >> wrote:
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> >> Hi all,
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> here are my results from the 500K steps 2 x
>>>>>>>>>>> repeated
>>>>>>>>>>> > benchmarks
>>>>>>>>>>> >> >> >>>>> >> under 319.23 driver and still Cuda 5.0 (see the
>>>>>>>>>>> attached
>>>>>>>>>>> >> table
>>>>>>>>>>> >> >> ).
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> It is hard to say if the results are better or
>>>>>>>>>>> worse
>>>>>>>>>>> than
>>>>>>>>>>> in
>>>>>>>>>>> > my
>>>>>>>>>>> >> >> >>>>> >> previous 100K test under driver 319.17.
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> While results from Cellulose test were improved
>>>>>>>>>>> and
>>>>>>>>>>> the
>>>>>>>>>>> > TITAN_1
>>>>>>>>>>> >> >> >>>>> card
>>>>>>>>>>> >> >> >>>>> >> even
>>>>>>>>>>> >> >> >>>>> >> successfully finished all 500K steps moreover
>>>>>>>>>>> with
>>>>>>>>>>> exactly
>>>>>>>>>>> >> the
>>>>>>>>>>> >> >> >>>>> same
>>>>>>>>>>> >> >> >>>>> >> final
>>>>>>>>>>> >> >> >>>>> >> energy !
>>>>>>>>>>> >> >> >>>>> >> (TITAN_0 at least finished more than 100K steps
>>>>>>>>>>> and in
>>>>>>>>>>> >> RUN_01
>>>>>>>>>>> >> >> even
>>>>>>>>>>> >> >> >>>>> more
>>>>>>>>>>> >> >> >>>>> >> than 400K steps)
>>>>>>>>>>> >> >> >>>>> >> In JAC_NPT test no GPU was able to finish at
>>>>>>>>>>> least
>>>>>>>>>>> 100K
>>>>>>>>>>> >> steps
>>>>>>>>>>> >> >> and
>>>>>>>>>>> >> >> >>>>> the
>>>>>>>>>>> >> >> >>>>> >> results from JAC_NVE
>>>>>>>>>>> >> >> >>>>> >> test are also not too much convincing.
>>>>>>>>>>> FACTOR_IX_NVE
>>>>>>>>>>> and
>>>>>>>>>>> >> >> >>>>> FACTOR_IX_NPT
>>>>>>>>>>> >> >> >>>>> >> were successfully
>>>>>>>>>>> >> >> >>>>> >> finished with 100% reproducibility in
>>>>>>>>>>> FACTOR_IX_NPT
>>>>>>>>>>> case
>>>>>>>>>>> >> (on
>>>>>>>>>>> >> >> both
>>>>>>>>>>> >> >> >>>>> >> cards)
>>>>>>>>>>> >> >> >>>>> >> and almost
>>>>>>>>>>> >> >> >>>>> >> 100% reproducibility in case of FACTOR_IX_NVE
>>>>>>>>>>> (again
>>>>>>>>>>> 100%
>>>>>>>>>>> in
>>>>>>>>>>> >> >> case
>>>>>>>>>>> >> >> >>>>> of
>>>>>>>>>>> >> >> >>>>> >> TITAN_1). TRPCAGE, MYOGLOBIN
>>>>>>>>>>> >> >> >>>>> >> again finished without any problem with 100%
>>>>>>>>>>> >> reproducibility.
>>>>>>>>>>> >> >> >>>>> NUCLEOSOME
>>>>>>>>>>> >> >> >>>>> >> test was not done
>>>>>>>>>>> >> >> >>>>> >> this time due to high time requirements. If you
>>>>>>>>>>> find
>>>>>>>>>>> in
>>>>>>>>>>> the
>>>>>>>>>>> >> >> table
>>>>>>>>>>> >> >> >>>>> >> positive
>>>>>>>>>>> >> >> >>>>> >> number finishing with
>>>>>>>>>>> >> >> >>>>> >> K (which means "thousands") it means the last
>>>>>>>>>>> number
>>>>>>>>>>> of
>>>>>>>>>>> step
>>>>>>>>>>> >> >> >>>>> written in
>>>>>>>>>>> >> >> >>>>> >> mdout before crash.
>>>>>>>>>>> >> >> >>>>> >> Below are all the 3 types of detected errs with
>>>>>>>>>>> relevant
>>>>>>>>>>> >> >> >>>>> systems/rounds
>>>>>>>>>>> >> >> >>>>> >> where the given err
>>>>>>>>>>> >> >> >>>>> >> appeared.
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> Now I will try just 100K tests under ETs
>>>>>>>>>>> favourite
>>>>>>>>>>> driver
>>>>>>>>>>> >> >> version
>>>>>>>>>>> >> >> >>>>> 313.30
>>>>>>>>>>> >> >> >>>>> >> :)) and then
>>>>>>>>>>> >> >> >>>>> >> I will eventually try to experiment with cuda
>>>>>>>>>>> 5.5
>>>>>>>>>>> which
>>>>>>>>>>> I
>>>>>>>>>>> >> >> already
>>>>>>>>>>> >> >> >>>>> >> downloaded from the
>>>>>>>>>>> >> >> >>>>> >> cuda zone ( I had to become cuda developer for
>>>>>>>>>>> this
>>>>>>>>>>> :))
>>>>>>>>>>> )
>>>>>>>>>>> >> BTW
>>>>>>>>>>> >> >> ET
>>>>>>>>>>> >> >> >>>>> thanks
>>>>>>>>>>> >> >> >>>>> >> for the frequency info !
>>>>>>>>>>> >> >> >>>>> >> and I am still ( perhaps not alone :)) ) very
>>>>>>>>>>> curious
>>>>>>>>>>> about
>>>>>>>>>>> >> >> your 2
>>>>>>>>>>> >> >> >>>>> x
>>>>>>>>>>> >> >> >>>>> >> repeated Amber benchmark tests with superclocked
>>>>>>>>>>> Titan.
>>>>>>>>>>> >> Indeed
>>>>>>>>>>> >> >> >>>>> that
>>>>>>>>>>> >> >> >>>>> I
>>>>>>>>>>> >> >> >>>>> am
>>>>>>>>>>> >> >> >>>>> >> very curious also about that Ross "hot" patch.
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> M.
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> ERRORS DETECTED DURING THE 500K steps tests with
>>>>>>>>>>> driver
>>>>>>>>>>> >> 319.23
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> #1 ERR writtent in mdout:
>>>>>>>>>>> >> >> >>>>> >> ------
>>>>>>>>>>> >> >> >>>>> >> | ERROR: max pairlist cutoff must be less than
>>>>>>>>>>> unit
>>>>>>>>>>> cell
>>>>>>>>>>> >> max
>>>>>>>>>>> >> >> >>>>> sphere
>>>>>>>>>>> >> >> >>>>> >> radius!
>>>>>>>>>>> >> >> >>>>> >> ------
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_1 JAC_NPT (at least 5000 steps
>>>>>>>>>>> successfully
>>>>>>>>>>> > done
>>>>>>>>>>> >> >> >>>>> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NPT (at least 8000 steps
>>>>>>>>>>> successfully
>>>>>>>>>>> > done
>>>>>>>>>>> >> >> >>>>> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> #2 no ERR writtent in mdout, ERR written in
>>>>>>>>>>> standard
>>>>>>>>>>> output
>>>>>>>>>>> >> >> >>>>> (nohup.out)
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>>>> >> >> >>>>> >> Error: unspecified launch failure launching
>>>>>>>>>>> kernel
>>>>>>>>>>> >> kNLSkinTest
>>>>>>>>>>> >> >> >>>>> >> cudaFree GpuBuffer::Deallocate failed
>>>>>>>>>>> unspecified
>>>>>>>>>>> launch
>>>>>>>>>>> >> >> failure
>>>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_1 CELLULOSE_NVE (at least 437 000
>>>>>>>>>>> steps
>>>>>>>>>>> >> >> successfully
>>>>>>>>>>> >> >> >>>>> done
>>>>>>>>>>> >> >> >>>>> >> before crash)
>>>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_2 JAC_NVE (at least 162 000 steps
>>>>>>>>>>> >> successfully
>>>>>>>>>>> >> >> done
>>>>>>>>>>> >> >> >>>>> >> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >> TITAN_0 ROUND_2 CELLULOSE_NVE (at least 117 000
>>>>>>>>>>> steps
>>>>>>>>>>> >> >> successfully
>>>>>>>>>>> >> >> >>>>> done
>>>>>>>>>>> >> >> >>>>> >> before crash)
>>>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NVE (at least 119 000 steps
>>>>>>>>>>> >> successfully
>>>>>>>>>>> >> >> done
>>>>>>>>>>> >> >> >>>>> >> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NVE (at least 43 000 steps
>>>>>>>>>>> successfully
>>>>>>>>>>> >> >> done
>>>>>>>>>>> >> >> >>>>> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> #3 no ERR writtent in mdout, ERR written in
>>>>>>>>>>> standard
>>>>>>>>>>> output
>>>>>>>>>>> >> >> >>>>> (nohup.out)
>>>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>>>> >> >> >>>>> >> cudaMemcpy GpuBuffer::Download failed
>>>>>>>>>>> unspecified
>>>>>>>>>>> launch
>>>>>>>>>>> >> >> failure
>>>>>>>>>>> >> >> >>>>> >> ----
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_1 JAC_NPT (at least 77 000 steps
>>>>>>>>>>> successfully
>>>>>>>>>>> >> >> done
>>>>>>>>>>> >> >> >>>>> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >> TITAN_1 ROUND_2 JAC_NPT (at least 58 000 steps
>>>>>>>>>>> successfully
>>>>>>>>>>> >> >> done
>>>>>>>>>>> >> >> >>>>> before
>>>>>>>>>>> >> >> >>>>> >> crash)
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> Dne Thu, 30 May 2013 21:27:17 +0200 Scott Le
>>>>>>>>>>> Grand
>>>>>>>>>>> >> >> >>>>> >> <varelse2005.gmail.com>
>>>>>>>>>>> >> >> >>>>> >> napsal/-a:
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> Oops meant to send that to Jason...
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>> Anyway, before we all panic, we need to get
>>>>>>>>>>> K20's
>>>>>>>>>>> behavior
>>>>>>>>>>> >> >> >>>>> analyzed
>>>>>>>>>>> >> >> >>>>> >>> here.
>>>>>>>>>>> >> >> >>>>> >>> If it's deterministic, this truly is a hardware
>>>>>>>>>>> issue. If
>>>>>>>>>>> >> >> not,
>>>>>>>>>>> >> >> >>>>> then
>>>>>>>>>>> >> >> >>>>> it
>>>>>>>>>>> >> >> >>>>> >>> gets interesting because 680 is deterministic
>>>>>>>>>>> as
>>>>>>>>>>> far
>>>>>>>>>>> as I
>>>>>>>>>>> >> can
>>>>>>>>>>> >> >> >>>>> tell...
>>>>>>>>>>> >> >> >>>>> >>> On May 30, 2013 12:24 PM, "Scott Le Grand"
>>>>>>>>>>> >> >> >>>>> <varelse2005.gmail.com>
>>>>>>>>>>> >> >> >>>>> >>> wrote:
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>> If the errors are not deterministically
>>>>>>>>>>> triggered,
>>>>>>>>>>> they
>>>>>>>>>>> >> >> probably
>>>>>>>>>>> >> >> >>>>> >>> won't be
>>>>>>>>>>> >> >> >>>>> >>>> fixed by the patch, alas...
>>>>>>>>>>> >> >> >>>>> >>>> On May 30, 2013 12:15 PM, "Jason Swails"
>>>>>>>>>>> >> >> >>>>> <jason.swails.gmail.com>
>>>>>>>>>>> >> >> >>>>> >>>> wrote:
>>>>>>>>>>> >> >> >>>>> >>>>
>>>>>>>>>>> >> >> >>>>> >>>> Just a reminder to everyone based on what
>>>>>>>>>>> Ross
>>>>>>>>>>> said:
>>>>>>>>>>> >> there
>>>>>>>>>>> >> >> is a
>>>>>>>>>>> >> >> >>>>> >>>> pending
>>>>>>>>>>> >> >> >>>>> >>>>> patch to pmemd.cuda that will be coming out
>>>>>>>>>>> shortly
>>>>>>>>>>> >> (maybe
>>>>>>>>>>> >> >> even
>>>>>>>>>>> >> >> >>>>> >>>>> within
>>>>>>>>>>> >> >> >>>>> >>>>> hours). It's entirely possible that several
>>>>>>>>>>> of
>>>>>>>>>>> these
>>>>>>>>>>> > errors
>>>>>>>>>>> >> >> >>>>> are
>>>>>>>>>>> >> >> >>>>> >>>>> fixed
>>>>>>>>>>> >> >> >>>>> >>>>> by
>>>>>>>>>>> >> >> >>>>> >>>>> this patch.
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>> All the best,
>>>>>>>>>>> >> >> >>>>> >>>>> Jason
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>> On Thu, May 30, 2013 at 2:46 PM, filip fratev
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>> filipfratev.yahoo.com>
>>>>>>>>>>> >> >> >>>>> >>>>> wrote:
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>> > I have observed the same crashes from time
>>>>>>>>>>> to
>>>>>>>>>>> time. I
>>>>>>>>>>> > will
>>>>>>>>>>> >> >> >>>>> run
>>>>>>>>>>> >> >> >>>>> >>>>> cellulose
>>>>>>>>>>> >> >> >>>>> >>>>> > nve for 100k and will past results here.
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> > All the best,
>>>>>>>>>>> >> >> >>>>> >>>>> > Filip
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> > ______________________________****__
>>>>>>>>>>> >> >> >>>>> >>>>> > From: Scott Le Grand
>>>>>>>>>>> <varelse2005.gmail.com>
>>>>>>>>>>> >> >> >>>>> >>>>> > To: AMBER Mailing List <amber.ambermd.org>
>>>>>>>>>>> >> >> >>>>> >>>>> > Sent: Thursday, May 30, 2013 9:01 PM
>>>>>>>>>>> >> >> >>>>> >>>>> > Subject: Re: [AMBER] experiences with EVGA
>>>>>>>>>>> GTX
>>>>>>>>>>> TITAN
>>>>>>>>>>> >> >> >>>>> Superclocked
>>>>>>>>>>> >> >> >>>>> -
>>>>>>>>>>> >> >> >>>>> >>>>> > memtestG80 - UNDERclocking in Linux ?
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> > Run cellulose nve for 100k iterations twice
>>>>>>>>>>> . If
>>>>>>>>>>> the
>>>>>>>>>>> >> >> final
>>>>>>>>>>> >> >> >>>>> >>>>> energies
>>>>>>>>>>> >> >> >>>>> >>>>> don't
>>>>>>>>>>> >> >> >>>>> >>>>> > match, you have a hardware issue. No need
>>>>>>>>>>> to
>>>>>>>>>>> play
>>>>>>>>>>> with
>>>>>>>>>>> >> >> ntpr
>>>>>>>>>>> >> >> >>>>> or
>>>>>>>>>>> >> >> >>>>> any
>>>>>>>>>>> >> >> >>>>> >>>>> other
>>>>>>>>>>> >> >> >>>>> >>>>> > variable.
>>>>>>>>>>> >> >> >>>>> >>>>> > On May 30, 2013 10:58 AM,
>>>>>>>>>>> <pavel.banas.upol.cz>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > Dear all,
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > I would also like to share one of my
>>>>>>>>>>> experience
>>>>>>>>>>> with
>>>>>>>>>>> >> >> titan
>>>>>>>>>>> >> >> >>>>> >>>>> cards. We
>>>>>>>>>>> >> >> >>>>> >>>>> have
>>>>>>>>>>> >> >> >>>>> >>>>> > > one gtx titan card and with one system
>>>>>>>>>>> (~55k
>>>>>>>>>>> atoms,
>>>>>>>>>>> > NVT,
>>>>>>>>>>> >> >> >>>>> >>>>> RNA+waters)
>>>>>>>>>>> >> >> >>>>> >>>>> we
>>>>>>>>>>> >> >> >>>>> >>>>> > run
>>>>>>>>>>> >> >> >>>>> >>>>> > > into same troubles you are describing. I
>>>>>>>>>>> was
>>>>>>>>>>> also
>>>>>>>>>>> >> >> playing
>>>>>>>>>>> >> >> >>>>> with
>>>>>>>>>>> >> >> >>>>> >>>>> ntpr
>>>>>>>>>>> >> >> >>>>> >>>>> to
>>>>>>>>>>> >> >> >>>>> >>>>> > > figure out what is going on, step by
>>>>>>>>>>> step.
>>>>>>>>>>> I
>>>>>>>>>>> >> understand
>>>>>>>>>>> >> >> >>>>> that
>>>>>>>>>>> >> >> >>>>> the
>>>>>>>>>>> >> >> >>>>> >>>>> code
>>>>>>>>>>> >> >> >>>>> >>>>> is
>>>>>>>>>>> >> >> >>>>> >>>>> > > using different routines for calculation
>>>>>>>>>>> >> >> energies+forces or
>>>>>>>>>>> >> >> >>>>> only
>>>>>>>>>>> >> >> >>>>> >>>>> forces.
>>>>>>>>>>> >> >> >>>>> >>>>> > > The
>>>>>>>>>>> >> >> >>>>> >>>>> > > simulations of other systems are
>>>>>>>>>>> perfectly
>>>>>>>>>>> stable,
>>>>>>>>>>> >> >> running
>>>>>>>>>>> >> >> >>>>> for
>>>>>>>>>>> >> >> >>>>> >>>>> days
>>>>>>>>>>> >> >> >>>>> >>>>> and
>>>>>>>>>>> >> >> >>>>> >>>>> > > weeks. Only that particular system
>>>>>>>>>>> systematically
>>>>>>>>>>> >> ends
>>>>>>>>>>> >> >> up
>>>>>>>>>>> >> >> >>>>> with
>>>>>>>>>>> >> >> >>>>> >>>>> this
>>>>>>>>>>> >> >> >>>>> >>>>> > error.
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > However, there was one interesting issue.
>>>>>>>>>>> When
>>>>>>>>>>> I
>>>>>>>>>>> set
>>>>>>>>>>> >> >> >>>>> ntpr=1,
>>>>>>>>>>> >> >> >>>>> the
>>>>>>>>>>> >> >> >>>>> >>>>> error
>>>>>>>>>>> >> >> >>>>> >>>>> > > vanished (systematically in multiple
>>>>>>>>>>> runs)
>>>>>>>>>>> and
>>>>>>>>>>> the
>>>>>>>>>>> >> >> >>>>> simulation
>>>>>>>>>>> >> >> >>>>> was
>>>>>>>>>>> >> >> >>>>> >>>>> able to
>>>>>>>>>>> >> >> >>>>> >>>>> > > run for more than millions of steps (I
>>>>>>>>>>> was
>>>>>>>>>>> not
>>>>>>>>>>> let
>>>>>>>>>>> it
>>>>>>>>>>> >> >> >>>>> running
>>>>>>>>>>> >> >> >>>>> for
>>>>>>>>>>> >> >> >>>>> >>>>> weeks
>>>>>>>>>>> >> >> >>>>> >>>>> > as
>>>>>>>>>>> >> >> >>>>> >>>>> > > in the meantime I shifted that simulation
>>>>>>>>>>> to
>>>>>>>>>>> other
>>>>>>>>>>> >> card
>>>>>>>>>>> >> >> -
>>>>>>>>>>> >> >> >>>>> need
>>>>>>>>>>> >> >> >>>>> >>>>> data,
>>>>>>>>>>> >> >> >>>>> >>>>> not
>>>>>>>>>>> >> >> >>>>> >>>>> > > testing). All other setting of ntpr
>>>>>>>>>>> failed.
>>>>>>>>>>> As
>>>>>>>>>>> I
>>>>>>>>>>> read
>>>>>>>>>>> >> >> this
>>>>>>>>>>> >> >> >>>>> >>>>> discussion, I
>>>>>>>>>>> >> >> >>>>> >>>>> > > tried to set ene_avg_sampling=1 with some
>>>>>>>>>>> high
>>>>>>>>>>> value
>>>>>>>>>>> >> of
>>>>>>>>>>> >> >> >>>>> ntpr
>>>>>>>>>>> >> >> >>>>> (I
>>>>>>>>>>> >> >> >>>>> >>>>> expected
>>>>>>>>>>> >> >> >>>>> >>>>> > > that this will shift the code to
>>>>>>>>>>> permanently
>>>>>>>>>>> use
>>>>>>>>>>> the
>>>>>>>>>>> >> >> >>>>> >>>>> force+energies
>>>>>>>>>>> >> >> >>>>> >>>>> part
>>>>>>>>>>> >> >> >>>>> >>>>> > of
>>>>>>>>>>> >> >> >>>>> >>>>> > > the code, similarly to ntpr=1), but the
>>>>>>>>>>> error
>>>>>>>>>>> >> occurred
>>>>>>>>>>> >> >> >>>>> again.
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > I know it is not very conclusive for
>>>>>>>>>>> finding
>>>>>>>>>>> out
>>>>>>>>>>> what
>>>>>>>>>>> > is
>>>>>>>>>>> >> >> >>>>> >>>>> happening,
>>>>>>>>>>> >> >> >>>>> >>>>> at
>>>>>>>>>>> >> >> >>>>> >>>>> > > least
>>>>>>>>>>> >> >> >>>>> >>>>> > > not for me. Do you have any idea, why
>>>>>>>>>>> ntpr=1
>>>>>>>>>>> might
>>>>>>>>>>> > help?
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > best regards,
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > Pavel
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > --
>>>>>>>>>>> >> >> >>>>> >>>>> > > Pavel Banáš
>>>>>>>>>>> >> >> >>>>> >>>>> > > pavel.banas.upol.cz
>>>>>>>>>>> >> >> >>>>> >>>>> > > Department of Physical Chemistry,
>>>>>>>>>>> >> >> >>>>> >>>>> > > Palacky University Olomouc
>>>>>>>>>>> >> >> >>>>> >>>>> > > Czech Republic
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > ---------- Původní zpráva ----------
>>>>>>>>>>> >> >> >>>>> >>>>> > > Od: Jason Swails <jason.swails.gmail.com>
>>>>>>>>>>> >> >> >>>>> >>>>> > > Datum: 29. 5. 2013
>>>>>>>>>>> >> >> >>>>> >>>>> > > Předmět: Re: [AMBER] experiences with
>>>>>>>>>>> EVGA
>>>>>>>>>>> GTX
>>>>>>>>>>> TITAN
>>>>>>>>>>> >> >> >>>>> >>>>> Superclocked -
>>>>>>>>>>> >> >> >>>>> >>>>> > > memtestG
>>>>>>>>>>> >> >> >>>>> >>>>> > > 80 - UNDERclocking in Linux ?
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > "I'll answer a little bit:
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > NTPR=10 Etot after 2000 steps
>>>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > > -443256.6711
>>>>>>>>>>> >> >> >>>>> >>>>> > > > -443256.6711
>>>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > > NTPR=200 Etot after 2000 steps
>>>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > > -443261.0705
>>>>>>>>>>> >> >> >>>>> >>>>> > > > -443261.0705
>>>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > > Any idea why energies should depend on
>>>>>>>>>>> frequency
>>>>>>>>>>> of
>>>>>>>>>>> >> >> >>>>> energy
>>>>>>>>>>> >> >> >>>>> >>>>> records
>>>>>>>>>>> >> >> >>>>> >>>>> > (NTPR)
>>>>>>>>>>> >> >> >>>>> >>>>> > > ?
>>>>>>>>>>> >> >> >>>>> >>>>> > > >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > It is a subtle point, but the answer is
>>>>>>>>>>> 'different
>>>>>>>>>>> >> code
>>>>>>>>>>> >> >> >>>>> paths.'
>>>>>>>>>>> >> >> >>>>> >>>>> In
>>>>>>>>>>> >> >> >>>>> >>>>> > > general, it is NEVER necessary to compute
>>>>>>>>>>> the
>>>>>>>>>>> actual
>>>>>>>>>>> >> >> energy
>>>>>>>>>>> >> >> >>>>> of a
>>>>>>>>>>> >> >> >>>>> >>>>> molecule
>>>>>>>>>>> >> >> >>>>> >>>>> > > during the course of standard molecular
>>>>>>>>>>> dynamics
>>>>>>>>>>> (by
>>>>>>>>>>> >> >> >>>>> analogy, it
>>>>>>>>>>> >> >> >>>>> >>>>> is
>>>>>>>>>>> >> >> >>>>> >>>>> NEVER
>>>>>>>>>>> >> >> >>>>> >>>>> > > necessary to compute atomic forces during
>>>>>>>>>>> the
>>>>>>>>>>> course
>>>>>>>>>>> >> of
>>>>>>>>>>> >> >> >>>>> random
>>>>>>>>>>> >> >> >>>>> >>>>> Monte
>>>>>>>>>>> >> >> >>>>> >>>>> > Carlo
>>>>>>>>>>> >> >> >>>>> >>>>> > > sampling).
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > For performance's sake, then, pmemd.cuda
>>>>>>>>>>> computes
>>>>>>>>>>> >> only
>>>>>>>>>>> >> >> the
>>>>>>>>>>> >> >> >>>>> force
>>>>>>>>>>> >> >> >>>>> >>>>> when
>>>>>>>>>>> >> >> >>>>> >>>>> > > energies are not requested, leading to a
>>>>>>>>>>> different
>>>>>>>>>>> >> >> order of
>>>>>>>>>>> >> >> >>>>> >>>>> operations
>>>>>>>>>>> >> >> >>>>> >>>>> > for
>>>>>>>>>>> >> >> >>>>> >>>>> > > those runs. This difference ultimately
>>>>>>>>>>> causes
>>>>>>>>>>> >> >> divergence.
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > To test this, try setting the variable
>>>>>>>>>>> >> >> ene_avg_sampling=10
>>>>>>>>>>> >> >> >>>>> in
>>>>>>>>>>> >> >> >>>>> the
>>>>>>>>>>> >> >> >>>>> >>>>> &cntrl
>>>>>>>>>>> >> >> >>>>> >>>>> > > section. This will force pmemd.cuda to
>>>>>>>>>>> compute
>>>>>>>>>>> >> energies
>>>>>>>>>>> >> >> >>>>> every 10
>>>>>>>>>>> >> >> >>>>> >>>>> steps
>>>>>>>>>>> >> >> >>>>> >>>>> > > (for energy averaging), which will in
>>>>>>>>>>> turn
>>>>>>>>>>> make
>>>>>>>>>>> the
>>>>>>>>>>> >> >> >>>>> followed
>>>>>>>>>>> >> >> >>>>> code
>>>>>>>>>>> >> >> >>>>> >>>>> path
>>>>>>>>>>> >> >> >>>>> >>>>> > > identical for any multiple-of-10 value of
>>>>>>>>>>> ntpr.
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > > --
>>>>>>>>>>> >> >> >>>>> >>>>> > > Jason M. Swails
>>>>>>>>>>> >> >> >>>>> >>>>> > > Quantum Theory Project,
>>>>>>>>>>> >> >> >>>>> >>>>> > > University of Florida
>>>>>>>>>>> >> >> >>>>> >>>>> > > Ph.D. Candidate
>>>>>>>>>>> >> >> >>>>> >>>>> > > 352-392-4032
>>>>>>>>>>> >> >> >>>>> >>>>> > > ______________________________**
>>>>>>>>>>> **_________________
>>>>>>>>>>> >> >> >>>>> >>>>> > > AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >>>>> > > AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> "
>>>>>>>>>>> >> >> >>>>> >>>>> > > ______________________________**
>>>>>>>>>>> **_________________
>>>>>>>>>>> >> >> >>>>> >>>>> > > AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >>>>> > > AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> > >
>>>>>>>>>>> >> >> >>>>> >>>>> > ______________________________**
>>>>>>>>>>> **_________________
>>>>>>>>>>> >> >> >>>>> >>>>> > AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >>>>> > AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> > ______________________________**
>>>>>>>>>>> **_________________
>>>>>>>>>>> >> >> >>>>> >>>>> > AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >>>>> > AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>> --
>>>>>>>>>>> >> >> >>>>> >>>>> Jason M. Swails
>>>>>>>>>>> >> >> >>>>> >>>>> Quantum Theory Project,
>>>>>>>>>>> >> >> >>>>> >>>>> University of Florida
>>>>>>>>>>> >> >> >>>>> >>>>> Ph.D. Candidate
>>>>>>>>>>> >> >> >>>>> >>>>> 352-392-4032
>>>>>>>>>>> >> >> >>>>> >>>>> ______________________________**
>>>>>>>>>>> **_________________
>>>>>>>>>>> >> >> >>>>> >>>>> AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >>>>> AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>>>> http://lists.ambermd.org/****
>>>>>>>>>>> mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>>>
>>>>>>>>>>> >> >> >>>>> >>>> ______________________________**
>>>>>>>>>>> **_________________
>>>>>>>>>>> >> >> >>>>> >>> AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >>> AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> http://lists.ambermd.org/****mailman/listinfo/amber<http://lists.ambermd.org/**mailman/listinfo/amber>
>>>>>>>>>>> <
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>> __________ Informace od ESET NOD32 Antivirus,
>>>>>>>>>>> verze
>>>>>>>>>>> >> databaze
>>>>>>>>>>> >> >> 8394
>>>>>>>>>>> >> >> >>>>> >>> (20130530) __________
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>> http://www.eset.cz
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >> --
>>>>>>>>>>> >> >> >>>>> >> Tato zpráva byla vytvořena převratným poštovním
>>>>>>>>>>> klientem
>>>>>>>>>>> > Opery:
>>>>>>>>>>> >> >> >>>>> >> http://www.opera.com/mail/
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> ______________________________**_________________
>>>>>>>>>>> >> >> >>>>> >> AMBER mailing list
>>>>>>>>>>> >> >> >>>>> >> AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> >>
>>>>>>>>>>> >> >> >>>>> > ______________________________**_________________
>>>>>>>>>>> >> >> >>>>> > AMBER mailing list
>>>>>>>>>>> >> >> >>>>> > AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> > __________ Informace od ESET NOD32 Antivirus,
>>>>>>>>>>> verze
>>>>>>>>>>> databaze
>>>>>>>>>>> >> >> 8394
>>>>>>>>>>> >> >> >>>>> > (20130530) __________
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> > http://www.eset.cz
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>> >
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>> --
>>>>>>>>>>> >> >> >>>>> Tato zpráva byla vytvořena převratným poštovním
>>>>>>>>>>> klientem
>>>>>>>>>>> Opery:
>>>>>>>>>>> >> >> >>>>> http://www.opera.com/mail/
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>> ______________________________**_________________
>>>>>>>>>>> >> >> >>>>> AMBER mailing list
>>>>>>>>>>> >> >> >>>>> AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >> >> >>>>>
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>> >> >> >>> ______________________________**_________________
>>>>>>>>>>> >> >> >>> AMBER mailing list
>>>>>>>>>>> >> >> >>> AMBER.ambermd.org
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>> __________ Informace od ESET NOD32 Antivirus, verze
>>>>>>>>>>> databaze
>>>>>>>>>>> 8395
>>>>>>>>>>> >> >> >>> (20130531) __________
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>> http://www.eset.cz
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >>
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >> >
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> --
>>>>>>>>>>> >> >> Tato zpráva byla vytvořena převratným poštovním klientem
>>>>>>>>>>> Opery:
>>>>>>>>>>> >> >> http://www.opera.com/mail/
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> ______________________________**_________________
>>>>>>>>>>> >> >> AMBER mailing list
>>>>>>>>>>> >> >> AMBER.ambermd.org
>>>>>>>>>>> >> >>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >
>>>>>>>>>>> >> >
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > __________ Informace od ESET NOD32 Antivirus, verze
>>>>>>>>>>> databaze
>>>>>>>>>>> 8397
>>>>>>>>>>> >> > (20130531) __________
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz - poskozeny archiv
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> GB_out_plus_diff_Files.tar
>>>>>>>>>>> >> > - poskozeny archiv
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> GB_out_plus_diff_Files.tar.gz -
>>>>>>>>>>> >> > poskozeny archiv
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> GB_out_plus_diff_Files.tar.gz >
>>>>>>>>>>> >> GZIP
>>>>>>>>>>> >> > > GB_out_plus_diff_Files.tar - poskozeny archiv
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> GB_out_plus_diff_Files.tar.gz >
>>>>>>>>>>> >> GZIP
>>>>>>>>>>> >> > > GB_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> GB_nucleosome-sim3.mdout-full -
>>>>>>>>>>> >> > vyskytl se problem pri cteni archivu
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz - poskozeny archiv
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar - poskozeny archiv
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> PME_out_plus_diff_Files.tar.gz -
>>>>>>>>>>> >> > poskozeny archiv
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> PME_out_plus_diff_Files.tar.gz >
>>>>>>>>>>> >> > GZIP > PME_out_plus_diff_Files.tar - poskozeny archiv
>>>>>>>>>>> >> > PME_out_plus_diff_Files.tar.gz > GZIP >
>>>>>>>>>>> >> > PME_out_plus_diff_Filestar > TAR >
>>>>>>>>>>> PME_out_plus_diff_Files.tar.gz
>>>>>>>>>>> >
>>>>>>>>>>> >> GZIP
>>>>>>>>>>> >> > > PME_out_plus_diff_Files.tar > TAR >
>>>>>>>>>>> >> > PME_JAC_production_NPT-sim3.**mdout-full - vyskytl se
>>>>>>>>>>> problem
>>>>>>>>>>> pri
>>>>>>>>>>> cteni
>>>>>>>>>>> >> > archivu
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > http://www.eset.cz
>>>>>>>>>>> >> >
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >> --
>>>>>>>>>>> >> Tato zpráva byla vytvořena převratným poštovním klientem
>>>>>>>>>>> Opery:
>>>>>>>>>>> >> http://www.opera.com/mail/
>>>>>>>>>>> >>
>>>>>>>>>>> >> ______________________________**_________________
>>>>>>>>>>> >> AMBER mailing list
>>>>>>>>>>> >> AMBER.ambermd.org
>>>>>>>>>>> >>
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> > ______________________________**_________________
>>>>>>>>>>> > AMBER mailing list
>>>>>>>>>>> > AMBER.ambermd.org
>>>>>>>>>>> >
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>> >
>>>>>>>>>>> > __________ Informace od ESET NOD32 Antivirus, verze databaze
>>>>>>>>>>> 8398
>>>>>>>>>>> > (20130531) __________
>>>>>>>>>>> >
>>>>>>>>>>> > Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>>>> >
>>>>>>>>>>> > http://www.eset.cz
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>
>>>>>>>>>>> ______________________________**_________________
>>>>>>>>>>> AMBER mailing list
>>>>>>>>>>> AMBER.ambermd.org
>>>>>>>>>>> http://lists.ambermd.org/**mailman/listinfo/amber<http://lists.ambermd.org/mailman/listinfo/amber>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8401
>>>>>>>>> (20130601) __________
>>>>>>>>>
>>>>>>>>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>>>>>>
>>>>>>>>> http://www.eset.cz
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>>>>>> http://www.opera.com/mail/
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> AMBER mailing list
>>>>>>> AMBER.ambermd.org
>>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 8403
>>>> (20130602) __________
>>>>
>>>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>>
>>>> http://www.eset.cz
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/


Total energy at step 100 000 (driver 319.23, Amber12 bugfix 18 applied, cuda 5.5 the latest from RPM package)

*TITAN_0 JAC_NVE JAC_NPT FACTOR_IX_NVE FACTOR_IX_NPT CELLULOSE_NVE TRPCAGE MYOGLOBIN
ROUND_1 -58142.1136 -58209.5130 -234190.7819 -234511.3041 -443241.9627 -254.9558 -1349.6880
ROUND_2 -58139.0027 -58227.5847 -234190.7819 -234511.3041 -443241.9627 -254.9558 -1349.6880

*TITAN_1
ROUND_1 -58143.0475 57K #0 -234190.7819 -234511.3041 -443237.4517 -254.9558 -1349.6880
ROUND_2 -58137.9516 2K #0 -234190.7819 -234511.3041 12K #2 -254.9558 -1349.6880


seems like err #1 was substituted with warning? #0 after bugfix 18.


#0 no ERR writtent in mdout, ERR written in standard output (nohup.out)
-----
Nonbond cells need to be recalculated, restart simulation from previous checkpoint
with a higher value for skinnb.

-----


#1 ERR writtent in mdout:
------
| ERROR: max pairlist cutoff must be less than unit cell max sphere radius!
------


#2 no ERR writtent in mdout, ERR written in standard output (nohup.out)

----
Error: unspecified launch failure launching kernel kNLSkinTest
cudaFree GpuBuffer::Deallocate failed unspecified launch failure
----
#3 no ERR writtent in mdout, ERR written in standard output (nohup.out)
----
cudaMemcpy GpuBuffer::Download failed unspecified launch failure
----


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 19 2013 - 09:00:02 PDT
Custom Search