So except for the EGB, this is actually a feature. DPFP does slightly
different energy accumulation in the nonbond kernels to work around the
possibility of the forces exploding. This is usually harmless differences
in the 7th decmal place or so that do not affect the trajectory in any
way...
But since the energy is used to drive the line search, you're seeing
differences...
I can address that.
That said, there is likely something weird about EGB because that's way,
way too high a difference...
< VDWAALS = -13625.88655233662575 EEL = 5414.48355789761990
EGB = -119070.11351310089231
---
> VDWAALS = -13625.88655233010650 EEL = 5414.48355790693313
EGB = -119068.21364945080131
EEL and EVDW are OK above and are the sorts of differences I would expect.
Again, utterly harmless since these energies don't drive MD trajectories.
But EGB is *weird*. I'll look into it.
I don't think this is a significant bug right now or else SPFP would get
whacked as well, but that's just a gut feeling.
On Tue, Jan 27, 2015 at 3:31 PM, Scott Le Grand <varelse2005.gmail.com>
wrote:
> Could you send me the example and the run command such that I can see if
> this reproduces at my end. DPFP should be 100%-reproducible if the
> underlying HW, OS, and Toolkit/Driver are unchanged.
>
> Scott
>
> On Tue, Jan 27, 2015 at 12:06 PM, David A Case <case.biomaps.rutgers.edu>
> wrote:
>
>> On Tue, Jan 27, 2015, Rosemary Mantell wrote:
>> >
>> > I should probably mention that I first saw this problem when using the
>> > AMBER 12 DPDP model, so it's not just a problem with fixed precision.
>>
>> Let me try to clarify things a bit here:
>>
>> 1. We do not expect the DPDP model to give exactly the same answer every
>> time.
>> This arises because floating point calculations depend on the order of
>> computation, and this will change from one run to the next.
>>
>> 2. However, run-to-run differences in DPDP should be very small for a
>> single energy calculation. This was why I asked earlier about how big
>> the differences were with DPDP. If the run-to-run diffs with DPDP are as
>> large as you report for DPFP, something must be amiss. If you get diffs
>> that are more like round-off errors, that is more in line with what we
>> expect.
>>
>> 3. SPFP and DPFP should be reproducible, since all parallel reductions
>> are done with fixed point arithmetic, which is independent of the order
>> of calculation. The fact that you see reproducibility with SPFP but not
>> DPFP is pointing the finger at the latter. This is all the more true
>> since very few people ever use the DPFP mode, and there might be something
>> lurking there that we haven't come across before.
>>
>> 4. One possible explanation for the DPFP failure is that you have unusual
>> dynamic range in your exmaple, such that the fixed point representation is
>> overflowing. This is why people are eager to know what the RMS gradient
>> of your structure is, since "bad" initial structures could have large
>> terms that violated the fixed-point assumptions. (This is the reason for
>> the general recommendation that initial minimizations, which are very
>> cheap, be carried out on a CPU.) Although you haven't explicitly said so,
>> it sounds like this is not the case here.
>>
>> 5. It should be up to us to track this down, and I know Ross is working
>> with your input files. But if you come across useful information (such
>> as: the problem goes away if you turn of gb; or set igb to some other
>> value; or do the calculation on a smaller system, etc.) please let us
>> know.
>>
>> ...thx...dac
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jan 28 2015 - 10:30:02 PST