Re: [AMBER] AMBER 14 DPFP single energy calculations inconsistent

From: Scott Le Grand <varelse2005.gmail.com> Date: Wed, 28 Jan 2015 10:00:11 -0800

---
>  VDWAALS =    -13625.88655233010650  EEL     =      5414.48355790693313
EGB        =   -119068.21364945080131
EEL and EVDW are OK above and are the sorts of differences I would expect.
Again, utterly harmless since these energies don't drive MD trajectories.
But EGB is *weird*.  I'll look into it.
I don't think this is a significant bug right now or else SPFP would get
whacked as well, but that's just a gut feeling.
On Tue, Jan 27, 2015 at 3:31 PM, Scott Le Grand <varelse2005.gmail.com>
wrote:
> Could you send me the example and the run command such that I can see if
> this reproduces at my end.  DPFP should be 100%-reproducible if the
> underlying HW, OS, and Toolkit/Driver are unchanged.
>
> Scott
>
> On Tue, Jan 27, 2015 at 12:06 PM, David A Case <case.biomaps.rutgers.edu>
> wrote:
>
>> On Tue, Jan 27, 2015, Rosemary Mantell wrote:
>> >
>> > I should probably mention that I first saw this problem when using the
>> > AMBER 12 DPDP model, so it's not just a problem with fixed precision.
>>
>> Let me try to clarify things a bit here:
>>
>> 1. We do not expect the DPDP model to give exactly the same answer every
>> time.
>> This arises because floating point calculations depend on the order of
>> computation, and this will change from one run to the next.
>>
>> 2. However, run-to-run differences in DPDP should be very small for a
>> single energy calculation.  This was why I asked earlier about how big
>> the differences were with DPDP.  If the run-to-run diffs with DPDP are as
>> large as you report for DPFP, something must be amiss.  If you get diffs
>> that are more like round-off errors, that is more in line with what we
>> expect.
>>
>> 3. SPFP and DPFP should be reproducible, since all parallel reductions
>> are done with fixed point arithmetic, which is independent of the order
>> of calculation.  The fact that you see reproducibility with SPFP but not
>> DPFP is pointing the finger at the latter.  This is all the more true
>> since very few people ever use the DPFP mode, and there might be something
>> lurking there that we haven't come across before.
>>
>> 4. One possible explanation for the DPFP failure is that you have unusual
>> dynamic range in your exmaple, such that the fixed point representation is
>> overflowing.  This is why people are eager to know what the RMS gradient
>> of your structure is, since "bad" initial structures could have large
>> terms that violated the fixed-point assumptions.  (This is the reason for
>> the general recommendation that initial minimizations, which are very
>> cheap, be carried out on a CPU.)  Although you haven't explicitly said so,
>> it sounds like this is not the case here.
>>
>> 5.  It should be up to us to track this down, and I know Ross is working
>> with your input files.  But if you come across useful information (such
>> as: the problem goes away if you turn of gb; or set igb to some other
>> value; or do the calculation on a smaller system, etc.) please let us
>> know.
>>
>> ...thx...dac
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber