Re: [AMBER] AMBER 14 DPFP single energy calculations inconsistent from Scott Le Grand on 2015-01-27 (Amber Archive Jan 2015)

From: Scott Le Grand <varelse2005.gmail.com>
Date: Tue, 27 Jan 2015 15:31:19 -0800

Could you send me the example and the run command such that I can see if
this reproduces at my end. DPFP should be 100%-reproducible if the
underlying HW, OS, and Toolkit/Driver are unchanged.

Scott

On Tue, Jan 27, 2015 at 12:06 PM, David A Case <case.biomaps.rutgers.edu>
wrote:

> On Tue, Jan 27, 2015, Rosemary Mantell wrote:
> >
> > I should probably mention that I first saw this problem when using the
> > AMBER 12 DPDP model, so it's not just a problem with fixed precision.
>
> Let me try to clarify things a bit here:
>
> 1. We do not expect the DPDP model to give exactly the same answer every
> time.
> This arises because floating point calculations depend on the order of
> computation, and this will change from one run to the next.
>
> 2. However, run-to-run differences in DPDP should be very small for a
> single energy calculation. This was why I asked earlier about how big
> the differences were with DPDP. If the run-to-run diffs with DPDP are as
> large as you report for DPFP, something must be amiss. If you get diffs
> that are more like round-off errors, that is more in line with what we
> expect.
>
> 3. SPFP and DPFP should be reproducible, since all parallel reductions
> are done with fixed point arithmetic, which is independent of the order
> of calculation. The fact that you see reproducibility with SPFP but not
> DPFP is pointing the finger at the latter. This is all the more true
> since very few people ever use the DPFP mode, and there might be something
> lurking there that we haven't come across before.
>
> 4. One possible explanation for the DPFP failure is that you have unusual
> dynamic range in your exmaple, such that the fixed point representation is
> overflowing. This is why people are eager to know what the RMS gradient
> of your structure is, since "bad" initial structures could have large
> terms that violated the fixed-point assumptions. (This is the reason for
> the general recommendation that initial minimizations, which are very
> cheap, be carried out on a CPU.) Although you haven't explicitly said so,
> it sounds like this is not the case here.
>
> 5. It should be up to us to track this down, and I know Ross is working
> with your input files. But if you come across useful information (such
> as: the problem goes away if you turn of gb; or set igb to some other
> value; or do the calculation on a smaller system, etc.) please let us
> know.
>
> ...thx...dac
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jan 27 2015 - 16:00:02 PST