Date: Tue, 27 Jan 2015 15:06:03 -0500

On Tue, Jan 27, 2015, Rosemary Mantell wrote:

Let me try to clarify things a bit here:

1. We do not expect the DPDP model to give exactly the same answer every time.

This arises because floating point calculations depend on the order of

computation, and this will change from one run to the next.

2. However, run-to-run differences in DPDP should be very small for a

single energy calculation. This was why I asked earlier about how big

the differences were with DPDP. If the run-to-run diffs with DPDP are as

large as you report for DPFP, something must be amiss. If you get diffs

that are more like round-off errors, that is more in line with what we

expect.

3. SPFP and DPFP should be reproducible, since all parallel reductions

are done with fixed point arithmetic, which is independent of the order

of calculation. The fact that you see reproducibility with SPFP but not

DPFP is pointing the finger at the latter. This is all the more true

since very few people ever use the DPFP mode, and there might be something

lurking there that we haven't come across before.

4. One possible explanation for the DPFP failure is that you have unusual

dynamic range in your exmaple, such that the fixed point representation is

overflowing. This is why people are eager to know what the RMS gradient

of your structure is, since "bad" initial structures could have large

terms that violated the fixed-point assumptions. (This is the reason for

the general recommendation that initial minimizations, which are very

cheap, be carried out on a CPU.) Although you haven't explicitly said so,

it sounds like this is not the case here.

5. It should be up to us to track this down, and I know Ross is working

with your input files. But if you come across useful information (such

as: the problem goes away if you turn of gb; or set igb to some other

value; or do the calculation on a smaller system, etc.) please let us

know.

...thx...dac

