Re: Problem with bench test on PIII450 from ross_at_cgl.ucsf.EDU on 1999-08-13 (Amber Archive Aug 1999)

From: <ross_at_cgl.ucsf.EDU>
Date: Fri 13 Aug 1999 12:55:32 -0700 (PDT)

        Yes, this problem is familiar to us. Even for Amber 6 (coming out this
        fall) we are unable to exactly reproduce Ewald energies on Intel chips.

Unless I'm missing something, the step 1 energy diffs are
well within the norm. The interesting thing about Intel
chips in the past (when I was responsible for the demo/test
stuff, i.e. 4.1) was that Intels (using f2c at the time) were
the _only_ machines that gave results _exactly_ the same as
the HP numbers obtained with no compiler optimization, i.e.
no diffs at all. Whereas diffs more like these ewald ones were
found across the board when comparing with e.g. SGI or even the
same HP machine with the normal compiler optimization flags.

Note that the benchmark cases _were_ run with HP optimization
flags turned on, so I'd expect the exact same diffs if you had
run with unoptimized HP binaries instead of on Intel, assuming
Intels now are numerically the same as then.

As I wrote in 4.1 test/0README:

    The demos were run on a
    HP 9000/735 (HP-UX 9.05) machine using unoptimized executables.
    Amber does not give uniform answers to high precision across
    different machines. (This is especially true of the single
    precision version of sander; Cray users do not have single
    precision; single precision will be dropped in future.) Here
    are 2 examples of acceptable differences, in this case resulting
    from different compiler options:

    *** single precision sander ***

    < Etot = 811.2754 EKtot = 0.0000 EPtot = 811.2754
    < BOND = 421.147 ANGLE = 836.2664 DIHED = 229.3823
    < 1-4 NB = 254.0287 1-4 EEL = 1318.7067 NONBOND = -164.1002
    < NBEEL = -2062.8323 EHBOND = -21.3228 CONSTRAINT = 0.0000
    ---
> Etot = 811.3116 EKtot = 0.0000 EPtot = 811.3116
> BOND = 421.147 ANGLE = 836.2667 DIHED = 229.3821
> 1-4 NB = 254.0287 1-4 EEL = 1318.7061 NONBOND = -164.0688
> NBEEL = -2062.8269 EHBOND = -21.3228 CONSTRAINT = 0.0000

    *** nmode - the "large" differences are between miniscule amounts ***

    < F = 0.161125E+02 GRDMAX = 0.470665E-12 GNORM = 0.141243E-12
    ---
> F = 0.161125E+02 GRDMAX = 0.345793E-12 GNORM = 0.101255E-12


        It is of interest that "traditional" Unix boxes (e.g. from SGI,
        Compaq(DEC), HP, Cray T3E, Sparc Solaris) all give identical results,

This would be new as of at least 5.0. In this release the use
of unoptimized binaries for generating the reference numbers
was discontinued, so it may be that machines tend to optimize
the same way; however, unoptimized code should give more 'correct'
answers insofar as there is a worthwhile distinction, so in general
I'd trust the Intel numbers 'more', although there may be something
in Dave Case's observation about the erfc function; however it didn't
give a diff at step 1 and at step 100 the 'Ewald error' was still
a small number and thus the diff might not be significant even if
it was at step 1, if I am interpreting this 'estimate' correctly:

< Ewald error estimate: 0.2978E-04

---
>  Ewald error estimate:   0.7049E-04
The main thing for practical purposes, as Dave pointed out, is that
the ensemble characteristics be consistent, and these types of diffs
are a micro attempt to probe this macro aspect.
Bill Ross

Received on Fri Aug 13 1999 - 12:55:32 PDT