- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Robert Duke <rduke.email.unc.edu>

Date: Fri, 2 Mar 2007 12:49:01 -0500

Hi Julie,

Well, in a perfect universe one would think a computer program would produce

the exact same results, given the same input, every time. However, there

are confounding factors in something that involves billions of calculations

every second (especially if you are talking parallel code). For sander and

pmemd, we actually do take fairly determined measures to enhance

reproducibility of results, and our calculations are all done in double

precision, with spline table-derived values determined to high accuracy also

(down around 1E-11 error). This means that sander and pmemd will appear as

reproducible as about anything. However you will only see perfect

reproducibility if the following conditions are met:

1) you run a single processor version of the code.

2) you don't use FFTW fft's.

3) you either use the exact same executable or an executable generated by

the exact same version/make of the compiler, all system libraries, and the

amber source code.

4) you run on the exact same hardware.

Why is this so? In two words, rounding error in billions of floating point

calculations. As it happens, rounding error is heavily influenced by 1) the

exact implementation of various algorithms, especially for things like

transcendental functions - sqrt, trig functions, exp, all of which we have

to use, 2) the exact order in which math operations are performed. Here is

an interesting point. In the world of pure math, addition is commutative -

the order of operations does not matter. In the world of

computer-implemented floating point addition, addition is NOT commutative -

the rounding error will vary depending on the order of additions. So,

1) when you run a parallel version of the code, force summations occur

across the net occur in different order depending on network interconnect

indeterminancy (the order of completion of asynchronous net calls is a

function of other things going on in the systems - essentially random things

relating to what other system background tasks happen to be running, even

slight differences in real clock rates between processors).

2) FFTW, when linked to pmemd, is used in such a way as to optimize

performance. FFTW is adaptive code; during initialization it determines the

fastest algorithms for the current hardware on the fly, and the answer it

gets actually varies depending on operating system-related indeteriminancy

(basically, how the task time slices happen to go and what else the OS

happens to schedule - say on your workstation with that nice big GUI, or

even just the usual system background tasks). SO FFTW will produce ever so

slightly different results, depending on what it thinks the fastest

algorithm is for the current phase of the moon (if you don't like this, use

my public fft's - they are almost as fast and deterministic).

3) Compilers play fast and loose with order of operations when order of

operations theoretically, ie., according to math rules, does not matter. So

you get different rounding error if you change compiler version, source

code, compiler manufacturer, etc. Same basic story for math libraries and

other system libraries where the transcendental implementations or order of

operations may change.

4) Things like sqrt() are these days found in hardware, so different cpu's

may be the cause of differences in transcendental results. The difference

may be small, but do the operation a few trillion times and you will start

seeing differences that are visible in the printout.

I am extremely paranoid about all this kind of stuff myself, and track it

carefully; in my mind one big reason to use 64-80 bits of precision in calcs

is to minimize rounding error, thus allowing one to more easily spot other

errors that might creep in during s/w development. There are other

reasons - like better energy conservation in an nve ensemble, which tends

(or at least should tend) to give folks warm fuzzies about their simulation.

There are no reasons in terms of preserving accuracy of results for a single

step - the parameters in classical forcefields are not exactly 20 digit

precision values. For pmemd run in parallel on systems near equilibrium

(highly energetic stuff tends to cause larger errors in shake which can

converge differently to the specified tolerances), pmemd should produce the

same numbers in parallel with the public fft's for the first 300 to 500

steps. For uniprocessor code, you should get the exact same trajectory; I

have probably only checked for a few thousand steps (don't remember the

details - did it early on in dev work).

Well, probably more than you wanted to know... We are careful :-)

Best Regards - Bob Duke (pmemd developer)

----- Original Message -----

From: "Stern, Julie" <jvstern.bnl.gov>

To: <amber.scripps.edu>

Sent: Friday, March 02, 2007 11:34 AM

Subject: AMBER: reproducibility between software

*> Hello,
*

*> Have there been any studies or comparisons done regarding
*

*> reproducibility of
*

*> an MD result in amber vs. namd? If all the paramter options are set the
*

*> same and
*

*> the initial conditions are the same, are the algorithms in amber and namd
*

*> implemented
*

*> the same so that an exact trajectory would come out the same?
*

*>
*

*> Any comments or pointers would be helpful.
*

*>
*

*> Thanks.
*

*>
*

*> --Julie
*

*> -----------------------------------------------------------------------
*

*> The AMBER Mail Reflector
*

*> To post, send mail to amber.scripps.edu
*

*> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
*

*>
*

-----------------------------------------------------------------------

The AMBER Mail Reflector

To post, send mail to amber.scripps.edu

To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu

Received on Sun Mar 04 2007 - 06:07:56 PST

Date: Fri, 2 Mar 2007 12:49:01 -0500

Hi Julie,

Well, in a perfect universe one would think a computer program would produce

the exact same results, given the same input, every time. However, there

are confounding factors in something that involves billions of calculations

every second (especially if you are talking parallel code). For sander and

pmemd, we actually do take fairly determined measures to enhance

reproducibility of results, and our calculations are all done in double

precision, with spline table-derived values determined to high accuracy also

(down around 1E-11 error). This means that sander and pmemd will appear as

reproducible as about anything. However you will only see perfect

reproducibility if the following conditions are met:

1) you run a single processor version of the code.

2) you don't use FFTW fft's.

3) you either use the exact same executable or an executable generated by

the exact same version/make of the compiler, all system libraries, and the

amber source code.

4) you run on the exact same hardware.

Why is this so? In two words, rounding error in billions of floating point

calculations. As it happens, rounding error is heavily influenced by 1) the

exact implementation of various algorithms, especially for things like

transcendental functions - sqrt, trig functions, exp, all of which we have

to use, 2) the exact order in which math operations are performed. Here is

an interesting point. In the world of pure math, addition is commutative -

the order of operations does not matter. In the world of

computer-implemented floating point addition, addition is NOT commutative -

the rounding error will vary depending on the order of additions. So,

1) when you run a parallel version of the code, force summations occur

across the net occur in different order depending on network interconnect

indeterminancy (the order of completion of asynchronous net calls is a

function of other things going on in the systems - essentially random things

relating to what other system background tasks happen to be running, even

slight differences in real clock rates between processors).

2) FFTW, when linked to pmemd, is used in such a way as to optimize

performance. FFTW is adaptive code; during initialization it determines the

fastest algorithms for the current hardware on the fly, and the answer it

gets actually varies depending on operating system-related indeteriminancy

(basically, how the task time slices happen to go and what else the OS

happens to schedule - say on your workstation with that nice big GUI, or

even just the usual system background tasks). SO FFTW will produce ever so

slightly different results, depending on what it thinks the fastest

algorithm is for the current phase of the moon (if you don't like this, use

my public fft's - they are almost as fast and deterministic).

3) Compilers play fast and loose with order of operations when order of

operations theoretically, ie., according to math rules, does not matter. So

you get different rounding error if you change compiler version, source

code, compiler manufacturer, etc. Same basic story for math libraries and

other system libraries where the transcendental implementations or order of

operations may change.

4) Things like sqrt() are these days found in hardware, so different cpu's

may be the cause of differences in transcendental results. The difference

may be small, but do the operation a few trillion times and you will start

seeing differences that are visible in the printout.

I am extremely paranoid about all this kind of stuff myself, and track it

carefully; in my mind one big reason to use 64-80 bits of precision in calcs

is to minimize rounding error, thus allowing one to more easily spot other

errors that might creep in during s/w development. There are other

reasons - like better energy conservation in an nve ensemble, which tends

(or at least should tend) to give folks warm fuzzies about their simulation.

There are no reasons in terms of preserving accuracy of results for a single

step - the parameters in classical forcefields are not exactly 20 digit

precision values. For pmemd run in parallel on systems near equilibrium

(highly energetic stuff tends to cause larger errors in shake which can

converge differently to the specified tolerances), pmemd should produce the

same numbers in parallel with the public fft's for the first 300 to 500

steps. For uniprocessor code, you should get the exact same trajectory; I

have probably only checked for a few thousand steps (don't remember the

details - did it early on in dev work).

Well, probably more than you wanted to know... We are careful :-)

Best Regards - Bob Duke (pmemd developer)

----- Original Message -----

From: "Stern, Julie" <jvstern.bnl.gov>

To: <amber.scripps.edu>

Sent: Friday, March 02, 2007 11:34 AM

Subject: AMBER: reproducibility between software

-----------------------------------------------------------------------

The AMBER Mail Reflector

To post, send mail to amber.scripps.edu

To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu

Received on Sun Mar 04 2007 - 06:07:56 PST

Custom Search