- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Robert Duke <rduke.email.unc.edu>

Date: Mon, 20 Oct 2003 09:01:47 -0400

Stephane -

In theory, at least from my perspective, the programming language, be it

fortran 77, fortran 95, C, or whatever, should basically be where the

precision is determined in calculations. This is certainly the intention in

the various fortrans, and fortran 90/95 tries to abstract itself even

further from the machine by the concept of "kind" (they are almost reluctant

to let you figure out how many bytes are in a datatype - very annoying for

system-oriented guys), as well as functions that explicitly let you

determine ranges of values. If you know anything about C programming, they

actually at one point tied the language data types more to the machine than

to the precision and range of values available. Thus an "int" was the "best

length" of integer to use on a given machine, which basically meant it was

what fit in the base register set of the chip, and gave the best

performance. Never mind that it might overflow because it was 16 bits and

you needed 32, it was fastest, and that was what matters. This mess was

ameliorated somewhat with the introduction of defined constants in limits..h,

but C, to my way of thinking remains annoying to work with on the issues of

range and precision on a variety of platforms. SO, my point in all of this

is that in theory the guy who wrote the code should have defined the

necessary precision for the software. Now, several things mess this up.

First of all, when you look at different machines, some cpu's implement more

precision than promised by the underlying data type. So for instance, the

intel chips have a widely used 80 bit internal precision floating point mode

(note, every time you store something in an array, you just rounded to 64

bits, though). Other chips may have more precision for various reasons.

For instance, the ibm power 3/4 architecture chips can combine at least two

operations (like addition and multiplication) with less loss of precision

than would be implied by the number of bits in the operand. Because of

these architecture-specific issues, there are a slew of options available in

the various compilers to control precision, especially floating point

precision, and they also of course affect performance. I tend to select

options that give compliance to IEEE standards while at the same time giving

good performance. In the amber group, I have heard the opinion expressed

that folks don't worry about being consistent on this stuff because it is

all below various other sources of noise; I agree but like to see close

agreement between two machines, at least for a few hundred steps. Then

there is the networking. Here the order of operations becomes

indeterminate, so you get different rounding errors from run to run. This

is even slighltly more pronounced on pmemd, because it does dynamic load

balancing. Thus, if in one calculation one cpu slows down for some external

reason, it will have less workload than say in the average run, and the

order of operations (basically adding forces) will be affected (note when I

say "order of operations" here, I am speaking about the order of doing

specific additions, multiplications, or whatever, and this ordering affects

the rounding errors that actually occur; in computer science, when one

refers to order of operations, one is more often referring to operation

ordering implicit in expressions by operator precedence (eg., multiplication

has higher precedence than addition, so 3 + 4 * 5 is 23, not 35)).

Now, the sad truth is that all the above is really mostly academic to just

about everybody but guys like me that are worrying about whether their

software is doing the right thing. 1) The actual errors from all the

various force field assumptions are huge in comparison. 2) The actual

errors from the choice of numerical methods are huge in comparison. 3) And

floating point implemented on computers is really a bigger inaccurate mess

than is widely appreciated. For 1), just run different MD implementations.

For 2), a book by Hamming is wonderful - R. W. Hamming, "Numerical Methods

for Scientists and Engineers", Dover. For 3) there is a great review - D.

Goldberg, "What Every Computer Scientist Should Know About Floating-Point

Arithmetic", ACM Computing Surveys, Vol 23, pp 5-48 (1991). I am basically

a systems guy at heart and love integers. ;-)

Regards - Bob

----- Original Message -----

From: "Teletchéa Stéphane" <steletch.biomedicale.univ-paris5.fr>

To: <amber.scripps.edu>

Sent: Monday, October 20, 2003 4:45 AM

Subject: Re: AMBER: Implicit precision in sander vs architecture

-----------------------------------------------------------------------

The AMBER Mail Reflector

To post, send mail to amber.scripps.edu

To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu

Received on Mon Oct 20 2003 - 14:53:01 PDT

Date: Mon, 20 Oct 2003 09:01:47 -0400

Stephane -

In theory, at least from my perspective, the programming language, be it

fortran 77, fortran 95, C, or whatever, should basically be where the

precision is determined in calculations. This is certainly the intention in

the various fortrans, and fortran 90/95 tries to abstract itself even

further from the machine by the concept of "kind" (they are almost reluctant

to let you figure out how many bytes are in a datatype - very annoying for

system-oriented guys), as well as functions that explicitly let you

determine ranges of values. If you know anything about C programming, they

actually at one point tied the language data types more to the machine than

to the precision and range of values available. Thus an "int" was the "best

length" of integer to use on a given machine, which basically meant it was

what fit in the base register set of the chip, and gave the best

performance. Never mind that it might overflow because it was 16 bits and

you needed 32, it was fastest, and that was what matters. This mess was

ameliorated somewhat with the introduction of defined constants in limits..h,

but C, to my way of thinking remains annoying to work with on the issues of

range and precision on a variety of platforms. SO, my point in all of this

is that in theory the guy who wrote the code should have defined the

necessary precision for the software. Now, several things mess this up.

First of all, when you look at different machines, some cpu's implement more

precision than promised by the underlying data type. So for instance, the

intel chips have a widely used 80 bit internal precision floating point mode

(note, every time you store something in an array, you just rounded to 64

bits, though). Other chips may have more precision for various reasons.

For instance, the ibm power 3/4 architecture chips can combine at least two

operations (like addition and multiplication) with less loss of precision

than would be implied by the number of bits in the operand. Because of

these architecture-specific issues, there are a slew of options available in

the various compilers to control precision, especially floating point

precision, and they also of course affect performance. I tend to select

options that give compliance to IEEE standards while at the same time giving

good performance. In the amber group, I have heard the opinion expressed

that folks don't worry about being consistent on this stuff because it is

all below various other sources of noise; I agree but like to see close

agreement between two machines, at least for a few hundred steps. Then

there is the networking. Here the order of operations becomes

indeterminate, so you get different rounding errors from run to run. This

is even slighltly more pronounced on pmemd, because it does dynamic load

balancing. Thus, if in one calculation one cpu slows down for some external

reason, it will have less workload than say in the average run, and the

order of operations (basically adding forces) will be affected (note when I

say "order of operations" here, I am speaking about the order of doing

specific additions, multiplications, or whatever, and this ordering affects

the rounding errors that actually occur; in computer science, when one

refers to order of operations, one is more often referring to operation

ordering implicit in expressions by operator precedence (eg., multiplication

has higher precedence than addition, so 3 + 4 * 5 is 23, not 35)).

Now, the sad truth is that all the above is really mostly academic to just

about everybody but guys like me that are worrying about whether their

software is doing the right thing. 1) The actual errors from all the

various force field assumptions are huge in comparison. 2) The actual

errors from the choice of numerical methods are huge in comparison. 3) And

floating point implemented on computers is really a bigger inaccurate mess

than is widely appreciated. For 1), just run different MD implementations.

For 2), a book by Hamming is wonderful - R. W. Hamming, "Numerical Methods

for Scientists and Engineers", Dover. For 3) there is a great review - D.

Goldberg, "What Every Computer Scientist Should Know About Floating-Point

Arithmetic", ACM Computing Surveys, Vol 23, pp 5-48 (1991). I am basically

a systems guy at heart and love integers. ;-)

Regards - Bob

----- Original Message -----

From: "Teletchéa Stéphane" <steletch.biomedicale.univ-paris5.fr>

To: <amber.scripps.edu>

Sent: Monday, October 20, 2003 4:45 AM

Subject: Re: AMBER: Implicit precision in sander vs architecture

-----------------------------------------------------------------------

The AMBER Mail Reflector

To post, send mail to amber.scripps.edu

To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu

Received on Mon Oct 20 2003 - 14:53:01 PDT

Custom Search