Safety in Numbers

Blog

Warning: Floating Point Error

There is a defect in every major computer constructed today that has existed since the first computer, the Zuse Z3, in 1941.  That failure is invisible unless it causes a major catastrophe.  I can show you that failure on your cell phone, tablet, notebook, or personal computer.

Failure’s end result

Failure’s end result


Calculate the square root of 0.1 (screenshot below):

1.png

Square that and you will see the result: “0.1” (screenshot below):

2.png

Now subtract 0.1 (screenshot below):

3.png

The result should be zero — but it is not.

This is not a program bug.  It is in the very design of your computer hardware.  There is even an international standard that specifies that computers be designed in that way.

This problem is well known.  It is floating point error.

Floating Point Error

The example above is a simple one and the resulting error is relatively very small and insignificant unless the result is tested against zero.  In the early days of computing, problems were necessarily simple and floating point errors seldom resulted in system failures.  Today, however, performance of super computers is measured in PetaFLOPS (quadrillion floating point operations per second).

In 1965 RCA introduced the IBM 360 compatible Spectra 70.  The joke at that time was “The Spectra 70 commits round-off error five times faster than the IBM 360.”

Rounding error occurs because some real numbers cannot be represented in a fixed space just like 1/3 cannot be represented accurately as a decimal fraction (0.333…). Neither can 0.1 be represented accurately in today’s binary computers.  But rounding error is not the only kind of floating point error.  The errors in the example above are a result of the combination of two kinds of error.

Untitled-2.png

In addition to rounding error, there is also truncation error.  Truncation error occurs when similar numbers are subtracted.  Unfortunately, rounding error and truncation are incompatible because rounding is linear and truncation is exponential.  Like adding apples and oranges, to understand and combine them, we need a floating point error generalization, just as “fruit” can generalize “apples and oranges.” This will be the subject of a future blog.