The Trouble With Rounding Floats
lukfil writes "We all know of floating point numbers, so much so that we reach for them each time we write code that does math. But do we ever stop to think what goes on inside that floating point unit and whether we can really trust it?"
What about encoding floats as a pair of ints or longs: one to express the numerical value, and the other its tenth power; id est, decimal arithmetic?
This is why I use the decNumber library from IBM.
l The decNumber library implements the General Decimal Arithmetic Specification[1] in ANSI C. This specification defines a decimal arithmetic which meets the requirements of commercial, financial, and human-oriented applications.
http://www2.hursley.ibm.com/decimal/decnumber.htm
The library fully implements the specification, and hence supports integer, fixed-point, and floating-point decimal numbers directly, including infinite, NaN (Not a Number), and subnormal values.
The code is optimized and tunable for common values (tens of digits) but can be used without alteration for up to a billion digits of precision and 9-digit exponents. It also provides functions for conversions between concrete representations of decimal numbers, including Packed Decimal (4-bit Binary Coded Decimal) and three compressed formats of decimal floating-point (4-, 8-, and 16-byte).
If you are actually concerned about rounding and precision, use decimal instead.
I am Intel of Borg, you will be approximated.
There have been many examples, such as the original pentium bug. Of course, there was a bug in Windows Calc, it was 2.01 - 2.0 = 0 (If I remember correctly).
Fight Spammers!
Apparently the author of the article didn't read the stories in RISKS that he cited. In particular, the 'pensioners being shortchanged' one talks about them not being paid interest on 'float'-- cash flow on transactions in progress. This has little to do with floating point numbers.
Similarly, the spacecraft problem mentioned is one of an errant cast, not because of dilution of precision in floating point calculations.
The author could really pick his examples better-- as mistakes in numerical programming happen often and are often of great import.
This is not a new problem. Or an unsolved one. Is there any modern programming language that does not supply a data type or library with exact decimal arithmetic support? Using a float to represent monetary amounts and expecting them to be free of rounding errors is as stupid as using integers to store zip codes and wondering where the leading zeros went from all the addresses in New England. If you can't be arsed to choose the right data type get out of the business.
Sigs? Sigs? We don't need no steenkin' sigs.
He talks about scientific applications, but actually very few scientific calculations are sensitive to rounding error. Remember, they sent astronauts to the moon using slide rules. Generally for scientific applications, you just don't want to roll your own crappy subroutines for stuff like matrix inversion; use routines written by people who know what they're doing. (And know the limitations of the algorithm you're using. For example, there are certain goofy matrices that will make a lot of matrix inversion algorithms blow chunks.)
For business apps, the classic solution was to use BCD arithmetic. But today, is it more practical (and simple) just to use a language like Ruby, that has arbitrary-precision integers, so you can just store everything in units of cents? A lot of machines used to have special BCD instructions; do those exist on modern CPUs?
Find free books.
float (and the big brother double) is inaccurate. Its no surprise. A 32-bit Float is but a single simple tool in a programming language. If anyone is surprised by how Floats behave then they are, most likely, inexperienced.
You don't start addressing a problem in software just by assuming Float or Double will magically fill every need. An experienced programmer needs to have a knowledge of how to use, and how not to use, the programming tools at hand. TFA about floating point numbers is very introductory (at the end it mentions that the next article will tell us how to "avoid the problem"... I assume it will go on to cover some basic idioms.) In a way it misses the point: Floating-point rounding is not a "problem". Floats and Doubles always do their job, but you have to know what that job is! The behaviour of floating point numbers should not be a big surprise to a seasoned coder.
For example: You can't use float or double to store the numerical result of a 160-bit SHA-1 hash... you have to use the full 160 bits. (Duh, right?) So, if you use a mere 32 bits (float) or 64 bits (double) to store that number, you are going to sacrifice a lot of accuracy!
You beat me to it. I was just thinking about how much better TFA would be if they explained the specifics of how the Office Space team ripped off the banking system.
Information wants a fueled airplane waiting at the hangar and no one gets hurt.
We have the same problem in everyday numbers. Try representing 1/3 in any finite number of digits. You can't. The big thing about floating-point numbers that trips people up is that we're used to thinking in base 10. Floating-point numbers in computers typically aren't in base 10, they're in base 2. The rounding problem he describes is simply us getting confused and wondering why a fraction with an exact representation in base 10 doesn't have an exact representation in base 2. The obvious solution is the one he alludes to at the end: don't use base 2. Computers have had base-10 arithmetic in them for decades, in fact the x86 family has base-10 arithmetic instructions built in (the packed-BCD instructions). COBOL has used packed-BCD since it's beginning, which is why you don't find this sort of calculation error in ancient COBOL financial packages running on mainframes.
Due to the efforts of Willam Kahan at U.C. Berkeley, IEEE 754 floating point, which is what we have today on almost everything, is far, far better than earlier implementations.
Just for starters, IEEE floating point guarantees that, for integer values that fit in the mantissa, addition, subtraction, and multiplication will give the correct integer result. Some earlier FPUs would give results like 2+2 = 3.99999. IEEE 754 also guarantees exact equality for integer results; you're guaranteed that 6*9 == 9*6. Fixing that made spreadsheets acceptable to people who haven't studied numerical analysis.
The "not a number" feature of IEEE floating point handles annoying cases, like division by zero. Underflow is handled well. Overflow works. 80-bit floating point is supported (except on PowerPC, which broke many engineering apps when Apple went to PowerPC.)
Those of us who do serious number crunching have to deal with this all the time. It's a big deal for game physics engines, many of which have to run on the somewhat lame FPUs of game consoles.
The use of transforms for handling numerical calculations is an old trick. It is probably best-known in its use as a very quick way to multiply or divide using logarithms and a slide-rule, prior to the advent of widely-available scientific calculators and computers. Nonetheless, devices based on logarithmic calculations (such as the mechanical CURTA calculator) can wipe the floor with most floating-point maths units - this despite the fact that the CURTA dates back to the mid 1940s.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
The author goes on and on about how floating point numbers are inaccurate, and unable to precisely represent represent real values, like this is something new, or even something different from the number approximations we normally use.
The reason the examples the author cites can't be represented precisely is that floating point numbers are ultimately represented as base-2 fractions, and there are a bunch of finite-length base-10 fractions that don't have a non-repeating base-2 representation. Guess what? We have *exactly* the same problem with the base-10 fractions that everyone uses all the time. Show me how you write 1/3 as a decimal!
The problem isn't that floating point numbers are inherently problematic, the problem is that we typically use them by converting base-10 numbers to them, doing a bunch of calculations and then converting them back to base 10. Floating point rounding isn't an unsolved problem -- floating point rounding works perfectly, and always has. It's just that the approximations you get when you round in base 2 don't match the approximations you get when you round in base 10.
Bottom line: If you care about getting the same results you'd get in base 10, do your work in base 10. This is why financial applications should not use floating point numbers.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
and
When bad things happen to good numbers (as well as Becker's other floating-point columns on that same page)
This is why I use DECIMAL and not FLOAT in MySQL. Problem solved. I'm not a big fan of floats, the extreme precision that they seem to have is mostly an illusion.
If you think your calculator doesn't give rounded off numbers, I hope you're not working in science or engineering.
Welcome to a very poor article on what's been taught in early Comp Sci for many many years.
Any serious developer of business software knows all about this and avoids floating point at all cost for financial calculations. Scientists however do use them carefully since the math they do is usually much more performance (speed) sensitive and the calculations are a little more complex than what tends to be done on the business side (ie _most_ business calcs are relatively simple).
These posts express my own personal views, not those of my employer
Actually the problem was that they used a float to store the system time (time since power on) in the ground radar unit. It allowed the clock to be used in calculations without a conversion. A float will store an integer just fine (and accurately) until the number gets too large and then the units part drops off the bottom of the precision and the increment operator no longer makes any sense. This was a design decision that made sense for the role for which the missle platform was originally designed. The patriot was originally designed to be used in the European Theater (if the cold war ever turned hot) and as such would never remain in one location for more than a very few days.The clock is reset everytime they move the battery (they power off the ground tracking radar when they move). The use in the gulf war was in a strategic role (not tactical) which kept them continuously operating in a single location for long periods of time, and the shortcut they used came back to haunt them (as usual). If they had reset the system every few days, the problem would not have occured.
Atlas stands on the earth and carries the celestial sphere on his shoulders.
And I'm not sure if it can be solved altogether. When you spend a little time meditating over the IEEE 754, you notice a few flaws. The first and most obvious is, of course, that, no matter how precise you want to make it, somewhere there's a cutoff. And, especially when you multiply with floats, that error grows as well. But there's another problem. Two actually.
The first one is the one mentioned in the article, and something everyone who didn't sleep through his IT classes should know: Computers calculate binary, and converting floats from binary to decimal isn't possible without error. There is no way to represent 0.37 in binary, in IEEE754. No matter how many bits you spend on the mantissa. Now, you can argue that, if you make it "big enough", it doesn't matter anymore since it's well within the error margin and when you round it to, say, 5 after decimal, the error vanishes. True. But when you start calculating, when you multiply or, worse, exponentiate, the error grows in big leaps.
Another, less obvious, problem is hidden underneath the way the IEEE754 works: Your error grows as your numbers grow. This might seem obvious, but it is interesting how many people overlook this flaw and problem in everyday life. Since according to the IEEE754 standard, real numbers are stored as exponent and mantissa, if you're dealing with BIG numbers, a fair deal of your mantissa is spent on the "pre-comma" part of your number, so you're losing precision. You can't reliably say that "a double is good for 5 behind dot, no matter what", you have to take into account how many of those precious mantissa bits are spent before you even get to ponder what's left for your precision.
This isn't so much a problem of processors. It's a problem of people understanding how their processors work.
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
GIMPS looks for Mersenne primes. This is clearly an exact integer operation. However, for speed, they use Fast Fourier Transforms to do the big squaring operation with floating point. Obviously, they need an exact result.
.5, you have the integer answer they need. Note that it's basically impossible to do this without using assembly language, because the order of operations and subexpression elimination definitely matter.
The trick is to carefully calculate exactly how much error each operation can generate. It is possible to know exactly how many bits of your result contain valid information. If you need more accuracy, you can split it into multiple operations. As long as the final accumulated error in their result is less than
Another interesting problem occurs with floating point results. You cannot expect the complete answer to be exactly identical on all machines. Even on the same machine, compiler settings affect the answer: x87 differs significantly from SSE. If you are doing something that needs bitwise identical results on all machines, you need to either implement it with integer math, or do what GIMPS does and do error tracking.
Melissa
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
If you're interested in rounding (and who isn't?) you might want to read An introduction to different rounding algorithms.
"when i use my calculator, it doesn't give rounded off numbers."
Not true.
In the math class I teach, I do the following: have everyone take a calculator and do "2/3".
Half of the calculators say this: "0.666666666" (rounded down).
Half of the calculators say this: "0.666666667" (rounded up).
In truth, an exact answer requires an infinite sequence of "6"'s. The calculator (or any computer) must decide whether to round up or down to fit it into its display space (or memory). You always have some round-off error -- and the more calculations you do, the more the round-off error builds up and up and up....
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Do you understand thatis not the same asand thatCan be true?
Every day, many people make the mistake of using floats when wat they really wanted was the ability just to represent large numbers. For example, in Mac OS X, the system uses doubles as representations of time. This is the worst idea I can think of. First of all, floats are imprecise and time is the thing that man can subdivide the most precisely. Secondly, if a Mac OS X machine is on long enough, time will cease to progress becuase of the 2nd statement above!
Plenty of people who thought they knew what they were doing have used floats in places where they are a bad idea. And it continues to today.
So I say no, Computer Science 101 doesn't seem to cover all you need to know about floats.
http://lkml.org/lkml/2005/8/20/95
I once wrote interest-calculation software for a bank. This was new software to replace their old stuff. Naturally, I stored the values in cents, not guilders/dollars/euros, to avoid rounding errors (which really have a big effect in interest calculations).
When I delivered my software, they compared my output to the output their old software produced. There were small differences. They asked me where these came from, and I traced them back to rounding errors in their old software. I showed them this by example, thinking that they would be happy that their new software did not have this problem.
Their response? "The new software should produce exactly the same numbers as the old software." "But the old numbers are WRONG!" "That does not matter, the new software should produce exactly the same numbers as the old software."
It is really hell to make good new software error-compatible with faulty old garbage.
It's Transcendental number.
Not trolling, just helping since you asked.
So you can laugh all you want to...
On solution as you put, is to priority sort your operations.
Another solution, is the Kahan summation algorithm.
Wich, grosso-modo, keeps track of the error at each step, and injects it back at the next.
In your example, in each iteration, the algorithme notice that tha 1.0e1 is missing from the sum and carries it to the next addition. A few iterations later, the carry is big enough to be added to the result.
The advantages are : you don't need to first load all components in a tree, then itteratively sort them and process them all until you're done. In fact you can even use this algorithme in a streaming fashion, were you don't enven need to know how much value will come.
The disadvantages are : some compilers are able to guess that the carry "should mathematically be 0" (actually true in a perfect world with infinite precision numbers) and could "optimise" the code back to a plain normal sum function bypassing the algorithm (and won't subsequently use any other sum-correction algorithm).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Actually, the answer is 42.
PS: That is what part of the alphabet would look like if the letters "Q" and "R" were removed.
Basically, I agree with you, but as Hamming pointed out in the 1950s you can get yourself into trouble with some thing like:
A = small number, e.g. size of smallest feature in a Celeron-M CPU in microns
B = big number, e.g. distance to Andromeda galaxy in microns
C = (A+B) ... (some set of clever operations) ... - B
The math is fine, but the implementation won't yield the correct answer. because the value of interest got scaled off. That isn't rocket science, but it is suprisingly hard to catch.
That said, Floating Point works suprisingly well for most things. Back in the 1960s, things like 2+2=3.99999 were a fairly common phenomena. Hardly ever happens today. I spent about three decades in the defense industry writing and testing systems that got things from point X to point Y. I think I saw a reasonable selection of the possible errors that folks can make. Other than subtracting big numbers from other big numbers and expecting the result to be precise, I can't recall a single problem with floating point.
If we're looking for something that is basically broken to write an article about, forget floating point. Try event driven software architectures. Or the notion that it is possible to write unambiguous specifications that are also comprehensible.
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
I am very small, utmostly microscopic.
You do realize that there's more to business than cash registers, right?
A 32-bit fixed point number maxes out at 21,474,836.47 which is severely limiting for all but small-sized businesses and tiny governments.
64-bit fixed point number (max 92,233,720,368,547,758.07) are obviously better, but are only efficient on 64-bit machines, which are still a minority of installed machines.
"I don't know, therefore Aliens" Wafflebox1
Everyone knows about rounding problems with floats. What people don't seem to realize (and this goes for many real "Rocket Scientists" I've known) is that a float only gives you a fixed amount of digits - that's all. You can have a highly precise small number or a really big number. You cannot ever have a precise, big number. It just so happens that these same "Rocket Scientists" like to represent time as a floating point based on some 1970 epoch. Guess how accurate that is going to be in a couple of years. How do they solve this problem? The "double double" type of course!
Why would this count as "news". Everyone who has to deal with this would already know about it.