Slashdot Mirror


The Trouble With Rounding Floats

lukfil writes "We all know of floating point numbers, so much so that we reach for them each time we write code that does math. But do we ever stop to think what goes on inside that floating point unit and whether we can really trust it?"

12 of 456 comments (clear)

  1. decNumber libary from IBM by Not+The+Real+Me · · Score: 5, Informative

    This is why I use the decNumber library from IBM.

    http://www2.hursley.ibm.com/decimal/decnumber.html The decNumber library implements the General Decimal Arithmetic Specification[1] in ANSI C. This specification defines a decimal arithmetic which meets the requirements of commercial, financial, and human-oriented applications.

    The library fully implements the specification, and hence supports integer, fixed-point, and floating-point decimal numbers directly, including infinite, NaN (Not a Number), and subnormal values.

    The code is optimized and tunable for common values (tens of digits) but can be used without alteration for up to a billion digits of precision and 9-digit exponents. It also provides functions for conversions between concrete representations of decimal numbers, including Packed Decimal (4-bit Binary Coded Decimal) and three compressed formats of decimal floating-point (4-, 8-, and 16-byte).

  2. Re:Decimal Arithmetic by Anonymous Coward · · Score: 5, Insightful

    This is not newsworthy. This is computer science 101.

  3. The author is seriously confused by mlyle · · Score: 5, Insightful

    Apparently the author of the article didn't read the stories in RISKS that he cited. In particular, the 'pensioners being shortchanged' one talks about them not being paid interest on 'float'-- cash flow on transactions in progress. This has little to do with floating point numbers.

    Similarly, the spacecraft problem mentioned is one of an errant cast, not because of dilution of precision in floating point calculations.

    The author could really pick his examples better-- as mistakes in numerical programming happen often and are often of great import.

  4. Not news. by SJasperson · · Score: 5, Insightful

    This is not a new problem. Or an unsolved one. Is there any modern programming language that does not supply a data type or library with exact decimal arithmetic support? Using a float to represent monetary amounts and expecting them to be free of rounding errors is as stupid as using integers to store zip codes and wondering where the leading zeros went from all the addresses in New England. If you can't be arsed to choose the right data type get out of the business.

    --
    Sigs? Sigs? We don't need no steenkin' sigs.
  5. It used to be much worse. Kahan fixed it. by Animats · · Score: 5, Interesting

    Due to the efforts of Willam Kahan at U.C. Berkeley, IEEE 754 floating point, which is what we have today on almost everything, is far, far better than earlier implementations.

    Just for starters, IEEE floating point guarantees that, for integer values that fit in the mantissa, addition, subtraction, and multiplication will give the correct integer result. Some earlier FPUs would give results like 2+2 = 3.99999. IEEE 754 also guarantees exact equality for integer results; you're guaranteed that 6*9 == 9*6. Fixing that made spreadsheets acceptable to people who haven't studied numerical analysis.

    The "not a number" feature of IEEE floating point handles annoying cases, like division by zero. Underflow is handled well. Overflow works. 80-bit floating point is supported (except on PowerPC, which broke many engineering apps when Apple went to PowerPC.)

    Those of us who do serious number crunching have to deal with this all the time. It's a big deal for game physics engines, many of which have to run on the somewhat lame FPUs of game consoles.

  6. This is why you would choose... by jd · · Score: 5, Informative
    One of the many many solutions:


    • Fixed-point numbers
    • Berkeley MP or Gnu MP arbritary-length floating-point
    • Co-processors with truly massive internal registers (I refuse to use less than 80-bit)
    • Delayed calculation (ie: actually process a calculation at the end, storing the inputs and operators until you absolutely need the value - eliminates intermediate rounding errors and if the value is never needed, you don't waste the clock cycles)
    • Don't use real numbers - apply a scaler or a transform such that ALL components of any scaled/transformed calculations must be integer, then only transform back for display purposes


    The use of transforms for handling numerical calculations is an old trick. It is probably best-known in its use as a very quick way to multiply or divide using logarithms and a slide-rule, prior to the advent of widely-available scientific calculators and computers. Nonetheless, devices based on logarithmic calculations (such as the mechanical CURTA calculator) can wipe the floor with most floating-point maths units - this despite the fact that the CURTA dates back to the mid 1940s.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  7. Bah. Author doesn't understand arithmetic. by swillden · · Score: 5, Insightful

    The author goes on and on about how floating point numbers are inaccurate, and unable to precisely represent represent real values, like this is something new, or even something different from the number approximations we normally use.

    The reason the examples the author cites can't be represented precisely is that floating point numbers are ultimately represented as base-2 fractions, and there are a bunch of finite-length base-10 fractions that don't have a non-repeating base-2 representation. Guess what? We have *exactly* the same problem with the base-10 fractions that everyone uses all the time. Show me how you write 1/3 as a decimal!

    The problem isn't that floating point numbers are inherently problematic, the problem is that we typically use them by converting base-10 numbers to them, doing a bunch of calculations and then converting them back to base 10. Floating point rounding isn't an unsolved problem -- floating point rounding works perfectly, and always has. It's just that the approximations you get when you round in base 2 don't match the approximations you get when you round in base 10.

    Bottom line: If you care about getting the same results you'd get in base 10, do your work in base 10. This is why financial applications should not use floating point numbers.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  8. Re:Decimal Arithmetic by gweihir · · Score: 5, Informative

    Is there any fundamental reason why decimal arithmetic in a computer should be more accurate than binary arithmetic in a computer?

    No, no, the problem is not with the precision! The problem is that when input and output is decimal, but the calculation is binary, then you get additional errors from the conversion that badly educated programmers do not expect.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  9. Re:Decimal Arithmetic by Fordiman · · Score: 5, Informative

    No, C will automatically recast a number as needed in cases like the above.

    The issue is actually a pretty commonly understood situation when going from decimal floating point numbers to binary IEEE floats (I have another comment on here describing how they're stored), and it basically comes down to this:

    Floats of any sort are stored as an int with an int shift (a.aa x b^c). As such, there will be aliasing problems based on the prime components of b. A known percentage of divisors will produce repeating numbers. For example, any division of 3,5,7,11.... in base 2 will be repeating. Any division of 3,7,11,13... in base 10 will be repeating.

    No, there's nothing you can do about it. Use higher precision if needed, and otherwise get over it.

    --
    110100 1101000 1101000 1100110 0 1101111 1101000 1100011 1
  10. The problem is using floating point improperly by Myria · · Score: 5, Interesting

    GIMPS looks for Mersenne primes. This is clearly an exact integer operation. However, for speed, they use Fast Fourier Transforms to do the big squaring operation with floating point. Obviously, they need an exact result.

    The trick is to carefully calculate exactly how much error each operation can generate. It is possible to know exactly how many bits of your result contain valid information. If you need more accuracy, you can split it into multiple operations. As long as the final accumulated error in their result is less than .5, you have the integer answer they need. Note that it's basically impossible to do this without using assembly language, because the order of operations and subexpression elimination definitely matter.

    Another interesting problem occurs with floating point results. You cannot expect the complete answer to be exactly identical on all machines. Even on the same machine, compiler settings affect the answer: x87 differs significantly from SSE. If you are doing something that needs bitwise identical results on all machines, you need to either implement it with integer math, or do what GIMPS does and do error tracking.

    Melissa

    --
    "Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
  11. Re:Decimal Arithmetic by Eivind · · Score: 5, Informative
    There's other funkyness too, besides the precision. For example, if you're adding up a lot of floating-point numbers, it makes a difference what sequence you do the additions in.

    For example, if your input consist of one large number, and tons of small ones, then rounding-errors mean that starting with the large number gives a much smaller result than starting with the small ones.

    If I scale it down to smaller numbers, you see why:

    1.0*10^5 + 1.0*10^1 = 1.0*10^5

    So, adding a "small" number to a "large" number gives you simply the large number.

    If you repeat this, a million times, your result is still simply the large number.

    So you could end up concluding that 1.0*10^5 + (1.0*10^1 + 1.0*10^1 ..[1000000 times]...) = 1.0*10^5

    That is an order of magnitude wrong. The correct result is 1.1*10^6

    Practical result ? You need to think about your input. If it *may* look like this, you need to add up by repeatedly adding the two smallest numbers. Easy to do with a priority-tree. pseudocode like this:

    • Insert all numbers in priority-tree.
    • Extract two smallest numbers from tree.
    • Add the two numbers, producing a new number.
    • Push this single new number into the tree.
    • Repeat from step 2 until you're left with a single number.

    MS-Excel, by the way, does *NOT* do this in it's SUM() function, if you feed it a "large" number and *many* "small" numbers, you get horrendously wrong results. Because of the relatively high precision of floats and doubles though, you need to use larger numbers than in my example here.

  12. Re:Decimal Arithmetic by johnw · · Score: 5, Interesting
    Friends of mine went off to work "In The City", when I quizzed them about their use of numbers for stock prices etc they were equally dismayed that things were being passed around as doubles.

    I'd be not only dismayed but very surprised to find anything which interfaces to the London Stock Exchange passing stock prices around as doubles, or as any other kind of floating point number.

    The LSE feeds all use 18 digits for values, with the first 10 being implicitly before the decimal point and the remaining eight being after the implicit decimal point. This is very handy because it means all the values can be manipulated using 64 bit integers. The LSE rules also state very precisely how rounding must be handled. If you try to submit a multi-million pound deal and your calculation of the consideration is out by just one penny then the deal will be rejected.

    No-one with the slightest clue about how to code would use floating point maths in any kind of financial program, particularly not one where they're working with the LSE.