Slashdot Mirror


The Trouble With Rounding Floats

lukfil writes "We all know of floating point numbers, so much so that we reach for them each time we write code that does math. But do we ever stop to think what goes on inside that floating point unit and whether we can really trust it?"

70 of 456 comments (clear)

  1. Decimal Arithmetic by (1+-sqrt(5))*(2**-1) · · Score: 4, Insightful
    From TFA:
    Example 1: showing approximation error.

    // some code to print a floating point number to a lot of
    // decimal places
    int main()
    {
    float f = .37;
    printf("%.20f\n", f);
    }
    The main problem with that example, I take it, is that single-precision datatypes are only guaranteed for roughly seven decimal places; using double, of course, only defers the problem.

    What about encoding floats as a pair of ints or longs: one to express the numerical value, and the other its tenth power; id est, decimal arithmetic?

    1. Re:Decimal Arithmetic by Anonymous Coward · · Score: 5, Insightful

      This is not newsworthy. This is computer science 101.

    2. Re:Decimal Arithmetic by Anomie-ous+Cow-ard · · Score: 3, Informative

      Since these are computers, and they deal primarily with binary internally, why not store the numerical value and the 2nd power instead? Oh, and since we generally need more bits of accuracy in the numerical value than the exponent (do you often deal with numbers 2**(2**32)?), why not allocate a "reasonable" number of bits to the exponent and leave more for the numerical value.

      Uh oh, we just re-invented floating point. Oh well, nice try.

      If you were just trying to get better accuracy by using base 10 rather than base 2, you're just hiding the problem (and making the hardware quite a bit more complex). If you want true accuracy, abandon floating point and use a bignum system.

      --

      --
      perl -e'$_=shift;die eval' '"$^X $0\047\$_=shift;die eval\047 \047$_\047"' at -e line 1.

    3. Re:Decimal Arithmetic by SageMusings · · Score: 4, Insightful

      Okay,

      Show of hands: Who did not already understand that floats are approximations? Anyone? I didn't think so. I've gotta wonder why this story ever made it into Slashdot. This is more worthy of Time magazine where it can be spun as a startling new revelation into the dirtier corners of computer science and foisting a lie on the public.

      --
      -- Posted from my parent's basement
    4. Re:Decimal Arithmetic by tomstdenis · · Score: 2, Insightful

      My college taught "numerical analysis" in the software comp.eng side. You learn the format of IEEE types, the range, accuracy, precision issues, etc.

      We had assignments to not only perform matrix ops but also give the expected error, etc.

      Maybe the author of the article should either go to a better school or pay more attention to the classes.

      Tom

      --
      Someday, I'll have a real sig.
    5. Re:Decimal Arithmetic by Duhavid · · Score: 2, Interesting

      It is a little newsworthy.

      I bothered to ask the question of what to use for monitary
      usage at a financial institution in my recent past. I was
      a bit ( pardon the pun ) suprised to get a blank stare, to
      have to explain what I was talking about. Floats where good
      enough. Course, I had a problem in .net with iterating thru
      a list of values ( testing, each was .1, for 10% ), and the
      sum wasnt 1.0. Had to do a bunch of

      decimal.parse(value.ToString())

      to get things to sum up correctly.

      --
      emt 377 emt 4
    6. Re:Decimal Arithmetic by Bender0x7D1 · · Score: 4, Insightful

      Exactly. Unfortunately, there are too many people out there who are programmers, even good ones, who don't know, or understand, the basics. While I'm not claiming that formal education is the only way to get the knowledge you need, it is a good way to avoid gaps in your knowledge. I hated some of the computer science classes I had to take, but I did learn something important in each and every one of them.

      Another advantage in the formal classes is you get the theory that allows you to make decisions on what data types to use and when. Sometimes you need the precision of BigNum systems, (crypto for example), and sometimes the accuracy of float is enough. For example, in a lot of financial applications, float would be good enough since 2 decimal places is enough. If you need performance, float will beat any BigNum system hands down. However, if you are dealing with decimals on top of decimals, (such as calculating someone's dividend from a mutual fund where they own partial shares), you might need BigNum. Either way, with the proper theory and good understanding of the formats, you can make these decisions.

      These situations are why I am a big supporter of actual software engineering instead of programming. Sure, standard programming is great for a lot of situations, but serious applications need to use software engineering practices. You wouldn't build a bridge without an engineer, so why build an application that handles billions of dollars without applying the same rules and principles?

      --
      Reading code is like reading the dictionary - you have to read half of it before you can go back and understand it.
    7. Re:Decimal Arithmetic by gweihir · · Score: 2, Insightful

      These situations are why I am a big supporter of actual software engineering instead of programming. Sure, standard programming is great for a lot of situations, but serious applications need to use software engineering practices. You wouldn't build a bridge without an engineer, so why build an application that handles billions of dollars without applying the same rules and principles?

      I could not agree more. The issue is not to get it done fast or cheap. The issue is that the person designing the solution does understand what the limitations of the tools used are. Anybody that builds mission critical stuff without good engineers as designers and supervisors gets what they deserve. Same is true anywhere. Trouble with programming is that bridges collapse far more newsworthy than software.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    8. Re:Decimal Arithmetic by gweihir · · Score: 5, Informative

      Is there any fundamental reason why decimal arithmetic in a computer should be more accurate than binary arithmetic in a computer?

      No, no, the problem is not with the precision! The problem is that when input and output is decimal, but the calculation is binary, then you get additional errors from the conversion that badly educated programmers do not expect.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    9. Re:Decimal Arithmetic by Fordiman · · Score: 5, Informative

      No, C will automatically recast a number as needed in cases like the above.

      The issue is actually a pretty commonly understood situation when going from decimal floating point numbers to binary IEEE floats (I have another comment on here describing how they're stored), and it basically comes down to this:

      Floats of any sort are stored as an int with an int shift (a.aa x b^c). As such, there will be aliasing problems based on the prime components of b. A known percentage of divisors will produce repeating numbers. For example, any division of 3,5,7,11.... in base 2 will be repeating. Any division of 3,7,11,13... in base 10 will be repeating.

      No, there's nothing you can do about it. Use higher precision if needed, and otherwise get over it.

      --
      110100 1101000 1101000 1100110 0 1101111 1101000 1100011 1
    10. Re:Decimal Arithmetic by modeless · · Score: 3, Insightful
      Floats where [sic] good enough. [...] Had to do a bunch of
      decimal.parse(value.ToString())
      to get things to sum up correctly.
      Oh god. Please tell us which financial institution you worked for so we can all avoid it like the plague.
    11. Re:Decimal Arithmetic by gstoddart · · Score: 2, Funny
      Since these are computers, and they deal primarily with binary internally

      Last I checked, they use binary internally exclusively, not primarily. ;-)

      Unless things have changed and nobody told me. :-P

      Cheers
      --
      Lost at C:>. Found at C.
    12. Re:Decimal Arithmetic by innosent · · Score: 4, Informative

      For the uneducated, the reason that this is stupid is that IEEE-754 floating point numbers cannot REPRESENT all values, they APPROXIMATE them. There is no way to properly represent the value 0.01 as a float (0.01 is best approximated by 3C23D70A, or 9.9999998e-3). So, for instance, if you were to add up 100 pennies, you would have 99.999998 cents, not 100. Repetitive additions (like credits and debits from an account) or multiplications (interest calculations, amortizations, etc.) simply make the problem worse, which is why floats should NEVER be used to track money. A fixed decimal system should always be used for financial systems.

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    13. Re:Decimal Arithmetic by DeadChobi · · Score: 2, Informative

      I believe that we solved that problem in my first Computer Science course by using Integers and long integers. Essentially we took the monetary value, turned it from dollars into cents for calculation, then back into dollars for output. The only error would be in our method of using division for conversion, which could have been remedied by some other methods I can think of which are probably solved by much better people than I.

      --
      SRSLY.
    14. Re:Decimal Arithmetic by innosent · · Score: 3, Informative

      no, although a double or larger increases the precision, all floating point-style numbers suffer from this problem. You might get away with it for a larger number, but you are wasting bits, and stil cannot guarantee that the number is always going to be accurate. In addition, as the number gets larger, more precision is lost. For instance, 782533.37 as a float is 493F0C55, or 782533.31, and as a rounded float is 493F0C56, or 782533.38. Neither would be acceptable, and even a double will lose precision at about 16 significant digits, which could be a problem for daily interest calculations or another high-precision calculation. What should be used is a packed decimal (BCD) or fixed decimal (int or long and an associated scale factor). The whole point is that for a financial institution, there is likely a fixed precision level that is acceptable, but floats, doubles, etc., cannot guarantee that any given precision will be maintained, as it depends upon all of the factors involved (value and size of each number in the calculation). For the same reasons, floating point numbers should never be used in program flow control, either, as doing a for(float x = 0.0; x 1.0; x = x + 0.01) will iterate 101 times, since the first time x is not less than 1.0 is when it reaches 1.00999..., and if you are looking for a specific value, it may never reach it (like for(x = 0.0; x != 1.0; x = x + 0.01)). If you instead assume that you will always deal with 2 decimal places (or 3, or whatever), you can guarantee that your addition or multiplication will be accurate, and can scale the answer later if necessary.

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    15. Re:Decimal Arithmetic by Eivind · · Score: 2, Insightful
      It's not more "accurate". It's just that the inputs and outputs are often decimal, in which case using something other than decimal can give "unexpected" results.

      For example for finance, floating-point is useless, people generally do something like use a single int to store number-of-cents.

      The issue ain't accuracy per se, it's accuracy with *certain* numbers (thous representable in base10).

      In a financial program people expect $0.40 * 1000000000 to come out as *precisely* 400000000 and not 399999999.99

      There's lots of numbers that base10 can't accurately represent, such as 1/3. The thing is, you'll never have those as inputs if the inputs are stuff like, for example, prices.

    16. Re:Decimal Arithmetic by Eivind · · Score: 5, Informative
      There's other funkyness too, besides the precision. For example, if you're adding up a lot of floating-point numbers, it makes a difference what sequence you do the additions in.

      For example, if your input consist of one large number, and tons of small ones, then rounding-errors mean that starting with the large number gives a much smaller result than starting with the small ones.

      If I scale it down to smaller numbers, you see why:

      1.0*10^5 + 1.0*10^1 = 1.0*10^5

      So, adding a "small" number to a "large" number gives you simply the large number.

      If you repeat this, a million times, your result is still simply the large number.

      So you could end up concluding that 1.0*10^5 + (1.0*10^1 + 1.0*10^1 ..[1000000 times]...) = 1.0*10^5

      That is an order of magnitude wrong. The correct result is 1.1*10^6

      Practical result ? You need to think about your input. If it *may* look like this, you need to add up by repeatedly adding the two smallest numbers. Easy to do with a priority-tree. pseudocode like this:

      • Insert all numbers in priority-tree.
      • Extract two smallest numbers from tree.
      • Add the two numbers, producing a new number.
      • Push this single new number into the tree.
      • Repeat from step 2 until you're left with a single number.

      MS-Excel, by the way, does *NOT* do this in it's SUM() function, if you feed it a "large" number and *many* "small" numbers, you get horrendously wrong results. Because of the relatively high precision of floats and doubles though, you need to use larger numbers than in my example here.

    17. Re:Decimal Arithmetic by theshowmecanuck · · Score: 2, Interesting

      Occasionally I see stuff like this in the real world. For example, at a bar I was in once, the debit machine which received input from the cash register had a difference of 1 cent from the bill calculated in the register. I asked them what was up with that. They said something like "yeah, that happens every once in a while". To me it seemed obvious that whoever did the coding for the interface didn't have a clue about floating point rounding errors. So I tend to agree with the grandparent post... it seems floating point rounding errors are not always obvious to some programmers. I really can't fathom how someone who is supposed to be a professional doesn't understand how their tools work... or why they don't care to.

      --
      -- I ignore anonymous replies to my comments and postings.
    18. Re:Decimal Arithmetic by StressedEd · · Score: 4, Interesting
      I suspect you will end up having to avoid most of them.

      Friends of mine went off to work "In The City", when I quizzed them about their use of numbers for stock prices etc they were equally dismayed that things were being passed around as doubles. Often encoded as ASCII text in data streams as well, requiring different people to write their own ASCII->DOUBLE conversion depending on the representation of the stock tick. I think this kind of madness is quite prevelant.

      As someone else pointed out, if you want to do things properly you can end up needing very big integers.

      Perhaps the best option is to make sure people can only by and sell equities etc in numbers that can be exactly represented as doubles on a computer. It sounds crazy, but it's not as crazy as it looks. One of the reasons stocks etc are quoted as they are is probably due to the ease of the mental arithmetic.

      Kudos to the parent of your post. At least he knows what he is having to do is dodgy and cares enough to check!

      --
      Be nice to people on the way up. You will meet them again on your way down!
    19. Re:Decimal Arithmetic by Nutria · · Score: 2, Insightful
      The RPG language also had an implementation of BCD,

      Argh, "forgot" about RPG.

      and probably any compiler for the IBM S/370 line would at least have a library for it,

      Having a library isn't the issue. Early versions of TurboPascal also had BCD libraries.

      That you used via function calls. Very not useful.

      To be practical, a datatype needs to be usable by the 5 base arithmetic operators.

      as I believe the IBM mini/mainframe architectures had implemented it in hardware.

      The System 3x0 "CPU" is extremely CISC.

      --
      "I don't know, therefore Aliens" Wafflebox1
    20. Re:Decimal Arithmetic by johnw · · Score: 5, Interesting
      Friends of mine went off to work "In The City", when I quizzed them about their use of numbers for stock prices etc they were equally dismayed that things were being passed around as doubles.

      I'd be not only dismayed but very surprised to find anything which interfaces to the London Stock Exchange passing stock prices around as doubles, or as any other kind of floating point number.

      The LSE feeds all use 18 digits for values, with the first 10 being implicitly before the decimal point and the remaining eight being after the implicit decimal point. This is very handy because it means all the values can be manipulated using 64 bit integers. The LSE rules also state very precisely how rounding must be handled. If you try to submit a multi-million pound deal and your calculation of the consideration is out by just one penny then the deal will be rejected.

      No-one with the slightest clue about how to code would use floating point maths in any kind of financial program, particularly not one where they're working with the LSE.
    21. Re:Decimal Arithmetic by johnw · · Score: 4, Informative
      Are there any people in financial institutions that can comment (anonymously) on this?

      I'm happy to comment on it without being anonymous. I designed and oversaw the implementation of the LSE feeds (to and from) for the stockbroking part of a large UK high street bank which shall be NatW^H^Hmeless. If you tried to implement the internals using floating point arithmetic it would be pretty much impossible to get it to pass the LSE's conformance tests, which all assume you will use integer arithmetic and explicit rounding according to their rules.
    22. Re:Decimal Arithmetic by nickos · · Score: 2, Interesting

      MS-Excel, by the way, does *NOT* do this in it's SUM() function, if you feed it a "large" number and *many* "small" numbers, you get horrendously wrong results.

      This caused some big problems for me in a previous job. I was using something similar to server-side javascript to generate financial reports (including summing and currency conversion) which the customer was testing by trying to get the same results in Excel. I knew there was a floating point issue in my code, but even after I fixed it it didn't match the customer's Excel spreadsheet. Imagine trying to tell a customer that the problem is with Excel! :(

    23. Re:Decimal Arithmetic by StressedEd · · Score: 2, Interesting
      Interesting.

      Can you outline examples of these conformance tests, or even better, are they freely available? I assume these are intended to make sure things that go on the wire have a sane value, fall between certain daily trading limits etc (to prevent things like the Mizuho cock-up) [*].

      I suspect that the main culprit of "dodgy doubles" is likely to be people throwing together ad hoc codes behind the scenes, not the official interface to the exchanges.. (the "front" and "back" doors I mentioned earlier).

      NatW^H^Hmeless huh? That gives me some confidence. Though it still doesn't make me happy with their online banking (the numbers don't add up). I still need to find someone that can make my balance - you know - balance..... Yes, I know, "retail" banking... Yuk!

      I assume you know plenty of people that work deeper in the bowls of such organisations. Would you do an unofficial survey and find out how other people implement financial numerics? I'd wager you will be shocked!

      [*] Tee hee, that story did make me smile. Poor guy I bet he felt terrible....

      --
      Be nice to people on the way up. You will meet them again on your way down!
    24. Re:Decimal Arithmetic by k8to · · Score: 3, Informative

      It's worse than that. In some kinds of calculations, the error can be so large that the result is entirely meaningless. That is, for example when doing a subtraction between two nearly equal floating point values, both of which were approximated, the result may have more noise than signal. This isn't the usual case, but when writing general code that cares about precise values, you can code yourself a beartrap that only springs in rare circumstances.

      There is an excellent article about all of this detail, linked from TFA at sun: http://docs.sun.com/source/806-3568/ncg_goldberg.h tml

      Granted, I have never written any code where this matters, but I had never realized really just how bad some of the implications are in some cases.

      --
      -josh
  2. decNumber libary from IBM by Not+The+Real+Me · · Score: 5, Informative

    This is why I use the decNumber library from IBM.

    http://www2.hursley.ibm.com/decimal/decnumber.html The decNumber library implements the General Decimal Arithmetic Specification[1] in ANSI C. This specification defines a decimal arithmetic which meets the requirements of commercial, financial, and human-oriented applications.

    The library fully implements the specification, and hence supports integer, fixed-point, and floating-point decimal numbers directly, including infinite, NaN (Not a Number), and subnormal values.

    The code is optimized and tunable for common values (tens of digits) but can be used without alteration for up to a billion digits of precision and 9-digit exponents. It also provides functions for conversions between concrete representations of decimal numbers, including Packed Decimal (4-bit Binary Coded Decimal) and three compressed formats of decimal floating-point (4-, 8-, and 16-byte).

    1. Re:decNumber libary from IBM by piranha(jpl) · · Score: 3, Informative

      Rational number arithmetic is a more general solution. Any number that can be expressed in decimal or floating-point notation is rational; any rational number can be expressed as (n/d), where n and d are integers. We have "bigints;" unbounded-magnitude integers constrained only by the memory of the computer they are stored on. Rational numeric data types pair two bigints together to give you unbounded magnitude and precision, and have been implemented for decades.

      They probably aren't directly supported in your favorite programming language because they are slow to work with when you need very high precision; after each calculation, the rational number needs to be reduced to its lowest terms. This involves factoring, which takes time proportional to the the terms themselves.

      Consider the use of integers, floats, or decimals only as an optimization when it has been shown that an application is suffering a serious performance hit because of rational arithmetic, and when you can use a faster data type knowing that your program will perform within accuracy goals.

      For 90% of computing problems, monetary calculations included, you shouldn't even have to worry about what numeric type you're using. Your language should assume rationals unless told otherwise. Common Lisp, Scheme, and Nickle do exactly that.

      C developers can use GMP. Other developers can use one of many bindings to GMP.

  3. Use A Proper Decimal Library by Anonymous Coward · · Score: 2, Informative

    If you are actually concerned about rounding and precision, use decimal instead.

  4. I am Intel of Borg by www.sorehands.com · · Score: 4, Funny

    I am Intel of Borg, you will be approximated.

    There have been many examples, such as the original pentium bug. Of course, there was a bug in Windows Calc, it was 2.01 - 2.0 = 0 (If I remember correctly).

  5. The author is seriously confused by mlyle · · Score: 5, Insightful

    Apparently the author of the article didn't read the stories in RISKS that he cited. In particular, the 'pensioners being shortchanged' one talks about them not being paid interest on 'float'-- cash flow on transactions in progress. This has little to do with floating point numbers.

    Similarly, the spacecraft problem mentioned is one of an errant cast, not because of dilution of precision in floating point calculations.

    The author could really pick his examples better-- as mistakes in numerical programming happen often and are often of great import.

    1. Re:The author is seriously confused by mlyle · · Score: 3, Informative
  6. Not news. by SJasperson · · Score: 5, Insightful

    This is not a new problem. Or an unsolved one. Is there any modern programming language that does not supply a data type or library with exact decimal arithmetic support? Using a float to represent monetary amounts and expecting them to be free of rounding errors is as stupid as using integers to store zip codes and wondering where the leading zeros went from all the addresses in New England. If you can't be arsed to choose the right data type get out of the business.

    --
    Sigs? Sigs? We don't need no steenkin' sigs.
    1. Re:Not news. by mcrbids · · Score: 4, Funny

      Using a float to represent monetary amounts and expecting them to be free of rounding errors is as stupid as using integers to store zip codes and wondering where the leading zeros went from all the addresses in New England.

      Hrrmm, well...

      That would explain our lack of customer response in New England...

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    2. Re:Not news. by Jerry+Coffin · · Score: 2, Informative
      If you're dealing with amounts of cents that could possibly start overflowing even a 32-bit int (that is, billions of cents, or tens of millions of dollars), then the application's important enough to be worth the cost of further research on the matter.

      ...especially when you can find roughly 10 gazillion alternative for about $1 worth of research time.

      Unfortunately, most of the obvious alternatives are either somewhat restrictive, or have relatively poor performance. For example, on a 64-bit machine, a 64-bit integer works quite nicely -- but of course, most people aren't (yet) using 64-bit machines. There are quite a few arbitrary precision integer packages available, but most of them are substantially slower than a float or a double for most calculations.

      Unfortunately, quite a bit of calculation with money really will run into problems with being stored in a 32-bit integer. If you're dealing strictly with US currency, you're right: the problem only arises with quite large amounts of money. OTOH, if you have to deal with international currency, the problem can (and will) arise far sooner. Just for a couple obvious examples, there are around 27 Russian Kopecks or about 39 Zambian Kwacha to one US penny. Even inside of the US, some things have prices that include fractions of a cent (e.g. Gasoline often has .9 cents tacked onto the end of its price, and some stock/bond/commodities markets use prices in eights of a cent).

      In a lot of cases, the best alternative to using floating point is to use floating point. No, that's not a typo. What you want to do is store an integer number of pennies -- but store it in a double (or on ocassion, a long double). A typical implementation of double can also be used as a 53-bit integer type. A 32-bit number is right on the edge where it's often usable, but can easily run into problems. A 53-bit integer makes those problems much more remote -- to the point that by the time you're dealing with such a large amount of money, you probably don't want to know about pennies anymore. A long double will store a 20 digit integer -- so it should be usable even for something like figuring the stock dividend of a large company.

      As long as you're careful about order of calculation and when you do rounding, this gives the added bonus of working nicely for things like converting one currency to another, that are relatively difficult to do with pure integers.

      --
      The universe is a figment of its own imagination.
  7. science; business by bcrowell · · Score: 4, Insightful

    He talks about scientific applications, but actually very few scientific calculations are sensitive to rounding error. Remember, they sent astronauts to the moon using slide rules. Generally for scientific applications, you just don't want to roll your own crappy subroutines for stuff like matrix inversion; use routines written by people who know what they're doing. (And know the limitations of the algorithm you're using. For example, there are certain goofy matrices that will make a lot of matrix inversion algorithms blow chunks.)

    For business apps, the classic solution was to use BCD arithmetic. But today, is it more practical (and simple) just to use a language like Ruby, that has arbitrary-precision integers, so you can just store everything in units of cents? A lot of machines used to have special BCD instructions; do those exist on modern CPUs?

    1. Re:science; business by Jeremi · · Score: 2, Funny
      But today, is it more practical (and simple) just to use a language like Ruby, that has arbitrary-precision integers, so you can just store everything in units of cents?


      Hmm.... if you use integers of any given finite precision, aren't you still subjecting yourself to round-off error? (e.g. ((int)4)/((int)3) == 1!!) On the other hand, if you use a string-based infinite-precision datatype, what happens when you try to compute an non-terminating number (e.g. 1.0/3.0)? Perhaps your program crashes after trying to allocate an infinite amount of RAM to store the result? ;^)


      Seems to me the only full solution to round-off error would be to store the results of certain math operations as strings indicating the underlying mathematical/algebraic expressions (e.g. 1.0/3.0 == "1/3"), a la Matlab... but then, I'm no expert, perhaps there is a better way.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
  8. This is not a "problem" per se by Null+Nihils · · Score: 2, Interesting

    float (and the big brother double) is inaccurate. Its no surprise. A 32-bit Float is but a single simple tool in a programming language. If anyone is surprised by how Floats behave then they are, most likely, inexperienced.

    You don't start addressing a problem in software just by assuming Float or Double will magically fill every need. An experienced programmer needs to have a knowledge of how to use, and how not to use, the programming tools at hand. TFA about floating point numbers is very introductory (at the end it mentions that the next article will tell us how to "avoid the problem"... I assume it will go on to cover some basic idioms.) In a way it misses the point: Floating-point rounding is not a "problem". Floats and Doubles always do their job, but you have to know what that job is! The behaviour of floating point numbers should not be a big surprise to a seasoned coder.

    For example: You can't use float or double to store the numerical result of a 160-bit SHA-1 hash... you have to use the full 160 bits. (Duh, right?) So, if you use a mere 32 bits (float) or 64 bits (double) to store that number, you are going to sacrifice a lot of accuracy!

    1. Re:This is not a "problem" per se by Wonko+the+Sane · · Score: 2, Informative

      A much better article is linked from this one near the bottom: what every computer scientist should know about floating-point arithmetic

  9. Re:Obligatory by andrewman327 · · Score: 2, Informative

    You beat me to it. I was just thinking about how much better TFA would be if they explained the specifics of how the Office Space team ripped off the banking system.

    --
    Information wants a fueled airplane waiting at the hangar and no one gets hurt.
  10. Numbers and bases by Todd+Knarr · · Score: 4, Insightful

    We have the same problem in everyday numbers. Try representing 1/3 in any finite number of digits. You can't. The big thing about floating-point numbers that trips people up is that we're used to thinking in base 10. Floating-point numbers in computers typically aren't in base 10, they're in base 2. The rounding problem he describes is simply us getting confused and wondering why a fraction with an exact representation in base 10 doesn't have an exact representation in base 2. The obvious solution is the one he alludes to at the end: don't use base 2. Computers have had base-10 arithmetic in them for decades, in fact the x86 family has base-10 arithmetic instructions built in (the packed-BCD instructions). COBOL has used packed-BCD since it's beginning, which is why you don't find this sort of calculation error in ancient COBOL financial packages running on mainframes.

    1. Re:Numbers and bases by tawhaki · · Score: 3, Funny

      Try representing 1/3 in any finite number of digits.

      0.3. All you need is base 9 :)

    2. Re:Numbers and bases by flibbajobber · · Score: 3, Insightful

      Try representing 1/3 in any finite number of digits. You can't.

      "1/3"

      You can. I just did. So did you. In base-10, even. In fact, the answer is the same for base-4 or higher. Using only two digits, "1" and "3". Any rational number can be represented using a finite number of digits, using... (wait for it) a RATIO.

      (Represent one-third in Base 2? why that would be "1/11". One-third in Base 3 would be "1/10".)

  11. It used to be much worse. Kahan fixed it. by Animats · · Score: 5, Interesting

    Due to the efforts of Willam Kahan at U.C. Berkeley, IEEE 754 floating point, which is what we have today on almost everything, is far, far better than earlier implementations.

    Just for starters, IEEE floating point guarantees that, for integer values that fit in the mantissa, addition, subtraction, and multiplication will give the correct integer result. Some earlier FPUs would give results like 2+2 = 3.99999. IEEE 754 also guarantees exact equality for integer results; you're guaranteed that 6*9 == 9*6. Fixing that made spreadsheets acceptable to people who haven't studied numerical analysis.

    The "not a number" feature of IEEE floating point handles annoying cases, like division by zero. Underflow is handled well. Overflow works. 80-bit floating point is supported (except on PowerPC, which broke many engineering apps when Apple went to PowerPC.)

    Those of us who do serious number crunching have to deal with this all the time. It's a big deal for game physics engines, many of which have to run on the somewhat lame FPUs of game consoles.

  12. This is why you would choose... by jd · · Score: 5, Informative
    One of the many many solutions:


    • Fixed-point numbers
    • Berkeley MP or Gnu MP arbritary-length floating-point
    • Co-processors with truly massive internal registers (I refuse to use less than 80-bit)
    • Delayed calculation (ie: actually process a calculation at the end, storing the inputs and operators until you absolutely need the value - eliminates intermediate rounding errors and if the value is never needed, you don't waste the clock cycles)
    • Don't use real numbers - apply a scaler or a transform such that ALL components of any scaled/transformed calculations must be integer, then only transform back for display purposes


    The use of transforms for handling numerical calculations is an old trick. It is probably best-known in its use as a very quick way to multiply or divide using logarithms and a slide-rule, prior to the advent of widely-available scientific calculators and computers. Nonetheless, devices based on logarithmic calculations (such as the mechanical CURTA calculator) can wipe the floor with most floating-point maths units - this despite the fact that the CURTA dates back to the mid 1940s.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:This is why you would choose... by philipgar · · Score: 3, Informative

      logarithmic number systems (LNS) for computers were first proposed by Marasa and Matula in 1973, as a "better" approximation of numbers than floating point units. This paper compared the cumulative error from different floating point standards with LNS standards. LNS offers some advantages over floating point, however it's performance degrades significantly as you add more bits of precision.

      LNS can be effective to around 24bits of precision, and then the hardware requirements for the LNS unit's adder/subtracter become too overwhelming. This is because multiplications and divisions are fast on LNS units (with minimal hardware) as just require an adder, however handling subtraction is much more difficult. The simplest (naive) methods of making an adder and subtractor involve using large ROM lookup tables. Fancier, more efficient units using smaller roms and small multipliers to help get better values (I don't remember all the details offhand). Sometimes they'll even trade precision for faster performance. This can result in chips with single cycle multiplies and divides, but multi-cycle additions and subtractions. For low precision calculations requiring many divides and multiplies LNS processors can often achieve the best performance. However for many applications an efficient LNS unit with sufficient precision just isn't practical.

      Phil

  13. Bah. Author doesn't understand arithmetic. by swillden · · Score: 5, Insightful

    The author goes on and on about how floating point numbers are inaccurate, and unable to precisely represent represent real values, like this is something new, or even something different from the number approximations we normally use.

    The reason the examples the author cites can't be represented precisely is that floating point numbers are ultimately represented as base-2 fractions, and there are a bunch of finite-length base-10 fractions that don't have a non-repeating base-2 representation. Guess what? We have *exactly* the same problem with the base-10 fractions that everyone uses all the time. Show me how you write 1/3 as a decimal!

    The problem isn't that floating point numbers are inherently problematic, the problem is that we typically use them by converting base-10 numbers to them, doing a bunch of calculations and then converting them back to base 10. Floating point rounding isn't an unsolved problem -- floating point rounding works perfectly, and always has. It's just that the approximations you get when you round in base 2 don't match the approximations you get when you round in base 10.

    Bottom line: If you care about getting the same results you'd get in base 10, do your work in base 10. This is why financial applications should not use floating point numbers.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  14. Must read floating-point articles by emarkp · · Score: 2, Interesting
    What every computer scientist should know about floating point numbers (HTML, PDF).

    and

    When bad things happen to good numbers (as well as Becker's other floating-point columns on that same page)

  15. Why I only use decimal values by eliot1785 · · Score: 3, Interesting

    This is why I use DECIMAL and not FLOAT in MySQL. Problem solved. I'm not a big fan of floats, the extreme precision that they seem to have is mostly an illusion.

    1. Re:Why I only use decimal values by flupps · · Score: 3, Interesting

      Up until MySQL 5.0 calculations with DECIMALs were still done as DOUBLEs, so you could get unexpected results.

  16. Re:average joe by Anonymous Coward · · Score: 2, Interesting

    If you think your calculator doesn't give rounded off numbers, I hope you're not working in science or engineering.

  17. Comp Sci 101 by syousef · · Score: 4, Informative

    Welcome to a very poor article on what's been taught in early Comp Sci for many many years.

    Any serious developer of business software knows all about this and avoids floating point at all cost for financial calculations. Scientists however do use them carefully since the math they do is usually much more performance (speed) sensitive and the calculations are a little more complex than what tends to be done on the business side (ie _most_ business calcs are relatively simple).

    --
    These posts express my own personal views, not those of my employer
  18. Re:A good example of the evils of math. by codegen · · Score: 4, Informative
    Part of the problem there was that the missile's clock values were such that they would not convert to base 2 (and hence to float) accurately and so the tracking was off

    Actually the problem was that they used a float to store the system time (time since power on) in the ground radar unit. It allowed the clock to be used in calculations without a conversion. A float will store an integer just fine (and accurately) until the number gets too large and then the units part drops off the bottom of the precision and the increment operator no longer makes any sense. This was a design decision that made sense for the role for which the missle platform was originally designed. The patriot was originally designed to be used in the European Theater (if the cold war ever turned hot) and as such would never remain in one location for more than a very few days.The clock is reset everytime they move the battery (they power off the ground tracking radar when they move). The use in the gulf war was in a strategic role (not tactical) which kept them continuously operating in a single location for long periods of time, and the shortcut they used came back to haunt them (as usual). If they had reset the system every few days, the problem would not have occured.

    --
    Atlas stands on the earth and carries the celestial sphere on his shoulders.
  19. Old news, but an unsolved problem by Opportunist · · Score: 2, Informative

    And I'm not sure if it can be solved altogether. When you spend a little time meditating over the IEEE 754, you notice a few flaws. The first and most obvious is, of course, that, no matter how precise you want to make it, somewhere there's a cutoff. And, especially when you multiply with floats, that error grows as well. But there's another problem. Two actually.

    The first one is the one mentioned in the article, and something everyone who didn't sleep through his IT classes should know: Computers calculate binary, and converting floats from binary to decimal isn't possible without error. There is no way to represent 0.37 in binary, in IEEE754. No matter how many bits you spend on the mantissa. Now, you can argue that, if you make it "big enough", it doesn't matter anymore since it's well within the error margin and when you round it to, say, 5 after decimal, the error vanishes. True. But when you start calculating, when you multiply or, worse, exponentiate, the error grows in big leaps.

    Another, less obvious, problem is hidden underneath the way the IEEE754 works: Your error grows as your numbers grow. This might seem obvious, but it is interesting how many people overlook this flaw and problem in everyday life. Since according to the IEEE754 standard, real numbers are stored as exponent and mantissa, if you're dealing with BIG numbers, a fair deal of your mantissa is spent on the "pre-comma" part of your number, so you're losing precision. You can't reliably say that "a double is good for 5 behind dot, no matter what", you have to take into account how many of those precious mantissa bits are spent before you even get to ponder what's left for your precision.

    This isn't so much a problem of processors. It's a problem of people understanding how their processors work.

    --
    We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
  20. The problem is using floating point improperly by Myria · · Score: 5, Interesting

    GIMPS looks for Mersenne primes. This is clearly an exact integer operation. However, for speed, they use Fast Fourier Transforms to do the big squaring operation with floating point. Obviously, they need an exact result.

    The trick is to carefully calculate exactly how much error each operation can generate. It is possible to know exactly how many bits of your result contain valid information. If you need more accuracy, you can split it into multiple operations. As long as the final accumulated error in their result is less than .5, you have the integer answer they need. Note that it's basically impossible to do this without using assembly language, because the order of operations and subexpression elimination definitely matter.

    Another interesting problem occurs with floating point results. You cannot expect the complete answer to be exactly identical on all machines. Even on the same machine, compiler settings affect the answer: x87 differs significantly from SSE. If you are doing something that needs bitwise identical results on all machines, you need to either implement it with integer math, or do what GIMPS does and do error tracking.

    Melissa

    --
    "Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
    1. Re:The problem is using floating point improperly by ponos · · Score: 3, Informative
      This is clearly an exact integer operation. However, for speed, they use Fast Fourier Transforms to do the big squaring operation with floating point. Obviously, they need an exact result.
      All serious bignum libraries use (or should use!) the FFT to multiply very big numbers. This has been studied extensively (see Knuth The Art of Programming vol 2, for example) and is the fastest way to multiply. The general idea is that after the transform you can multiply in O(N), which is much faster than the naive O(N*N) one would expect from a simplistic digit-by-digit approach.

      P.

  21. rounding algorithms by trb · · Score: 3, Informative

    If you're interested in rounding (and who isn't?) you might want to read An introduction to different rounding algorithms.

  22. Re:average joe by dcollins · · Score: 2, Interesting

    "when i use my calculator, it doesn't give rounded off numbers."

    Not true.

    In the math class I teach, I do the following: have everyone take a calculator and do "2/3".
    Half of the calculators say this: "0.666666666" (rounded down).
    Half of the calculators say this: "0.666666667" (rounded up).

    In truth, an exact answer requires an infinite sequence of "6"'s. The calculator (or any computer) must decide whether to round up or down to fit it into its display space (or memory). You always have some round-off error -- and the more calculations you do, the more the round-off error builds up and up and up....

    --
    We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
  23. not really by YesIAmAScript · · Score: 2
    The number of people who really understand floats is less than 1% of the people who think they do.

    Do you understand that
    (A < B)
    is not the same as
    !(A >= B)
    and that
    ((A + 1) == (A))
    Can be true?

    Every day, many people make the mistake of using floats when wat they really wanted was the ability just to represent large numbers. For example, in Mac OS X, the system uses doubles as representations of time. This is the worst idea I can think of. First of all, floats are imprecise and time is the thing that man can subdivide the most precisely. Secondly, if a Mac OS X machine is on long enough, time will cease to progress becuase of the 2nd statement above!

    Plenty of people who thought they knew what they were doing have used floats in places where they are a bad idea. And it continues to today.

    So I say no, Computer Science 101 doesn't seem to cover all you need to know about floats.
    --
    http://lkml.org/lkml/2005/8/20/95
    1. Re:not really by mike2R · · Score: 2, Informative

      This Wikipedia page has good background information.

      --
      This sig all sigs devours
    2. Re:not really by Anonymous+Brave+Guy · · Score: 3, Informative

      The first paper recommended for learning more about floating point arithmetic is usually Goldberg's famous What Every Computer Scientist Should Know About Floating-Point Arithmetic .

      I can't remember whether the paper specifically discusses the failure of floating point arithmetic to obey the mathematical laws of arithmetic, but even if not, the background it provides is probably enough for you to understand the reasoning yourself.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  24. Another true story by Flyboy+Connor · · Score: 2, Interesting

    I once wrote interest-calculation software for a bank. This was new software to replace their old stuff. Naturally, I stored the values in cents, not guilders/dollars/euros, to avoid rounding errors (which really have a big effect in interest calculations).

    When I delivered my software, they compared my output to the output their old software produced. There were small differences. They asked me where these came from, and I traced them back to rounding errors in their old software. I showed them this by example, thinking that they would be happy that their new software did not have this problem.

    Their response? "The new software should produce exactly the same numbers as the old software." "But the old numbers are WRONG!" "That does not matter, the new software should produce exactly the same numbers as the old software."

    It is really hell to make good new software error-compatible with faulty old garbage.

  25. Re:did you just call pi an infinite value?!? by spiffyman · · Score: 2, Informative

    It's Transcendental number.

    Not trolling, just helping since you asked.

    --
    So you can laugh all you want to...
  26. Error diffusion is another way. by DrYak · · Score: 3, Informative

    On solution as you put, is to priority sort your operations.

    Another solution, is the Kahan summation algorithm.
    Wich, grosso-modo, keeps track of the error at each step, and injects it back at the next.
    In your example, in each iteration, the algorithme notice that tha 1.0e1 is missing from the sum and carries it to the next addition. A few iterations later, the carry is big enough to be added to the result.

    The advantages are : you don't need to first load all components in a tree, then itteratively sort them and process them all until you're done. In fact you can even use this algorithme in a streaming fashion, were you don't enven need to know how much value will come.

    The disadvantages are : some compilers are able to guess that the carry "should mathematically be 0" (actually true in a perfect world with infinite precision numbers) and could "optimise" the code back to a plain normal sum function bypassing the algorithm (and won't subsequently use any other sum-correction algorithm).

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
    1. Re:Error diffusion is another way. by Eivind · · Score: 2, Interesting
      Sure. There's many ways of solving the problem, none of them very hard to grasp once you've done the fundamental task: which is to be aware of and understand the problem.

      Lots of programmers though, are unaware of the finer details of floating-point numbers.

      As evidenced by MS-Excel failing to give the correct answer, even when as we've now demonstrated, there's multiple, simple, correct algorithms for doing so.

      It *is* surprising to do the equivalent of 1000 + (1+1+1 ...[10000 times] +1) and get 1000 as the answer. The answer *should* be on the order of 11000. Anything else is a bug.

      Thus we can conclude that MS-Excel does not even manage to get its SUM() function correct. One of the simplest functions there can be on a set of floats.

  27. The Answer by BSonline · · Score: 2, Funny

    Actually, the answer is 42.

    --
    PS: That is what part of the alphabet would look like if the letters "Q" and "R" were removed.
  28. Well, Yeah, but .... by vtcodger · · Score: 2, Informative
    ***Well, kinda, yeah. Can you think of any applications for which 10 digits of precision isn't enough? ***

    Basically, I agree with you, but as Hamming pointed out in the 1950s you can get yourself into trouble with some thing like:

    A = small number, e.g. size of smallest feature in a Celeron-M CPU in microns

    B = big number, e.g. distance to Andromeda galaxy in microns

    C = (A+B) ... (some set of clever operations) ... - B

    The math is fine, but the implementation won't yield the correct answer. because the value of interest got scaled off. That isn't rocket science, but it is suprisingly hard to catch.

    That said, Floating Point works suprisingly well for most things. Back in the 1960s, things like 2+2=3.99999 were a fairly common phenomena. Hardly ever happens today. I spent about three decades in the defense industry writing and testing systems that got things from point X to point Y. I think I saw a reasonable selection of the possible errors that folks can make. Other than subtracting big numbers from other big numbers and expecting the result to be precise, I can't recall a single problem with floating point.

    If we're looking for something that is basically broken to write an article about, forget floating point. Try event driven software architectures. Or the notion that it is possible to write unambiguous specifications that are also comprehensible.

    --
    You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
  29. Good thing I use double instead of float.... by gatkinso · · Score: 2, Funny

    ...THAT should keep me safe from thems nasty float errors!

    --
    I am very small, utmostly microscopic.
  30. Re:BCD isn't the answer by Nutria · · Score: 2, Insightful
    I mean, do you need a cash register than can tally sums > $1000000?

    You do realize that there's more to business than cash registers, right?

    A 32-bit fixed point number maxes out at 21,474,836.47 which is severely limiting for all but small-sized businesses and tiny governments.

    64-bit fixed point number (max 92,233,720,368,547,758.07) are obviously better, but are only efficient on 64-bit machines, which are still a minority of installed machines.

    --
    "I don't know, therefore Aliens" Wafflebox1
  31. Rounding is no big deal by etresoft · · Score: 2, Informative

    Everyone knows about rounding problems with floats. What people don't seem to realize (and this goes for many real "Rocket Scientists" I've known) is that a float only gives you a fixed amount of digits - that's all. You can have a highly precise small number or a really big number. You cannot ever have a precise, big number. It just so happens that these same "Rocket Scientists" like to represent time as a floating point based on some 1970 epoch. Guess how accurate that is going to be in a couple of years. How do they solve this problem? The "double double" type of course!

  32. Clasic testbokk issue by ChrisA90278 · · Score: 2, Interesting
    I still have this textbook I got in 1971. It's called "Computer Science, A first course". It talks about this same exact problem or representation. If compared integer, floating and decimal representations.

    Why would this count as "news". Everyone who has to deal with this would already know about it.