Slashdot Mirror


Floating Point Programming, Today?

An anonymous reader asks: "I'm rather new with programming and stumbled across these twe articles: The Perils of Floating Point from 1996 and What Every Computer Scientist Should Know About Floating-Point Arithmetic from 1991. I tried some of the examples in these articles with Intel's Fortran Compiler and g77 and noted that some of those issue reported no longer seem valid whereas quite a few still very much are around. Could someone, please, give me a pointer to some newer thoughts and/or new facts surrounding floating point programming. What has been improved since those articles were written? What is still the same? How is the future, especially with the new platforms IA64 and AMD64? I am most interested in the x86 and x86-64 architectures. Thank you for your kind help."

111 comments

  1. The articles are quite up-to-date by Mik!tAAt · · Score: 5, Informative

    Both articles are still valid today, mostly because current processors use the same IEEE floating point format than the ones available in 96 (or 91).

    --
    This is the place where you write something that will make you seem like a complete idiot.
    1. Re:The articles are quite up-to-date by jmccay · · Score: 2, Informative

      You should probably give a little more detail. For those that don't know, IEEE floating point is basically a number expressed in a form similar to scientific notation (although there are serious differences in what must be done) expressed in powers of -2 (x**-(n**2)if I remember my FORTRAN right). An example of a number that cannot be expressed in IEEE floating point (if I remember correctly) is .1. You can approach the number, but you never really reach the number.
      If you want to avoid the error, the best thing to do is to use interger math and convert to decimal at the last moment. Make sure you leave plenty of room in the number so as to avoid rounding errors. I rarely see the problem, but I usually only go out 2 decimal places (3 tops) because a lot of what I was workin with was financial numbers at my last job. After rounding, there usually wasn't a problem. A lot of the financial institutions use integer math in the software to avoid potential problems. If the program is mission critical, you're better off using interger math unless you don't need great precision.
      As a side not, the last time I check mozilla's version of javascript uses floating point notation to store dates. Because of this, some platforms will experience date problems from time to time with java script. Unless, they have changed it to an integer type for storing the number representing the date and time, a date may be wrong from time to time. To test it whether or not mozilla has a problem you have to create a date object with a specific date and time that you know can't be represented in IEEE floating point notation. I don't have the patience to do that right now, but they may have changed it because I haven't looked at the code since before July 26, 2002 when I was laid off. It really is only a problem for certain instances in timeon certain dates, but I think the precision required to represent a date and time as a number really require it to be implemented with some variation of interger mathematics.
      I was looking at the code for javascript because my last company was looking into it for scripting purposes for their product. I got laid off and the project died.

      --
      At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
    2. Re:The articles are quite up-to-date by jmccay · · Score: 2

      I made a mistake that should be 2**-(n**2) (1/2, 1/4, 1/8, etc.). That's what I get for not stopping to double check my math.

      --
      At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
  2. Don't worry by Hard_Code · · Score: 5, Funny

    ...those articles are only 99.99999891 percent true

    --

    It's 10 PM. Do you know if you're un-American?
  3. Unsolvable problem by Anonymous Coward · · Score: 5, Informative

    Floating point stuff hasn't really changed much since then. Basic rule of thumb, if you want it to be accurate don't use floating point.

    Much the same problem as you have with decimals. Many fractions cannot be evaluated evenly in certain bases. It will always cause you headaches if you don't realize this.

    Try writing a bunch of numbers in hex but then do all of your calculations in decimal. you'll have the same problem.

    1. Re:Unsolvable problem by john_many_jars · · Score: 4, Informative

      The use of floating point numbers isn't all bad. Those of use who use them are often solving problems with condition numbers that render the answer we get less accurate than the number of digits of accuracy provided.

      Think about tan(89.99) versus tan(89.991) (which is very ill-conditioned around 90). Both numbers are not terribly truncated by floating point, but the results are different by about 1,000. Try it and you'll see floating point error isn't as dangerous as things like cancellation, ill-conditioning and the like.

    2. Re:Unsolvable problem by rpresser · · Score: 1
      Think about tan(89.99) versus tan(89.991) (which is very ill-conditioned around 90). Both numbers are not terribly truncated by floating point, but the results are different by about 1,000.


      tan(89.99) and tan(89.991) are *supposed* to be different by about 1000. This is not a good example of floating point instability.

      Maybe I misunderstood; maybe you were trying to say that the uncertainty in the angle is likely to be larger than 0.001 degree (or even a tenth of that), so you shouldn't be taking tan() around there anyway. If that's what you meant, sorry for this post; I agree with you.
    3. Re:Unsolvable problem by Daleks · · Score: 4, Insightful

      Basic rule of thumb, if you want it to be accurate don't use floating point.

      Basic rule of thumb, determine what accuracy you need, then pick your number representation.

    4. Re:Unsolvable problem by Phronesis · · Score: 3, Informative
      Think about tan(89.99) versus tan(89.991) (which is very ill-conditioned around 90). Both numbers are not terribly truncated by floating point, but the results are different by about 1,000. Try it and you'll see floating point error isn't as dangerous as things like cancellation, ill-conditioning and the like.

      tan(89.990) = -2.0460
      tan(89.991) = -2.0408

      perhaps you're thinking of

      tan(1.571) = -4909.8
      and
      tan(1.578) = -138.8

    5. Re:Unsolvable problem by looseBits · · Score: 3, Funny

      Wouldn't it be simpler if humans only had 2 fingers instead of 10. Hell, that's how many I type with anyway.

      --
      Lord, bless my users that they may stop being such fucking idiots!!
    6. Re:Unsolvable problem by TLI_ · · Score: 0

      Thank YOU. Now I will run out of thumbs...

      What will I do when I see next "rule of thumb"?

      Hmm... Realy unsolvable problem

    7. Re:Unsolvable problem by Anonymous Coward · · Score: 0

      try 1.5707 and 1.5708 for something really pathological. The condition number for tan is so great around 90 degrees or pi/2 that there are no digits of accuracy for any reasonable input, regardless of floating point error.

    8. Re:Unsolvable problem by jonadab · · Score: 1

      > Wouldn't it be simpler if humans only had 2 fingers instead
      > of 10. Hell, that's how many I type with anyway.

      If we just didn't use our thumbs for counting, that would be octal.
      I personally think it would be interesting if we had two thumbs
      plus six other fingers on _each hand_, so then we could work in hex.

      I do favour place value over two's complement for the representation
      of fractional parts, though. Either that or rational notation. I
      am not fond of two's complement.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    9. Re:Unsolvable problem by Anonymous Coward · · Score: 0

      I type using only two.

    10. Re:Unsolvable problem by Froggie · · Score: 2, Informative

      An interesting point is that if you do integer calculations that you expect to work with perfect accuracy on a 32 bit integer, then they will also work with perfect accuracy on a float with a 32+ bit mantissa.

      Quite useful if you're adding integer numbers together on a 32 bit machine in C and you want the carry bit, for instance (and you're too lazy to write the code entirely in integer arithmetic): you can't easily find the sum and carry bit if you're using 32 bit ints, but it's trivial if you have a larger FP type (e.g. a double).

      This is not to say FP numbers are better than ints, but if you know what you're doing you can do anything with an FP type that you can with an int type. The cost is usually the slowness of the FP operation - highly unlikely to take 1 ALU cycle, even on modern processors.

    11. Re:Unsolvable problem by Admiral+Burrito · · Score: 3, Informative
      Try writing a bunch of numbers in hex but then do all of your calculations in decimal. you'll have the same problem.

      Actually, you won't. You would the other way though.

      The problem occurs when you try to represent a (properly reduced) fraction whos denominator has one or more prime factors not in common with your number base.

      You can represent one tenth in base 10 because all the prime factors in the denominator (10: 5,2) are found within the factorization of the base (also 10: 5,2). You can not represent one sixth in base 10 because one of the factors of 6 is not found in the factorization of 10 (3). Likewise, you cannot represent one tenth in base 2, because the denominator (10) is a multiple of 5, which is a prime not found in the factorization of the base (2).

      Because the factorization of 16 contains only primes that are in the factorization of 10 (2) all fractions that can be represented in hexadecimal can be represented in decimal. The reverse is not true, because 10 is the product of a prime (5) that is not found in the factorization of 16. So there is no way to get the "fifths" aspect of a decimal number into a hexadecimal number.

    12. Re:Unsolvable problem by Anonymous Coward · · Score: 1, Informative

      tan(89.990) = -2.0460
      tan(89.991) = -2.0408

      perhaps you're thinking of

      tan(1.571) = -4909.8
      and
      tan(1.578) = -138.8


      ...or perhaps he's thinking of angles in degrees. ;)

    13. Re:Unsolvable problem by rabidcow · · Score: 1

      Basic rule of thumb, if you want it to be accurate don't use floating point.

      Well ok, but only if accuracy is infinitely more important than space and time. A fixed-point format that can handle from 10^-308 to 10^308 with at least 53 bits of precision wouldn't be terribly useful.

      You need to understand how much accuracy you have and not expect to get any more out of the calculations.

    14. Re:Unsolvable problem by taphu · · Score: 1

      Or better yet, we could build a computer that based on base 10..

      (before you say I'm crazy, remember that I'm just talking about building a machine, whereas you're talking about altering millions of years of evolution and thousands of years of a particular though pattern.. :)

    15. Re:Unsolvable problem by jonadab · · Score: 1

      Or we could just dispense with counting on our fingers and learn
      to actually (gasp) add. Then we could work in hex even though we
      only have ten fingers. It would sure make a lot of things easier.
      And in the process we could obsolete that dang metric system and
      replace it with something decent based on powers of two.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    16. Re:Unsolvable problem by FuzzyDaddy · · Score: 1
      Hey! I use gradians!

      --
      It's not wasting time, I'm educating myself.
    17. Re:Unsolvable problem by dankelley · · Score: 1

      I see this comment was modded up as informative. I think it needs a few funny points, as well. I laughed out loud, at the delightfully unexplained humour.

    18. Re:Unsolvable problem by Phronesis · · Score: 1

      I'm glad somebody noticed! Sometimes I wonder...

  4. Platform and all by Stary · · Score: 5, Informative

    It all depends on what platform you program on and so on. Newer x86 processors do their floating point in an 80-bit format and only truncate when copying back to your original 32 or 64 bit floats. That saves you some precision but not that much. As others have said, there are probably situations where almost all of the material in those articles is valid.

    --
    Tomorrow will be cancelled due to lack of interest
    1. Re:Platform and all by norwoodites · · Score: 4, Insightful

      It only truncates when saving to memory, that is why you can get different results when optimizing than not optimizing with gcc (you can force gcc to truncate all the time by using -ffloat-store).
      With gcc you can force the floating point calculations in the sse registers by -mfpmath=sse.

    2. Re:Platform and all by Pseudonym · · Score: 2, Informative

      It also screws royally with your numerics. Take, for example:

      float x = something(), y = something_else();
      if (x < y) assert(x < y);

      This assertion can fail on Intel hardware, because by the time the assert comes around, x may equal y as one or both of them have been truncated from 80 bits to 32.


      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    3. Re:Platform and all by chthon · · Score: 2, Interesting

      FWIW, but the Intel Numeric Coprocessors have always done their math in 80-bit floats since their introduction, what was it about 20 years ago ?

    4. Re:Platform and all by t · · Score: 1

      The real problem is not the 80bit fp itself, but its lack of reproducibility. The problem arises whenever there is enough register pressure to push the floats in registers on the stack or other memory, at which point they get rounded to 64 bits (double). This register pressure is pretty random, it can even change between compiles. Also whenever your process is switched out, the registers get saved in memory. That's why there exists a flag for gcc that says follow strict IEEE floating-point arithmetic, which rounds the 80bit fp values after every computation.

  5. Common mistake by PD · · Score: 5, Informative

    Don't count money as floating point. You'll just have rounding errors. Using long doubles instead of floats won't help you at all.

    The solution is to count pennies instead, or if you need values bigger than 22 million dollars, use a BCD library. BCD is Binary Coded Decimal.

    1. Re:Common mistake by alonsoac · · Score: 2

      Counting pennies is not always good, in real financial applications the resolution is to 4 decimals since sometimes you can get 4 decimals when calculating percentages, and if you ignore it someone somewhere eventually will not get as much as he expected and will come down to hunt you.

    2. Re:Common mistake by PD · · Score: 3, Informative

      That's not the error I was addressing. Here's some definitions of a subtotal:

      float subtotal; // wrong way to represent money
      long subtotal_pennies; // right way to represent money

      And, if you're at a gas station, you need to represent money like this:

      long subtotal_mils; // gas per gallon has a 9/10 of a cent on the end - $1.34 9/10

      The calculations that you perform on the money are a completely different story. There's no point in worrying about 4 decimal places of percentages if you don't start from the right place.

    3. Re:Common mistake by AvantLegion · · Score: 4, Funny
      Don't count money as floating point. You'll just have rounding errors.

      But that's the point! And you transfer those fractions of cents (that just get rounded off anyway) into an account you control!

      "back up in your ass with the resurrection...."

    4. Re:Common mistake by mdielmann · · Score: 2, Informative

      You're not looking far enough ahead. What do you do when you have two taxes collected, where each is a fraction of a cent? What do you do when the governing bodies allow you to combine the fractional cents for economy, and pay the tax based on total taxable sales at the end of the month? Where I live we have two VAT taxes, both 7%. There is nowhere, short of whole dollars that that equals whole pennies except for whole dollars. Since the governing bodies allow combining of taxes, 50 cents gives a whole number, too, but what do you do for the other 98% of the time? Now, just for fun, let's talk foreign currencies (how much is the peso in U.S. dollars?). You need at least 4 decimals. The accounting system I develop value-added mods for typically allows 5 decimals for financial values, and we've run into situations where this isn't enough for our clients.

      Now, how the system stores that value may be up for grabs, but from what I've seen of the specs, they use a floating point.

      --
      Sure I'm paranoid, but am I paranoid enough?
    5. Re:Common mistake by PD · · Score: 1

      The answer to your questions: use BCD. When you've got to do math on your numbers and you can't have rounding errors, then you need BCD.

      Floats are never the answer for storing money values. Sometimes you might have to use floating point math, but every effort to avoid it should be taken.

      Look at this

    6. Re:Common mistake by chthon · · Score: 2

      Yep, like in Cobol. I think that's the main reason that financial institutions keep a whole lot of Cobol code and associated hardware around. The closest language that I have found up to now to handle such numbers is Oracle pl/SQL. Ada does have the possibility to specify the precision of a number, but I am not sure that it reverts to BCD based library to do arithmetic with those.

      Any one who knows other language with the same capabilities in Cobol ?

      Btw., about Intel Coprocessors again, you can use BCD numbers, but they are translated back and forth between 80-bit FP representation.

    7. Re:Common mistake by Anonymous Coward · · Score: 0

      Haskell has built in arbitrary precision integer and real types. I think Common LISP and Scheme have similar features too. I doubt they store the numbers as BCD, but that doesn't make much difference to the programmer.

    8. Re:Common mistake by PD · · Score: 1

      There are C++ libraries that implement BCD. A great use of operator overloading.

    9. Re:Common mistake by larry+bagina · · Score: 1
      C++. Or any object-oriented language that would let you specify a class to store that information.

      x86 has instructions for BCD addition/subtraction (and conversion, IIRC), dating back to the 8086 w/o an FPU coprocessor. The 6502 (apple II, commodore, Nintendo) had a setting for if math (add subtract) was BCD or normal.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    10. Re:Common mistake by ralphclark · · Score: 2

      Well, duh. Could there really be any programmer working for a living anywhere in the world who doesn't know that already? And you with such a low UserId too.

      Your very first college lesson on float data types should have explicitly stated that they should never be used with the equality comparison operator, so even a completely-wet-behind the-ears rookie should know it.

    11. Re:Common mistake by Ian+Bicking · · Score: 1
      I suppose that VAT may be collected differently than taxes in the US, since it's taken from the total, rather than added to the total. In the US I believe most things get rounded to the nearest cent. In cases where you are aportioning the money (like VAT), then you want to be sure the divided money adds up to the same total as the original total, so if you are dividing a dollar into thirds you want to come up with $0.33, $0.33, and $0.34 -- somebody gets a little extra, maybe it doesn't matter who, but the total is maintained. I don't think accountants would like it at all if this wasn't always, completely true -- unfairness is better than losing money to the ether.

      I'm sure there are some transactions where a one cent rounding per transaction could add up to significant money -- maybe you need more decimal points. But you still have to round at some point (unless you are using rational numbers), and you want to maintain totals quite strictly. Floating point values are not predictable when it comes to this rounding.

    12. Re:Common mistake by PD · · Score: 1

      You've also got such a low userid, that I have to believe that the only reason for your rudeness would be a mental illness of some kind.

      If "everyone" (and just who is everyone exactly) knows about floating point money values, then why am I working on code right now (owned by someone who should know better - think New York finance house) that has all sorts of float money values? Seems to me that this increasingly hypothetical "everyone" that you speak of missed class that day.

    13. Re:Common mistake by metamatic · · Score: 1

      European regulations for VAT (amongst other things) require 4 decimals for amounts of currency, with computation done to 5 decimals.

      However, they require that accuracy regardless of the magnitude of the number, so floating point is still the wrong solution. The right answer is to use fixed-point BCD with five decimals and round to four for display.

      (I used to write business finance software.)

      --
      GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
    14. Re:Common mistake by ralphclark · · Score: 1

      Sorry, I didn't really mean to be rude. "Mental illness"? Goodness me, you are a little tetchy aren't you. I suppose you must have some "issues" with this particular topic.

    15. Re:Common mistake by p3d0 · · Score: 1
      Yep, people do use floats for money. I have seen it.

      And it's not about the equality comparison operator either (which, by the way, is just fine if you use it right). It's because there is no finite binary representation for 0.01, so all your dollars-and-cents values will have rounding error, as will your percent-interest values, and all the other things that use decimal fractions.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    16. Re:Common mistake by ralphclark · · Score: 1
      it's not about the equality comparison operator either (which, by the way, is just fine if you use it right)
      Hey that's crazy talk. I can't let that go unchallenged!

      The only way to use the equality comparison operator is to calculate everything at a precision level way beyond where you will truncate the result before doing the comparison. The question of just how much extra precision you need to throw away really depends on how many arithmetical operations you'll be doing on it before you get to the comparison. This is ambiguous, which is a Very Bad Thing (subsequent programmers assume the existing code works, then they re-use it in repeated calcs and BAM!).

      So no, it's just not worth the trouble. When you use floating point you should never expect to compare results for equality, instead use less-than-or-equal/greater-than-or-equal to place the result within a specified range. It's OK for representing analogue (i.e. continuous) quantities but should *never* be used for discrete quantities like money. Use a "big integer" library for that, or use normal longword/quadword integers and throw an exception if you get an overflow, depending on how likely you are to get up to the magic +/-2.17bn figure.

    17. Re:Common mistake by PD · · Score: 1

      I guess that comment was overkill. Sorry.

    18. Re:Common mistake by p3d0 · · Score: 1
      You're right, equality only works in very restricted circumstances. To say it's "just fine" is a big overstatement.

      If you do IEEE 754 math with fractions with power-of-two denominators (like 13/256 + 7/64), and you stay within the mantissa's range, then you'll get exact results, and equality comparison will work. In practical terms, this never happens, so yes, you can't rely on the equality operator.

      Maybe it would be nice if FP computations came with built-in error bars, and equality were defined with could-be-equal semantics.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    19. Re:Common mistake by Anonymous Coward · · Score: 0

      Don't count money as floating point.

      Right, the key is that money is a discrete quatity, not continuous.

      if you need values bigger than 22 million dollars, use a BCD library.

      FWIW, BCD and floating point are not exclusive. (not that you're saying that they are) TI's graphing calculators do their math with floating point BCD.

    20. Re:Common mistake by Anonymous Coward · · Score: 0

      AFAIK governments like to specify the exact computation rules for such percentage computations. The usual rule is to use fixed point (in practice BCD) with a specified number of extra digits in intermediate computations, and rounding at specific points. They never specify floating point, for the reasons mentioned earlier.

    21. Re:Common mistake by Anonymous Coward · · Score: 0

      Well, there are more cases where equality is sensible: first, if you use the same computation at different places, you can expect to get the exact same result from the exact same operands (but this requires extra compiler options on x86 with its extra intermediate precision). Second, there are iterative numerical algorithms that converge to the exact correct result, so your termination test is simply whether you got the same value as on the previous iteration.

    22. Re:Common mistake by mdielmann · · Score: 1

      Curious that you mention the European laws, that's where the software I work with originated. Like I said, though, they didn't use any fixed-point decimals. In fact, their DLL for external programming has conversion routines to go from/to their (proprietary) datatype to/from BCD. You could fix the decimal size (it would trim/pad it to match), though, and the defaults for most of their fields was 5. Now I know why.

      --
      Sure I'm paranoid, but am I paranoid enough?
    23. Re:Common mistake by Old+Wolf · · Score: 1

      I don't know why you guys are going on about BCD as if it is a godsend or something -- it's horrible. Not only is it wasteful of space (instead of 256 possible values per byte, you can only have 100), but when converting [ascii|int] [to|from] bcd, you have all sorts of possible flags:
      - what to do if it's an odd number of nibbles (left 0 ? left F ? right 0 ? right F ?)
      - what to do if it's too small or too big for the buffer you're writing into (error ? 0 ? 0-fill ? F-fill ? )
      - when reading an odd number of nibbles, do you want to start from the second nibble? or is the last nibble invalid? how do you even know in advance if it's an odd number of nibbles or an even?

      Of course you can define a set of conventions to avoid these problems, but it all makes for quite voluminous code, and then if you need to go against convention sometime then your library won't support it.

      Here's a better solution: use (*gasp*) integers. And if your language's integer size isn't big enough for what you want, then use some library that allows for arbitrarily-sized integers, such as GNUmp. Heck, if you only need 64-bit and you have a 32-bit machine, just define your own library! (stick 2 ints together). It'd take all of ten minutes of coding..

      PS. Why does the GNU 'search' page never return any results? Searching for 'mp' didn't find that; I had to look it up in the directory of all packages.

    24. Re:Common mistake by Anonymous Coward · · Score: 0

      BCD? I think that a general multi-precision integer library (or language where they are available as native) would be far better.

      But one thing that many people don't understand is that you can count money accurately in floating point. Just use floating point numbers to count the pennies, like you would an integer.

      A 64-bit IEEE float has a 52-bit mantissa (and a separate sign bit), so it accurately represents numbers up to 2^52.

      The reason it fails for decimal fractions is because it is a binary representation. If we had a currency where the subunit were e.g. 1/128th of a whole monetary unit, that would be completely accurate in floating point.

    25. Re:Common mistake by dvdeug · · Score: 1

      Ada does have the possibility to specify the precision of a number, but I am not sure that it reverts to BCD based library to do arithmetic with those.

      It has the ability to use fixed point decimal numbers.

      about Intel Coprocessors again, you can use BCD numbers

      Actually, x86-64 doesn't support the old BCD instructions in 64 bit mode, reusing those codes for other things.

  6. Here's an important one. by Apuleius · · Score: 4, Informative
  7. If you need more precision... by cfallin · · Score: 4, Informative

    Hardware floating point is only so accurate - if you need more floating point (or integer) precision, use GNU MP - a library for C with bindings for many other languages too. It came in quite handy when I wrote some cryptography code with very large numbers.

  8. Floating point operations are not that bad. by stj · · Score: 5, Insightful
    Well, I have a lot of experience with that since I've been doing numerical computations for last 7 years. First of all, it's not all that bad. With 64 bit 'double' in C, you get around 15 decimal digits of accuracy (theoretically 18, but in practice don't count on the end). You have to understand that numbers are stored in logarithmic format: mantissa and a factor to multiply it (in computers exponent of 2). If there is no overlap between two numbers in addition (that is for example one number is 1.234*2^64, and the other is 1.234*2^-15), the smaller one is always lost. The are two ways to get around it:

    extend mantissa so there is enough overlap - usually involves some kind of multiple precision libraries like mentioned in other post GNU MP and many others. I've implemented one for my own use, too. Generally means lots of overhead since there will be less than 5% of operations actually benefitting from greater precision.

    postpone such operations until there is overlap - store such numbers together and do operations on them together, too. Sometimes additions in loops will add up small parts so actually there will be overlap with big part and additions can be done with enough precision.
    On a side, interesting thing is that in computers multiplications and divisions are better (that is more accurate) than additions and subtractions because of logarithmic format.
    I know that Sun was working on a variable precision floating-point CPU. I'm not sure how that project is going and what the end effect is, but I remember it being an interesting idea.

    Multiple precision libraries are usually decent with only one problem, they are always slower by a couple orders of magnitude than regular CPU operations, so using them is just such a pain.

    --
    iThink iHate iMod
    1. Re:Floating point operations are not that bad. by mvw · · Score: 1
      But that is black art, maybe engineering, but definitely not science.

      This works because most problems in applied science and engineere are rather good behaving.

      If some numerical analyst comes up with a counter example, one can often deem that as a pathological case, without having too much a bad conscience.

      I would really like to know, if there are real world engineering examples, where simulations produced dangerous products, because the simulation was inadequate because of numerical errors. Perhaps in aerodynamics, who knows how they perform their flight simulations.

      Regards,
      Marc

  9. Python-specific, but contains useful info for all. by tdelaney · · Score: 3, Informative
  10. Intervall Analysis by mvw · · Score: 3, Informative
    Ok, known issues with floating point routines that can be fixed (unintentional pun :-) should be fixed.

    On the other hand it is clear that a finite representation of real numbers has tradeoffs. But only few seem to care about the cumulated errors.

    My experience in engineering (simulation of casted turbine blades) was that people know that bad things can occur during complex floating point calculations but the matter was too complicated to be investigated.

    Example: if during finite element simulation a timestep did not end up with a valid solution (the iterative/approximative solver of the large linear systems did not converge or even crash) just some control parameters were varied (time step, perhaps material curves) until the calculation seemed to produce some valid looking result. Needless to say, that that only obvious errors can be spotted that way.

    The strange thing about all that is, that in the last years the mathematical discipline of interval analyis has been developed. Here every number is represented with its interval of known error bounds. These error intervall are kept and updated during calculations. Thus at the end of a large complex calculation, you know the error. That is a very valuable property.

    More, in fact what one does so in many cases is not only a standard calculation but rather machine proof of error bounds.

    This offers some unique properties, e.g. for rigorous global searches.

    So we have far better technology available. Why is this stuff not used more widely?

    As far as I know, only SUN puts interval analysis enabled data types in its FORTRAN and C/C++ compilers. But I have not seen that stuff in gcc, which would have a big impact.

    Very strange.

    To whom is interested, here is a homepage of the intervals community.

    Regards,
    Marc

    1. Re:Intervall Analysis by zenyu · · Score: 1

      I know of at least one geometric library written in the 80's that used interval arithmatic when needed. I think it it fairly common to run a calculation with floating point until your error bounds get too big, then you roll back and do the math in infinite precision, or at least with as many bits as needed for your error bounds. This way 99% of the calculations use fast hardware floating point, and the 1% that needs more gets it. (You end up spending most of your time on that 1% since it's all in software, but you gotta do what you gotta do.)

      In graphics you can sometimes just look for error conditions and then go back and perturb your data with some small random values and try again. I don't know if this is done in other fields.

      Back when I took CS there was a class in numerical computing sophomore year, is that no longer a requirement?

    2. Re:Intervall Analysis by mvw · · Score: 2, Insightful
      You are right, that for many cases the exact calculation (which is computaionally more expensive) should be used only when needed.

      But how is that achieved, if?

      I guess one would go and hunt for some arbitrary precision library for integers or some intervals lib for exact error bounds.

      Think for a moment that compilers came just with integer data types and you had a to get a floating point arithmetics library every time you want to use floating point arithmetics! (I can only remember old Apple ][ integer basic, where something like that might have been really happened :-)

      Wouldn't you say that is too uncomfortable?

      So why not make arbitrary precision integer calculations and interval arithmetics part of the compilers? (A compiler switch?)

      My guess is that people would start to use these features more, if they were easy to add to existing software.

      To some extend functional languages already offer certain integer operations with arbitrary precision. But I believe one could do much better.

      Let us hope that future languages will have such extended support right built in.

      Regards,
      Marc

    3. Re:Intervall Analysis by Anonymous+Brave+Guy · · Score: 2, Informative
      So why not make arbitrary precision integer calculations and interval arithmetics part of the compilers?

      I'm sure I've read about a language where there's basically one integer type, which normally maps to a typical 32- or 64-bit value on current machines, but is subject to over/underflow tests and switches to an arbitrary precision mode dynamically. As I recall, its efficiency was comparable to an average compiled language today unless it flipped over, and obviously after flipping it got the right answer where other languages simply hit error conditions. I think the language was probably cited in a Slashdot article. Can anyone else remember reading about this?

      Not sure how well the same approach would work for floating point, but if you can do it reasonably efficiently for integral types, I guess why not?

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    4. Re:Intervall Analysis by berenddeboer · · Score: 1

      > To whom is interested, here is a homepage
      > of the intervals community.

      Both replies to "What is interval arithmetic?" gave a 404. Perhaps this answers your question: "Why is this stuff not used more widely?"

      --
      If I had a sig, I would put it here.
    5. Re:Intervall Analysis by Jerf · · Score: 2, Informative

      Don't know if this is what you're referring to, but Python after (I believe) 2.2 works this way; int calculations will transparently overflow into arbitrary precision integers.

      Theoretically one could do the same for real numbers but it's not as easy as you think. I'm not sure a library that was both practical and fully general could be produced; reals are nasty little buggers.

      In fact my intuition (normally pretty good at these things) is poking me and suggesting that it may be provable that such a library would be impossible in the general case since one can always construct a situation where any algorithm deciding how much precision to keep will decide to keep an arbitrary amount of it, meaning the calculations would take arbitrarily long, rendering the library useless. The question then is whether it's an odd corner case or something rather more likely to come up, and I suspect the latter because of the sensitivity of iterative algorithms.

      You also have the problem of doing a "pre-calculation" to decide how much precision to keep, then doing the "real calculation", which isn't impossible but would be impossible to retrofit into a language; you'd need a special language just for this library. You can imagine a scenario where you're doing all sorts of fiddly calculations, and at the very end you do one last comparision against, say, 1.0, and your number is .9999 +/- .01, so you'd need to re-do the entire calculation with more precision. This problem isn't insurmountable like the previous problem, but it would still be quite tricky to get right, and tricky to use correctly, too. (Precalculation may even turn out to be impossible, so you'd have to speculatively execute the calculations and then discover you needed more precision, which would cause even worse performance in the worst case.)

    6. Re:Intervall Analysis by joto · · Score: 2, Interesting
      I'm sure I've read about a language where there's basically one integer type, which normally maps to a typical 32- or 64-bit value on current machines, but is subject to over/underflow tests and switches to an arbitrary precision mode dynamically.

      Yes, this is pretty typical in most lisp or scheme implementations (it should have been in Python too, but for some reason isn't). Testing for overflow on e.g. x86 can be done by simply testing the overflow flag. Some 20 years ago, that might have been conceived of as fast.

      But in order to be able to switch to larger representation, there needs to be an if-test somewhere. And that means there is a branch instruction behind every addition. Not fast. Especially on todays pipelined processors. In other words, the documentation/propaganda you've seen is lying.

      However, what you loose in speed might not be that important, because, if you are lucky, you can arrange for that test to be needed anyways. This is because, when you have dynamic typing, an if-test would be needed anyways, and if you can arrange for your integers to be in some other range than pointers, then you can specualatively add (or whatever arithmetic operation you want) two fixed size integers before having tested that they really are fixed size integers, and then do something slow only if the result isn't what you should expect.

      Not sure how well the same approach would work for floating point, but if you can do it reasonably efficiently for integral types, I guess why not?

      Sure, you can do something similar. But it isn't necessarily faster or better because of that.

      First, we can't check for just underflow or overflow or NaN's, if what we really are interested in is precision. So if we want to test precision, we need interval arithmetic (or something similar). This is already slowing down stuff by at least a factor of two.

      Second, we need no just maintain this calculated precision value. We also need to monitor it all the time. This adds a lot of if-tests, slowing down the calculation even more.

      Finally, if precision is too bad, we need to be able to rollback the current calculation. Because, if we do a calculation, and find that precision is lost beyond what is acceptable, then we need to redo the whole calculation, not just the last step. I have no idea what this will cost, but it will most likely be very expensive, and certainly complex.

      My guess is that these three factors combine to make the proposed scheme rather unattractive combined with simpler solutions such as just using more bits in the first place.

    7. Re:Intervall Analysis by zenyu · · Score: 1


      Second, we need no just maintain this calculated precision value. We also need to monitor it all the time. This adds a lot of if-tests, slowing down the calculation even more.

      Finally, if precision is too bad, we need to be able to rollback the current calculation. Because, if we do a calculation, and find that precision is lost beyond what is acceptable, then we need to redo the whole calculation, not just the last step. I have no idea what this will cost, but it will most likely be very expensive, and certainly complex.


      Rolling back the current calculation won't give you much, the problem isn't underflow on the current operation. That is pretty easy to handle even in plain old C. The problem is when you calculate A & B and then do a C=A-B. Now you may have lost all your bits of precision. In order to get them back you need to recalculate A & B and you'll know when you have enough bits, but how you go about it is a combinatorial problem. Usually the N is pretty small so you could automate trying all the combinations and adding more precision until it works. But usually a couple minutes of looking at the formula along with your knowledge of the inputs and you can figure out a good way to rearrange the algebra. I don't know if I really would want this to be automatic, especially on a compilation level. A tool that looked at your code and provided alternatives and was interactive so you could grow how far up the tree the tool looked would be nice.


      My guess is that these three factors combine to make the proposed scheme rather unattractive combined with simpler solutions such as just using more bits in the first place.

      But you might not know where the problem is, or worse that there is a problem at all. Sometimes things look reasonable even when the math is all wrong. But now when I think about it you could create a compiler flag that converted all your doubles to doubles with max error. The number formatting routines could be modified to add error, then at least you would know when you were getting bad numbers. You could implement this as a C++ class as a proof of concept, just do a "#define double interval_double" then you could have this class throw some error when the error grew too large. This would make it much easier to implement rollback specific to your application, and it would be easy to at least know you were getting an error and narrowing it down...

      AND You could hold this over all the LISP, Java, TCL, etc. peoples talking about their big integer implementations. (INF precision int's are easy to implement, but this would be cleverer, and prolly impossible to do nicely in those languages. Either because they don't have exceptions or don't have overloading.)

    8. Re:Intervall Analysis by joto · · Score: 2, Informative
      Rolling back the current calculation won't give you much, the problem isn't underflow on the current operation

      That is exactly why you need rollback. I never intended this to be interpreted as rollback of the current opcode, it was intended to mean rollback as far as you really needed. But I agree that I didn't write it clearly. And I should have thought more about it before writing. With side-effects, such rollbacks would soon become very tricky to implement correctly. But if you want to increase precision automatically, there is no other way to do it.

      But you might not know where the problem is, or worse that there is a problem at all. Sometimes things look reasonable even when the math is all wrong.

      Exactly. Accept no substitute, sometimes actual use of the brain can simplify computational problems immensely.

      But now when I think about it you could create a compiler flag that converted all your doubles to doubles with max error.

      Well, I don't see why we need to #define everything. But a good interval arithmetic library would be nice. Now that boost have one for C++ (haven't tried it myself), maybe people will even start using it.

      precision int's are easy to implement, but this would be cleverer, and prolly impossible to do nicely in those languages.

      This makes no sense. If you think you can do #define double interval_double in C/C++, I see no reason why it should be harder to do in lisp. In fact, as everything else, it should be much easier in lisp, and it has most likely also been done by lispers since at least the 1950's.

    9. Re:Intervall Analysis by zenyu · · Score: 1

      Well, I don't see why we need to #define everything. But a good interval arithmetic library would be nice. Now that boost have one for C++ (haven't tried it myself), maybe people will even start using it.

      The reason for the #define is so that you can turn the use of this library off and on without changing any code. I usually call my floats Real with a typedef so that I can change their underlying representation without a define, but you can't count on this in everyone's code.

      This makes no sense. If you think you can do #define double interval_double in C/C++, I see no reason why it should be harder to do in lisp. In fact, as everything else, it should be much easier in lisp, and it has most likely also been done by lispers since at least the 1950's.

      That part is easy to do in LISP, I meant that as a jibe against Java lacking operator oveloading. Are exceptions part of LISP now though? I guess you don't need them, you can wrap your expressions in a function that takes an interval_double and deals with error conditions like a catch statement in C++ would. So I guess LISP wins the "you can do it in LISP!" arguement, but at least not the "That's been in LISP since 1832" arguement ;)

      I'll have to check out this boost interval arithmetic library, I didn't know about it.

    10. Re:Intervall Analysis by mvw · · Score: 1
      What you describe is an adapative scheme, something that kicks in automatically, that uses the fast version where possible, otherwise the slow but more accurate. Thus yielding the fastest but still accurate result. That autocontrol is more than I would ask for.

      I would already like to brace my code with some

      use_reliable_calculation {
      // old code
      }

      declaration, or flip some compiler switch, and with minimal changes to the code have the same stuff calculated slow but safe.

      After that I would like to compare results, off course. :)

      Regards, Marc

    11. Re:Intervall Analysis by joto · · Score: 1
      That part is easy to do in LISP, I meant that as a jibe against Java lacking operator oveloading

      Well, lisp is not java :-)

      Are exceptions part of LISP now though?

      What do you mean?

      Scheme was first created in 1975. Given that scheme had continuations from the start, I think it is quite likely that people in the lisp community had at least dabbled with exception-handling mechanisms earlier. Continuations is of course the ultimate generalization of that.

      I would be very surprised to hear that either MacLisp or Interlisp (neither of which I am old enough to ever have used) did not support any kind of exception-handling mechanism. But it's possible that Lisp 1.5 missed it.

      By the way, Common Lisp was standardized in 1984, and certainly supports stack unwinding and exceptions, although not general continuations, such as scheme.

      So I guess LISP wins the "you can do it in LISP!" arguement, but at least not the "That's been in LISP since 1832" arguement ;)

      I would be very surprised to find out that people haven't done transparent interval arithmetic in lisp since the dawn of time. Remember that a large part of early development of lisp was done simply to have a better environment for building computational algebra systems such as Macsyma, Reduce, and probably others. AI also was an important part of it of course. In any case, interval arithmetic seems to be an obvious idea that I am sure many people would dabble with, given early computings emphasis on math.

    12. Re:Intervall Analysis by sir99 · · Score: 1
      I'm glad you brought this up, allowing me to mention a neat alternative to interval arithmetic, namely Affine Arithmetic. Whereas interval arithmetic is a constant approximation to a function, affine arithmetic is sort of a linear approximation, which enables a much better error bound, especially for monotonic functions. Some cool properties:
      • Addition and substraction are exact: The identity x - x = 0 holds, unlike interval arithmetic.
      • The error in the output is quadratically related to the error in the input (instead of intervals' linear relation).
      • For every calculation, you can extract an estimate of the gradient, as a side effect.
      The main drawback is that computations are slower. Depending on application, the improved accuracy may or may not make up for this. I've written an affine arithmetic library, which I plan to eventually release, along with a program to graph implicitly-defined functions. I plan to use the third property above to drastically improve the root-finding algorithm. Instead of a quad/octree, I'll be able to recurse on parallelograms that much better follow the function's contours.

      Unfortunately, I've decided that it would be too much work for me to rigorously bound rounding errors, which would be necessary to get the "machine proof" of correctness you mention. I currently have my own interval library as part of the implementation, but I might use Boost's, if it works well enough.

      Actually, the other reason I haven't tried to bound rounding errors is that rounding control is broken in GNU Libc on pretty much every architecture.

      --
      The ocean parts and the meteors come down
      Laid out in amber, baby.
    13. Re:Intervall Analysis by Anonymous Coward · · Score: 0

      The Common Lisp Condition System dates back to at least the early 1990s, and THROW/CATCH are much older.

  11. The world is not all float by Jouni · · Score: 3, Insightful
    Most desktop architectures have gone all the way to push wide bands of parallel processed float and double calculations through the pipes, but the mobile world is a whole different story.

    PDA level mobile FPUs are very rare indeed. In practice, devices using the ARM family processors have no hardware float support. It's thus very important for developers to understand floating point intimately, so that they won't be left at the mercy of awful compiler-emulated floating point code. Of course, in those cases most code tends to orient itself for fixed point arithmetic. Fixed point calculations are much better suited for the integer crunching power of, say, the Intel XScale.

    There are also good tradeoffs developers can make between floats and fixed point, for example by using block floating point (BFP) formats, where a whole block of values shares the same common exponent.

    Now that 3D is really coming to mobile devices, plenty of people will get first-hand experience of emulating floating point for the first time since the 80's. :-)

    Jouni

    --
    Jouni Mannonen | Game Designer, Consultant
    1. Re:The world is not all float by BiggerIsBetter · · Score: 1

      Now that 3D is really coming to mobile devices, plenty of people will get first-hand experience of emulating floating point for the first time since the 80's. :-)

      I don't think so. Enough research has been done into using 3D hardware as an general purpose FPU that the next generation of PDA display chips will probably take care of this anyway (if needed at all). If the choice is between an FPU for software 3D vs real hardware 3D, PC history has shown that the better answer is the latter.

      --
      Forget thrust, drag, lift and weight. Airplanes fly because of money.
  12. Any collegel level engineering numerical methods by alyandon · · Score: 3, Insightful

    Any college level engineering numerical methods course will teach you all the pitfalls involved with using floating point calculations on modern processors and how to minimize the impact of rounding errors (cumulative and otherwise) on your calculations.

    Hell, any decent numerical methods book should cover stuff like that as well.

  13. Lahey on inexactness by DSP_Geek · · Score: 2, Informative

    The inexactness portion of his argument is quite wrong. His example claims single precision floating point only allows for 8K values between 1023.0 and 1024.0. Consider that under IEEE-754 the numbers would be represented respectively as 1.99904875 * 2^9 and 1.00000000 * 2^10, _with a full 24 bits of precision in the mantissa_, thus ensuring the number of possible values between 1023.0 and 1024.0 actually reaches 2^24.

    Francois.

    1. Re:Lahey on inexactness by DSP_Geek · · Score: 1

      Ooops, that's actually 2^23 values between 1023.0 and 1024.0 (IEEE-754 single precision mantissa is actually 23 bits, extended to 24 by considering the leading bit to be 1 when the number is normalised).

      Francois.

    2. Re:Lahey on inexactness by Anonymous Coward · · Score: 0

      You are wrong on this point.

      1024 in hex: 400
      1023 in hex: 3FF

      1023 has 10 significant bits, 1024 has 1.

      In FP:
      1024.0: mantissa of 1.0, unbiased exponent of 10
      1023.0: mantissa of 1.1FF, unbiased exponent of 9

      standard single precision floating point: 1 bit sign, 8 bits biased exponent, 23 bits mantissa.

      1023.0 (minus the implied one) mantissa:

      111 1111 1100 0000 0000 0000

      1023.0 becomes 1024.0 when that mantissa rolls over, so you have about 14 bits to play with.

      So, you have between 0000 and 3FFF, which is about 16k. Either I missed a bit somewhere, or Lahey made the mistake of allotting 11 significant bits to 1023.0, but 16,383 is a *far* cry from 2^24.

  14. Huh? by joto · · Score: 4, Insightful
    I tried some of the examples in these articles with Intel's Fortran Compiler and g77 and noted that some of those issue reported no longer seem valid whereas quite a few still very much are around.

    Would you mind tell us what those "issues" where. Because the articles hardly deal with "issues" at all. What they deal with is the theoretic limitations that must exist in floating point, due to the fact that we have finite hardware, while real analysis assumes infinite precision. This should not have changed between 1991 and now (especially, since we have all standardized on IEEE floating point formats, but even if the article was from 1960, you should easily be able to "translate" it to your favourite floating point format (which is probably IEEE)).

    Could someone, please, give me a pointer to some newer thoughts and/or new facts surrounding floating point programming.

    There are very few new thoughts with regards to floating point programming, just as there are very few new thoughts on the use of "if-then-else"-branches or "while"-loops. Floating point programming is basically a solved problem. The only problem with it is that it sometimes flies in the face of intuition, and most programmers are ignorant about it. This has not changed since 1991 either.

    The articles you mentioned are very good articles for understanding issues surrounding floating point. Just make sure you read them with your brain, instead of just feeding your favourite compiler with any examples you see.

    What has been improved since those articles were written?

    Speed. Computers have become faster. (It's possible that there also have been some minor software improvements such as an ISO C addendum clarifying tricky areas with rounding modes, or something like that.)

    What is still the same?

    Essentially, nothing have changed.

    How is the future, especially with the new platforms IA64 and AMD64?

    Very predictable. Nothing will change there either. Non-IEEE floating point vector instructions, or "multimedia" instruction sets will probably continue to be unstandardized and platform-dependent.

    I am most interested in the x86 and x86-64 architectures

    There is nothing special about those architectures with respect to floating point (well, the x86 reuses its floating point registers for MMX instructions, but you shouldn't need to know that unless you use assembler).

    1. Re:Huh? by H*(BZ_2)-Module · · Score: 2, Informative

      IEEE 754/854 has not changed for some time now, but it does have some problems and a revision is currently being worked on. See http://grouper.ieee.org/groups/754/revision.html.

    2. Re:Huh? by greppling · · Score: 1
      There is nothing special about those architectures with respect to floating point. (Talking of x86 and x86-64 architectures.)

      Well there is one thing that may be a nasty surprise: The fact that x86 processors use registers with 80 bit precisions can mean that two absolutely identical looking computations, when compiled (e.g.) with an optimizing C compiler, can lead to non-identical results. That's because just storing a result from register to memory changes the result (truncating it).

      (If using GCC, you can use -ffloat-store to avoid this problem.)

    3. Re:Huh? by turgid · · Score: 1

      Very predictable. Nothing will change there either. Non-IEEE floating point vector instructions, or "multimedia" instruction sets will probably continue to be unstandardized and platform-dependent.
      This shouldn't worry you on x86 since 3DNow!, SSE and SSE2 all conform to the IEEE spec.

    4. Re:Huh? by Anonymous Coward · · Score: 0

      Your dosage of attitude relative to useful information is too high. Consider being nice - it makes you seem smarter.

      As others have pointed out, x86 can be different in that internally it uses 80 bits. This is the most likely reason why some of the examples the original poster tried didn't fail as expected.

    5. Re:Huh? by Anonymous Coward · · Score: 0

      I dunno, smart knowledgeable people who answer stupid questions nicely and with kindness are so rare that I'd probably end up wondering if they really knew what they were talking about.

    6. Re:Huh? by G3ckoG33k · · Score: 1

      "Floating point programming is basically a solved problem." -- joto, 2003

      "640 K ought to be enough for anybody." -- Bill Gates, 1981

  15. Nooooooooo! by Anonymous+Brave+Guy · · Score: 3, Funny

    It's past 1am and some **** is throwing inexact representations and fuzzy logic at me.... This must be a nightmare... Must... wake... up... Aaaaaargh!

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  16. Real world and catastrophic failures by Anonymous+Brave+Guy · · Score: 4, Interesting
    I would really like to know, if there are real world engineering examples, where simulations produced dangerous products, because the simulation was inadequate because of numerical errors. Perhaps in aerodynamics, who knows how they perform their flight simulations.

    I've worked on a couple of projects where this is very important. One was writing control software for metrology equipment, industrial strength QA kit that measured manufactured parts down to fractions of a micron or even nanometres to make sure they were in spec. Another was a geometric modelling tool used in CAD applications and the like.

    In neither case am I aware of any physical real world failure caused by a problem with the floating point calculations. You do have to be really careful with manipulating the numbers, though.

    For example, the loss of significance when you subtract can be horrible if you've got two position vectors close together, and you're trying to calculate a translation vector from one to the other. The error in that translation vector can be enormous if the points you started with were very close: you might get only one or two significant figures, when the rest of your values have 15 or more. If you're interested in the direction of the vector, that can give you errors of +/- several degrees!

    Inevitably, there are always going to be bugs in complex mathematical software, and I've seen plenty of wrong answers from programs like the above. Fortunately, it's normally possible to have checks and balances that at least identify and highlight inconsistencies so, in the worst case, at least nobody relies on them. You can also use ruthless automated testing procedures, which run zillions of calculations every night and flag the smallest changes in the results, so no-one accidentally breaks a verified algorithm with a change later. The combination makes it reasonably unlikely that any algorithm would fail catastrophically with the sort of consequences you're talking about.

    The possibility is always there, of course, because programming is subject to human error. However, FWIW, I've worked on software that's used to design cars, and software that controls the QA machinery to make sure they're put together right, and I still drive one. :-)

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    1. Re:Real world and catastrophic failures by Anonymous Coward · · Score: 0

      The possibility is always there, of course, because programming is subject to human error. However, FWIW, I've worked on software that's used to design cars, and software that controls the QA machinery to make sure they're put together right, and I still drive one. :-)

      Heh, which one? I worked in the office where the math library for one of them was made and there were a couple hundred known bugs. There were work arounds in the code, that usually worked. At the time there was an effort to merge the fork of the library used for the Car company product and the other big product, but it was slow going because of the disperate work arounds in the code. Using one or the other would often break the product the work around wasn't made for. There were like 6 mathematicians per programmer too, a mathematician could take a couple weeks just to change one line of code.

  17. Arbitrary length by Tablizer · · Score: 2, Interesting

    One interesting approach I have seen is the use of strings to store almost arbitrary decimal positions. You can set a maximum length, at which point it rounds. But the nice thing is that the rounding is done in the decimal number system instead of binary, so it is closer to how business managers expect it to be rounded (like you would do it on paper). This approach obviously is not ideal for scientific computing, but is geared toward business uses where rounding accuracy is more important than speed. PHP used to include the "BC" library that did this kind of thing. I don't know what happened to it.

    1. Re:Arbitrary length by metamatic · · Score: 1

      Right, that's what's known as the correct approach :-)

      Floating point is basically a convenience for people who don't know (or don't care to work out) how accurate they need the answer to be or what the range of input will be. It was also convenient years ago when computers lacked the power to deal with multiword numeric representations. These days, unless you're doing *really* heavy number crunching, there's not much point using floats.

      In fact, real numbers are a mathematical abstraction of questionable relevance to the real world. Given the size of the known universe, we could quite easily represent all length measurements as 256 bit integer multiples of the Planck length. The resulting figures would be as accurate as reality itself--or rather, they would be as accurate as the initial measurements they depended on.

      --
      GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
    2. Re:Arbitrary length by tg5027 · · Score: 1

      Yep, I was doing that back in 1967 on an IBM 1130 doing commercial apps.

  18. News Flash: Go To considered harmful by bellings · · Score: 5, Funny
    I'm assuming that sometime in the next week one of the slashdot editors will be trolled with an article like:
    I'm rather new with programming and stumbled across the article Go To Statement Considered Harmful from 1968. I tried some of the examples in this article, and noted that some of those issue reported no longer seem valid whereas quite a few still very much are around. What has been improved since the article was written? Will the new 64-bit architectures finally fix all the problems with Go To Statements, or is this something that the hardware designers still need to work on?
    --
    Slashdot is jumping the shark. I'm just driving the boat.
  19. picking nits by soulcutter · · Score: 1
    Because the articles hardly deal with "issues" at all.

    The articles you mentioned are very good articles for understanding issues surrounding floating point
    (emphasis mine)

    Great post, though. You're absolutely correct.
    --
    --
    Old programmers don't die, they're just cast into a void
  20. What about BCD by oliverthered · · Score: 1

    BCD (binary coded decimal) is floating point too and perfectly accurate.

    --
    thank God the internet isn't a human right.
    1. Re:What about BCD by sir99 · · Score: 1

      Really? What's the perfectly accurate BCD representation of 1/3?

      --
      The ocean parts and the meteors come down
      Laid out in amber, baby.
    2. Re:What about BCD by larry+bagina · · Score: 2, Interesting

      in lisp 1/3 is stored as 1/3. Maybe the rest of the computer languages will catch up some day.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    3. Re:What about BCD by haystor · · Score: 1

      Yea, until emacs-lisp handles rationals the world is not complete.

      Oh, there is some other language than common lisp and emacs-lisp?

      --
      t
  21. More time to research the accuracy ... by Anonymous Coward · · Score: 0
    For 32 bits machines: float of 32 bits and double of 64 bits.

    For 64 bits machines: float of 64 bits and double of 128 bits.

    For doubles of 128 bits, is better to grow more bits of precision for mantisse and less bits for exponent.

    what is the accuracy of tan(tan(tan(tan(tan(tan(tan(tan(tan(tan(sqrt(PI))) )))))))) at double of 128 bits?

    More precision implicates more slow the computation of "cosino"!!!

    open4free

  22. Overloading...(Re:Intervall Analysis) by joto · · Score: 1
    Oh yeah, I forgot about your question about overloading.

    Overloading only makes sense in statically typed languages. When you have dynamic typing, the compiler can't statically determine the types of the arguments, this needs to be done at runtime (at least in the worst case, see below). The "+" function in lisp is just that, a function (although the compiler will typically inline parts of it). And you can override it (well, many implementations would disallow that for reasons of speed, but you can always use a new name, such as "add", "sum", "plus", "my+", or something similar. Since lisp uses the same syntax, whether the symbol looks like an operator, this isn't as bad as it sounds, writing (plus a b) or (add a b) is not much worse than simply writing (+ a b).

    The reasoning behind not letting you override the built-in "+" is simple. "+" needs to be at least partially inlined to get good speed. Furthermore, good lisp compilers (such as cmucl, sbcl, Franz Lisp, etc) all have insanely advanced compilers that can do type-inference to analyze your code and try to find guarantees for e.g. numbers used as arguments to "+" always being fixed-size integers, thus removing any kind of run-time testing, and allowing the same speed as statically typed languages. Common Lisp has a number of primitives for adding type declarations to expressions to help the compiler in this process.

    If you must insist on using the name "+", you would typically save your old "+" function under a new name, and use it to define a new function called "+" that does the appropriate thing for all kinds of arguments, including your newly defined data-type. Needless to say, this is not particulary efficient, but it's the most obvious way to do it in a dynamically typed language. If you use namespaces, you can ensure that your new enhanced "+" function will only be used in those modules that need it.

    If common lisp was designed today, it would probably have a much more integrated object system, making built-in functions such as "+" being methods (some scheme implementations do this). Then, instead of overriding the fast built-in "+" to call a slow user-defined function, you would simply define new methods for "+" and your new datatype.

    This means that fast things remain fast, even when you extend the meaning of "+" to new datatypes. Under the hood, this will be implemented as a fast "+" for builtin types, such as fixed-size integers, and anything else, such as complex numbers, interval arithmetic, strings, etc, going through the object system.

    Adding new "slow" methods would not disturb the fast stuff, but since method lookup would typically go through a hash-table (vtables are only useful for "static" classes where you can't add new methods at runtime), it could potentially slow down other "slow" methods, but not by much, hash-tables are pretty fast regardless of size.

    Unfortunately, according to the common lisp standard, "+" is a function and not a method. And typically, most implementations will fight hard to let you override it (I am not entirely sure what the standard says about it). Your best bet is to simply use a different name then "+", together with the CLOS object system. This is also more in line with the classic lisp philosophy, which often provides a "fast" function for simpler arguments, and a "slow" for more generic arguments, with different names. But ideally, we want a new funky fully object-oriented lisp.

    I am not sure what python, ruby, javascript, or other more modern object-oriented scripting languages do here, but I would guess most of them allow you to do what you want (at least I'm pretty sure javascript will do the trick).

  23. Re:Any collegel level engineering numerical method by Anonymous Coward · · Score: 0

    Hmm, you had a good course. I took my undergraduate numerical methods course from the math department at my school, and he pounded home that on everything we did -- we needed to be able to put a bound on the error. I thought it was a great course.

    Later, in graduate school (this time in physics), I had a similar course, but now the instructor spent very little time talking about error bounds. In fact, nobody I talked to in physics talked about that sort of stuff, it wasn't just something funny with my instructor. I don't know a single physicist [at least in my field...] who thinks about this stuff. It strikes me as wierd.

    I don't blame it particularly on my instructor, because he's done amazing number crunching in his time, and he taught us some fabulous things. I lament that I've lost some of those error-bounding skills I had developed, and when I go to conferences and see people present results, I occasionally ponder if any of these new results are the result of subtracting one giant number from another giant number, and then doing the wrong thing with that result.

  24. IEEE FP is the peril by 73939133 · · Score: 2, Interesting

    Float point is a well-defined and easy to understand representation. Of course, that doesn't mean it's easy to use--mathematically, it can be pretty complicated to deal with at times. Perhaps the biggest sin is to think of floating point numbers as "real numbers"--they aren't.

    Unfortunately, IEEE 754, the most widely used floating point standard, fixes none of the complexities of using floating point but creates many completely unnecessary complexities of its own. Many CPUs just give up and throw any kind of specialized IEEE features into software, making them nominally compliant but unusable. And many programming languages refuse to implement the inane and broken semantics specified for IEEE comparison operators.

    The only good thing that can be said about IEEE 754 is that even a lousy standard is better than nothing at all. And, on the bright side, you can usually put CPUs and compilers into modes where they behave somewhat sanely (no denormalized numbers, sane comparisons, no NaNs).

  25. Counting money by jvalenzu · · Score: 1

    You can usually count my money using a 1-bit integral type with 1-bit for error checking.

  26. Only one R in "integer" by Anonymous Coward · · Score: 0

    "Interger" is not a word.

    HTH.

  27. Who cares? by Anonymous Coward · · Score: 0

    Maybe I'm just not experienced enough with certain areas of programming (quite possible), but who cares if the 30th decimal point is rounded? As you add each decimal place to the right, you increas the precision by a factor of 10.

    After all, 1.0000000000 is 100 billion times more accurate that 1. Why would you need any more precision?

    1. Re:Who cares? by Anonymous Coward · · Score: 0

      Obviously you didn't read this link as suggested by a previous submitter.

  28. It's all still true. by sbaker · · Score: 2, Insightful

    Those articles are still quite valid - and will remain so.

    So long as a float is still 32 bits and a double 64, you'll get about that degree of precision. It's not that the hardware is inaccurate - they all do pretty much the best they can with the information provided.

    Roundoff errors and other evils of floating point representations are here to stay.

    However, you can't just automatically decide to punt and use fixed point arithmetic. There is a 'tension' between dynamic range and precision. If you want reliable precision, you can't have large dynamic ranges for your numbers and vice-versa.

    The biggest and best improvement we've seen since the early '90s is that doing your work in double precision is much less of a penalty than it used to be (when compared to working in single precision or integers).

    With 64 bit machines, we should expect that penalty to become yet smaller.

    So if speed is an issue, modern machines can be more precise - but if speed was not an issue, machines of the early '90s were every bit as precise as the latest wizz-bang 64 bit CPU. IEEE math hasn't changed much (at all?) in that time.

    --
    www.sjbaker.org
  29. Oh but it gets worse.... by Anonymous Coward · · Score: 0

    More importantly, because some processors
    do their FP at 80bits and others at 64bits,
    you can't depend on answers to even simple
    FP math coming out bit identical on two
    machines ( not even between certain Intel
    parts ) which really sucks if you are
    developing anything peer-to-peer with
    the expectation of synchronicity.
    The IEEE "standard" is an insult to the word.
    It allows for uniform FP programming, but they fell short of drawing any useful lines in the saned w.r.t bit patterns and other impl details.