Floating Point Programming, Today?

← Back to Stories (view on slashdot.org)

Floating Point Programming, Today?

Posted by Cliff on Thursday June 26, 2003 @08:42AM from the learning-new-techniques dept.

An anonymous reader asks: "I'm rather new with programming and stumbled across these twe articles: The Perils of Floating Point from 1996 and What Every Computer Scientist Should Know About Floating-Point Arithmetic from 1991. I tried some of the examples in these articles with Intel's Fortran Compiler and g77 and noted that some of those issue reported no longer seem valid whereas quite a few still very much are around. Could someone, please, give me a pointer to some newer thoughts and/or new facts surrounding floating point programming. What has been improved since those articles were written? What is still the same? How is the future, especially with the new platforms IA64 and AMD64? I am most interested in the x86 and x86-64 architectures. Thank you for your kind help."

22 of 111 comments (clear)

Min score:

Reason:

Sort:

The articles are quite up-to-date by Mik!tAAt · 2003-06-26 08:48 · Score: 5, Informative

Both articles are still valid today, mostly because current processors use the same IEEE floating point format than the ones available in 96 (or 91).

--
This is the place where you write something that will make you seem like a complete idiot.
1. Re:The articles are quite up-to-date by jmccay · 2003-06-26 12:25 · Score: 2, Informative
  
  You should probably give a little more detail. For those that don't know, IEEE floating point is basically a number expressed in a form similar to scientific notation (although there are serious differences in what must be done) expressed in powers of -2 (x**-(n**2)if I remember my FORTRAN right). An example of a number that cannot be expressed in IEEE floating point (if I remember correctly) is .1. You can approach the number, but you never really reach the number.
  If you want to avoid the error, the best thing to do is to use interger math and convert to decimal at the last moment. Make sure you leave plenty of room in the number so as to avoid rounding errors. I rarely see the problem, but I usually only go out 2 decimal places (3 tops) because a lot of what I was workin with was financial numbers at my last job. After rounding, there usually wasn't a problem. A lot of the financial institutions use integer math in the software to avoid potential problems. If the program is mission critical, you're better off using interger math unless you don't need great precision.
  As a side not, the last time I check mozilla's version of javascript uses floating point notation to store dates. Because of this, some platforms will experience date problems from time to time with java script. Unless, they have changed it to an integer type for storing the number representing the date and time, a date may be wrong from time to time. To test it whether or not mozilla has a problem you have to create a date object with a specific date and time that you know can't be represented in IEEE floating point notation. I don't have the patience to do that right now, but they may have changed it because I haven't looked at the code since before July 26, 2002 when I was laid off. It really is only a problem for certain instances in timeon certain dates, but I think the precision required to represent a date and time as a number really require it to be implemented with some variation of interger mathematics.
  I was looking at the code for javascript because my last company was looking into it for scripting purposes for their product. I got laid off and the project died.
  
  --
  At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
Unsolvable problem by Anonymous Coward · 2003-06-26 09:01 · Score: 5, Informative

Floating point stuff hasn't really changed much since then. Basic rule of thumb, if you want it to be accurate don't use floating point.

Much the same problem as you have with decimals. Many fractions cannot be evaluated evenly in certain bases. It will always cause you headaches if you don't realize this.

Try writing a bunch of numbers in hex but then do all of your calculations in decimal. you'll have the same problem.
1. Re:Unsolvable problem by john_many_jars · 2003-06-26 10:06 · Score: 4, Informative
  
  The use of floating point numbers isn't all bad. Those of use who use them are often solving problems with condition numbers that render the answer we get less accurate than the number of digits of accuracy provided.
  
  Think about tan(89.99) versus tan(89.991) (which is very ill-conditioned around 90). Both numbers are not terribly truncated by floating point, but the results are different by about 1,000. Try it and you'll see floating point error isn't as dangerous as things like cancellation, ill-conditioning and the like.
2. Re:Unsolvable problem by Phronesis · 2003-06-26 13:51 · Score: 3, Informative
  
  Think about tan(89.99) versus tan(89.991) (which is very ill-conditioned around 90). Both numbers are not terribly truncated by floating point, but the results are different by about 1,000. Try it and you'll see floating point error isn't as dangerous as things like cancellation, ill-conditioning and the like.
  tan(89.990) = -2.0460 tan(89.991) = -2.0408
  perhaps you're thinking of
  tan(1.571) = -4909.8
  and
  tan(1.578) = -138.8
3. Re:Unsolvable problem by Froggie · 2003-06-27 04:07 · Score: 2, Informative
  
  An interesting point is that if you do integer calculations that you expect to work with perfect accuracy on a 32 bit integer, then they will also work with perfect accuracy on a float with a 32+ bit mantissa.
  
  Quite useful if you're adding integer numbers together on a 32 bit machine in C and you want the carry bit, for instance (and you're too lazy to write the code entirely in integer arithmetic): you can't easily find the sum and carry bit if you're using 32 bit ints, but it's trivial if you have a larger FP type (e.g. a double).
  
  This is not to say FP numbers are better than ints, but if you know what you're doing you can do anything with an FP type that you can with an int type. The cost is usually the slowness of the FP operation - highly unlikely to take 1 ALU cycle, even on modern processors.
4. Re:Unsolvable problem by Admiral+Burrito · 2003-06-27 14:43 · Score: 3, Informative
  
  Try writing a bunch of numbers in hex but then do all of your calculations in decimal. you'll have the same problem.
  
  Actually, you won't. You would the other way though.
  
  The problem occurs when you try to represent a (properly reduced) fraction whos denominator has one or more prime factors not in common with your number base.
  
  You can represent one tenth in base 10 because all the prime factors in the denominator (10: 5,2) are found within the factorization of the base (also 10: 5,2). You can not represent one sixth in base 10 because one of the factors of 6 is not found in the factorization of 10 (3). Likewise, you cannot represent one tenth in base 2, because the denominator (10) is a multiple of 5, which is a prime not found in the factorization of the base (2).
  
  Because the factorization of 16 contains only primes that are in the factorization of 10 (2) all fractions that can be represented in hexadecimal can be represented in decimal. The reverse is not true, because 10 is the product of a prime (5) that is not found in the factorization of 16. So there is no way to get the "fifths" aspect of a decimal number into a hexadecimal number.
5. Re:Unsolvable problem by Anonymous Coward · 2003-06-28 13:15 · Score: 1, Informative
  
  tan(89.990) = -2.0460
  tan(89.991) = -2.0408
  
  perhaps you're thinking of
  
  tan(1.571) = -4909.8
  and
  tan(1.578) = -138.8
  
  ...or perhaps he's thinking of angles in degrees. ;)
Platform and all by Stary · 2003-06-26 09:19 · Score: 5, Informative

It all depends on what platform you program on and so on. Newer x86 processors do their floating point in an 80-bit format and only truncate when copying back to your original 32 or 64 bit floats. That saves you some precision but not that much. As others have said, there are probably situations where almost all of the material in those articles is valid.

--
Tomorrow will be cancelled due to lack of interest
1. Re:Platform and all by Pseudonym · 2003-06-26 18:29 · Score: 2, Informative
  
  It also screws royally with your numerics. Take, for example:
  
  float x = something(), y = something_else(); if (x < y) assert(x < y);
  
  This assertion can fail on Intel hardware, because by the time the assert comes around, x may equal y as one or both of them have been truncated from 80 bits to 32.
  
  --
  sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Common mistake by PD · 2003-06-26 09:21 · Score: 5, Informative

Don't count money as floating point. You'll just have rounding errors. Using long doubles instead of floats won't help you at all.

The solution is to count pennies instead, or if you need values bigger than 22 million dollars, use a BCD library. BCD is Binary Coded Decimal.

--
If tits were wings it'd be flying around.
1. Re:Common mistake by PD · 2003-06-26 10:28 · Score: 3, Informative
  
  That's not the error I was addressing. Here's some definitions of a subtotal:
  
  float subtotal; // wrong way to represent money
  long subtotal_pennies; // right way to represent money
  
  And, if you're at a gas station, you need to represent money like this:
  
  long subtotal_mils; // gas per gallon has a 9/10 of a cent on the end - $1.34 9/10
  
  The calculations that you perform on the money are a completely different story. There's no point in worrying about 4 decimal places of percentages if you don't start from the right place.
  
  --
  If tits were wings it'd be flying around.
2. Re:Common mistake by mdielmann · 2003-06-26 17:12 · Score: 2, Informative
  
  You're not looking far enough ahead. What do you do when you have two taxes collected, where each is a fraction of a cent? What do you do when the governing bodies allow you to combine the fractional cents for economy, and pay the tax based on total taxable sales at the end of the month? Where I live we have two VAT taxes, both 7%. There is nowhere, short of whole dollars that that equals whole pennies except for whole dollars. Since the governing bodies allow combining of taxes, 50 cents gives a whole number, too, but what do you do for the other 98% of the time? Now, just for fun, let's talk foreign currencies (how much is the peso in U.S. dollars?). You need at least 4 decimals. The accounting system I develop value-added mods for typically allows 5 decimals for financial values, and we've run into situations where this isn't enough for our clients.
  
  Now, how the system stores that value may be up for grabs, but from what I've seen of the specs, they use a floating point.
  
  --
  Sure I'm paranoid, but am I paranoid enough?
Here's an important one. by Apuleius · 2003-06-26 09:40 · Score: 4, Informative

Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications, also available here if you're one of the unwashed masses. It has algorithms to see if your app is facing floating point trouble.
If you need more precision... by cfallin · 2003-06-26 09:44 · Score: 4, Informative

Hardware floating point is only so accurate - if you need more floating point (or integer) precision, use GNU MP - a library for C with bindings for many other languages too. It came in quite handy when I wrote some cryptography code with very large numbers.
Python-specific, but contains useful info for all. by tdelaney · 2003-06-26 09:59 · Score: 3, Informative

Floating Point Arithmetic: Issues and Limitations.
Intervall Analysis by mvw · 2003-06-26 10:02 · Score: 3, Informative

Ok, known issues with floating point routines that can be fixed (unintentional pun :-) should be fixed.
On the other hand it is clear that a finite representation of real numbers has tradeoffs. But only few seem to care about the cumulated errors.
My experience in engineering (simulation of casted turbine blades) was that people know that bad things can occur during complex floating point calculations but the matter was too complicated to be investigated.
Example: if during finite element simulation a timestep did not end up with a valid solution (the iterative/approximative solver of the large linear systems did not converge or even crash) just some control parameters were varied (time step, perhaps material curves) until the calculation seemed to produce some valid looking result. Needless to say, that that only obvious errors can be spotted that way.
The strange thing about all that is, that in the last years the mathematical discipline of interval analyis has been developed. Here every number is represented with its interval of known error bounds. These error intervall are kept and updated during calculations. Thus at the end of a large complex calculation, you know the error. That is a very valuable property.
More, in fact what one does so in many cases is not only a standard calculation but rather machine proof of error bounds.
This offers some unique properties, e.g. for rigorous global searches.
So we have far better technology available. Why is this stuff not used more widely?
As far as I know, only SUN puts interval analysis enabled data types in its FORTRAN and C/C++ compilers. But I have not seen that stuff in gcc, which would have a big impact.
Very strange.
To whom is interested, here is a homepage of the intervals community.
Regards,
Marc
1. Re:Intervall Analysis by Anonymous+Brave+Guy · 2003-06-26 12:56 · Score: 2, Informative
  
  So why not make arbitrary precision integer calculations and interval arithmetics part of the compilers?
  
  I'm sure I've read about a language where there's basically one integer type, which normally maps to a typical 32- or 64-bit value on current machines, but is subject to over/underflow tests and switches to an arbitrary precision mode dynamically. As I recall, its efficiency was comparable to an average compiled language today unless it flipped over, and obviously after flipping it got the right answer where other languages simply hit error conditions. I think the language was probably cited in a Slashdot article. Can anyone else remember reading about this?
  
  Not sure how well the same approach would work for floating point, but if you can do it reasonably efficiently for integral types, I guess why not?
  
  --
  If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
2. Re:Intervall Analysis by Jerf · 2003-06-26 14:41 · Score: 2, Informative
  
  Don't know if this is what you're referring to, but Python after (I believe) 2.2 works this way; int calculations will transparently overflow into arbitrary precision integers.
  
  Theoretically one could do the same for real numbers but it's not as easy as you think. I'm not sure a library that was both practical and fully general could be produced; reals are nasty little buggers.
  
  In fact my intuition (normally pretty good at these things) is poking me and suggesting that it may be provable that such a library would be impossible in the general case since one can always construct a situation where any algorithm deciding how much precision to keep will decide to keep an arbitrary amount of it, meaning the calculations would take arbitrarily long, rendering the library useless. The question then is whether it's an odd corner case or something rather more likely to come up, and I suspect the latter because of the sensitivity of iterative algorithms.
  
  You also have the problem of doing a "pre-calculation" to decide how much precision to keep, then doing the "real calculation", which isn't impossible but would be impossible to retrofit into a language; you'd need a special language just for this library. You can imagine a scenario where you're doing all sorts of fiddly calculations, and at the very end you do one last comparision against, say, 1.0, and your number is .9999 +/- .01, so you'd need to re-do the entire calculation with more precision. This problem isn't insurmountable like the previous problem, but it would still be quite tricky to get right, and tricky to use correctly, too. (Precalculation may even turn out to be impossible, so you'd have to speculatively execute the calculations and then discover you needed more precision, which would cause even worse performance in the worst case.)
3. Re:Intervall Analysis by joto · 2003-06-27 02:54 · Score: 2, Informative
  
  Rolling back the current calculation won't give you much, the problem isn't underflow on the current operation
  That is exactly why you need rollback. I never intended this to be interpreted as rollback of the current opcode, it was intended to mean rollback as far as you really needed. But I agree that I didn't write it clearly. And I should have thought more about it before writing. With side-effects, such rollbacks would soon become very tricky to implement correctly. But if you want to increase precision automatically, there is no other way to do it.
  But you might not know where the problem is, or worse that there is a problem at all. Sometimes things look reasonable even when the math is all wrong.
  Exactly. Accept no substitute, sometimes actual use of the brain can simplify computational problems immensely.
  But now when I think about it you could create a compiler flag that converted all your doubles to doubles with max error.
  Well, I don't see why we need to #define everything. But a good interval arithmetic library would be nice. Now that boost have one for C++ (haven't tried it myself), maybe people will even start using it.
  precision int's are easy to implement, but this would be cleverer, and prolly impossible to do nicely in those languages.
  This makes no sense. If you think you can do #define double interval_double in C/C++, I see no reason why it should be harder to do in lisp. In fact, as everything else, it should be much easier in lisp, and it has most likely also been done by lispers since at least the 1950's.
Lahey on inexactness by DSP_Geek · 2003-06-26 11:56 · Score: 2, Informative

The inexactness portion of his argument is quite wrong. His example claims single precision floating point only allows for 8K values between 1023.0 and 1024.0. Consider that under IEEE-754 the numbers would be represented respectively as 1.99904875 * 2^9 and 1.00000000 * 2^10, _with a full 24 bits of precision in the mantissa_, thus ensuring the number of possible values between 1023.0 and 1024.0 actually reaches 2^24.

Francois.
Re:Huh? by H*(BZ_2)-Module · 2003-06-26 16:31 · Score: 2, Informative

IEEE 754/854 has not changed for some time now, but it does have some problems and a revision is currently being worked on. See http://grouper.ieee.org/groups/754/revision.html.