Has the Decades-Old Floating Point Error Problem Been Solved? (insidehpc.com)

← Back to Stories (view on slashdot.org)

Has the Decades-Old Floating Point Error Problem Been Solved? (insidehpc.com)

Posted by EditorDavid on Sunday January 21, 2018 @04:34AM from the carrying-the-one dept.

overheardinpdx quotes HPCwire: Wednesday a company called Bounded Floating Point announced a "breakthrough patent in processor design, which allows representation of real numbers accurate to the last digit for the first time in computer history. This bounded floating point system is a game changer for the computing industry, particularly for computationally intensive functions such as weather prediction, GPS, and autonomous vehicles," said the inventor, Alan Jorgensen, PhD. "By using this system, it is possible to guarantee that the display of floating point values is accurate to plus or minus one in the last digit..."

The innovative bounded floating point system computes two limits (or bounds) that contain the represented real number. These bounds are carried through successive calculations. When the calculated result is no longer sufficiently accurate the result is so marked, as are all further calculations made using that value. It is fail-safe and performs in real time.
Jorgensen is described as a cyber bounty hunter and part time instructor at the University of Nevada, Las Vegas teaching computer science to non-computer science students. In November he received US Patent number 9,817,662 -- "Apparatus for calculating and retaining a bound on error during floating point operations and methods thereof." But in a followup, HPCwire reports: After this article was published, a number of readers raised concerns about the originality of Jorgensen's techniques, noting the existence of prior art going back years. Specifically, there is precedent in John Gustafson's work on unums and interval arithmetic both at Sun and in his 2015 book, The End of Error, which was published 19 months before Jorgensen's patent application was filed. We regret the omission of this information from the original article.

7 of 174 comments (clear)

Min score:

Reason:

Sort:

Reinvented interval arithmetic by Anonymous Coward · 2018-01-21 04:51 · Score: 2, Interesting

As mentioned in the summary, this sounds no different from the age old interval arithmetic. The reason interval arithmetic never took off is that for the vast majority of problems where error is actually a problem, the bounds on the error become so large as to be worthless. To fix this you still need to employ those specialist numerical programmers, so this doesn't actually get you anywhere.
Re:Built-in error bars by Anne+Thwacks · 2018-01-21 04:58 · Score: 3, Interesting

In the olden days (1970's) we did the computation in standard and double precision - if the answers were different, then you probably needed to change the method. (The underlying problem is "underflow" - you do not have any significant digits).
No special hardware or patents needed.

--
Sent from my ASR33 using ASCII
back to basics ? by swell · 2018-01-21 05:38 · Score: 4, Interesting

In the 1950s, when the public was first becoming aware of computers, computers were considered to be large calculators. They could do math. They could be used by the IRS to compute your taxes or by the military to analyze sensor inputs and guide missiles. Few people could envision a future where computers could manipulate strings, images, sounds and communicate in the many ways that we now enjoy.
But today we have all those unimaginable benefits but one: They can't really do math well. Oh, the irony!

--
...omphaloskepsis often...
Re:Built-in error bars by Impy+the+Impiuos+Imp · 2018-01-21 06:04 · Score: 4, Interesting

Every time you multiply two floats you lose a digit of precision. It's a little more complicated but that is the essence.
The butterfly effect was discovered with weather simulations. They saved their initial data and ran it again the next day -- and got a different result, which is impossible.
Turns out they only saved the initial conditions to 5 digits and not the entire float, or what passed for it in the 1970s.
Lo and behold! The downstream numbers, far from returning to the same value, diverged wildly. And no matter how small the difference, it always diverged.
Up until then, scientists had believed small differences would get absorbed away in larger trends. Here was evidence the big trends were completely dependent on initial conditions to the smallest detail.
That's why one stray photon would screw up time travel -- any difference whatsoever would cause the weather to be different in about a month and soon different sperm are meeting different eggs, and the entire next generation is different.
So any time you do more than a trivial number of float multiplies, you are in a whole different world. This is ok if you are looking for statistical averages over many, but miserable if you want to rely on any particular calculation.

--
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
Re:Built-in error bars by Cassini2 · 2018-01-21 06:46 · Score: 3, Interesting

The reason why it is not "twice the work" is because a crude interval approach only works for linear equations. Consider: y=x^2, for small x. If the lower limit is x=-0.1, and the upper limit is x=0.1, y=0.01 in either case. However, consider the case where the actual x=0, with the actual y=0. It can be seen that for -0.1 < x < 0.1, y is outside the predicted range.
For many simple systems of equations, it is possible to get good error analysis by using some Calculus. Specifically, multiply the expected error in the inputs by the derivative (or an upper bound on the derivative).
Unfortunately, most of the people working in this area are solving complex systems of equations, often involving large matrices. This makes the problem difficult to detect with computationally fast approaches. Some people use Monte-Carlo simulations to get estimates of the likely error. These situations require far more computations than double (often thousands of runs). It is also possible to do some serious error analysis to determine when the linear interval analysis applies, and when it doesn't. However, this also requires far more calculations.
Re:Built-in error bars by whit3 · 2018-01-21 09:34 · Score: 2, Interesting

instead of using one floating point value, they use two and say the real answer is between those two. If the two floats are consistent when rounded to the requested precision, it declares the value correct. If they differ, it gives an accuracy error.
So, for only twice the work and a little ovehead on top, this process can tell you when to switch to a high precision fixed-point model ...

There's flaws in the principle other than the obvious doubles-the-work feature, though. Error 'bars' only show a pair of worst-case values that is purported to represent the error, but careful measurement processes give a distribution of errors.
One can analyze a voltage and temperature tolerance range for a CPU chip; it gives rise to 2^2 = 4 vertices for testing.
That DOES fit the model proposed. Measured-accuracy is usually less precise than a floating point number, so LSB error isn't our accuracy limit.
Taking four measured numbers, each with an associated thirty-samples selection of deviants, by calculating all the combinations: 30^4 (about a million) calculations later, you know not only the result, but a collection of deviants that can be distilled down to a new thirty samples, representing the error distribution of the result..
While the LSB granularity of a computer number is a kind of 'error bar', and a calculation on that kind of error IS just two worst-case values, there's still the nagging problem that the worst cases are overestimates of the error to be expected. In a statistical sense, worst-case is always wrong for large-scale calculations (an overestimate). When the error distribution is a bell curve, there IS no defined worst-case that really represents the situation.
By the central limit theorem, we always expect the result of a many-input calculation to have a bell-curve distribution, so the result of a large calculation is NEVER well-characterized if you propogate worst-cases.
Re:Built-in error bars by Pseudonym · 2018-01-21 10:36 · Score: 3, Interesting

That's not the only flaw, of course. If you take a workhorse numeric problem like integrating an ordinary differential equation, interval arithmetic can give you a bound on how accurate the calculation is relative to an infinite-precision calculation, but not on how accurate the calculated solution is relative to the true solution. I'll give you one guess which bound is the one you actually want.
In the case of ODE solving or numeric quadrature, the thing that determines the accuracy of the solution is how well-behaved the function is in the regions that fall between your samples. Neither interval arithmetic nor this "bounded floating point" is going to help you here.
Now having said that, I did read it, and the method described is not quite interval arithmetic. It's a little more subtle than that: the programmer sets the desired error bounds on the solution, and the FPU does something like a quiet NaN if the calculation exceeds those bounds.
And yes, just like with interval arithmetic, all our floating point libraries will need to be rewritten.

the result of a large calculation is NEVER well-characterized if you propogate worst-cases.
That's true, but TFA is talking up the safety-critical aspect, which I suppose is the one place where worst-case behaviour is what you want. TFA also mentions weather simulations, and that's exactly the case where the distribution of solutions is the answer you want, not worst-case bounds.
It's a little bit interesting, but Betteridge's Law definitely wins this round. I can see this as being slightly more useful than existing techniques (e.g. interval arithmetic) in some safety-critical systems, but I'd rather just have finer rounding control on a per-variable basis. We've had SIMD instructions on commodity hardware for almost two decades now, so I don't know why we don't already have essentially-free interval arithmetic within a register.
This doesn't "solve" "the floating-point error problem", as if there's only one problem outstanding. The full-employment theorem for numeric analysts has not been violated.

--
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});