Same Programs + Different Computers = Different Weather Forecasts

← Back to Stories (view on slashdot.org)

Same Programs + Different Computers = Different Weather Forecasts

Posted by timothy on Sunday July 28, 2013 @01:26AM from the climate-change-without-leaving-the-room dept.

knorthern knight writes "Most major weather services (US NWS, Britain's Met Office, etc) have their own supercomputers, and their own weather models. But there are some models which are used globally. A new paper has been published, comparing outputs from one such program on different machines around the world. Apparently, the same code, running on different machines, can produce different outputs due to accumulation of differing round-off errors. The handling of floating-point numbers in computing is a field in its own right. The paper apparently deals with 10-day weather forecasts. Weather forecasts are generally done in steps of 1 hour. I.e. the output from hour 1 is used as the starting condition for the hour 2 forecast. The output from hour 2 is used as the starting condition for hour 3, etc. The paper is paywalled, but the abstract says: 'The global model program (GMP) of the Global/Regional Integrated Model system (GRIMs) is tested on 10 different computer systems having different central processing unit (CPU) architectures or compilers. There exist differences in the results for different compilers, parallel libraries, and optimization levels, primarily due to the treatment of rounding errors by the different software systems. The system dependency, which is the standard deviation of the 500-hPa geopotential height averaged over the globe, increases with time. However, its fractional tendency, which is the change of the standard deviation relative to the value itself, remains nearly zero with time. In a seasonal prediction framework, the ensemble spread due to the differences in software system is comparable to the ensemble spread due to the differences in initial conditions that is used for the traditional ensemble forecasting.'"

240 comments

Min score:

Reason:

Sort:

Damn you people by Anonymous Coward · 2013-07-28 01:29 · Score: 1

Why don't you use 128bit Integers to represent some form of fixed point? I highly doubt you need any more precision than that.
1. Re:Damn you people by YoungManKlaus · 2013-07-28 02:19 · Score: 2, Informative
  
  actually, that would be really good because you have a fixed spacing of values throughout the whole range which is a very important property in simulations (at least as far as I learned in numerical mathematics).
2. Re:Damn you people by Anonymous Coward · 2013-07-28 03:38 · Score: 5, Insightful
  
  Precision is the point. Mathematical chaos diverges exponentially. This means that if you have a value of 9.3440281 in one calculation and it returns 3.5 and a value of 9.344028147 in another, that you can get completely different results (where the second case returns 8.1). Now you say: well, let's just make it more precise then! So you put in the value of 9.34402814672 and get a completely different result (1.7), and so on*. If you weren't dealing with mathematical chaos, you would continually refine the values down (e.g. 3.5, 3.45, 3.467, etc.).
  * Note: I should be careful with this layman's description to point out that more precise values technically shrink the window down. But since it is exponentially divergent in the first place, this might not ever do you any good in a realistic setting. Ref Lyapunov exponents and mathematical chaos
3. Re:Damn you people by Anonymous Coward · 2013-07-28 03:55 · Score: 2, Funny
  
  For being the first person ever to use exponentially correctly on slashdot I literally award you one (1) internet.
4. Re:Damn you people by Anonymous Coward · 2013-07-28 05:24 · Score: 1
  
  To potentially make this more clear: when you have an exponential divergence in output based on difference in input differences, you need a massive change in precision of inputs for minor gains on output. It is not a matter of doubling the precision of the input doubles the precision of the output. You could double the precision of the input, and get a small fraction of an improvement of the output.
  The exponents in weather simulation works out such that it is nearly impossible to get more than two weeks prediction of weather in detail (some large scale systems are simpler though, and can go much, much further). And that is in the ideal case, 10 day forecasts like talked about here are pushing pretty far into the region chaos prevents accurate predictions, and this has been known for some time. In this case, you could double the precision of the inputs, and struggle to get more than an extra day or two out of the prediction.
5. Re:Damn you people by Anonymous Coward · 2013-07-28 05:44 · Score: 0
  
  In this case, you could double the precision of the inputs, and struggle to get more than an extra day or two out of the prediction.
  Double the precision in the sense of doubling the number of digits, not halving the size of the error. Going from something like single precision to double precision floating point numbers won't gain you an extra week of weather predictions, you'll be lucky to get an extra day out of that drastic of a change.
6. Re:Damn you people by Anonymous Coward · 2013-07-28 05:56 · Score: 0, Funny
  
  For being one of many to use literally incorrectly on slashdot, I should of given you one (1) cockpunch.
7. Re:Damn you people by Anonymous Coward · 2013-07-28 06:49 · Score: 1
  
  We also need to be careful here by noting that weather predictions are not always chaotic. Sometimes, a 10 day forecast is not affected by mathematical chaos, and more precise values have significant impacts. And even worse, chaos might not impact the entire forecast. It might only affect days 3-6, and 10, while days 7-9 are still accurate. This sounds crazy, but sometimes chaos has windows of stability.
8. Re:Damn you people by Anonymous Coward · 2013-07-28 06:50 · Score: 5, Funny
  
  For being one of the many to use should of where the correct phrase is should have (often abbreviated should've, I just point at you and laugh.
9. Re:Damn you people by fast+turtle · 2013-07-28 09:53 · Score: 2
  
  TL/RTS:
  From what I'm seeing it's a two fold issue
  1) The tool chain is violating the rules by rounding before the calculations are completed
  2) The programers have broken the rule of not rounding until your calculations
  3) The hardware does not have enough precision to actualy deal with the number of places desired by the programmers
  Someone else made the comment about going with a 128 int and for this, I see absolutely no reason that's going to be sufficient. Instead, what they need to use is a 1024 (or larger) bit Int to handle the issue
  
  --
  Mod me up/Mod me down: I wont frown as I've no crown
10. Re:Damn you people by blueg3 · 2013-07-28 12:54 · Score: 2
  
  I highly doubt you need any more precision than that.
  People doubting that you "need any more precision than that" is, roughly speaking, the origin of problems like this in the first place and, more generally, the origin of our understanding of chaos theory.
  It turns out you, ultimately, need more precision than you can get. Always.
11. Re:Damn you people by xQx · 2013-07-28 18:26 · Score: 1, Offtopic
  
  For continuing this pointless thread by highlighting your failure to close your parenthesis. I give myself (-2) redundant and off-topic.
12. Re:Damn you people by RaceProUK · 2013-07-28 23:35 · Score: 2
  
  I highly doubt you need any more precision than that.
  I'm reminded of something about 640k being enough...
  
  --
  No colour or religion ever stopped the bullet from a gun
13. Re:Damn you people by Darinbob · 2013-07-29 08:29 · Score: 1
  
  This reminds of of working with scientists in the 80s, who complained that the new compiler (using IEEE FP) was wrong because the answers did not agree with their old compiler (using VAX FP). The differences they were pointing out were much smaller than the precision that could be handled, and it felt wierd to have to point this out to people with much more mathematical eduction than I had. But because the print of the number showed 12+ digits they assumed that every one of those digits were accurate.
14. Re:Damn you people by zipn00b · 2013-07-29 08:37 · Score: 1
  
  I have no clue why but this thread is making me hungry..........
15. Re:Damn you people by uninformedLuddite · 2013-07-29 17:14 · Score: 1
  
  It's probably the cock punch. Just don't drive.
  
  --
  The new right fascists are bilingual. They speak English and Bullshit.
Have these people never heard of IEEE754???? by gweihir · 2013-07-28 01:30 · Score: 0, Flamebait

WTF are these amateurs doing? This is a solved problem and has been for several decades. Base float is solved. How to condition your computations so that order remains the same or does not impact the results is solved. Pathetic.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
1. Re:Have these people never heard of IEEE754???? by cnettel · 2013-07-28 01:35 · Score: 5, Insightful
  
  No, it isn't, when the system itself is not well-conditioned. And I bet you don't want your compiler to run a real codebase in a IEEE754 strict interpretation, as that will disallow almost any optimization. Even if you would allow it, then "trivial" rearrangements, that don't affect the theoretical analysis of stability, correctness or condition number, will still introduce different rounding perturbations. Perturb weather or some other systems, and you will get a completely different trajectory.
  That said, many applied fields, including meteorology, could benefit from more well-disciplined computational science approaches. But don't expect all that much of a difference.
2. Re:Have these people never heard of IEEE754???? by SlayerofGods · 2013-07-28 01:40 · Score: 3, Informative
  
  Yes... because that never rounds off numbers.
  https://en.wikipedia.org/wiki/IEEE_floating_point#Rounding_rules
  
  --
  
  Technology, the cause of and solution to all of life's problems.
3. Re:Have these people never heard of IEEE754???? by gweihir · 2013-07-28 01:42 · Score: 1, Insightful
  
  I was in particular thinking about the section on rounding in IEEE754. You are also overlooking that badly conditioned != behaves in a random fashion. My guess is they did not involve the numerics people in the optimization process, which is a complete fail when you know your problem is not well conditioned.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
4. Re:Have these people never heard of IEEE754???? by Goaway · 2013-07-28 01:42 · Score: 4, Insightful
  
  When floating point roundoff errors grow big enough to affect the outcome of the simulation, you have long since reached the point where you are not predicting anything useful any longer. It is not exactly a problem if the results differ at that point.
5. Re:Have these people never heard of IEEE754???? by cnettel · 2013-07-28 01:47 · Score: 5, Informative
  
  It doesn't help you that individual operations are rounded deterministically, if the order of your operations is non-deterministic. You cannot expect bit-identical results if you parallelize or allow any level of operation reordering. Even a very well-written code might implement a reduce operation in different hierarchies depending on memory layout. Enforcing all these things to be done in the exactly same order, with full IEEE754 compliance is a significant performance cost. By taking numerical aspects into account, you can ensure that your result is not invalid or unreasonable. However, for a chaotic problem where a machine epsilon difference in input data might be enough for a macroscopically different end result, there is nothing you can do and still expect reasonable utilization of modern architectures.
6. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 01:49 · Score: 1
  
  That is the problem when people start compiling with things like --ffast-math.
7. Re:Have these people never heard of IEEE754???? by EvanED · 2013-07-28 01:58 · Score: 1
  
  I wish I still had my mod points from a few days ago, because this post deserves some.
8. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 02:06 · Score: 2, Interesting
  
  WTF are these amateurs doing? This is a solved problem and has been for several decades. Base float is solved. How to condition your computations so that order remains the same or does not impact the results is solved. Pathetic.
  I ran into this once when working on support for an AIX compiler - got a bug report that we were doing floating point wrong because the code gave different results on AIX than some other machine (HP I think). After looking into it, it turned out that the algorithm accumulated roundoff errors quite badly, and basically wasn't working right on _any_ platform, but would give different results due to slightly different handling of round-off on the different platforms.
  The problem is, this kind of code is very often written by scientists, who have most likely never heard of this issue, or forgot about it, or thought they handled it right but didn't - it's not their area of expertise, so it's not surprising if you think about it. I only hope that for engineering software that designs bridges, airplanes, etc, they realized that they better have it looked over by someone who knows what they are doing.
  BTW, this is one reason why I take all the global warming predictions with a big grain of salt - they are all based on computer simulations which are difficult if not impossible to validate, and given what I've seen, I don't trust the results from them at all.
9. Re:Have these people never heard of IEEE754???? by swilver · 2013-07-28 02:19 · Score: 1
  
  They didn't predict the rain correctly yesterday here, that's why I believe those predictions are obviously incorrect.
10. Re:Have these people never heard of IEEE754???? by korgitser · 2013-07-28 02:22 · Score: 2
  
  So are you saying that enforcing predictable and correct answers has a significant performance cost?
  
  --
  FCKGW 09F9 42
11. Re:Have these people never heard of IEEE754???? by amorsen · 2013-07-28 02:56 · Score: 2
  
  WTF are these amateurs doing?
  Enjoying decent performance. Doing weather forecasts slower than real time is a lot easier but somewhat less useful.
  My interpretation of the abstract (I cannot access the actual paper) is that they could not show that any particular compiler or architecture made the predictions any better, just different. In that case you just go with whichever runs fastest.
  
  --
  Finally! A year of moderation! Ready for 2019?
12. Re:Have these people never heard of IEEE754???? by amorsen · 2013-07-28 02:58 · Score: 3, Insightful
  
  When floating point roundoff errors grow big enough to affect the outcome of the simulation, you have long since reached the point where you are not predicting anything useful any longer.
  This is not true. If the model predicts rain at 2 pm two days out and different rounding moves it to 3 pm, that is still a useful forecast in a lot of cases.
  
  --
  Finally! A year of moderation! Ready for 2019?
13. Re:Have these people never heard of IEEE754???? by Goaway · 2013-07-28 03:01 · Score: 0
  
  If rounding error moves the time from 2 pm to 3 pm, then the errors in your input data will probably switch it between raining at all, and sunshine. You are already past the point where your model can predict anything at all.
14. Re:Have these people never heard of IEEE754???? by Xtifr · 2013-07-28 03:08 · Score: 5, Informative
  
  That would be a case of solving the wrong problem. Getting the exact same result every time doesn't much matter if that result is dominated by noise and rounding errors. In fact, the diverging results are a good thing, since, once they start to diverge, you know you've reached the point where you can no longer trust any of the results. If all the machines worked exactly the same, you could figure the same thing out, but it would require some very advanced mathematical analysis. With the build-the-machines-slightly-differently approach, the point where your results are becoming meaningless leaps out at you.
  Remember, the desired result here is not a set of identical numbers everywhere. It is an accurate simulation. Getting the same results everywhere would not make the simulation one bit more accurate. So really, this is a good thing.
15. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 03:23 · Score: 1
  
  WTF are these amateurs doing? This is a solved problem and has been for several decades. Base float is solved. How to condition your computations so that order remains the same or does not impact the results is solved. Pathetic.
  Go read up on chaotic systems, then come back to us.
16. Re:Have these people never heard of IEEE754???? by nogginthenog · 2013-07-28 03:31 · Score: 1
  
  But it does it in a consistant way across platforms.
17. Re:Have these people never heard of IEEE754???? by Rockoon · 2013-07-28 03:39 · Score: 5, Informative
  
  So are you saying that enforcing predictable and correct answers has a significant performance cost?
  He said nothing about "correct."
  
  And yes, enforcing predictable answers across toolchains and architectures has significant performance cost. Even ignoring optimizations, with the x87 FPU (which uses 80-bit registers) it means the compiler needs to emit a rounding operation after every single intermediate operation because the x87 uses 80-bit internal floats but IEEE754 specifies that all operations, even intermediate ones, are always to be performed as if rounded like 32-bit or 64-bit floats.
  
  When you get into the effects of order-of-operations type optimizations even on hardware that only uses 64-bit floats, you find that in most cases (x + y + z) != (z + y + x) even when the same floating point precision is present in each step of the calculation. Even things like common-divisor optimizations (if z is used as a divisor many times, compute 1/z a single time and multiply because multiplication is much faster than division) destroy the chance of equal outcome between compilers that will do it and compilers that will not.
  
  The best way to get insight into the issues is to become familiar with the single-digit-of-precision estimation technique.
  
  --
  "His name was James Damore."
18. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 03:54 · Score: 5, Insightful
  
  Almost nothing you do with IEEE754 floating point numbers is correct in the strict mathematical sense. You can't even represent 0.1 (1/10) as an IEEE754 floating point number. There are entire series of lectures on the topic of scientific computing with floating point numbers. The errors are usually small enough that a few simple rules keep you safe (e.g., never compare floating point numbers for equality), but when you do many iterations, the errors can accumulate and mess with your results, and if in that case you do the calculations in a different order, the accumulated error will mess with your results in a different way. That's what's happening here.
19. Re:Have these people never heard of IEEE754???? by gweihir · 2013-07-28 03:58 · Score: 1
  
  I am nor arguing about that, I know that this is true. What gets me is that this is a surprise to anyone. I mean, have the done optimization without error estimation? Have they completely ignored error when optimizing? You do not just calculate away on these problems and then check whether the results seem to match reality. The results are far too important for that amateur-level approach.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
20. Re:Have these people never heard of IEEE754???? by lightknight · 2013-07-28 03:59 · Score: 1
  
  Nice, but no. He's pointing out the obvious: Climate scientists are usually reliant on their own coding skills, which love it or hate it, are not quite on the same level (usually) as a Computer scientist / Software engineer.
  And yes, little errors do matter, since a little error in a preceding calculation may be used in the next series of calculations, and so on...the snowball effect.
  
  --
  I am John Hurt.
21. Re: Have these people never heard of IEEE754???? by alen · 2013-07-28 04:06 · Score: 0, Flamebait
  
  But what if one model predicts the end of the world due to higher temps and another one says the earth will absorb the heat
  Which one do you trust?
22. Re:Have these people never heard of IEEE754???? by kyrsjo · 2013-07-28 04:12 · Score: 5, Interesting
  
  *SNIP*
  BTW, this is one reason why I take all the global warming predictions with a big grain of salt - they are all based on computer simulations which are difficult if not impossible to validate, and given what I've seen, I don't trust the results from them at all.
  In the case of climate simulations, different models (both physics-wise and code-wise) are run with different computers on the same input data, and yield basically the same results.
  When simulation chaotic behaviour, very small differences can make a big difference in the outcome of your simulations. As an example, I'm currently working on simulations of sparks in vacuum, which is a "runaway" process. In this case, adding a single particle early in the simulations (before the spark actually happens) can change the time for the spark to appear by several tens of %. This also happens if we are running with different library versions (SuperLU, Lapack), different compilers, and different compiler flags. Once the spark happens, the behaviour is predictable and repeatable - but the time for it to happen, as the system is "balancing on the edge, before falling over", is quite random.
23. Re:Have these people never heard of IEEE754???? by kyrsjo · 2013-07-28 04:13 · Score: 1
  
  Please mod parent up!
24. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 05:01 · Score: 0
  
  Indeed. If they don't like the problem, the solution is not to introduce more accurate rounding, but to do the calculations in greater precision, assuming that their models are accurate enough that the results would be useful.
  This is what makes the fact that "--fast-math" isn't a default setting in GCC kind of silly. If you honestly care how your floating point numbers are rounded, then you're not using floating point correctly.
25. Re:Have these people never heard of IEEE754???? by ckatko · 2013-07-28 05:02 · Score: 0
  
  But isn't the point of rounding-errors giving drastically different results mean it's ADDING error? As opposed to being able to see where the the results change based on THE DATA and THE ALGORITHMS, we're now supposed to be fitting to meaningless rounding error? That would be like saying I have 5 significant figure data (123.45), I use integer data types, and now I say results are only possible with three significant figures (123) because "it shows where the diverging results start."
26. Re:Have these people never heard of IEEE754???? by Laxori666 · 2013-07-28 05:29 · Score: 1
  
  In the case of climate simulations, different models (both physics-wise and code-wise) are run with different computers on the same input data, and yield basically the same results.
  Yes, but how many of those basically same results were achieved by tweaking the model until the output was basically the same?
  
  The problem with climate science is that it's not experimental. You cannot run controlled experiments on the climate. Thus, the quality of climate science research is determined not by how accurately it models reality (since it's impossible to test), but by how accepted your research is by other climate scientists. This can easily lead to the point where the science becomes totally disconnected from the reality. Much like astronomy with its dark matter & energy & ridiculous constants to attempt to fit together the observed structure of the universe into a failing model.
27. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 05:30 · Score: 1
  
  Because of the nature of chaotic systems to have two similar, but different initial conditions to diverge exponentially, or in other words, for any accumulated error to result in exponential growth in error at a later point, it is fighting a losing battle. Of course you want to avoid any obvious error sources and do the best you can. But at some point, you could double the number of digits of precision you use, and only squeeze out a fraction of a day more in accurate predictions.
28. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 05:49 · Score: 1
  
  When a complete re-implementation led by a physicist, who cited such coding issues among his reasons for doubt, funded via the Koch brothers, who's views on global warming are well known, comes to the same positive conclusion as all the other models then I would say that this is unlikely to be an issue. If these models where so flawed that rounding errors changed the results this much then they would not be coming to the same conclusion you would expect such a study to be extra careful to avoid any errors that might tilt it in favour of human caused global warming. (see https://en.wikipedia.org/wiki/Berkeley_Earth_Surface_Temperature)
29. Re: Have these people never heard of IEEE754???? by statusbar · 2013-07-28 05:59 · Score: 3, Interesting
  
  Good points - in fact in this case one can say that ALL of the calculations done by the different computer architectures are in fact wrong. to varying degrees When doing floating point math without rounding analysis being done then all bets are off. Measurements always have accuracies, and floating point math also adds it's own inaccuracies.
  The Boost library can help: http://www.boost.org/doc/libs/1_54_0/libs/numeric/interval/doc/interval.htm
  Of course all this extra interval management costs in terms of development and performance. But what is the cost of having supercomputers coming up with answers with unknown accuracy?
  
  --
  ipv6 is my vpn
30. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 06:05 · Score: 1
  
  the Berkeley Earth Surface Temperature (BEST) project
  https://en.wikipedia.org/wiki/Berkeley_Earth_Surface_Temperature
  was done by and funded by people who wanted to show global warming wrong or already thought it was, no way would they tweak their model to fit the consensus of other climate researchers yet they came to the same conclusion.
  You can run experiments without changing things, make a prediction based on the current state does it come true? if so tested positive! this is not hard to understand and has been happening.
  (see http://www.rawstory.com/rs/2013/03/27/climate-change-models-predict-remarkably-accurate-results/ for this, the real article http://www.nature.com/ngeo/journal/v6/n4/full/ngeo1788.html is pay-walled unfortunately)
  Also you can test climate models ability to match reality, make them using a limited data set (eg 20k-1k years ago) and then test them on another(eg last 1k years) to see weather they match. Again this is not a hard method to understand, if the new set does not match perditions your wrong, if it does then you are more likely correct. This method is standard across biology as well as several other fields not ideal but good enough.
31. Re:Have these people never heard of IEEE754???? by 93+Escort+Wagon · 2013-07-28 06:12 · Score: 1
  
  When floating point roundoff errors grow big enough to affect the outcome of the simulation, you have long since reached the point where you are not predicting anything useful any longer. It is not exactly a problem if the results differ at that point.
  Weather model forecasts are run as an ensemble, not a single run. Generally forecast modelers, like climate modelers, start with numerous small variations in the initial state, run the model multiple times, and average the results.
  Thing is, reading the abstract (since the article is paywalled) - its not clear that the summary here is correct. To me, anyway, it seems like they may be saying that, in practice, ensemble forecasting solves this problem even though it's present in individual runs.
  
  --
  #DeleteChrome
32. Re:Have these people never heard of IEEE754???? by dfghjk · 2013-07-28 06:20 · Score: 4, Insightful
  
  "Remember, the desired result here is not a set of identical numbers everywhere. It is an accurate simulation."
  *An* accurate simulation is not the desired result either, an accurate model is. Without reproducibility you don't have a model.
  Reproducibility is important always.
33. Re:Have these people never heard of IEEE754???? by godrik · 2013-07-28 06:27 · Score: 1
  
  I guess you have never written an actual simulation code. the IEEE754 standard tells you what happens and what kind of precision you get when applying basic operation on float. But that does not guarantee anything at the higher level.
  The order of operation is extremely important not to lose precision. For instance, how do you sum a set of float to achieve maximal precision? Hint you do not start from the first one and iterate to the last one. You basically need to keep them sorted by increasing absolute value. Whenever you sum two of them, you need to insert the result in the set and recurse until there is no more. So essentially if you want to sum a set of floats, you need to sort first, which induces a significant overhead.
  Now when you think about a complex simulation code, you might not have all the numbers available at once. So you do not actually know in which order the numbers should be summed up. If you have a value that ou previously computed as a sum and later on you get a new value to add in there. To get the best precision you might need to redo the whole sum.
  Obvisouly keeping the best tradeoff of precision of the computation vs size of the memory you need ot keep vs time of the calculation is challenging. That is why precision is often mostly ignored in these calculations. Also most simulation code pay a lot of attention to how much precision is lost during the computation.
  These problem are non trivial.
34. Re:Have these people never heard of IEEE754???? by TubeSteak · 2013-07-28 06:40 · Score: 1
  
  My interpretation of the abstract (I cannot access the actual paper) is that they could not show that any particular compiler or architecture made the predictions any better, just different. In that case you just go with whichever runs fastest.
  Or you could, you know, compare the results with reality and go with whichever one is most accurate.
  
  --
  [Fuck Beta]
  o0t!
35. Re:Have these people never heard of IEEE754???? by Vintermann · 2013-07-28 06:47 · Score: 2
  
  Averaging the result makes sense for climate modeling. But for meteorological forecasts, it makes more sense to report the most commonly occuring prediction in the ensemble, plus something about risks if you're talking about dangerous weather.
  
  --
  xkcd is not in the sudoers file. This incident will be reported.
36. Re:Have these people never heard of IEEE754???? by Vintermann · 2013-07-28 06:55 · Score: 1
  
  Climate predictions are not vulnerable to rounding errors the way meteorological predictions are. Meteorologists are solving an initial value problem, climate scientist are solving a boundary value problem.
  You can make simple climate models that do not rely on computer simulations (energy budget calculations of various sorts), and those are certainly enough to predict big problems from anthopogenic global warming. Heavy-duty numerical climate models aren't used to "prove" global warming, they're used to get better estimates for various things.
  
  --
  xkcd is not in the sudoers file. This incident will be reported.
37. Re:Have these people never heard of IEEE754???? by Vintermann · 2013-07-28 06:59 · Score: 1
  
  Propagation of rounding errors is not a big problem in climate modeling. These models are run thousands of times in order to establish averages, very different from meteorological models (although they are basically the same!) which are run many times to find the most likely specific events.
  
  --
  xkcd is not in the sudoers file. This incident will be reported.
38. Re:Have these people never heard of IEEE754???? by amorsen · 2013-07-28 07:01 · Score: 3, Funny
  
  It is so unfortunate that academics do not have the wisdom of Slashdot available before they submit papers. Alas, that is the reality they have to live with.
  
  --
  Finally! A year of moderation! Ready for 2019?
39. Re:Have these people never heard of IEEE754???? by Vintermann · 2013-07-28 07:09 · Score: 1
  
  Yes, it is possible to estimate how well a climate model models reality. The parameters that vary in climate models are not unconstrained, but constrained by physics (experimental evidence). If your climate model accurately hindcasts the climate developments of the 20th century (say), but the parameters are at the extreme range of what's plausible from experimental physics, then it probably isn't a very good model.
  Not all climate scientists focus on general circulation models either. If your particular GCM isn't accepted by climate scientists, it's probably because it has trouble accounting for things we know from other sub-disciplines of climate science.
  These are pretty old, discredited talking points.
  
  --
  xkcd is not in the sudoers file. This incident will be reported.
40. Re: Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 07:12 · Score: 1
  
  does it really matter what am i meant to do with that information exactly:
  A) panic
  B) picnic
41. Re: Have these people never heard of IEEE754???? by kenh · 2013-07-28 08:00 · Score: 1
  
  Yes, guessing always has, and always will be, easier than deriving the correct answer.
  
  --
  Ken
42. Re: Have these people never heard of IEEE754???? by dylan_- · 2013-07-28 08:12 · Score: 3, Insightful
  
  another one says the earth will absorb the heat Which one do you trust?
  I think I'd have to go with the one that doesn't redefine "absorb" to mean "magically disappear".
  
  --
  Igor Presnyakov stole my hat
43. Re:Have these people never heard of IEEE754???? by Laxori666 · 2013-07-28 08:20 · Score: 1
  
  Yes, it is possible to estimate how well a climate model models reality.
  It's possible to make a climate model, then wait for reality to happen, then see how well they matched, yes. But you can't run experiments to see if your model is sound. And climate models do diverge from reality as reality happens, see this graph for example.
  
  The parameters that vary in climate models are not unconstrained, but constrained by physics (experimental evidence). If your climate model accurately hindcasts the climate developments of the 20th century (say), but the parameters are at the extreme range of what's plausible from experimental physics, then it probably isn't a very good model.
  That hasn't stopped astronomers from positing ridiculous things such as dark matter and dark energy.
44. Re:Have these people never heard of IEEE754???? by Laxori666 · 2013-07-28 08:23 · Score: 1
  
  the Berkeley Earth Surface Temperature (BEST) project https://en.wikipedia.org/wiki/Berkeley_Earth_Surface_Temperature was done by and funded by people who wanted to show global warming wrong or already thought it was, no way would they tweak their model to fit the consensus of other climate researchers yet they came to the same conclusion.
  They didn't make a model, they measured temperatures. I agree that you can measure temperatures accurately. From skimming the article it seems they discredited the 'urban heat bias' hypothesis which is interesting to know.
  
  Also you can test climate models ability to match reality, make them using a limited data set (eg 20k-1k years ago) and then test them on another(eg last 1k years) to see weather they match. Again this is not a hard method to understand, if the new set does not match perditions your wrong, if it does then you are more likely correct. This method is standard across biology as well as several other fields not ideal but good enough.
  That doesn't show your model matches reality, it shows that you managed to make a complicated mathematical formula that managed to use some data points to generate some other data points.
45. Re:Have these people never heard of IEEE754???? by Rich0 · 2013-07-28 13:33 · Score: 1
  
  Remember, the desired result here is not a set of identical numbers everywhere. It is an accurate simulation.
  Well, I'd say a useful simulation, which entails some reasonable level of accuracy, but speed and cost are also important.
  It isn't helpful if an algorithm gives you a slightly better simulation of tomorrow's weather if it takes a week to run. If your algorithm is faster or less expensive to run then you can run it more often, or use the saved computer time to run other models. Having an ensemble of models or more frequent updates might be more useful to forecasters than having one model that stays coherent for an extra 30 minutes out. The weather is so chaotic that it gets exponentially more expensive to predict further out.
46. Re:Have these people never heard of IEEE754???? by Xtifr · 2013-07-28 14:07 · Score: 2
  
  But tweaking the FP to ensure reproducibility doesn't improve the accuracy of the model. In fact, it hides the inaccuracies of the model. So, while I completely agree with you in principle, I think that what you said has no bearing on this particular case.
47. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 14:50 · Score: 0
  
  That hasn't stopped astronomers from positing ridiculous things such as dark matter and dark energy.
  Gee, if only they had consider other ideas like alternative gravity theories. Oh wait, there are whole research groups dedicated to that, but they have yet to find any alternatives that explain observations anywhere near as well.
48. Re:Have these people never heard of IEEE754???? by mveloso · 2013-07-28 15:03 · Score: 1
  
  "In the case of climate simulations, different models (both physics-wise and code-wise) are run with different computers on the same input data, and yield basically the same results."
  Maybe that means that their models are bad and they're all fudging their data?
49. Re:Have these people never heard of IEEE754???? by Michael+Woodhams · 2013-07-28 15:11 · Score: 1
  
  Much more useful than running your simulation on multiple different supercomputers is to run it multiple times on one supercomputer, but with your input variables perturbed slightly on each run. If you randomly perturb your input measurements proportional to the standard error in those measurements, then the differences between runs will directly tell you how accurate your forecast is. (This should work independent of whether inaccuracy is dominated by initial condition inaccuracy, or by round off. It doesn't help so much if your model is bad.) You probably don't need to do this for every single forecast. After you've done it often enough for different weather conditions, you should get to know what your accuracy profile looks like.
  
  --
  Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
50. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 20:44 · Score: 0
  
  What we need is another "never use goto" (Which by the way creates much more readable code and better output in some circumstances.)
  I suggest "Never use float or double as an accumulator.", "Never use float or double in a loop." or possibly "Never use float or double with addition."
  It is even harder to master floats than it is to master when and where to use gotos, yet people tend to sprinke their code with them whenever they encounter a value that might be fractional.
  Another thing that that could solve these programming errors would be 256bit integer calculations. That is sufficient to address the observable universe with planck length precision but I guess there still will be this dude that uses floats for whatever reason.
51. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 21:02 · Score: 0
  
  256 bit integer math is sufficient to address the observable universe with planck length precision.
  Unless you intent to run your simulation from components you found in a dumpster.
  Climate research and weather research are individually large enough fields to justify developing a customized CPU but considering that the x86 architecture already has support for 256bit integer operations through the AVX instruction set this seems more to be a problem of languages not supporting it in a good way.
52. Re:Have these people never heard of IEEE754???? by nadaou · 2013-07-28 21:30 · Score: 1
  
  > In that case you just go with whichever runs fastest.
  Not quite, optimizing to "result = 1" will be fastest, but obviously not correct. If you know -Ofast will degrade numerics compared to -O0 you do know something.
  So you do a sensitivity analysis and learn what parts of the results you can trust and what parts you can't.
  Or you re-run your forecast models from 10 days ago with what-you-knew 10 days ago and see which ones got closest to reality. After doing those hindcasts for a while you can build up some confidence about model performance.
  That doesn't work so well when trying to model a 1 in 500 year storm which you have no hindcast experience with, but it's better than nothing.
  
  --
  ~.~
  I'm a peripheral visionary.
53. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-28 21:48 · Score: 0
  
  Ask a sociologist. They're lucky to get 2 digits identical. GP has a point: the desired result set is not a set of identical numbers, and "reproducability" does not imply you will get them. "Similar enough" is sufficient, with a definition that varies from case to case.
54. Re: Have these people never heard of IEEE754???? by semi-extrinsic · 2013-07-28 21:54 · Score: 1
  
  Remembering that supercomputers span a large range of architectures and compilers, in particular when accelerators are employed, we quote the Patriarch Torvalds:
  
  Anyone who claims Boost is stable and portable is so full of BS it's not even funny.
  
  --
  for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
55. Re: Have these people never heard of IEEE754???? by statusbar · 2013-07-29 00:24 · Score: 1
  
  In the cast of the boost-interval library, the link I posted has a very clear warning about that; so I don't understand why that quote is relevant here. This warning shows that "floating point is hard" and that is MORE reason to be careful with your intervals!
  
  Warning! Guaranteed interval arithmetic for native floating-point format is not supported on every combination of processor, operating system, and compiler. This is a list of systems known to work correctly when using interval and interval with basic arithmetic operators.
  x86-like hardware is supported by the library with GCC, Visual C++ 7.1, Intel compiler ( 8 on Windows), CodeWarrior ( 9), as long as the traditional x87 floating-point unit is used for floating-point computations (no -mfpmath=sse2 support).
  Sparc hardware is supported with GCC and Sun compiler.
  PowerPC hardware is supported with GCC and CodeWarrior, when floating-point computations are not done with the Altivec unit.
  Alpha hardware is supported with GCC, except maybe for the square root. The options -mfp-rounding-mode=d -mieee have to be used.
  The previous list is not exhaustive. And even if a system does not provide guaranteed computations for hardware floating-point types, the interval library is still usable with user-defined types and for doing box arithmetic.
  
  --
  ipv6 is my vpn
56. Re:Have these people never heard of IEEE754???? by OakDragon · 2013-07-29 01:52 · Score: 1
  
  They didn't predict the rain correctly yesterday here, that's why I believe those predictions are obviously incorrect.
  And there's really no excuse. I can tell you yesterday's weather with 100% accuracy.*
  * - Unless I didn't write it down.
  
  --
  Dark Reflection
57. Re:Have these people never heard of IEEE754???? by Rich0 · 2013-07-29 03:28 · Score: 1
  
  256 bit integer math is sufficient to address the observable universe with planck length precision.
  Unless you intent to run your simulation from components you found in a dumpster.
  Climate research and weather research are individually large enough fields to justify developing a customized CPU but considering that the x86 architecture already has support for 256bit integer operations through the AVX instruction set this seems more to be a problem of languages not supporting it in a good way.
  Sure, but in the same time that you can process one 256-bit integer you could SIMD to process several shorter integers in the same time.
  If you're running a simulation I suspect that processing more points is probably more important than processing each individual point down to the planck length.
  So now we're just arguing over where to draw the line. Oh, and all of this is setting aside all the arguments over order-of-operations and such. If you want a fully deterministic process it probably means far more locking/synchronization/etc. If you can get a perfect model run in 12 hours, and a good-enough one in 2 hours, most would take the latter (and 5 more to go along with it).
58. Re:Have these people never heard of IEEE754???? by amorsen · 2013-07-29 04:34 · Score: 1
  
  You didn't read what I wrote.
  To repeast myself:
  
  they could not show that any particular compiler or architecture made the predictions any better, just different.
  If one of the architectures/compilers had come up with a constant "result = 1" or indeed any degradation of model performance at all, they would have been able to make much stronger statements in the abstract.
  
  --
  Finally! A year of moderation! Ready for 2019?
59. Re:Have these people never heard of IEEE754???? by Anonymous Coward · 2013-07-29 06:22 · Score: 0
  
  Hi, atmospheric science PhD here who does research with ensembles. For weather forecasts, it is actually quite easy to demonstrate that the average of an ensemble (the consensus) is more skillful than any one deterministic forecast on average. If the forecasts have a gaussian distribution, which they often do, then of course this will also be the mode.
60. Re: Have these people never heard of IEEE754???? by semi-extrinsic · 2013-07-29 09:14 · Score: 1
  
  You don't understand why your proposed solution is bad when it has a negative impact on performance and is not portable between different supercomputers? Where do I even begin...
  
  Let's put it this way: your university/research organization will eventually buy a new supercomputer. It may have a different architecture from the old one. Do you then rewrite your 20,000 SLOC code which is using ? Do you really imagine anyone is going to pay you for that?
  
  --
  for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
61. Re: Have these people never heard of IEEE754???? by statusbar · 2013-07-29 09:33 · Score: 1
  
  The reality is that the original code was not portable between supercomputers and already comes up with incorrect answers but yet people didn't realize it until now! Do you realize that this means that all the weather forecasts from the first supercomputer implementation of this program are now known to be wrong too? What is the cost of having answers that have unknown accuracy?
  You don't have to use Boost - but you HAVE TO manage your intervals and accuracy and rounding errors! If you don't then you can not know what the accuracy is of your answers! Note this has relevance beyond supercomputing too - Digital Signal Processing of Audio also is adversely affected by people programming floating point filters incorrectly, causing noise artifacts and inharmonic distortion due to improper noise shaping and bad coefficient rounding and fading.
  Jeff
  
  --
  ipv6 is my vpn
62. Re: Have these people never heard of IEEE754???? by semi-extrinsic · 2013-07-29 09:55 · Score: 1
  
  The reality is that the original code was not portable between supercomputers and already comes up with incorrect answers but yet people didn't realize it until now!
  
  Ah, jeez. If you think this is the first time someone noticed that different computers give different results, I would like to introduce you to Edward Lorenz, a prominent physicist in the 1960s, and the field of science called Chaos Theory which he fathered. There is nothing new about the fact that this happens. It is taught in Computational Physics 101. The novelty of the study reported is that it quantifies the variation on different supercomputers in a comprehensive way.
  
  And you may want to look up the definition of portable. If you take "portable" to mean "gives exactly the same results on all computers evar", then there are no portable programs in the entire world. By Boost not being portable, I mean that it doesn't even run on your new SGI Altix where there is no GCC compiler available.
  
  --
  for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
63. Re: Have these people never heard of IEEE754???? by statusbar · 2013-08-04 06:08 · Score: 1
  
  Ah, jeez. If you think this is the first time someone noticed that different computers give different results,
  Well, apparently the people who wrote the software that this whole article was about did not know that their software was broken because of this. http://journals.ametsoc.org/doi/abs/10.1175/MWR-D-12-00352.1
  
  --
  ipv6 is my vpn
It is the butterfly effect. by 140Mandak262Jamuna · 2013-07-28 01:34 · Score: 4, Interesting

Almost all the CFD (Computational Fluid Mechanics) simulations us time marching of Navier-Stokes equations. Despite being very non linear and very hard, one great thing about them is they naturally parallelize very well. The partition the solution domain into many subdomains and distribute the finite volume mesh associated with each sub domain to a different node. Each mesh is also parallelized using GPU. At the end of the day these threads complete execution at slightly different times and post updates asynchronously. So even if you use the same OS and the same basic cluster, if you run it twice you get two different results if you run it far enough, like 10 days. I am totally not surprised if you change OS or architecture or big-endian-small-endian things or the math processor or the GPU brands the solutions differ a lot when you make 10 day forecast.

--
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
1. Re:It is the butterfly effect. by CODiNE · 2013-07-28 02:40 · Score: 0
  
  Damn. Keep Ashtin Kutcher AWAY from my computer!
  
  --
  Cwm, fjord-bank glyphs vext quiz
2. Re:It is the butterfly effect. by bill_mcgonigle · 2013-07-28 05:11 · Score: 1
  
  Coincidentally, I went to a presentation a couple weeks ago that largely focused on HPC CFD work. The presenter's company doesn't use GPU's because things like memory bandwidth are more important, but that aside, the thing that surprised me the most was that the simulations are not independently clocked (self-clocking) - they use the hardware clock, so things like latency and state are extremely important. Self-clocking would be too expensive with current hardware. Depending on the HPC cluster setup (and even things like BIOS versions matching on different nodes) the simulation clocks can drift and ruin the simulation. It's very exacting work in the current state of the art, and very easy to get wrong.
  Of course, now the weatherman can blame the sysadmin...
  
  --
  My God, it's Full of Source!
  OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
3. Re:It is the butterfly effect. by 4wdloop · 2013-07-28 05:26 · Score: 1
  
  So hence CFD N-S equations are not (presently?) solvable, hence simulation is used to approximate the answer, would it mean, all imperfections of computations asides, that exact weather forecasting is not possible?
  I suppose hence N-S equations do not have (yet?) mathematical proofs of solutions and smoothness, we cannot yet predict if precise weather forecasting is even theoretically possible?
  
  --
  4wdloop
4. Re:It is the butterfly effect. by Anonymous Coward · 2013-07-28 06:35 · Score: 0
  
  We've know for decades that exact weather forecasting is NOT possible. You cannot get the initial conditions exact, and we know that such a system will cause any initial error to grow exponentially. This would be true even if computers could do exact, perfect math with real numbers instead of some limited binary representation. Even if you had a 1 meter grid of weather stations, you would have trouble getting beyond a couple weeks.
5. Re:It is the butterfly effect. by Anonymous Coward · 2013-07-28 08:57 · Score: 0
  
  upvoted for truth
6. Re:It is the butterfly effect. by Anonymous Coward · 2013-07-28 20:17 · Score: 0
  
  So basically, it's unscientific.. unreproducable.
  Nice excuse to earn bigger salaries, but less scientific benefits. How do you even debug this? It's more faith than science.
7. Re:It is the butterfly effect. by 140Mandak262Jamuna · 2013-07-29 02:15 · Score: 1
  
  Navier-Stokes equations come under continuum mechanics. Fundamental assumption is even at infinitely small control volumes the field quantities like mass, momentum and pressure are continuous. But we know the fluids are made up of molecules. These quantities are not really continuous when the control volumes are comparable to atomic/molecular dimensions. These molecules undergo random motion induced by temperature (Brownian motion). These are supposed to be the fundamental reason for turbulence. So you can not use N-S to predict the fluid flow far into the future. These things are known even before computers were even invented.
  But instead of expecting perfect weather forecast if you are willing to settle for reasonable accurate over reasonable periods of time, yes, it could be done.
  Another problem with weather forecasting is communicating the massive data that emerges from these simulations to the masses. Listeners want to know if they have to pack an umbrella. How are you going to tell thousands of users with million different daily routines, whether it would rain long enough when and where they are outside vehicles or buildings to require an umbrella, over a few sentences in radio?
  
  --
  sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
I've seen this before by slashgordo. · 2013-07-28 01:38 · Score: 5, Interesting

When doing spice simulations of a circuit many years ago, we ran across one interesting feature. When using the exact same inputs and the exact same executable, the sim would converge and run on one machine, but it would fail to converge on another. It just happened that one of the machines was an Intel server, and the other was an AMD, and we attributed it to ever so slightly different round off errors between the floating point implementation of the two. It didn't help that we were trying to simulate a bad circuit design that was on the hairy edge of convergence, but it was eye opening that you could not guarantee 100% identical results between different hardware platforms.
1. Re:I've seen this before by Livius · 2013-07-28 02:12 · Score: 4, Funny
  
  Well, Arrakis melange is a pretty strong drug, so consistency in spice simulations is probably a little too much to expect.
  (Yes, I know the parent really meant SPICE.)
2. Re:I've seen this before by rossdee · 2013-07-28 02:14 · Score: 4, Funny
  
  "When doing spice simulations "
  Weather forecasting on Arrakis is somewhat tricky, not only do you have the large storms, but also giant sndworms.
  (And sabotage by the Fremen)
3. Re:I've seen this before by Anonymous Coward · 2013-07-28 02:26 · Score: 1
  
  Yes, this is known:
  Deep inside your CPU's pipelining circuits, there is a possibility of reordering the execution of apparently innocent operations, each conforming to IEEE754 standard. Reordering itself can trigger interesting ripples as regards error bounds across operations, versus time.
  Especially, fellows have developed a slight distrust for the Intel compiler because it has a tendency to optimize so much in favor of speed, that these effects may be even more pronounced. So, the advice is: do a few alternative builds and compare runs' results, just to keep the most basic risks at bay. Alternatively, you may go at ensemble runs (weather & climate people often do), in order to decimate some common noise errors. Overall, be aware that computational science may not be as easy as it seems at first sight.
4. Re:I've seen this before by Anonymous Coward · 2013-07-28 03:56 · Score: 0
  
  And the use of nuclear weapons for landscaping.
5. Re:I've seen this before by mrbester · 2013-07-28 04:40 · Score: 1
  
  Maybe he's South African and was typing up a dictated post
  
  --
  "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
6. Re:I've seen this before by Cassini2 · 2013-07-28 05:09 · Score: 4, Insightful
  
  This often happens when the simulation results are influenced by variations in the accuracy of the built-in functions. Every floating point unit (FPU) returns an approximation of the correct result to an arbitrary level of accuracy, and the accuracy level of these results varies considerably when built-in functions like sqrt(), sin(), cos(), ln(), and exp() are considered. Normally, the accuracy of these results is pretty high. However, the initial 8087 FPU hardware from Intel was pretty old, and it necessarily made approximations.
  At one point, Cyrix released an 80287 clone FPU that was faster and more accurate than Intel's 80287 equivalent. This broke many programs. Since then, Intel and AMD have been developing FPUs that are compatible with the 8087, ideally at least as accurate, and much faster. The GPU vendors have been doing something similar, however in video games, speed is more important than accuracy. For compatibility reasons (CPUs) and speed reasons (GPUs), vendors have focused on returning fast, compatible and reasonably accurate results.
  In terms of accuracy, the results of the key transcendental functions, exponential functions, logarithmic functions, and the sqrt function should be viewed with suspicion. At high-accuracy levels, the least-significant bits of the results may vary considerably between processor generations, and CPU/GPU vendors. Additionally, slight differences in the results of double-precision floating point to 64-bit integer conversion functions can be detected, especially when 80-bit intermediate values are considered. Given these approximations, getting repeatable results for accuracy-sensitive simulations is tough.
  It is likely that the articles weather simulations and the parent poster's simulations have differing results due to the approximations in built-in functions. Inaccuracies in the built-in functions are often much more significant that the differences due to round-off errors.
7. Re:I've seen this before by AmiMoJo · 2013-07-28 06:18 · Score: 3, Insightful
  
  In theory both should have been the same, if they stuck rigidly to the IEEE specifications. There may be other explanations though.
  Sometimes compilers create multiple code paths optimized for different CPU architectures. One might use SSE4 and be optimized for Intel CPUs, another might use AMD extensions and be tuned for performance on their hardware. There was actually some controversy when it was discovered that Intel's compiler disabled code paths that would execute quickly on AMD CPUs just because they were not Intel CPUs. Anyway, the point is that perhaps one machine was using different code and different super-scalar instructions, which operate at different word lengths. Compilers sometimes extend a 64 bit double to 80 bit super-scalar registers, for example.
  Or one machine was a Pentium. Intel will never live that one down.
  
  --
  const int one = 65536; (Silvermoon, Texture.cs)
  SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
8. Re:I've seen this before by matfud · 2013-07-28 07:35 · Score: 3, Interesting
  
  Trig functions are nasty. CPU's (FPU's) tend to use lookup tables to get a starting point and then iteratively refine that to provide more accuracy. How they do this depends on the precision and rounding of the intermediate steps and how many iterations they will undertake. Very few FPUs produce IEEE compliant results for trig. Multiple simple math operations also tend to be rounded and kept at different precisions on different processors (let alone instruction reordering done by the cpu and compiler.
  GPU's are great performance wise at float (sometimes double) math but tend to be poor at giving the result you expect. Now IEEE-754 does not remove these issues it just ensures that the issues are always the same.
  It is why languages like Java have java.lang.Math and java.lang.FastMath for trig and the strictfp keyword for float and double natives. (FastMath tends to just delegate to Math but does not have to). strictfp can kill performance as a lot of fixups have to be done in software in the better cases (also hotspot compilation can be hindered by it) and in the worst cases the entire simple operation (+,-,*,/) has to be performed in software.
9. Re:I've seen this before by matfud · 2013-07-28 07:42 · Score: 1
  
  As an additional comment:
  There are reasons why people will pay a lot of money to use a POWER 6 and later processors
10. Re:I've seen this before by Anonymous Coward · 2013-07-28 15:41 · Score: 0
  
  I've seen this before too. It was a hellish bug to track down! Back in the day, ancient version of Linux and GCC... (I might add, the folks at the FSF fixed it awfully fast. I wish I could get turn around times like that from commercial vendors!)
  
  int main()
  {
  double a,b,c,d;
  int i;
  #define VALUE .01
  for ( i=0; 3>i; i++ )
  {
  a += VALUE;
  a += VALUE;
  b += VALUE + VALUE;
  c = c + VALUE;
  c = c + VALUE;
  d = d + VALUE + VALUE;
  }
  printf( "Results:\n%35.30lf\n%35.30lf\n%35.30lf\n%35.30lf\n", a, b, c, d);
  }
  Results:
  0.060000000000000004718447854657
  0.059999999999999997779553950750
  0.060000000000000004718447854657
  0.060000000000000004718447854657
11. Re:I've seen this before by Muad'Dave · 2013-07-29 02:12 · Score: 1
  
  Let me guess, the rounding difference between b = b + (VALUE + VALUE) and d = (d + VALUE) + VALUE ?
  For what it's worth, Java returns the following using the default double type and with the strictfp keyword on the class:
  Results:
  0.060000000000000005000000000000
  0.060000000000000000000000000000
  0.060000000000000005000000000000
  0.060000000000000005000000000000
  
  --
  Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
12. Re:I've seen this before by tlhIngan · 2013-07-29 03:48 · Score: 1
  
  This often happens when the simulation results are influenced by variations in the accuracy of the built-in functions. Every floating point unit (FPU) returns an approximation of the correct result to an arbitrary level of accuracy, and the accuracy level of these results varies considerably when built-in functions like sqrt(), sin(), cos(), ln(), and exp() are considered. Normally, the accuracy of these results is pretty high. However, the initial 8087 FPU hardware from Intel was pretty old, and it necessarily made approximations.
  
  Floating point arithmetic is in itself a bunch of approximations. It's the least precise computation you can do, and if you're not careful, errors accumulate rather rapidly.
  In fact, most people using floating point probably don't realize that they order in which they do computations matter, nor despite having a standard, most hardware floating point units are NOT fully compliant.
  This is probably a lot harder on the sciences (whether it's computer, weather, climate, whatever) who assume their computation hardware is "perfect" because understanding the low level is more an implementation concern.
  A must-read article is What every computer scientist should know about floating point arithmetic (paywalled). A nice edited reprint is available as HTML (and PDF if you Google it - but Oracle seems to have it in HTML). This is one of those leaky abstractions - the hardware doing floating point is not precise at all and if you're not careful, you can lose significant figures very easily (you may think you're keeping 4 sigfigs throughout, but one error and you can be reduced to 1 due to approximations even when you really should have 4).
double versus long double by Barbarian · 2013-07-28 01:41 · Score: 2

The x86 architecture, since the 8081, has double precision 64 bit floats, and a special 80 bit float--some compilers call this long double and use 128 bits to store this. How does this compare to other architectures?
1. Re:double versus long double by Anonymous Coward · 2013-07-28 03:34 · Score: 0
  
  Not just architecture, but OS. Linux x86 32-bit defaults to 80-bit FP. x86_64 defaults to 64-bit FP.
  Some google searches show the 32-bit x86 *BSD default to 64-bit FP.
2. Re:double versus long double by sstamps · 2013-07-28 04:07 · Score: 1
  
  1) There never was any such thing as an 8081.
  2) The earliest Intel math coprocessor was the 8087, for the 8086. The 80-bit float was a special temporary-precision representation which could be stored in memory, but was otherwise unique to the Intel MCP architecture.
  
  --
  -SS "Teach the ignorant, care for the dumb, and punish the stupid."
3. Re:double versus long double by Anonymous Coward · 2013-07-28 04:26 · Score: 1
  
  2) The earliest Intel math coprocessor was the 8087, for the 8086. The 80-bit float was a special temporary-precision representation which could be stored in memory, but was otherwise unique to the Intel MCP architecture.
  Other FP implementations have 80-bit as well.
  http://en.wikipedia.org/wiki/Motorola_68881
  (Maybe) unique to x87 is the stack architecture.
4. Re:double versus long double by raftpeople · 2013-07-28 06:51 · Score: 1
  
  IBM 360 and 370 mainframes have had 128 bit floating point since the 60's
5. Re:double versus long double by gnasher719 · 2013-07-28 09:18 · Score: 1
  
  The x86 architecture, since the 8081, has double precision 64 bit floats, and a special 80 bit float--some compilers call this long double and use 128 bits to store this. How does this compare to other architectures?
  The 80 bit format is not in any way "special", it is the standard extended precision format. Unfortunately, PowerPC didn't support it :-) Compilers tend to use 128 bits to store it, the hardware actually reads and writes 80 bits. In practice, long double isn't used very much.
  
  The real difference is elsewhere: 1. A C or C++ compiler can decide which precision to use for intermediate results. 2. A C or C++ compiler can decide whether fused multiply-add is allowed. 3. Java doesn't allow extended precision but allows in some cases extended exponent range. 4. Fortran allows replacing arithmetic operations with mathematically equivalent operation.
  
  If a numerical problem is so sensitive to the details of the arithmetic used that it produces different results with different compiler options or different hardware then that means that the problem is hard and neither solution can really be trusted.
Time to revoke some "scientist" licences. by Anonymous Coward · 2013-07-28 01:46 · Score: 0

The people writing this code ought to've known better.
1. Re:Time to revoke some "scientist" licences. by Anonymous Coward · 2013-07-28 01:52 · Score: 0
  
  Who said they didn't know better?
  "which is the change of the standard deviation relative to the value itself, remains nearly zero with time."
  Sounds to me like the 'problem' takes care of itself.
2. Re:Time to revoke some "scientist" licences. by Anonymous Coward · 2013-07-28 02:10 · Score: 0
  
  That reads to me like the system estimates its own calculations to be pretty accurate, even though rounding has clearly introduced a large amount of uncertainty to the result. But TFA is paywalled, and I can't find any other significant uses of the term "fractional tendency" on Google, so who knows what they mean.
Chaos by pcjunky · 2013-07-28 01:46 · Score: 5, Interesting

This very effect was noted in weather simulations back in the 1960's. Read Chaos - The making of a new science, by Jmaes Gleick.
1. Re:Chaos by Trepidity · 2013-07-28 05:08 · Score: 1
  
  Was noted in actual weather systems as well (at least as far as we understand them), which is part of what makes it particularly tricky to avoid in simulations. It's not only that our hurricane track models, for example, are sensitively dependent on parameters, but also that real hurricane trajectories appear to be sensitively dependent on surrounding conditions.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
2. Re:Chaos by Common+Joe · 2013-07-28 08:11 · Score: 1
  
  by Jmaes Gleick.
  Perfect example of the butterfly effect and floating point errors in weather. Over time, it can even change a person's name who wrote a book on weather simulations in the 60's. I bet no one predicted that!
3. Re:Chaos by Anonymous Coward · 2013-07-28 08:35 · Score: 0
  
  In Chaos theory this is called sensitive dependence on intial conditions. So sensitive you can never know enough amount such systems to make accurate long term predictions. The farther into the future you go the less accurate your predictions become.
4. Re:Chaos by jamesh · 2013-07-28 12:31 · Score: 1
  
  by Jmaes Gleick.
  Perfect example of the butterfly effect and floating point errors in weather. Over time, it can even change a person's name who wrote a book on weather simulations in the 60's. I bet no one predicted that!
  I did, but nobody listened to me until it was too late.
Doesn't matter much by Hentes · 2013-07-28 01:53 · Score: 0

Rounding errors are orders of magnitude smaller than measurement errors, they are not the precision bottleneck.
1. Re:Doesn't matter much by Anonymous Coward · 2013-07-28 02:09 · Score: 0
  
  But thats the problem. Those tiny rounding errors are causing different forecasts. That means a difference of input by 0.0001% will give a completely different output.
  How accurate and reliable can these forecasts be in reality then? Just their measurement devices being off by more than the can possibly be calibrated can change your 10 day forecast. Sounds like the entire thing is junk.
  Side note: Anyone who has programed with money and had to deal with half pennies and get it correct 100% of the time knows the tricks to deal with this. Bank interest calculations, done correctly, will not come out different on different hardware.
2. Re:Doesn't matter much by Goaway · 2013-07-28 02:15 · Score: 1
  
  Those tiny rounding errors are causing different forecasts.
  So are the measurement errors, and to a much higher degree. The roundoff errors just don't matter.
  
  How accurate and reliable can these forecasts be in reality then?
  Once they reach the point where errors have accumulated to this degree, not at all. Everybody knows that.
3. Re:Doesn't matter much by meza · 2013-07-28 02:26 · Score: 1
  
  At first I agreed with you and thought the GP wasn't aware of the concept of chaos (small errors in input give large errors in output). However, that's not what he wrote. He correctly pointed out that the rounding error is much smaller than the error from the initial measurement. Logically it should be the dominant error that first leads to chaotic behavior. The problem then seems to be over-belief in the forecast due to not accounting correctly for the measurement error. Long before any rounding errors start to play a role one should have stopped the simulation as it didn't predict anything useful anyway.
4. Re:Doesn't matter much by Anonymous Coward · 2013-07-28 02:30 · Score: 0
  
  yain, in that order.
  Rounding errors may be small, but if their effect is not understood and contained, there will be not much hope for reproducible science
  (specifically, it would not help tuning of the models themselves for fixable non implementation specific flaws).
5. Re:Doesn't matter much by AchilleTalon · 2013-07-28 02:32 · Score: 4, Informative
  
  Measurement errors are involved once at boundary conditions. Precision errors propagates in the computations. So, even if a single precision error is magnitude orders smaller than measurement errors, they can have an impact on the result depending on the computations involved while solving the problem.
  
  --
  Achille Talon
  Hop!
6. Re:Doesn't matter much by CrimsonAvenger · 2013-07-28 03:02 · Score: 1
  
  However, that's not what he wrote. He correctly pointed out that the rounding error is much smaller than the error from the initial measurement. Logically it should be the dominant error that first leads to chaotic behavior.
  Alas, TFA is about a situation where they take the SAME inputs (initial measurements), run the program on ten different sets of hardware, and get ten different results.
  I fail to see how the same program + same inputs == "differences in inputs cause most of the error"....
  
  --
  
  "I do not agree with what you say, but I will defend to the death your right to say it"
7. Re:Doesn't matter much by kasperd · 2013-07-28 04:38 · Score: 1
  
  I fail to see how the same program + same inputs == "differences in inputs cause most of the error"....
  Inaccuracies in the input most likely did cause most of the error. Maybe nobody noticed because that error was the same in all the calculations. Eventually a difference between the calculations starting to build up because of differences in rounding between the different runs. This variation was noticed, but it would still be small compared to the differences caused by inaccuracies in the input. In short means by the time you notice the difference between two runs, both of them are already way off compared to the real value due to both of them having been working on the same inaccurate input.
  
  If you want to do better, then do calculations with a representation that keeps track of uncertainty. Even in those cases where you cannot do a floating point operation and get an exact result, you can still do the calculation and know the possible range of the error. So each number is represented by a minimum and a maximum (or a mean and an error margin). As you do calculations the minimum and maximum values will be going further and further apart. Once they get too far apart, you know the results are no longer useful.
  
  When you start the simulation, you initialize the numbers with an error margin corresponding to the accuracy of the measurements. Different runs on different platforms may not build up errors at the same rate, and that is something you can actually look at. If the ranges from two different runs no longer overlaps, then you know there is a bug somewhere. If one simulation says the air temperature is going to be be between -10 and +30 and the other simulation says it is going to be between 0 and +20, then they can both be right, but neither simulation result is particular useful. If one simulation says it is going to be between -10 and 0 and the other says it is going to be between +20 and +30, then you know at least one of them has a bug.
  
  --
  
  Do you care about the security of your wireless mouse?
8. Re:Doesn't matter much by Anonymous Coward · 2013-07-28 05:38 · Score: 0
  
  But an order of magnitudes more difference in the initial conditions in such a system results in such a fast growth of error, it would take a massive amount of accumulated errors to catch up. This isn't a system where a 10 rounding errors a tenth the size of the measurement errors would produce the same error level in the output, you would need a massive amount of such precision errors to catch up.
9. Re:Doesn't matter much by siride · 2013-07-28 05:54 · Score: 1
  
  Again, the article says that they used the same input. This can be verified with a simple diff. Same input leading to different results means that some other input (that is, the circuitry of the CPU or software libraries) have to be at fault, unless you want to start to argue that computer hardware is non-deterministic. Then you've opened an entirely different can of worms that your error margin system will do little or nothing to address.
10. Re:Doesn't matter much by bidule · 2013-07-28 06:29 · Score: 1
  
  Measurement errors are involved once at boundary conditions. Precision errors propagates in the computations.
  If measurement errors are less than precision errors and precision errors are sufficient to bring out chaos, changing the initial state by epsilon would also bring chaos.
  Getting different results using different architectures is a good thing, it allows to see how chaotic the initial conditions are and evaluate the reliability of the result.
  
  --
  ID: the nose did not occur naturally, how would we wear glasses otherwise? (apologies to Voltaire)
11. Re:Doesn't matter much by achbed · 2013-07-28 07:03 · Score: 1
  
  You'r contradicting yourself.
  
  Inaccuracies in the input most likely did cause most of the error.
  is the exact logical opposite of
  
  Eventually a difference between the calculations starting to build up because of differences in rounding between the different runs.
  It's the differences in rounding based on the same input data that the paper is talking about. Not the inaccuracies in input data (testing for which would involve, by definition, different sets of input data varying by a known quantity). If the rounding was behaving the same, we would expect the same output given the same program and input. If a system produces different output every time its run with the same input, then we have a useless system as we cannot have any way of verifying that what is produced is correct. If you can't unit test the system, then you have a religion, not a scientific simulation.
12. Re:Doesn't matter much by kasperd · 2013-07-28 10:24 · Score: 1
  
  You'r contradicting yourself.
  No. You are assuming if both calculations produce the same result, then that result is correct. In reality, you can run the same calculation twice and get the same error.
  
  If the rounding was behaving the same, we would expect the same output given the same program and input.
  If you take the same source and compile it for two different systems, is it the same program? What the compiled program does is probably within the specs of the language.
  
  If a system produces different output every time its run with the same input, then we have a useless system
  That depends very much on what the purpose of the program is. I have worked with cryptography, and for most usages in that field, a program which produces the same output twice is unusable. A program which does floating point operations need to be done in a way, where you can figure out how large an error you get. Knowing the accuracy is more important than getting the same result twice. If you do get two different results from the same calculation, you can check if the difference is within the accuracy you were supposed to get.
  
  as we cannot have any way of verifying that what is produced is correct. If you can't unit test the system, then you have a religion, not a scientific simulation.
  The complete program is not one unit. You unit test individual units. And unit tests can deal perfectly well with units, where the spec allows for more than one possible output. The unit test just need to verify that the output is within spec. Testing for one specific output value is usable in some cases, but not always.
  
  --
  
  Do you care about the security of your wireless mouse?
13. Re:Doesn't matter much by kasperd · 2013-07-28 10:28 · Score: 1
  
  unless you want to start to argue that computer hardware is non-deterministic.
  Distributed systems are inherently non-deterministic. Moreover it says right there in the tittle, that the different results were produced on different computers.
  
  --
  
  Do you care about the security of your wireless mouse?
14. Re:Doesn't matter much by budgenator · 2013-07-28 10:32 · Score: 1
  
  Once they reach the point where errors have accumulated to this degree, not at all. Everybody knows that.
  Climatologist either don't or are in denial of that fact.
  
  --
  Apocalypse Cancelled, Sorry, No Ticket Refunds
15. Re:Doesn't matter much by siride · 2013-07-28 10:45 · Score: 1
  
  I thought that was my point? It seemed that you were trying to argue that the input was actually different.
16. Re:Doesn't matter much by Anonymous Coward · 2013-07-28 15:12 · Score: 0
  
  Do you honestly think that climatology models are simply short term forecast models run for a very long simulation time?
  The idea that because predicting the exact temperature and weather conditions at a given time in the near future is very hard, predicting global (or even regional) climate over much longer periods of time must be as hard (or impossible) is absurd. Believing that proves nothing other than that you are woefully ignorant of science and mathematics. It is akin to suggesting that because one cannot feasibly model with perfect accuracy the movement of every particle comprising a ball which is in flight, one therefore cannot possibly model the trajectory the ball will take.
17. Re:Doesn't matter much by kasperd · 2013-07-28 17:53 · Score: 1
  
  It seemed that you were trying to argue that the input was actually different.
  No, I was arguing that the input was not actually accurate enough to do the calculation in the first place. Floating point numbers can handle much higher accuracy than the measurements used as input. By the time you notice the difference between two runs you are already way past the point were the output could be useful.
  
  So there are two sources of errors. Inaccurate input data which leads to reproducible bad output. Rounding errors during calculation which is smaller and thus only becomes significant later. The inaccuracy of the output due to inaccurate input data cannot be seen by running the calculation twice with the same input data. But by comparing to the real world, it can be observed that it diverges from the calculation. That divergence can be caused by inaccurate input data, flaws in the algorithm, or simply by the real world having much higher granularity than the discrete datapoints used in the algorithm.
  
  Divergence between two runs of the same algorithm on the same input data can be caused by a number of other factors. Such factors include different rounding due to differences in the platform being used (different hardware and/or software), or non-determinism due to timing in a distributed system. For example if a node receives three floating point numbers and add them, the sum can depend on which order the three numbers were received.
  
  The differences due to rounding errors are however not of much practical interest. By the time they are large enough to notice, the errors due to inaccurate input are already too large for the output to be of practical interest.
  
  --
  
  Do you care about the security of your wireless mouse?
18. Re:Doesn't matter much by Anonymous Coward · 2013-07-28 20:15 · Score: 0
  
  The input is the same, but the suite that processes this input is not. Different systems have different FPU-s which leads to the different (hardware) rounding of the numbers, then different systems use different system libraries and compilers which also leads to different (software) rounding. The second implies that even though the source code of the numerical model is identical, its object code is not. So you have 2 sources of rounding differences - FPU and the object code.
  Systems are deterministic in a way that you can always get same result on the same system with the same set of input fields, but it depends how you define system equivalence (FORTRAN line-by-line equivalent or object code byte-by-byte equivalent, or object+CPU+FPU+GPU+whatever_hardware_that_might_round_some_float equivalent).
  This kind of differences are quite small (in the field of weather forecast, ussualy 4th or 5th decimal place after 10 days of integration), but they also tend to creep up and accumulate and after enough time the influence of this kind of errors starts to dominate, so you end up in a (multi-)seasonal forecast that is not initial conditons dominated but rounding error dominated.
19. Re:Doesn't matter much by budgenator · 2013-07-29 11:33 · Score: 1
  
  I'll see your trajectoryof a ball and raise you a Pioneer Anomaly
  
  --
  Apocalypse Cancelled, Sorry, No Ticket Refunds
20. Re:Doesn't matter much by Goaway · 2013-07-30 04:57 · Score: 1
  
  Do you also believe it is impossible to predict that a flipped coin will come up heads 50% of the time, because it is impossible to predict what it will come up as on a single flip?
21. Re:Doesn't matter much by budgenator · 2013-07-31 11:44 · Score: 1
  
  Do you also believe it is impossible to predict that a flipped coin will come up heads 50% of the time, because it is impossible to predict what it will come up as on a single flip?
  No, but that's a nice strawman. What's pi in base 2?
  
  --
  Apocalypse Cancelled, Sorry, No Ticket Refunds
22. Re:Doesn't matter much by Goaway · 2013-07-31 20:53 · Score: 1
  
  It is not. That is the exact kind of argument you were making.
Yes, the Butterfly Effect, as others have said by Impy+the+Impiuos+Imp · 2013-07-28 02:05 · Score: 5, Interesting

This problem has been known since at least the 1970s, and it was weather simulation that discovered it. It lead to the field of chaos theory.
With an early simulation, they ran their program and got a result. They saved their initial variables and then ran it the next day and got a completely different result.
Looking into it, they found out that when they saved their initial values, they only saved the first 5 digits or so of their numbers. It was the tiny bit at the end that made the results completely different.
This was terribly shocking. Everybody felt that tiny differences would melt away into some averaging process, and never be an influence. Instead, it multiplied up to dominate the entire result.
To give yourself a feel for what's going on, imagine knocking a billiard ball on a table that's miles wide. How accurate must your initial angle be to knock it into a pocket on the other side? Now imagine a normal table with balls bouncing around for half an hour. Each time a ball hits another, the angle deviation multiplies. In short order with two different (very minor differences) angles, some balls are completely missing other balls. There's your entire butterfly effect.
Now imagine the other famous realm of the butterfly effect -- "time travel". You go back and make the slightest deviation in one single particle, one single quantum of energy, and in short order atmospheric molecules are bouncing around differently, this multiplies up to different weather, people are having sex at different times, different eggs are being fertilized by different sperm, and in not very long an entirely different generation starts getting born. (I read once that even if you took a temperature, pressure, wind direction, humidity measurement every cubic foot, you could only predict the weather accurately to about a month. The tiniest molecular deviation would probably get you another few days on top of that if you were lucky.)
Even if the current people in these parallel worlds lived more or less the same, their kids would be completely different. That's why all these "parallel world" stories are such a joke. You would literally need a Q-like being tracking multiple worlds, forcing things to stay more or less along similar paths.
Here's the funnest part -- if quantum "wave collapse" is truly random, then even a god setting up identical initial conditions wouldn't produce identical results in parallel worlds. (Interestingly, the mechanism on the "other side" doing the "randomization" could be deterministic, but that would not save Einstein's concept of Reality vs. Locality. It was particles that were Real, not the meta-particles running the "simulation" of them.)

--
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
1. Re:Yes, the Butterfly Effect, as others have said by Anonymous Coward · 2013-07-28 07:14 · Score: 0
  
  If that god guy controls randomness, he controls everytin'.
  Welcome to the intersection of computer science, cryptography, physics and religion. When I visited London quite a few years ago, I saw the GCHQ boss in some Anglican church garment in the hotel room literature. Maybe they did some advanced analysis of physical randomness....
Been there, done that by fatmar · 2013-07-28 02:05 · Score: 1

This is a good time to review some problems in met codes. The first real problem is that the science is poorly understood. If the model is poorly constructed conditioning is one of the least of your problems. By and large, the push for V&V came form the met world. The second thing is that the spatial resolution is 'way too big. And, long before IEEE 754, it was anecdotally known that you lose a digit whenever you change systems (hardware or software).

--
D. E. (Steve) Stevenson, Ph.D. Emeritus Associate Professor,School of Computing,Clemson University.
Headline disagrees with summary. by Anonymous Coward · 2013-07-28 02:07 · Score: 0

The summary says, "There exist differences in the results for different compilers, parallel libraries, and optimization levels,". That doesn't mean different computers, although different computers were used. It means that they weren't running the same code path and same order of operations so differences should have been expected.
Unfortunately, any information regarding whether the differences are significant for local or even regional weather prediction is behind the paywall.
Global Climate Change by Anonymous Coward · 2013-07-28 02:15 · Score: 0, Troll

Certainly not a problem for climate "scientists" all over the planet and their crazy predictions out 10 or 20 years.
1. Re:Global Climate Change by Anonymous Coward · 2013-07-28 04:09 · Score: 0
  
  Glad to see this has been marked -1.
  It's very important to stop anyone thinking that there could be anything wrong with the Global Warming scare.
  97% of all scientists can't be wrong!!!
2. Re:Global Climate Change by PPH · 2013-07-28 08:24 · Score: 0
  
  I think that 97% is a result of a rounding error.
  
  --
  Have gnu, will travel.
3. Re:Global Climate Change by Anonymous Coward · 2013-07-29 02:24 · Score: 0
  
  I think that 97% is a result of a rounding error.
  Actually they just asked 10 scientists, 9 said "sure, whatever, now get out of my lab and let me work" and the last responded with "that's a tough one, hold on while I send out my grad students to get some temperature readings."
Translation by Chris+Mattern · 2013-07-28 02:18 · Score: 1

The system dependency, which is the standard deviation of the 500-hPa geopotential height averaged over the globe, increases with time. However, its fractional tendency, which is the change of the standard deviation relative to the value itself, remains nearly zero with time.
In other words, they all gave different answers, but each one was equally certain that *it* was right.
1. Re:Translation by Anne+Thwacks · 2013-07-28 03:17 · Score: 1
  
  they all gave different answers, but each one was equally certain that *it* was right.
  Perhaps that is where politicians got the idea from?
  
  --
  Sent from my ASR33 using ASCII
Just needs a little adjustment by Anonymous Coward · 2013-07-28 02:23 · Score: 2, Funny

They really need to standardize on what butterflies to use.
Re:Global Warming Predictions by Anonymous Coward · 2013-07-28 02:32 · Score: 0

So, predictions of global warming and increasing weather variability are really justt artifacts of round-off errors.
My Slashdot tagline for today:
1 + 1 = 3, for large values of 1.
Building reproducible HPC software is already hard by Anonymous Coward · 2013-07-28 02:35 · Score: 0

If you don't believe that statement, look at this:
http://hpcugent.github.io/easybuild/
and then the diagram in here:
http://hpcbios.readthedocs.org/en/latest/HPCBIOS_2012-92.html
Put the equivalent of that diagram into scattered wiki instructions and ask any 2 people to come up with the same build;
how tough would that be? Only the people who have tried it, know really well what it means...
btw.
In modern HPC systems it is common to provide 3 MPI stacks (IntelMPI, OpenMPI, MVAPICH) and a bunch of compilers;
ah, and that's just the first two layers from the bottom of that diagram! Are you surprised you have fireworks on the top?
molecular dynamics on GPU by inflamed · 2013-07-28 02:41 · Score: 0

In molecular dynamics simulations, kinetics are known to be approximate and states at a given time are not considered directly correlated with that time point; we only hope to get statistically correct distribution of states across ensembles. Consequently, differences in rounding between wildly different compiler/hardware architectures are expected. However, deterministic behavior of the system is achieved by employing higher precisions for accumulation steps, which ensures that averages over a sufficiently long time (big enough sample) are the same no matter what hardware is employed. Consequently a tremendous speed-up is possible running CUDA code on consumer grade nvidia cards which have far fewer double precision execution units than single float precision units. So, we have deterministic trajectories but nobody expects these to match real-world processes on a time-function basis :-)
Comical by Anonymous Coward · 2013-07-28 02:49 · Score: 0

Beyond about 3 days (based on the Meteorology classes I took in college) most forecasts are just a shitty guess. Looking at a 10 days forecast is like calling your local psychic hotline. Sometimes they're right, but it's just a lucky guess.
1. Re:Comical by Anonymous Coward · 2013-07-28 10:15 · Score: 0
  
  The duration of the "predictability window" itself is variable. For example, in spring this year there was a humungous buggerbastard of an anticyclone between Scotland and Norway. The forecasts said the weather would be shite everywhere along the Southern periphery of that system for about two weeks, and it was. Well, perhaps not exactly: it was abso fucking lutely shite.
  You can run a metamodel (model of models) by running several sims with slightly different inputs[1]. The divergence between the runs gives a rough estimate of the reliability of the results.
  [1] Germans and aspies: That's not what this story is talking about. It's orthogonal to it (same inputs giving different results). I know that, anyone who can read knows that, so you can STFU already.
Re:Building reproducible HPC software is already h by Anonymous Coward · 2013-07-28 02:49 · Score: 0

btw. "Consistency of Floating-Point Results using the Intel® Compiler or Why doesn’t my application always give the same answer?"
ref. http://software.intel.com/sites/default/files/m/4/4/6/9/4/39386-FP_Consistency_102511.pdf
Hey, at it least it ran all the way. by 140Mandak262Jamuna · 2013-07-28 02:58 · Score: 3, Interesting

These numerical simulation codes can sometimes do things funny things when you port from one architecture to another. One of the most frustrating debugging session I had was when I ported my code to Linux. One of my tree class's comparison operator evaluates the key and compares the calculated key with the value stored in the instance. It was crapping out in Linux and not in Windows. I eventually discovered Linux was using 80 bit registers for floating point computation but the stored value in the instance was truncated to 64 bits.
Basically they should be happy their code ported to two different architectures and ran all the way. Expecting same results for processes behaving choatically is asking for too much.

--
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
1. Re:Hey, at it least it ran all the way. by budgenator · 2013-07-28 10:44 · Score: 1
  
  The study used FORTRAN, which is expected to be highly portable.
  
  --
  Apocalypse Cancelled, Sorry, No Ticket Refunds
2. Re:Hey, at it least it ran all the way. by 140Mandak262Jamuna · 2013-07-29 02:19 · Score: 1
  
  It is not your father's fortran buddy! The highly optimized procedures, oops sorry subroutines, are in FORTRAN, But the simulation runs on huge high performance clusters with a mix of GPU, CPU and FPU. You might get a different result if you replace all he 3 meter ethernet cables in the cluster with 10meter cables.
  
  --
  sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
3. Re:Hey, at it least it ran all the way. by budgenator · 2013-07-29 11:45 · Score: 1
  
  I still have my program decks of hollerith cards from school for Fortran, RPG II and COBOL; now get off my lawn you whipper-snapper. Ever see a CPU made out of wirewrapped NAND gates?
  
  --
  Apocalypse Cancelled, Sorry, No Ticket Refunds
CompSci 101 by Anonymous Coward · 2013-07-28 03:08 · Score: 0

I'm no programming expert, but isn't this basically Computer Science 101 stuff?
1. Re:CompSci 101 by kasperd · 2013-07-28 04:59 · Score: 2
  
  I'm no programming expert, but isn't this basically Computer Science 101 stuff?
  All I was taught about floating point at that level was how wrong results we could get, and that we should avoid it. Several years later on a more advanced course, I learned about how to do floating point calculations, if you really need to.
  
  --
  
  Do you care about the security of your wireless mouse?
Don't let mathematicians write production code by AlejoHausner · 2013-07-28 03:10 · Score: 0

I once saw a piece of code written by a mathematician which said "pow(x, -1)". Ugh. I wonder if meteorologists know better.
1. Re:Don't let mathematicians write production code by Anonymous Coward · 2013-07-28 03:48 · Score: 1
  
  I once saw a piece of code written by a mathematician which said "pow(x, -1)". Ugh. I wonder if meteorologists know better.
  It might be written that way to get a well-defined behavior depending upon the value of x. E.g. what if x is +0?
  http://linux.die.net/man/3/pow
  Maybe they do know better.
2. Re:Don't let mathematicians write production code by angel'o'sphere · 2013-07-28 05:55 · Score: 1
  
  And which point do you like to make? pow(x,-1) is equivalent to 1/x. So pretty valid.
  
  --
  Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
3. Re:Don't let mathematicians write production code by Anonymous Coward · 2013-07-28 08:22 · Score: 0
  
  When you start seeing "pow(x, +1)", then you can start to worry.
4. Re:Don't let mathematicians write production code by Anonymous Coward · 2013-07-28 09:35 · Score: 1
  
  And which point do you like to make? pow(x,-1) is equivalent to 1/x. So pretty valid.
  Sure, for a mathematician, 1/x = x^{-1}.
  But pow(a, b) is implemented as exp(b * log(a)), and both exp() and log() are probably implemented as high-degree chebyschev polynomial approximations, so pow(x,-1) involves a f**k of a lot more computation than the 1/x which the mathematician intended. Not to mention lots more possibility for numerical error.
5. Re:Don't let mathematicians write production code by Anonymous Coward · 2013-07-28 09:46 · Score: 0
  
  And which point do you like to make? pow(x,-1) is equivalent to 1/x. So pretty valid.
  Is it really entirely equivalent? What if x=0? 1/x will give you a divide by zero error in pretty much any language, I'm not sure what pow(x,-1) will do since I don't know what language the original poster referred to, but I can't help thinking that the mathematician may have used pow for exactly that reason, and is actually more clever than the original poster thought ;)
This is a DENIER propaganda story! by Anonymous Coward · 2013-07-28 03:17 · Score: 0

It doesn't mean anything. You must not listen to it. Global Warming is still happening, and the models are all correct and agree with each other.
97% of all scientists agree that we should stop generating CO2 NOW, otherwise humanity will be responsible for the greatest environmental catastrophe ever to hit the Earth. There is no need for any further examination of the science - what we need is ACTION.
Slashdot should not be supporting denier propaganda in this way. This story should be removed immediately.
Lies, damned lies, and statistics... by Anonymous Coward · 2013-07-28 03:17 · Score: 0

QED - quod erat demonstrandum! Or to paraphrase - the proof is in the pudding... :-)
problem solved decades ago by Gravis+Zero · 2013-07-28 03:19 · Score: 1, Interesting

it's called Binary Coded Decimal (BCD) and it works well. plenty of banks still use it because it's reliable and works. it's plenty slower but it's accurate regardless of the processor it's used on.

--
Anons need not reply. Questions end with a question mark.
1. Re:problem solved decades ago by HornWumpus · 2013-07-28 04:01 · Score: 3, Informative
  
  A little knowledge is a dangerous thing.
  Get back to us when you've recompiled the simulation using BCD and then realize that there is still rounding. .01 being a repeating decimal in float is another issue.
  
  --
  John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
2. Re:problem solved decades ago by Anonymous Coward · 2013-07-28 04:59 · Score: 0
  
  Nope. If you are suggesting base 10 (ten) would help, thats silly. There is nothing special about ten.
  Now if you are suggesting fixed point would help, thats also wrong. There is still rounding there.
  Thirdly, if you are suggesting using BCD to implement arbitrary precision to avoid all rounding: that is impossible due to repeating decimals (1/3 for example), and if you are doing that, there is no reason to use BCD! Base 2 is fine for extending to arbitrary precision.
  So I can't see how BCD could possibly help. Its one purpose is when you want to represent numbers in base 10 so that you don't accumulate errors from base conversions of numbers that are displayed to humans in base 10. The weather simulator does not convert bases, so that's irrelevant.
  Lastly, BCD isn't really an alternative to doubles. BCD is a way to represent base 10 digits. You would have to specify more detail as to how it would be used to represent non whole numbers (but as mentioned above, using it for fixed point does not help, nor does using it for the significant or exponent in floats). And don't even propose rationals: that has nothing to do with BCD.
3. Re:problem solved decades ago by wiredlogic · 2013-07-28 05:28 · Score: 2
  
  BCD is no better than fixed point binary in this instance. The banking industry relies on it because we use decimalized currency and it eliminates some types of errors to carry out all computations in decimal. For simulation inputs you're no better off than if you use a plain binary encoded number.
  
  --
  I am becoming gerund, destroyer of verbs.
4. Re:problem solved decades ago by Anonymous Coward · 2013-07-28 06:42 · Score: 0
  
  When dealing with real things (natural and measured, not artificial like accounting systems), one tends to use real numbers. Real numbers, as in all of them taken together, are inherently impossible for a computer to represent exactly. The complexity of computers is bounded more or less by the number of integers, which is proven to be less than the number of reals (which is to say there are different levels of infinity, and they are not equal). Look up Cantor diagonalization. Therefore, no matter what you do in computation, at best you are merely approximating real conditions. As the real scenario becomes more complex, an initial error can get propagated through calculations such that the final error is exponentially larger. For example, consider a 2% underestimate of a value that gets reused 10 times in a sequence of products. Unless you know that you underestimated and are taking steps to correct it (likely specific to the problem instance), you can expect an estimate of the final answer to be (1-0.02)^10=0.98^10 ~=0.81707 ~= 81.71% of the true answer. A rounding error is typically much smaller than 2% to start with, but there are also many more calculations than just 10. This is true regardless of number representation.
5. Re:problem solved decades ago by blueg3 · 2013-07-28 13:00 · Score: 1
  
  Problem discovered decades ago. Called "chaos theory". Turns out that for iterated feedback systems, even arbitrarily-large stored numbers cause round-off errors eventually. Usually more quickly than people anticipate.
  People continue not to understand this admittedly subtle point, proceed to suggest known-bad solutions.
6. Re:problem solved decades ago by Anonymous Coward · 2013-07-28 21:40 · Score: 0
  
  Or more concise: reality is not decimal.
  Corollary: banking is not real.
so what is Pi in BCD? by Anonymous Coward · 2013-07-28 03:34 · Score: 0

oh wait, you cannot compute Pi on any machine because its transcendental
INB4 Climate Model Shitstorm by PPH · 2013-07-28 03:35 · Score: 0

n/a

--
Have gnu, will travel.
1. Re:INB4 Climate Model Shitstorm by Anonymous Coward · 2013-07-28 03:49 · Score: 0
  
  paper title already appears on at least 2 global-warming-skeptic blogs already
we code it to drop the parts of pennies into our by Anonymous Coward · 2013-07-28 03:49 · Score: 0

our bank account
Welcome to Chaotic Systems 101 ;-) by Technomancer · 2013-07-28 03:53 · Score: 2

Pretty much most iterative simulation systems like weather simulation will behave this way. When the result of one step of the simulation is the input for another step any rounding error will possibly get amplified.
Also see Butterfly Effect https://en.wikipedia.org/wiki/Butterfly_effect (not the movie!).
Unfortunately not... by Anonymous Coward · 2013-07-28 03:55 · Score: 0

BCD solves the problem of binary not being able to unambigously represent certain decimal fractions. BCD solves little for scientific computing. You still need to round and in parallel programs you still gather & round in non-deterministic order. The non-determinism of the particular program won't go away if rewritten to use BCD.
Utterly Unsurprising by Anonymous Coward · 2013-07-28 04:02 · Score: 2, Insightful

Floating Point arithmetic is not associative.
Everyone who reads Stack Overflow knows this, because every who doesn't know this posts to Stack Overflow asking why they get weird results.
Everyone who does numerical simulation or scientific programming work knows this because they've torn their hair out at least once wondering if they have a subtle bug or if it's just round-off error.
Everyone who does cross-platform work knows this because different platforms implement compilers (and IEEE-754) in slightly different ways.
Everyone who does parallel programming knows this because holy shit will you see round-off differences when you through many minutes of TFlops at a problem and it sequences difference every time.
Everyone who looks at the standards knows this because for Chrissakes, Fused-Multiply-Add is standards compliant but _optional_.
Why is this remotely news?
1. Re:Utterly Unsurprising by Anonymous Coward · 2013-07-29 05:11 · Score: 0
  
  Why is this remotely news?
  You might not care, but:
  
  However, its fractional tendency, which is the change of the standard deviation relative to the value itself, remains nearly zero with time. In a seasonal prediction framework, the ensemble spread due to the differences in software system is comparable to the ensemble spread due to the differences in initial conditions that is used for the traditional ensemble forecasting.
  That's the news.
Lorenz, the Butterfly Effect and Chaos Theory by alanw · 2013-07-28 04:03 · Score: 3, Informative

Edward Lorenz discovered that floating point truncation causes weather simulations to diverge massively back in 1961.
This was the foundation of Chaos Theory and it was Lorenz who created the term "Butterfly Effect"
http://www.ganssle.com/articles/achaos.htm
1. Re:Lorenz, the Butterfly Effect and Chaos Theory by Anonymous Coward · 2013-07-28 05:53 · Score: 0
  
  thank you for pointing this out. I'm amazed these idiots thought this was even worthy of mention or considered "news" or a "discovery" in any sense of the word. Get off slashdot and go read some math and science books you knuckle-dragging neckbeards
2. Re:Lorenz, the Butterfly Effect and Chaos Theory by alanw · 2013-07-28 06:50 · Score: 1
  
  another link: http://www.aps.org/publications/apsnews/200301/history.cfm
  
  Instead of starting the whole run over, he started midway through, typing the numbers straight from the earlier printout to give the machine its initial conditions. Then he walked down the hall for a cup of coffee, and when he returned an hour later, he found an unexpected result. Instead of exactly duplicating the earlier run, the new printout showed the virtual weather diverging so rapidly from the previous pattern that, within just a few virtual "months", all resemblance between the two had disappeared.
3. Re:Lorenz, the Butterfly Effect and Chaos Theory by Anonymous Coward · 2013-07-28 07:57 · Score: 0
  
  Why can't they just settle on which species of butterfly to use in the simulations?
The butterfly that changed the weather for the wor by Anonymous Coward · 2013-07-28 04:07 · Score: 0

The butterfly that changed the weather for the world was not in Texas. But it rather at the end of a floating point word with.
This is why regression testing matters by Anonymous Coward · 2013-07-28 04:25 · Score: 0

This is a common problem with all serious scientific codes. If it's important, you rerun test cases any time you change compiler flags or system software and compare results to make sure the changes are within an acceptable tolerance. They're never the same, so if the change is larger than the threshold, human examination and judgement are required to determine if the change is acceptable. It's not uncommon to discover latent bugs that didn't appear until the machine actually did what the programmer wrote.
Anyone who thinks that this is a solved problem, or ever will be a solved problem doesn't understand the many issues involved which range from algorithm choice to order of execution and intermediate result truncation.
FWIW So far as I know, no x86 systems provide 128 bit floating point. Power, Sparc and Z series are the only options for that I'm aware of. I spent a good bit of time investigating this when I code of mine had convergence issues.
It also demonstrates the problem with modeling climate change. But of course, if you already know what the answer you want is, you can just modify things until you get the answer you want.
So... by Anonymous Coward · 2013-07-28 04:39 · Score: 0

So... of all the hardware tested, which one more accurately predicted the actual weather?
That's what I want to know!
Sh*t For Brains by Anonymous Coward · 2013-07-28 04:40 · Score: 0

Endianness, floating-point representation, long-short INT, roundoff, machine error have been known from the 1970's as posted above.
Trouble now is that a 'new' kid is in town: the Geographer (Geo-groper).
Geographer + computer + stolen (borrowed) code (the Geographer does not even understand INT32) = Shitty output.
The UN IPCC 'Reports' are ripe with shit that they call 'science.'
And the Politicos and Warmer-boys just eat it up faster than it dribbles out.
But who is to complain? The National Science Foundation (lots of Geo-gropers) throw money at this shit like the Treasury can't print enough money fast enough.
Like I wrote: SFB.
Splitting atoms, wx forecasts & zen by Anonymous Coward · 2013-07-28 04:58 · Score: 0

Excuse my completely uneducated, non-scientific response but, this is, in essence about weather forecasting, right? It would seem to a laymen like myself you are a group of highly trained scientists of different genres looking to be as accurate as possible. There is one variable that I'm sure none of you wants to admit. I highly respect, appreciate, & admire what you do for the common good. With all the super computers, past data, software and modeling, there's a fly in the soup. You guys, at the end of the day, still have to have a little luck to correct! Mother Nature has no part in anything to do with science. Chaos is by definition impossible to predict! It's just a thought I wanted to throw out there. Anyone who hunts or fishes passionately knows what I'm eluding to. Everything is in "perfect" condition for the hunt & the game is nowhere to be found. I'm not being critical, but you can't be 100% accurate with anything to do with nature. Thanks for all you do!
doubles by Anonymous Coward · 2013-07-28 05:13 · Score: 0

just use doubles.
may be slightly slower, but you wont have this problem/
Pentium 4 by ISoldat53 · 2013-07-28 05:37 · Score: 1

I didn't know anyone was still using the old Pentiums anymore.
1. Re:Pentium 4 by samwichse · 2013-07-30 03:34 · Score: 1
  
  -1 wrong processor to you, sir.
Chaos by goodmanj · 2013-07-28 06:01 · Score: 2

This is what chaotic systems do. Not to worry, it doesn't change the accuracy of the forecast.
A better article by slew · 2013-07-28 06:18 · Score: 2

A better article...
From what I can gather, although the code was well scrubbed so that the single processor, threaded and message passing (MPI) versions produce the same binary result indicating no vectorization errors, machine rounding differences caused problems.
Since all the platforms were IEEE754 compliant and the code was mostly written in Fortran 90, I'm assuming that one of the main contributor to this rounding is the evaluation order of terms and perhaps the way that double fourier series and spherical harmonics where written.
Both SPH and DFS operations use sine/cosine evaluation which vary a great deal from platform to platform (since generally they only round within 1ulp, not within 1/2ulp of an infinitely precise result).
I remember many moons ago, when I was working on fixed-point FFT accelerators, we were lazy and generated sine/cosine tables using the host platform (x86) and neglected to worry about the fact that using different compliers and different optimization levels on the same platform we got twiddle-factor tables that were different (off-by-one).
With one bug report, we eventually tracked it down to different intrinsics (x87 FSIN w/ math or FSINCOS) were used and sometime libraries were used. Ack... Later library releases we complied in a whole bunch of pregenerated tables to avoid this problem.
Of course putting in a table or designing your own FSIN function for a spherical harmonic or fourier series numerical library solver might be a bit out of scope (not to mention tank the performance), so I'm sure that's why they didn't bother to make the code platform independent w/ respect to transcendental functions, although with Fortran 90, it seems like they could of fixed the evaluation order issues (with appropriate parenthesis to force a certain evaluation order, something you can't do in C).
1. Re: A better article by trick-knee · 2013-07-30 03:30 · Score: 1
  
  oh, man. this was way better than mainstream erotica, but just as I was about to climax I read "could of" and could not maintain state.
  in future editions, please use "could have". thanks so much.
The paper is paywalled by Anonymous Coward · 2013-07-28 07:22 · Score: 0

"The paper is paywalled" - then don't link to it! Link only to open, accessible content. If someone wants to brick themselves up in a paywall ghetto, don't give them publicity.
Video transcoding by DigiShaman · 2013-07-28 07:30 · Score: 1

Handbrake transcodes video as a multi-threaded application. I have yet to try it, but if I re-encoded the same video multiple times from the same source, would I get a different file size based on an MD5 or SHA1 checksum?

--
Life is not for the lazy.
1. Re:Video transcoding by Anonymous Coward · 2013-07-28 08:34 · Score: 0
  
  If the results of each thread is always combine in the same order, it could result in the same result (excluding timestamps, etc.)
2. Re:Video transcoding by Anonymous Coward · 2013-07-28 23:38 · Score: 0
  
  OMG. Don't you know Handbrake is written by a well-known Apple Fanboi??? It is therefore the model of absolute perfection and is incapable of making a math error no matter what data is fed to it.
Calculators get 0.5 - 0.4 - 0.1 wrong ... by perpenso · 2013-07-28 07:35 · Score: 2

Here's a simple example. Try 0.5 - 0.4 - 0.1 on a calculator or calculator app. If it is using the FPU it will probably get a non-zero result. This is why calculators, including ours, are normally implemented using decimal arithmetic rather than the FPU.

All IEEE754 would do is ensure that each FPU based calculator would yield the same non-zero result.
1. Re:Calculators get 0.5 - 0.4 - 0.1 wrong ... by Anonymous Coward · 2013-07-28 12:49 · Score: 1
  
  And that doesn't help if you are trying to do operations that produce repeating numbers in base 10. You're just trading one set of problem numbers for a different set of problem numbers. Even if you used a format that did math with exact rational numbers (or even real numbers...), that would still not solve some of the larger problems inherent in weather prediction models and the fact there are much larger errors in the original measurements. You could spend 100x the computation effort for no gains if it still buried in the noise due to initial conditions.
I have run into this by LF11 · 2013-07-28 07:43 · Score: 2

It is surprising how quickly certain rounding errors can add up. I've had the dubious pleasure of writing an insurance rating algorithm based on multiplying tables of factors. The difference between half-up and banker's round at 6 decimal places makes for rating errors totalling > 50% of the expected premium in a surprisingly small number of calculations. It's one thing to know about error propagation from a theoretical standpoint, but it's quite another to see it happen in real life.
I sympathize with the weather forecasters.
Re: by Anonymous Coward · 2013-07-28 08:00 · Score: 1

(x + y + z) != (z + y + x)
That's correct. Addition is not commutative with floating point numbers. So, 1,000,000 + 1,000 + 1,000 is not necessarily the same as 1,000 + 1,000 + 1,000,000.
Also, x * x + x != x * (x + 1) but many compilers make this substitution to reduce code size or increase FPU throughput.
So the weatherman isn't always right? by Nyder · 2013-07-28 08:10 · Score: 1

Didn't we know this? Take forecasts with a grain of salt because they could be wrong?

--
Be seeing you...
Not going away by sjames · 2013-07-28 09:15 · Score: 1

This problem is not going to go away unless/until computers start doing their math rationally and symbolically. That is, with fractional results stored as fractions with no rounding. Where irrational constants are used in calculations, they'll have to be carried through in symbolic form as you would using pencil and paper. That is, the computer actually stores a representation of 1/2pi, NOT 1.570796327.
Of course, that leaves the 'minor matter' of performance.
1. Re:Not going away by blueg3 · 2013-07-28 13:02 · Score: 1
  
  These are non-algebraic simulations. Even symbolic math libraries -- which there are no shortage of -- cannot do better.
2. Re:Not going away by sjames · 2013-07-28 15:15 · Score: 1
  
  Symbolic math can also cover calculus. What there is a shortage of is symbolic math libraries that have performance even close to IEEE floating point.
3. Re:Not going away by blueg3 · 2013-07-29 00:04 · Score: 1
  
  That's true, because it's much harder. I know symbolic math covers calculus -- I've used plenty of such packages.
  By non-algebraic I mean numerical-only. Unable to be computed using symbolic mathematics.
4. Re:Not going away by sjames · 2013-07-29 04:01 · Score: 1
  
  Then you don't quite get my meaning. I mean that in the program when you set a as 1/4pi, it actually stores the representation of 4 and the representation of pi as a ratio in the variable. It might be a return value from atan() for example. It would continue to hold pi as a discreet entity in the variable (since it cannot be perfectly represented in decimal or fractional form). It would only go away if a gets divided by some other value containing pi so that the pi-s cancel or you call a function that forces it to a rounded numeric.
  There is no such thing as numeric computation that can't be accomplished symbolically, numbers are symbols.
5. Re:Not going away by blueg3 · 2013-07-29 05:00 · Score: 1
  
  No, I understand you just fine. But you're falling victim to a common line of thinking -- that any mathematical problem can necessarily be done with symbolic algebra. It's true for simple problems, but most complex models (and lots of other classes of problems) cannot be done symbolically at all. They absolutely require numerical approximations.
  Don't try to be pedantic about "numbers are symbols". There are two general approaches to mathematical computation, and they're usually referred to as "numerical" and "symbolic".
6. Re:Not going away by sjames · 2013-07-29 06:05 · Score: 1
  
  No, you are really missing a fundamental truth of mathematics. the so-called numerical approach is a limitation of the computer, not a limitation of the mathematics. 2pi *IS* in every way a number. Just like 3+23i is a number. Just like 1/3 is a number. We can reduce and approximate 2pi as 6.28 if we like, but at the cost of precision. We can't actually reduce 3+23i so we invented a complex data type to handle it. 1/3 (one third) is most certainly a number that we can express (more or less) as 0.3333333 (etc) or we can maintain as a compound value with numerator=1 and denominator=3. We can multiply it by i yielding i/3. We can divide it by pi yielding i/3pi. We can multiply that by the circumference of the unit circle yielding 2i/3.
  The fact that there exists no data type that can store 2pi as two times the value of pi is a limit of computation, not a limit of computability.
  Please, do show me here, some problem that cannot be expressed and solved using symbols. I'll get the popcorn.
7. Re:Not going away by sjames · 2013-07-29 06:16 · Score: 1
  
  Sorry to double reply but I believe I saw where you have gone into the weeds. I do indeed understand iterative models where you cannot solve the equation (for example, there is no equation for the n body problem where N>2). That does NOT mean you cannot perform the iterative computations in terms of ratios and symbolic values. There is no need to round the value of pi, just carry pi through the computations and reduce at the very end if you must (and can). Given enough ram there is no need to round at all. Store rational numbers as numerator and denominator and irrational numbers in terms of their symbol (invent a new one if you must).
  For example, there is no representation of an imaginary number that is not in terms of i, so we carry i through the variables where necessary (using the complex data type). We can likewise carry pi or a messy arbitrary fraction.
8. Re:Not going away by blueg3 · 2013-07-29 09:29 · Score: 1
  
  There is an equation for the N-body problem. There's not a closed-form solution. That's hardly unique, though, there aren't closed-form solutions for huge swaths of physical problems. There are even bigger concerns than a lack of closed-form solutions: like systems where you're solving a continuum function but have to use an iterative solver that quantizes space. Hence techniques like finite-element modeling. But, I digress.
  Here's a good example of the problem you're missing. There is a whole class of problems that are only solvable through iterative improvement. Newton's method is an excellent example of an iterative-approximation technique. (It's by no means a particularly complicated iterative approximation technique, so it's a good example.) You start with an arbitrary (read: well-chosen) guess and iteratively converge on the answer. Here, being able to exactly model irrational constants like pi is not really helpful.
  Now, you seem to think that maintaining a rational-number notation will help you. But that's not really any different from using arbitrarily-large floating-point binary numbers. In fact, you can see that a floating-point binary number is a rational number: it's K / 2^N for some K and N. Given two floating-point numbers and some basic operation on those numbers, you can store the result of the operation exactly in roughly a number of bits equal to the sum of the number of bits in the operands. (So, an operation on two 80-bit numbers would yield a 160-bit number.) This is roughly the same expense as doing "symbolic" algebra: given some way of storing a rational number, an operation on two of those rational numbers will take up the storage space of the pair. Unless some parts of the ratio cancel out, which for sufficiently complicated systems, will not occur to any meaningful extent.
  So, it's strictly true that you can start performing numerical approximation methods (which are sometimes the only way of solving a problem) using symbolic algebra. If you had arbitrary amounts of memory, you could even continue doing it. But this is no better than just using very large (high-precision) floating-point numbers. Even then, that doesn't mean that you can obtain an exact answer, because a) sometimes answers are irrational and b) it's not necessarily the case that an irrational answer will be representable as an algebraic expression containing only rational numbers and known irrational constants so c) no finite number of iterations of your approximation methods, storing the result in a finite amount of memory, could possibly exactly find the irrational answer. Even if the solutions are all rational, though, you still can't reasonably use a symbolic approach because there's just not enough memory. As above, each iteration very roughly can be expected to double the number of bits you need to exactly represent the solution. It's nothing to expect hundreds of thousands of iterations in order to arrive at a solution. (Far, far more than that for any interesting problem.) And there's probably on the order of no more than 2^300 bits available, ever.
Re:Global Warming Predictions by Anonymous Coward · 2013-07-28 09:18 · Score: 0

Actually, it's worse.
many predictions - most, nowadays, are out-and-out fraud.
But you mustn't say this.....
Well, if the system dynamics are governed entirely by random perturbations, then fraud is of no consequence, just as legitimate prediction is of no consequence.
Numerical Analysis... by Anonymous Coward · 2013-07-28 09:35 · Score: 0

While cretins dribble on about the importance of using 64-bit, 80-bit, 128-bit, or one million bits of floating point precision, there used to be this little mathematical discipline known as 'Numerical Analysis' that has a little bit to say about the issue. For god's sake, does no-one in IT actually get a proper education these days?
1) Weather prediction algorithms are SUPPOSED to have minor inaccuracies introduced into the data set. This is the whole idea. Run the calculation, say, three or four times with minor noise values added to the input values (from you weather station collection points). If the predictions from each run vary greatly from one another, this is indicative that the prediction is essentially junk. If each prediction is fairly close to the others, this is indicative that the computer program MAY be giving a fairly accurate weather forecast.
2) The above method is actually used to decide the accuracy of longer term weather forecasts. Forecasts close to the present time are expected to be highly accurate. It is the fall-off in accuracy as the prediction time increases that is of interest to the meteorologists.
3) Weather prediction software should NOT be vulnerable to issues of precision, rounding or whatever. The software should have been written by people with a proper understanding of the mathematical theories of numerical analysis. To make this clear to those of you to thick to get it, here's a neat example:
MPEG1 and MPEG2 used floating point methods in video compression/decompression, and as a consequence compression was inefficient, frequently incorrect against desired targets, and had video decoders that would produce different results given the same data streams to display. Then proper mathematicians got involved. They dropped the cretinous "always use doubles" method of junk programming. They examined the mathematical 'space' the algorithms needed to operate in, and created mathematically correct INTEGER methods to handle compression/decompression. Unlike with MPEG2, every MPEG4 (H264) decoder produces the same result.
There is no rules in maths that says floating point is better/more correct than integer, or that doubles are better than singles. Indeed, a lack or understanding of the principles of numerical analysis means that thick headed programmers can make all kinds of dreadful mistakes by the simple order in which the calculations are carried out (even if said order would be OK if each value carried infinite precision). Too many programmers are proud to be crap at maths. These crap programmers are the ones that ALWAYS use doubles, and will go to even greater 'precision' if their code doesn't work as they expect.
Re: by Anonymous Coward · 2013-07-28 10:38 · Score: 0

As long as you don't exceed the capacity of the fraction bits, floating point operations on pure integers represented as floating point numbers are actually exact, so that's not a good example. 1002000, 1001000, 1000000, 2000 and 1000 are all exactly representable as IEEE754 floating point numbers, so the order really doesn't matter in this case.
Microsoft Access by Tablizer · 2013-07-28 10:50 · Score: 1

I've seen Microsoft Access do the same thing. Apparently Person-B had loaded a slightly different OS date-handler DLL because they found a bug for date patterns of a specific country they happened to be interested in once. A specific spot on a report that calculated date difference thus produced slightly different answers than if ran on the PC of Person-A, making the final totals not add up the same.

--
Table-ized A.I.
Global warming anyone? by Anonymous Coward · 2013-07-28 12:29 · Score: 0

So how much confidence should we have in calculations that purport to predict average global apparatuses and sea levels 50 or 100 years from now?
1. Re:Global warming anyone? by blueg3 · 2013-07-28 13:03 · Score: 1
  
  That's the simulation of climate rather than weather, which is a substantially different problem. It's a problem that's still hard and is still plagued by chaos-theory effects on numerical modeling. Not to worry, though: scientists have understood this problem and its implications for about 7 orders of magnitude longer than you've heard about it.
Repeatings digits can be expressed as fractions by perpenso · 2013-07-28 13:39 · Score: 1

And that doesn't help if you are trying to do operations that produce repeating numbers in base 10. You're just trading one set of problem numbers for a different set of problem numbers.
Yes and no. You get rounding in either base when you have insignificant significant digits. However by not doing a conversion from one base to another you avoid a second opportunity for rounding errors.

Also numbers with repeatings digits can be expressed as a fraction. In our calculator a fraction is a basic data type. If an operation includes a fraction we will try to produce a result that is a fraction. This can sometimes avoid a rounding error.

... would still not solve some of the larger problems inherent in weather prediction ...
I'm not suggesting a solution to this problems. I am just providing a simple example of how an FPU or IEEE754 can get things wrong.
1. Re:Repeatings digits can be expressed as fractions by muridae · 2013-07-28 16:02 · Score: 1
  
  Also numbers with repeatings digits can be expressed as a fraction.
  No always. Please express e or pi or tau or the square root of 2 as fractions. Not as an infinite Taylor series of fractions for pi, but just pi = x/y.
2. Re:Repeatings digits can be expressed as fractions by mhotchin · 2013-07-28 18:16 · Score: 1
  
  "Repeating". None of the numbers you indicated are repeating (sometimes called periodic) decimal (or any other integer base) expansions.
3. Re:Repeatings digits can be expressed as fractions by Anonymous Coward · 2013-07-28 23:37 · Score: 0
  
  I think 'repeating digits' is meant to be something like 0.1232323... non-rational numbers don't have this property in any base.
4. Re:Repeatings digits can be expressed as fractions by perpenso · 2013-07-29 06:09 · Score: 1
  
  Also numbers with repeatings digits can be expressed as a fraction.
  No always. Please express e or pi or tau or the square root of 2 as fractions. Not as an infinite Taylor series of fractions for pi, but just pi = x/y.
  Pi does not have a permanent repeating pattern.
Nothing new here. by Anonymous Coward · 2013-07-28 14:33 · Score: 0

This has been known for at least 20 years.
1. Re:Nothing new here. by Pino+Grigio · 2013-07-28 23:55 · Score: 0
  
  Surprisingly not well known in the literature amongst climate scientists. This is just another uncertainty to throw into the mix. This is all not withstanding the inability to accurately specify the start conditions, the model parameters and a whole host of other things. This is why model runs over the longer term diverge from reality. If they ever looked like they were accurate in the past, remember that they're "tuned" on past empirical data. That is to say, a curve fitting exercise is done backwards, and then the simulation is run forwards. So they always look like they were right, even if they can't take the state "today" and run it backwards and get an accurate yesterday.
2. Re:Nothing new here. by Pino+Grigio · 2013-07-30 01:20 · Score: 1
  
  Some massive twat marked my comment down from 1? Really? It's almost as if they're political activists who don't like any criticism whatsoever of their "post-modern" scientific methods.
I realize that they use navier stokes equations by Anonymous Coward · 2013-07-28 16:57 · Score: 0

I realize that they use Navier-Stokes equations and samples of current atmospheric conditions, and then propagate based on models and probability to estimate the next hour's forecast, and then the next and so on. In the end though, you can approximate their accuracy by saying '95% accurate today, 95% of 95% tomorrow. 95% of 95% of 95% the next day, and so on. After two weeks, you are at 50/50. The thing is: they aren't 95% accurate, they are only about 90%. Also because of the nature of the calculations, they don't use high precision (hundreds/thousands/billions of digits) for calculations. You are stuck using the limits of the bit width of the registers in the CPU/ALU (64 bits on a 64 bit processor). Some might yelp about 'double precision' but if you start arbitrarily expanding the number of bits, you may as well use either high or arbitrary precision.
It affects my tax returns! by eionmac · 2013-07-29 04:34 · Score: 1

1. I calculate my personal and corporation tax using four digit decimals (with a two digit Pound/Penny (or Dollar/Cent)system this seems ok)
2 Her Majesties Revenue & Customs sometimes calculates to two decimals and sometimes does not in the same tax calculation.
3. Hence having paid my estimate of due tax, I have got demands for one penny to four pennies sent by post (cost say 15 pound to Revenue to issue pay or punishment warning letter) or face punishment and fines. I duly pay one penny at local Post Office in cash, who charge HMRC 4 pounds or so to transmit the penny due. They (the Post Office staffers) laugh and say this is a very regular occurrence. Thus penny rounding error costs HMRC say 19 or 20 pounds of spend to collect. It would appear they cannot just 'not balance the books and not collect' [computer instruction in calculation to disregard sums due of less than xx pennies] due to stringency of reporting to parliament they have done all possible to collect due tax. (A tick box syndrome on HMRC officials reporting to government)

--
Regards Eion MacDonald
Should have wrote "repeating decimal" by perpenso · 2013-07-29 06:22 · Score: 1

I should have wrote "repeating decimal" not "repeating digits".
http://en.wikipedia.org/wiki/Repeating_decimal
Doesn't anyone read the classics? by khb · 2013-07-29 10:34 · Score: 1

http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html "What Every Computer Scientist Should Know About Floating-Point Arithmetic", by David Goldberg, published in the March, 1991 issue of Computing Surveys
I don't know why anyone thought this was surprising (it would have been surprising if they didn't get different results, given that some use GPUs, some don't, etc.). What does tend to get "amusing" is that even with the same processor folks get different results (sometimes due to software issues, chip rev issues, or actual hardware bugs that go undetected ... but are minor enough to remain so unless someone gets really careful and whips out the old logic analyzer).
Inconsistent constants too by Anonymous Coward · 2013-08-09 12:35 · Score: 0

Once I was challenged to resolve a mismatch in the 19th digit in a customers CFD code.
He had a constant in the deck (FORTRAN),,,,,, "PI=3.14".
These codes are full of such cruft. Some have been pressed into global warming climate modeling use. Half of the community cries foul. The other half wants more budget.
The code needs to be opened up.....!