Slashdot Mirror


Ask Slashdot: How Reproducible Is Arithmetic In the Cloud?

goodminton writes "I'm research the long-term consistency and reproducibility of math results in the cloud and have questions about floating point calculations. For example, say I create a virtual OS instance on a cloud provider (doesn't matter which one) and install Mathematica to run a precise calculation. Mathematica generates the result based on the combination of software version, operating system, hypervisor, firmware and hardware that are running at that time. In the cloud, hardware, firmware and hypervisors are invisible to the users but could still impact the implementation/operation of floating point math. Say I archive the virutal instance and in 5 or 10 years I fire it up on another cloud provider and run the same calculation. What's the likelihood that the results would be the same? What can be done to adjust for this? Currently, I know people who 'archive' hardware just for the purpose of ensuring reproducibility and I'm wondering how this tranlates to the world of cloud and virtualization across multiple hardware types."

3 of 226 comments (clear)

  1. Re:Fixed-point arithmetic by Anonymous Coward · · Score: 1, Interesting

    Or simply don't use the broken "cloud computing" model. If you have some calculations to do, and care the least about the results, how about buying a computer that does those calculations for you?

  2. Re:Fixed-point arithmetic by raddan · · Score: 4, Interesting

    Experiments can vary wildly with even small differences in floating-point precision. I recently had a bug in a machine learning algorithm that produced completely different results because I was off by one trillionth! I was being foolish, of course, because I hadn't use an epsilon for doing FP, but you get the idea.

    But it turns out-- even if you're a good engineer and you are careful with your floating point numbers, the fact is: floating point is approximate computation. And for many kinds of mathematical problems, like dynamical systems, this approximation changes the result. One of the founders of chaos theory, Edward Lorenz, of Lorenz attractor fame, discovered the problem by truncating the precision of FP numbers from a printout when he was re-entering them into a simulation. The simulation behaved completely differently despite the difference in precision being in the thousands. That was a weather simulation. See where I'm going with this?

  3. Re:You need to know some numerical analysis by Red+Jesus · · Score: 4, Interesting

    While that's true in many cases, there are some situations in which we need . Read Shewchuk's excellent paper on the subject.

    When disaster strikes and a real RAM-correct algorithm implemented in floating-point arithmetic fails to produce a meaningful result, it is often because the algorithm has performed tests whose results are mutually contradictory.

    The easiest way to think about it is with a made-up problem about sorting. Let's say that you have a list of mathematical expressions like sin(pi*e^2), sqrt(14*pi*ln(8)), tan(10/13), etc and you want to sort them, but some numbers in the list are so close to each other that they might compare differently on different computers that round differently, (e.g. one computer says that sin(-10) is greater than ln(100)-ln(58) and the other says it's less).

    Imagine now that this list has billions of elements and you're trying to sort the items using some sort of distributed algorithm. For the sorting to work properly, you *need* to be sure that a < b implies that b > a. There are situations (often in computational geometry) where it's OK if you get the wrong answer for borderline cases (e.g. it doesn't matter whether you can tell whether sin(-10) is bigger than ln(100)-ln(58) because they're close enough for graphics purposes) as long as you get the wrong answer consistently, so the next algorithm out (sorting in my example, or triangulation in Shewchuk's) doesn't get stuck in infinite loops.