Ask Slashdot: How Reproducible Is Arithmetic In the Cloud?
goodminton writes "I'm research the long-term consistency and reproducibility of math results in the cloud and have questions about floating point calculations. For example, say I create a virtual OS instance on a cloud provider (doesn't matter which one) and install Mathematica to run a precise calculation. Mathematica generates the result based on the combination of software version, operating system, hypervisor, firmware and hardware that are running at that time. In the cloud, hardware, firmware and hypervisors are invisible to the users but could still impact the implementation/operation of floating point math. Say I archive the virutal instance and in 5 or 10 years I fire it up on another cloud provider and run the same calculation. What's the likelihood that the results would be the same? What can be done to adjust for this? Currently, I know people who 'archive' hardware just for the purpose of ensuring reproducibility and I'm wondering how this tranlates to the world of cloud and virtualization across multiple hardware types."
Use Fixed-point arithmetic.
In Mathematica make sure to specify your precision.
Look at 'Arbitrary-Precision Numbers' and 'Machine-Precision Numbers' for more information on how Mathematica does this.
What the title says - e.g. bignum for Python etc. It will be significantly slower, but the result is going to be stable at least for a given library version, and that is far easier to archive.
Mathematica in particular uses adaptive precision; if you ask it to compute some quantity to fifty decimal places, it will do so.
In general, if you want bit-for-bit reproducible calculations to arbitrary precision, the MPFR library may be right for you. It computes correctly-rounded special functions to arbitrary accuracy. If you write a program that calls MPFR routines, then even if your own approximations are not correctly-rounded, they will at least be reproducible.
If you want to do your calculations to machine precision, you can probably rely on C to behave reproducibly if you do two things: use a compiler flag like -mpc64 on GCC to force the elementary floating point operations (addition, subtraction, multiplication, division, and square root) to behave predictably, and use a correctly-rounded floating point library like crlibm (Sun also released a version of this at one point) to make the transcendental functions behave predictably.
This assumption by the OP:
Mathematica generates the result based on the combination of software version, operating system, hypervisor, firmware and hardware that are running at that time.
... is entirely wrong. One of the defining features of Mathematica is symbolic expression rewriting and arbitrary-precision computation to avoid all of those specific issues. For example, the expression:
N[Sin[1], 50]
Will always evaluate to exactly:
0.84147098480789650665250232163029899962256306079837
And, as expected, evaluating to 51 digits yields:
0.841470984807896506652502321630298999622563060798371
Notice how the last digit in the first case remains unchanged, as expected.
This is explained at length in the documentation, and also in numerous Wolfram blog articles that go on about the details of the algorithms used to achieve this on a range of processors and operating systems. The (rare) exceptions are marked as such in the help and usually have (slower) arbitrary-precision or symbolic variants. For research purposes, Mathematica comes with an entire bag of tools that can be used to implement numerical algorithms to any precision reliably.
Conclusion: The author of the post didn't even bother to flip through the manual, despite having strict requirements spanning decades. He does however have the spare time to post on Slashdot and waste everybody else's time.