Theoretical Breakthrough Made In Random Number Generation (threatpost.com)
msm1267 quotes a report from Threatpost: Two University of Texas academics have made what some experts believe is a breakthrough in random number generation that could have longstanding implications for cryptography and computer security. David Zuckerman, a computer science professor, and Eshan Chattopadhyay, a graduate student, published a paper in March that will be presented in June at the Symposium on Theory of Computing. The paper describes how the academics devised a method for the generation of high quality random numbers. The work is theoretical, but Zuckerman said down the road it could lead to a number of practical advances in cryptography, scientific polling, and the study of other complex environments such as the climate. "We show that if you have two low-quality random sources -- lower quality sources are much easier to come by -- two sources that are independent and have no correlations between them, you can combine them in a way to produce a high-quality random number," Zuckerman said. "People have been trying to do this for quite some time. Previous methods required the low-quality sources to be not that low, but more moderately high quality. We improved it dramatically." The technical details are described in the academics' paper "Explicit Two-Source Extractors and Resilient Functions."
Hardware random number generation based on stochastic random processes (e.g. Johnson noise across a resistor) is a real thing.
There is the minor difficulty that it requires that you actually add the hardware to do the sampling - sure HP didn't mind dropping a /dev/hwrng into my new $3500 laptop, but most people don't buy top-of-the-line hardware, and nobody likes hardware dongles.
The REAL difficulty is that you're now subject to all the usual threats to analog processes. We like to just *assume* that all the noise sources in our analog devices are uncorrelated because that makes our analyses tractable, but that's not the case. In other words, you want to believe that the 20th bit of the noise-sampling ADC is independently flapping in the breeze and providing 1/0 randomly and in equal measure, but that doesn't make it so. This is especially the case inside a box packed with high-speed digital circuitry, which will be coupling nonrandom, nonwhite signals into anything it can. Worst of all, these signals will tend to be correlated with the ADC: Its conversion clock, like almost every other clock signal in a modern computer, is probably either phase-locked to or derived directly from the same crystal in order to provide reliable, synchronous timing between circuits.
Mersenne Twister is pretty much the standard for simulating a uniform distribution in a lot of scientific computing. These depend not only upon unpredictability (useful for avoiding biases, and clearly important in the security realm), but also upon properties of the uniform distribution.
But when we test it out, we find it's still not as great as we'd like: look at a histogram of outputs, and you'll see that until you get really large numbers of function calls, the histogram isn't particularly uniform. (In particular, numbers near the bottom and top of the range don't get called quite as often.) This means that simulation properties that rely upon uniform distributions over both long and short time periods may be thrown off, and short- and mid-time simulation results may well stem from the MT rather than from the mathematical model. Moreover, low-probability events may have artificially smaller probabilities in the simulations (because of the non-uniformity of the distributions near the bottom and top ends of the range).
Over very short numbers of function calls (a few hundred to a few thousand), the outputs can even tend to cluster in a small neighborhood. So suppose that you are simulating a tissue with many cells, and calling MT as your uniform distribution to decide if the cells divide or stay dormant (each with independent probability p, so each cell divides if PRNG/max(PRNG) < p). The math says that for a uniform distribution, you don't need to worry about what order you evaluate your decision across all the cells. But if the PRNG outputs cluster over several sequential calls, then a neighborhood of cells may simultaneously divide if they are all evaluated close to one another sequentially. In analyzing the spatial behavior of such a simulation, you may draw incorrect conclusions in smaller spatial structures that, again, derive from non-uniformity of the PRNG, rather than problems with predictability. (And then you may accept/reject a particular hypothesis or mathematical model pre-maturely.)
So, there's definitely more to it than just unpredictability, depending upon where the code is being used.
OpenSource.MathCancer.org: open source comp bio
Multiplication and other arithmetic operators are indeed secret as far as the average American high schooler is concerned.
Well, a Mathematician was interrogated because some woman on a plane saw him........
doing math, which in today's America is a sign of trrsm.
The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
Mersenne Twister is pretty much the standard for simulating a uniform distribution in a lot of scientific computing.
No. Permuted Congruential Generators are. There's no contest. MT is slower and doesn't pass TestU01.
MT is used for reasons of inertia.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
Mersenne Twister is pretty much the standard for simulating a uniform distribution in a lot of scientific computing
kind of is, for the reason as you say of:
MT is used for reasons of inertia.
MT19937 is pretty good. Much better than the LCGs that were popular before it. Those have some really serious pitfalls and you can hit them awfully easily. MT19937 is generally pretty good, though not perfect. It has far fewer pitfalls than LCGs and you're actually pretty unlikely to hit them (it doesn't fail TestU01, it passes almost all of the tests and fails a few). It's also fast enough that you have to try quite hard to make the RNG the limiting factor (it is possible though).
You can go wrong with MT19937, but it's hard to. That makes it pretty decent. Prior to C++11, xorshift used to be my go-to generator that didn't suck because it was a 5 line paste. These days, it's MT19937 simply because it's in the standard library and good enough for almost all tasks.
SJW n. One who posts facts.