Ultra-low-cost True Randomness
Cryptocrat writes "Today I blogged about a new method for secure random sequence generation that is based on physical properties of hardware, but requires only hardware found on most computer systems: from standard PCs to RFID tags." Basically he's powercycling memory and looking at the default state of the bits, which surprisingly (to me anyway) is able to both to fingerprint systems, as well as generate a true random number. There also is a PDF Paper on the subject if you're interested in the concept.
A slightly more expensive but somehow even more random method is to seed the generator against the words and phrases that come out of the mouth of South Carolina's Miss Teen USA.
:)
But in all seriousness, I wonder how this compares to the Mersenne Twister (Java implementation & PDF)that I use at home? I am almost sure this new proposed method is more efficient and faster, when will there be (I know, I'm lazy) a universal implementation of it?
Also, this may be a stupid question, but I wonder how one measures the 'randomness' of a generator? Is there a unit that represents randomness? I mean, it would be seemingly impossible to do it using observation of the output so I guess all you can do is discuss how dependent it is on particular prior events and what they are, theoretically. Can you really say that this is 'more random' than another one because you have to know so much more before hand about the particular machine & its fingerprint in order to predict its generated number?
My work here is dung.
I wouldn't assume that these fingerprints are as unique or pattern-less as one might hope (a fact discussed in the pdf). All of the RAM chips from a given wafer or given mask may share tendencies toward some patterns of the probability of a 0 or 1. These patterns may appear as correlations between rows and columns of a given chip. Location on the wafer (in the context of nonuniformities of exposure to fab chemicals) might also systematically affect the aggregate probabilities of 0 or 1 or the repeatability of the fingerprint. The quality of these fingerprints to be consistent or random might change from run to run and from manufacturer to manufacturer. Finally, I'd bet that the probabilities vary systematically with temperature -- e.g., the probability of a 1 increases for all bits as the chip's temperature increases.
This is a very interesting phenomenon, but a lot more data is needed to show that it provides consistent behavior.
Two wrongs don't make a right, but three lefts do.
the true RNG properties rely on the fact that:
a: Many of the bits are sorta random, but physically random. So very biased coins, but true randomness.
b: With the right reduction function, you can turn a LOT (eg, 512 Kb) of cruddy random data to a small amount (128b-512b) of very high quality, well distributed random.
And the fingerprinting relies on the fact that:
a: Many other of the bits are physically random, but VERY VERY biased. So map where those are and record them and it is a very good fingerprint. And since it is all silicon process randomness going into that, it is pretty much a physically unclonable function.
Kevin Fu has some SMART grad students.
Test your net with Netalyzr
Also, this may be a stupid question, but I wonder how one measures the 'randomness' of a generator?
Read James Gleick's Chaos.
There is a method in that book that describes how they extracted attractors from a stream of data. Here's how it works.
A team of researchers had a bathtub with a dripping faucet. They tuned the dripping until the drips fell at random intervals. Nothing rhythmic about it. As the drop broke away from the faucet, it was setting up a vibration in the remaining water that would jiggle until the next drop fell. It was highly nonlinear.
They constructed a phase space where you would look at the time between any two drops. On the other axis was the time between the one previous to that. So on one axis you have the time bewteen drops 1 and 2, and on the other axis between drops 2 and 3.
It turns out that an attractor would emerge. The times did not scatter around the page randomly, they grouped in clusters. There was an underlying order that this method would expose.
So - to answer your question, what you could do would be to take your stream of numbers, and examine them in phase space looking at the differences between each data point. If nothing shows up in a two dimensional plot, go for three. Use n1-n2, n2-n3 and n3-n4 on your axis. Add dimensions if you need to beyond that. See what it takes to make your data cluster, if it ever does. The more complex your data is, the more dimensions it will take to visualize that.
Weaselmancer
rediculous.
The original paper is much better than CmdrTaco's quick conclusions.
The described method is ONLY for SRAM (statical RAM), no DRAM, no SDRAM. You can find this on RFID chips and in a CPU'S cache, not in RAM. As there is no way to access a CPU's cache uninitialized, I can't see why this should be useful.
If you have to modify a CPU first, to allow access to it's unitialized caches (think about all the unwanted implications), it's much cheaper to just give it a thermal diode and register to poll (as most modern CPU's already have).
After all the described method is just another way of collecting thermal noise. As RFID's are custom designs most of the time, also there it would be cheaper to just use a thermal diode.
The only application for this would be if you had to develop strong crypto for legacy RFID chips.
Slashdot stories get worse by the day.
I've had this bite me, and exploited it.
It bit me when booting into Windows CE - you'd power cycle the thing, and the OS would boot with the old RAM disk you had - we'd gotten to the point where we'd have the bootloader wipe the kernel memory so the data structures were all corrupted by the time the OS was trying to decide between mounting the RAM disk (object store) and starting fresh. It turns out that the longer an image is unchanged in RAM, the more likely the cells woudl be biased such that if you cycle the power on them, they're more likely to lean towards the way they were before power was cut.
The time I exploited it, I didn't have any way of logging. Logging to serial port caused issues (timing-sensitive code), so I logged to memory (and no, I had no filesystem running, so I couldn't log to file). My trick was to simply log to a circular RAM buffer. When it crashed, I would just power cycle and dump the RAM buffer. Even though the data was fresh, it was enough to make out what my debug message was trying to say (almost always perfect). This was readable after a brief power cycle, and was still readable after turning power off for nearly a minute. The characters got corrupted, but since it was regular ASCII, you could still make out the words.