Slashdot Mirror


Linux RNG May Be Insecure After All

Okian Warrior writes "As a followup to Linus's opinion about people skeptical of the Linux random number generator, a new paper analyzes the robustness of /dev/urandom and /dev/random . From the paper: 'From a practical side, we also give a precise assessment of the security of the two Linux PRNGs, /dev/random and /dev/urandom. In particular, we show several attacks proving that these PRNGs are not robust according to our definition, and do not accumulate entropy properly. These attacks are due to the vulnerabilities of the entropy estimator and the internal mixing function of the Linux PRNGs. These attacks against the Linux PRNG show that it does not satisfy the "robustness" notion of security, but it remains unclear if these attacks lead to actual exploitable vulnerabilities in practice.'" Of course, you might not even be able to trust hardware RNGs. Rather than simply proving that the Linux PRNGs are not robust thanks to their run-time entropy estimator, the authors provide a new property for proving the robustness of the entropy accumulation stage of a PRNG, and offer an alternative PRNG model and proof that is both robust and more efficient than the current Linux PRNGs.

13 of 240 comments (clear)

  1. Random number generators are hard by moonwatcher2001 · · Score: 2, Insightful

    Linus always says Linux is perfect. Linus can be wrong.

    1. Re: Random number generators are hard by Anonymous Coward · · Score: 5, Insightful

      No, RNGs are easy. Super easy. Just take a trustworthy source of noise

      Therein lies the tricky part. Getting a trustworthy source of noise is harder than you may think. Especially when you're writing software with no control over the hardware it runs on.

    2. Re: Random number generators are hard by Vintermann · · Score: 5, Insightful

      The nice thing about randomness though, is that it adds up. If you xor one stream of hopefully random bits with another stream of hopefully random bits, you get a result that is at least as random as the best of the two streams, quite possibly better than either. It's a rare and precious thing in cryptography: something you can't make worse by messing up. At worst you make no difference.

      So if you're paranoid, come up personally with a ridiculously long phrase (you don't need to remember it), feed it through a key derivation function, and use it in a stream cipher with proven security guarantees (in particular one that passes the next-bit test for polynomial time). Instead of using this directly, xor it together with a source of hopefully random stuff.

      If you write to /dev/random this is more or less what happens. Write to it to your heart's content - it can only make it better, not worse. (This is as I recall, please check with an independent source before you try).

      Voila, no matter what NSA has done to your HRNG chip, this door is secured. Your time is better spent focusing on the other doors, or the windows.

      (But you should be very careful in using HRNG output directly. I am very surprised to read that some open source OSes disable the stream cipher if a HRNG is present - this is a very bad idea!)

      --
      xkcd is not in the sudoers file. This incident will be reported.
  2. At what scope of time or size of output data? by fishnuts · · Score: 4, Insightful

    At what scope/scale of time or range of values does it really matter if a PRNG is robust?
    A PRNG seeded by a computer's interrupt count, process activity, and sampled I/O traffic (such as audio input, fan speed sensors, keyboard/mouse input, which I believe is a common seeding method) is determined to be sufficiently robust if only polled once a second, or for only 8 bits of resolution, exactly how much less robust does it get if you poll the PRNG say, 1 million times per second, or in a tight loop? Does it get more or less robust when the PRNG is asked for a larger or smaller bit field?

    Unless I'm mistaken, the point is moot when the only cost of having a sufficiently robust PRNG is to wait for more entropy to be provided by its seeds or to use a larger modulus for its output, both rather trivial in the practical world of computing.

    1. Re:At what scope of time or size of output data? by jythie · · Score: 4, Insightful

      The headline is somewhat sensational. There is a pretty wide gulf between an abstract and rather arbitrary metric and a practical vulnerability. This is kinda the security equivalent of pixel peeping, a fun mathematical exercise at best and pissing contest at worst, but ultimately not all that important.

    2. Re:At what scope of time or size of output data? by jhol13 · · Score: 5, Insightful

      Your attitude is exactly what is wrong with security. Quite a few still use MD5 because "it is not that broken". Linus really should take a look in this new provably better method and adapt it ASAP and not wait until it bites hard.

    3. Re:At what scope of time or size of output data? by FireFury03 · · Score: 4, Insightful

      It probably has a radio with signals of varying strengths and packet losses.

      This is true. Although I suspect a lot of the time the CPU isn't going to get to see a lot of that stuff (although with most wireless chipsets being softmac devices, maybe it does?)

      It also probably has a multitude of routed, nonrouted, and broadcast packets on its various network interfaces.

      In a dark-start situation (coming up from a wide area power outage), possibly not. Imagine an ADSL router whereby all the clients are connected via wireless. The clients can't talk to the router until it has managed to initialise the wireless encryption subsystems, which requires entropy. Even on a wired network, you may well only see a few DHCP requests from workstations. Obviously rebooting a router onto a large active network is different to rebooting the entire network at the same time, as would happen in a power outage.

      And on top of that, it's connected to a global network of nodes with varying packet delivery times, and where at any time it can ask for a multitude of continuously changing and a least partially stochastic metrics (e.g. exchange rates, 4th word of news headlines, youtube +1 counts, etc, etc).

      A global network that it may well be unable to access until it has enough entropy to make a cryptographic handshake with the upstream peer.

    4. Re:At what scope of time or size of output data? by Xest · · Score: 3, Insightful

      Exactly. If the recent leaks have taught us anything it's that the NSA has managed to produce real working exploits where previously such issues have just been discarded as nothing to worry about because they were "only theory".

      At this point it's stupid to assume that just because you can't come up with a working exploit that someone with the resources the NSA has hasn't already.

      It's of course not even just the NSA people should worry about, it seems naive to think the Russians, Chinese et. al. haven't put similar resources into this sort of thing, the difference is they just haven't had their Snowden incidents yet. I'd imagine the Chinese and Russians have exploits for things the NSA hasn't managed to break as well as the NSA having exploits for things the Chinese and Russians haven't managed to break. Then there's the Israelis, the French, the British and many others.

      It's meaningless to separate theory and practice at this point. If there's a theoretical exploit then it should be fixed, because whilst it may just be theoretical to one person, it may not to a group of others.

    5. Re:At what scope of time or size of output data? by Anonymous Coward · · Score: 2, Insightful

      Lots of people still use MD5, because it's widely available, fast, and good enough for what they need. Just like CRC32 is good enough for certain tasks. MD5 may be broken for cryptographic purposes, but it's still fine to use for things like key-value stores.

    6. Re:At what scope of time or size of output data? by Rockoon · · Score: 4, Insightful

      Its trivial to show that typical PRNG's are frequently not good enough for the monte-carlo simulations and testing that they are thrown at. An N-bit PRNG can only produce at most half of all N+1 bit sequences, a quarter of all N+2 bit sequences, an eighth of all N+3 bit sequences, and so on....

      Given this, obviously the larger N is, the better. Of course most standard libraries use a 32-bit (or less) generator and most programmers are lazy or uneducated in the matter... so only half of all 8-billion to one shots are even possible with that 32-bit generator and then each can only be sampled if the generator just happens to be in the perfect 4-billion to one shot state....

      As the saying goes...Random numbers are too important to be left to chance.

      --
      "His name was James Damore."
  3. Re:Whatever happened to by mark_reh · · Score: 4, Insightful

    avalanche diodes conduct bursts of current at random times. A true random number generator simply measures time between those bursts of current then scales that value to whatever numerical range you need.

    You can also time the clicks produced by a geiger-mueller tube detecting beta radiation from a radioactive source, but that requires a lot more difficult-to-integrate hardware.

    Even if you base the final random number on a truly random source you have to ensure that the scaling routine doesn't introduce any sort of bias into the final value. THAT is the tricky part.

  4. Re:Dilbert RNG by jmhobrien · · Score: 5, Insightful

    I think you need to re-assess your attitude. Perhaps some people have not seen those links? Did you consider the possibility that the comment was not for you? Like rhetorical questions?

    Lighten up or Fuck off. You are taking this way too seriously.

    --
    Where is moderation: -1 False?
  5. Re:Hear me out: Locally Generated Entropy Pool by Anonymous Coward · · Score: 2, Insightful

    RC4 is cracked, and a 120MHz radio signal can bias your generator.

    (That wiped the smile off your face, didn't it!)

    Two unstable Os, at different frequencies, work well, with an appropriate whitener and shielding. You need to determine when measurable entropy falls below a certain level from raw and cease. Use that to seed a pure CSPRNG - the Intel generator uses this construct with the DRBG_AES_CTR (which seems to be OK, if you trust AES, unlike the hairbrained and obviously backdoored DRBG_Dual_EC). You also need a mechanism for software to read the pre-whitened seed so it can estimate, or replace your CSPRNG with another (like Fortuna).