Slashdot Mirror


Entropy Problems For Linux In the Cloud

CalTrumpet writes "Our research group recently spoke at Black Hat USA on the topic of cloud computing security. One of the interesting outcomes of our research was the discovery that the combination of virtualization technologies and public system images results in a problem for random number generation on guest operating systems. This is especially true for Linux, since its PRNG uses only a small set of entropy-gathering events, and virtual Linux images often generate SSH host keys within seconds of their initial boot. The slides are available; the PRNG vulnerability material begins at slide 63."

41 of 179 comments (clear)

  1. Another advantage for TPM chips... by Anonymous Coward · · Score: 4, Informative

    TPM chips have their bad things, but one thing they do offer is a cryptographically secure RNG. Its completely understandable not to trust it 100% completely, but you can use the random number stream it puts out as a good addition to the /dev/random number pool.

    1. Re:Another advantage for TPM chips... by iYk6 · · Score: 4, Insightful

      Or you could plug in a microphone.

    2. Re:Another advantage for TPM chips... by profplump · · Score: 3, Informative

      Most of the RNG chips publish pretty good specifications on the design of their entropy source, the amount of real entropy it provides, and the circumstances in which that entropy level might be reduced. There could be implementation or production errors or course, just like there could be runtime or compiler errors with software, but the design is available for perusal and has been analyzed.

      For example, the Intel 82802:
      http://www.cryptography.com/resources/whitepapers/IntelRNG.pdf

    3. Re:Another advantage for TPM chips... by evanbd · · Score: 4, Informative

      You don't need a mic. The resistor noise on the sound card inputs is present and of secure quantum origin, regardless of whether a microphone is plugged in. The microphone noise is louder, but it's much harder to determine how much secure entropy is present. Why trust it when you don't have to? There's plenty available for most purposes without it. The Turbid program does this in an efficient and secure manner (and they have a paper discussing the details, along with the relevant proofs, for the curious).

    4. Re:Another advantage for TPM chips... by profplump · · Score: 4, Insightful

      First, real-world images are not very random just be virtual of being part of the real world; random things also need to happen. This is particularly mostly-static images like you'd see in 24/7 web cams -- there is not much entropy available.

      Second, most of the reason we want random data for seeing purposes is because the seed needs to be something an attacker cannot derive. The output of truly random number generator cannot be predicted by a remote attacker, but publicly available video streams most certainly can, so any source that sends the same data to more than one person is not suitable for things like cryptography. Frankly that's the whole point of the article; if there are many VMs on the same host, or many real hosts on the same hardware and network, started at the same time, and using the same source for entropy they will all generate the same "random" number.

      Finally, this is a well-solved problem. Many CPUs and motherboards include a hardware RNG that is perfectly sufficient both in terms of randomness and speed for typical PRNG seeding needs. VIA has had one directly in all their CPUs for a long time, Intel includes one in their firmware hubs, and I'm sure there are similar options on most other architectures. Using that on-board RNG to individually seed each VM/host would solve the problem described in the article. There's no reason to try to invent ways to get random data unless you have very specific requirements not met by the existing solutions, as you're quite likely to come up with something inferior either in design or implementation.

    5. Re:Another advantage for TPM chips... by muckracer · · Score: 2, Insightful

      > There's no reason the host can't export that same /dev/random to the guest;
      > certainly to ensure there is sufficient entropy on startup.

      Wouldn't the low-budget solution to this entire issue be the simple deferral of SSH key creation and the like for a few minutes past the initial boot-up?

  2. Getting creative by Brian+Gordon · · Score: 2, Interesting

    How about getting signed entropy from a trusted server on the network/internet? How about putting that microsecond-accurate system clock to use?

    1. Re:Getting creative by JWSmythe · · Score: 3, Funny

          Well, clearly that "Linux" thing is a toxic gas weapon being used by the reds. Ya, I'd worry about them blowing up a chemical weapon in the clouds. They obviously got the technology from the Nazi's (no, not a candidate for Godwin's law).

          I don't know about you, but I'm grabbing my M1 Garand and heading down to the shelter under the house. Once that Linux stuff clears, I'll they'd better have thought twice about attackin' my good ol US of A.

          Well, you asked what they would have though 50 years ago, didn't you? :)

         

      --
      Serious? Seriousness is well above my pay grade.
    2. Re:Getting creative by Brian+Gordon · · Score: 4, Interesting

      I think of some primitive post-human civilization struggling to industrialize amid the ruins of the heat-dead universe.

      There's little solid matter left. Nobody really knows why; the legends tell of ancient, sprawling empires releasing great monsters that consume worlds and deliver energy to fuel their eons-old wars in the cold between the stars. Several human colonies survived the Last Scourge. One even knew something of their people's history. This colony of merchant-scholars thrived in an old space-borne city drifting about a great lightyears-long dust cloud inexplicably left untouched by the wars. The city was old, very old, built by a generation of master engineers who etched their likenesses in the great canvases of the city's impervious white construction. Quiet machinery lurked untouched in the mysterious depths of the undercity, seen only by outcasts wandering alone through those vast echoing chambers.

      The city provided everything the civilization needed. Somehow (so much seemed like magic to them that even the usually-curious humans grew bored of speculation) their reservoirs filled with water, their air recycled, and their waste disappeared down bottomless shafts. All of their needs were filled, but they craved expansion and exploration. They were able to harvest some limited chemical energy from the food supplied by the city, and build using scrap. Still, entropy was a problem in the dust cloud of Linux. ....

    3. Re:Getting creative by Brian+Gordon · · Score: 2, Interesting

      From my fingers 3 hours ago, you insensitive clod. But if you liked it, maybe I should write similar teasers for some story ideas I've had sitting in a text file for a few months.....

    4. Re:Getting creative by the_womble · · Score: 2, Interesting

      You should. It is well written and has good ideas in it.

  3. Why is this done in software at all? by BadAnalogyGuy · · Score: 3, Interesting

    Why can't the CPU contain a register which holds a random number which is updated with every clock cycle?

    1. Re:Why is this done in software at all? by bradkittenbrink · · Score: 2, Insightful

      That's like asking "Why can't they add a DWIM opcode to the instruction set?"

    2. Re:Why is this done in software at all? by ShadowRangerRIT · · Score: 4, Informative

      First, the cost of computing truly random numbers is way too high for that, unless you are performing an iterative approach to random number generation (and then you have the problem of predictability). It could be done, but you'd be pumping a lot of hardware into computing values that would be thrown away 99.9%+ of the time.

      Secondarily, if your PRNG algorithm is broken, you're stuck replacing the hardware. At least a bad software PRNG can be replaced.

      That said, hardware PRNG is provided in many modern systems by a TPM. It lacks the performance problems associated with your solution, since it only generates random numbers on demand. It still has the problem of a potential exploit being discovered leading to expensive hardware upgrades, but to my knowledge that has not been a problem to date.

      --
      $_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
    3. Re:Why is this done in software at all? by Timothy+Brownawell · · Score: 3, Interesting

      Why can't the CPU contain a register which holds a random number which is updated with every clock cycle?

      Some do have something like that, although it's only about 800kbps instead of 4 bytes per cycle.

    4. Re:Why is this done in software at all? by Timothy+Brownawell · · Score: 3, Insightful

      Why can't the CPU contain a register which holds a random number which is updated with every clock cycle?

      First, the cost of computing truly random numbers is way too high for that

      Computers are deterministic. Truly random numbers cannot be computed, they can only be provided by special hardware (something that can measure shot noise or thermal noise, a camera pointed at a lava lamp, a movement detector in Schrodinger's cat's box).

      Secondarily, if your PRNG algorithm is broken, you're stuck replacing the hardware.

      That's why you don't do pseudo-random numbers, but real randomness from thermal noise or shot noise or some other quantum effect (cats and lava lamps don't fit on ICs).

      That said, hardware PRNG is provided in many modern systems by a TPM.

      And at some level, the randomness generator on the TPM almost certainly has an interface of "read this special register every X clock cycles" (because how else would you interface with your special hardware?).

      It lacks the performance problems associated with your solution, since it only generates random numbers on demand.

      If it's implemented in hardware (as it must be, to get true randomness), it's always running and there is no "on demand".

      It still has the problem of a potential exploit being discovered leading to expensive hardware upgrades, but to my knowledge that has not been a problem to date.

      That would be because it's a RNG instead of a PRNG.

    5. Re:Why is this done in software at all? by bradkittenbrink · · Score: 4, Insightful

      So, I was mostly just giving him shit because of his name. If you want a more serious debate, here's my best shot: The instructions you described are all relatively easy to define a generally useful specification. My main point was that every application has differing standards of randomness that are required. Do you need real quantum-mechanical randomness, or just a CSPRNG? How many bits of random data do you need, and how frequently? I'm assuming that the request is for real quantum-mechanical randomness. I find it hard to imagine defining a good spec for such hardware component, especially since the vast majority of applications don't actually require quantum-mechanical randomness, and the ones that do are likely to have very specific requirements. Anyways, besides the fact that it's tough to come up with good requirements for such a feature, I bet it's really tough to implement as well. I know just barely enough about about hardware implementations to be dangerous, so someone who knows for real, please correct me if I'm wrong. Anyways, circuits that exhibit quantum-mechanical randomness are, as far as I know, essentially the same as circuits that cause metastability in transistors. Because of the need to control for such problems, implementing such circuits on the same die as a normal digital circuit would likely be very expensive in terms of both die area and yield.

  4. Surely Not. by lobiusmoop · · Score: 2, Insightful

    Generating SSH keys involves interaction via at least keyboard and possibly mouse at a terminal. Surely that basic permise is enough to provide enough entropy for the pseudo-random generator. Also, the date and time (as sources of random) can't be virtualized of course.

    --
    "I bless every day that I continue to live, for every day is pure profit."
  5. Much ado about nothing. by Facegarden · · Score: 5, Funny

    All this complaining over random numbers is silly. All you really have to do is use 5. It's just as random as any other number, and it's easy to generate a 5.
    -Taylor

    --
    Worldwide Military budgets: $2100 billion. Worldwide Space Exploration budgets: $38 billion. Really, world? Really?
    1. Re:Much ado about nothing. by BobisOnlyBob · · Score: 5, Funny

      This only proves how easy it is to generate a (5, Funny).

    2. Re:Much ado about nothing. by hannson · · Score: 2, Funny

      int getRandomNumer()
      {
              return 4; // chosen by fair dice roll.
      // guaranteed to be random.
      }

    3. Re:Much ado about nothing. by jamesh · · Score: 2, Insightful

      Interesting that both Dilbert (years ago) and xkcd (more recently) both contain a comic with a similar joke...

  6. Not surely by Kaseijin · · Score: 3, Interesting

    Generating SSH keys involves interaction via at least keyboard and possibly mouse at a terminal.

    SSH host keys are often generated automatically when the init script notices there aren't any.

  7. Big problem, but addressable by lamber45 · · Score: 3, Interesting
    The nice thing about Linux is that you can develop whatever entropy-producing process you want and write its output to /dev/urandom to add more entropy to the pool. For instance, a boot script could issue an HTTP request to a website backed by a hardware random-number generator (access-control to only machines in the cloud by IP range). It is something to be worried about, though.

    Java code that does cryptography or generates UUIDs (in the hope that they will be a truly universal key for something) operates under similar problems. JavaScript is even worse; all it has is the time, perhaps the user's window-size (not very random if maximised) and mouse-movements, and the built-in random() method, which is not expected to be of cryptographic quality.

  8. Re:Doesn't SSH use OpenSSL? by morgan_greywolf · · Score: 5, Informative

    OpenSSL has a cryptographically secure random number generator. I know not everything uses it but doesn't (Open)SSH?

    No. By default, OpenSSH will use the system's pesudo-random number generator, but you can also make it use prngd or EGD (the Entropy Gathering Daemon) instead. Whether either are more "secure" than the kernel's built-in RNG I am not qualified to say.

  9. Linux has a paravirtual entropy driver by Anthony+Liguori · · Score: 5, Insightful

    CONFIG_HW_RANDOM_VIRTIO enables it. It's been there for quite a while.

    We could easily support it in KVM but I've held back on it because to really solve the problem, you would need to use /dev/random as an entropy source. I've always been a bit concerned that one VM could starve another by aggressively consuming entropy.

    lguest does support this backend device though.

    1. Re:Linux has a paravirtual entropy driver by plasmacutter · · Score: 2, Funny

      I heard the aliens from zeta reticuli utilize paravirtual entropy drivers to get to earth.

      --
      VLC FOR MAC IS DYING! IF YOU DEVELOP, PLEASE SAVE IT!!
  10. evidence that cloud is a fad? by noric · · Score: 2, Interesting

    I'd like some evidence that cloud computing is a fad. Tens of thousands of companies, in dozens of industries, do not list "computing hardware, availability, and capacity management" as a core competency, making them prime cloud customers.

    1. Re:evidence that cloud is a fad? by jellomizer · · Score: 2, Insightful

      It is a tool in the bucket. That what it is. There will be a huge growth spurt, then they realize that it won't solve everything. Then they will cut back and still use it until they find something better.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
  11. Support via "Guest Additions"? by Just+Brew+It! · · Score: 2, Insightful

    Seems to me this could be solved via the "Guest Additions" module that most virtualization packages recommend you install in the guest OS. Use the GA to inject some entropy from the host system into the guest system's entropy pool. The host CPU's TSC register would probably be an excellent source.

  12. Eh? by ledow · · Score: 3, Interesting

    If you "need" cloud computing, then you're bright enough to install an entropy daemon on one of the machines and maybe even slap a hardware-based RNG on it (probably worth sourcing a VIA or similar just for this purpose, to be honest). It's not hard.

    Anything else, your "randomness" really doesn't matter and the standard entropy will be just fine.

    1. Re:Eh? by ledow · · Score: 2, Insightful

      A bold assertion. I assume you're thinking of TCP sequence numbers or similar. Otherwise, I call bullshit on the "ANY".

      And the entropy provided by being connected to a network in any way, shape or form is enough for that purpose.

      Even in general, unless you're generating LOTS of SSH/SSL keys on some kind of automated process schedule, you're fine, and that's the sort of task that should be pushed out to a dedicated entropy machine.

      Otherwise, every ADSL router etc. in the WORLD would be worthless - no keyboard, no mouse, no disk interrupts, etc. and yet they run full TCP stacks that hold the majority of the world's home connections. The fact is that it's just not as big an issue as you think it is.

  13. No one wants to make money off the Interwebs! by zullnero · · Score: 2, Insightful

    "The term cloud computing is useless" said Stamos. "It's way overused. It's mostly about gathering venture capital or selling your products."

    Yes. Because no one on the Internet has any use for gathering venture capital or selling products.

    It IS an overused term, but you're not testing some product or how people are using it, you're really just testing the security models of various operating systems to determine which are more ready to support those concepts that people grouped together and called "cloud computing". There were a lot of various concepts that were grouped together that comprised the "Net 2.0" concept too...and that cliche was just as derided for being overused. And yet, websites that aren't all ajaxed up or don't use css seem pretty old-fashioned these days.

    That said, the question I have is how ready for those "cloud computing" concepts is Windows, really? How much of that security model is using the proper approach to securing a transaction instead of just shutting down that path altogether?

  14. Definition of Cloud skewed by smueller · · Score: 2, Informative

    This is not a "cloud" problem. This is a virtual server and image problem. Clouds have nothing to do with virtual servers. If you use a service like NewServers.com, you can get dedicated physical servers for your cloud, on-demand and at hourly prices.

    1. Re:Definition of Cloud skewed by julesh · · Score: 2, Informative

      This is not a "cloud" problem. This is a virtual server and image problem. Clouds have nothing to do with virtual servers. If you use a service like NewServers.com, you can get dedicated physical servers for your cloud, on-demand and at hourly prices.

      Expanding on the other answer you've, here's the basic problem:

      I can take a virtual server, install an image with a well-known PRNG seed in it, and use it for a little while. While it's used the PRNG is updated by entropy in an unpredictable way, resulting eventually in a virtual server image that produces effectively random numbers. When I shut it down the entropy pool is stored in its disk image, and reread when I start it up again. There is a small problem, but it goes away after a little while.

      That isn't the usage model for "cloud" servers, however. In a cloud environment, e.g. Amazon EC2, the servers are quite likely to run for only a few hours at a time (because you start them up when you need extra capacity, and stop them when you no longer need that capacity), so the image has no time to accumulate much entropy, and worst of all when you shut it down _the data on the OS image, including the entropy pool, is lost_. The basic model is that you have many servers, all sharing a read-only base disk image. The problem occurs each time you start up a new host, which can be quite frequently.

      Now, you could modify your images to stick their entropy pools in permanent storage (e.g. Amazon S3), but then you'll need some mechanism to prevent two servers from starting up with the same entropy pool, which is a non-trivial problem to solve, and I'll bet that very few EC2 users have thought to do it (I certainly didn't when I trialled EC2 a few months ago).

  15. Re:Big problem, but addressable by CalTrumpet · · Score: 5, Informative

    Actually, /dev/random and /dev/urandom have their own, separate secondary pools that are fed off of a main pool when entropy is "depleted" in the second level pools. This is an area of research for us as well, since Linux's entropy estimation algorithm fails in situations where the timing deltas of entropy gathering events (IRQs and disk IOs) are actually predictable, so it's possible that the second level pools are not being refreshed at appropriate times.

    If you write to /dev/urandom, it goes into the primary pool by tradition. This is what the rc scripts do on bootup with the random seed file on disk.

    BTW, it's absolutely the wrong solution to get entropy from another source on the network (for many reasons, but one is that you can't do a secure HTTPS handshake without, you guessed it, unguessable random numbers). The whole point here is that we are looking for a way for 500 Linux instances on EC2 to have different entropy pools before the kernel completes boot. The only possible solution is for the hypervisor (Xen for Amazon) to provide a simulated HW RNG that pulls entropy from a real HW RNG or from an entropy daemon in the hypervisor.

    The best way to learn about Linux RNG basics is Gutterman et. al. Analysis of the Linux Random Number Generator. Several of the issues they describe have been addressed, such as their PFS concerns, but their description of the entropy pools is still accurate.

  16. FTA... by NotBorg · · Score: 4, Funny

    "This falls somewhere between a very big deal and irrelevant," says Wagner.

    I'm glad he cleared that up for me.

    --
    I want this account deleted.
  17. The generation of random numbers... by ameline · · Score: 5, Funny

    As has been so often said, the generation of random numbers is too important to be left to chance. :-)

    --
    Ian Ameline
  18. Old hat? by GiMP · · Score: 3, Informative

    Disclaimer: I work for a hosting company doing VPS/cloud hosting.

    This is pretty old-hat. First, the host-keys issue inside pre-generated images is a very obvious one, although I'm not too surprised that companies aren't considering it. RNG issues aren't quite as obvious, but they're not super-secret either, anyone with any amount of background in security has been aware of this for a while.

    In fact, questions regarding RNGs have even surfaced in the ##xen IRC channel (freenode.org) because it is a very important issue to some. In particular, those with the need for hardware RNG solutions have come seeking assistance.

    I'm certainly not minimizing the issue, just noting that it isn't really a new one at all. More than anything, is that the average systems administrator has been slow to realize this, and developers even less so.

  19. problem has been solved by flok · · Score: 2, Informative

    This problem has been solved: use EntropyBroker: a physical machine gathers entropy data and distributes this to the virtual machines. If I remember correctly KVM has a special driver for feeding the VM with entropdata from the host system.

    --

    www.vanheusden.com - home of Multitail, HTTPing, CoffeeSaint, EntropyBroker, rsstail, bsod, listener, nagcon, nagi
  20. Use Linux-VServer/OpenVZ or LXC by Lennie · · Score: 2, Informative

    Those just use process-namespaces and the same kernel and you are done with it.

    --
    New things are always on the horizon