Slashdot Mirror


Hyper-Threading, Linus Torvalds vs. Colin Percival

OutsideIn writes "The recent Hyper-Threading vulnerability announcement has generated a fair amount of discussion since it was released. KernelTrap has an interesting article quoting Linux creator Linus Torvalds who recently compared the vulnerability to similar issues with early SMP and direct-mapped caches suggesting, "it doesn't seem all that worrying in real life." Colin Percival, who published a recent paper on the vulnerability, strongly disagreed with Linus' assessment saying, "it is at times like this that Linux really suffers from having a single dictator in charge; when Linus doesn't understand a problem, he won't fix it, even if all the cryptographers in the world are standing against him.""

34 of 396 comments (clear)

  1. He won't fix it? by Morgahastu · · Score: 5, Insightful

    Then somebody else will.

    1. Re:He won't fix it? by untouchable · · Score: 5, Funny
      Fix what?

      If I remember correctly, there hasn't been a shown exploit for this yet. It's better to wait and see before fixing something that may not matter later.

      --
      As Seen On TV's? Come back!!!
    2. Re:He won't fix it? by Vo0k · · Score: 5, Interesting

      Actually, my bet is it will be fixed in the new CPU revision, by Intel. And eventually Kernel fix dug into the config somewhere next to other "bugfix/support" entries, with note like "Early multithreading Intel Pentium 4 CPUs have a vulnerablity that allows to override privledges of a process. This entry includes a patch for this bug at cost of increasing the kernel size by 32K and slightly slowing it down. If you have an early Pentium 4 processor and run a multi-user system, say Y. If you don't or aren't sure, say N."

      --
      Anagram("United States of America") == "Dine out, taste a Mac, fries"
    3. Re:He won't fix it? by CaymanIslandCarpedie · · Score: 5, Insightful

      Oh come on man, don't be that guy ;-)

      So MS$ shouldn't fix problems in IE until an exploit has been shown for it?

      It's better to wait and see before fixing something that may not matter later.

      Its better to just fix it and be safe than wait and see if something happens later. It may not be top priority, but remember this "wait and see" approach to security is exactly what got MS$ into so much trouble with users. We don't need the same for Linux.

      --
      "reality has a well-known liberal bias" - Steven Colbert
    4. Re:He won't fix it? by Jugalator · · Score: 4, Interesting

      Why wouldn't he?

      He doesn't say "I don't want a fix for this anywhere in the kernel" anywhere. Just that he doesn't think it's a very critical issue.

      If someone else does the patch for him, why wouldn't he accept it?

      --
      Beware: In C++, your friends can see your privates!
    5. Re:He won't fix it? by Threni · · Score: 5, Funny

      That reminds me of the joke about programmers being in a car, steaming downhill with failed brakes, narrowly avoiding death, then once the car has come to a standstill suggesting that instead of seeing what went wrong they just get back in the car and `see if it happens again`.

    6. Re:He won't fix it? by /ASCII · · Score: 5, Interesting

      I'm not to sure about that. Linus says this is a library issue and I agree. The kernel should not try to fix library bugs.

      What this bug amounts to is this: When a program is performing calculations using secret data like an RSA key, it is important that the data access patterns do not depend on the secret data, since these patterns can be analyzed by an attacker.

      An example of a classical vulnerability of this sort is using the c function strcmp to compare the real and the supplied password. By timing multiple runs you can get a decent estimate of how long time the strcmp function took, which means you can guess which character was first differing character in the password.

      The security flaw in HT is that a process running on a HT CPU can get quite a lot of information about the data access patterns of the process on the other virtual CPU on the same chip. In other words, the severity of any library bugs which cause different access patterns on different secret data has been severly increased.

      --
      Try out fish, the friendly interactive shell.
    7. Re:He won't fix it? by antifoidulus · · Score: 4, Insightful

      Heh, this is more than just your average buffer overflow exploit. This fix would have to modify how the OS handles the cache. It's going to probably take more than a quick fix to get rid of the exploit, and the patch could have far reaching reprecussions. All that to fix a security hole that may not even be exploitable in practice....

    8. Re:He won't fix it? by Vellmont · · Score: 4, Interesting

      That sounds like it has some non-trivial costs associated with it. That would mean losing performance in many instances where two high-cpu processes want to be exectured on a single physical processor at the same time.

      The best solution is to just fix the crypto libraries as a short term solution, and for Intel to fix the chip in future iterations as a long term solution. Mucking about in the kernel and having other unknown effects seems like a bad approach when the problem can be fixed elsewhere.

      --
      AccountKiller
    9. Re:He won't fix it? by jrockway · · Score: 4, Insightful

      His comment isn't from a CS perspective, it's from a code monkey perspective. CS people use mathematics to prove their code correct, application programmers write stuff and are happy it works.

      --
      My other car is first.
    10. Re:He won't fix it? by null+etc. · · Score: 5, Insightful
      By timing multiple runs you can get a decent estimate of how long time the strcmp function took, which means you can guess which character was first differing character in the password.

      Can I buy some pot from you?

      Maybe that would work with a ONE MEGAHERTZ PROCESSOR. But do you have any idea how fast processors are these days? And how likely any deviance in the cache state, IO controller state, page faults, multi-user latency, or power management will throw your precious timings right out the window?!?!?!

      I mean, c'mon, think about things before you say them. Even REAL TIME SYSTEMS AT NASA don't run with enough consistency to be able to tell WHICH CHARACTER IN A STRCMP OPERATIONS fails.

    11. Re:He won't fix it? by hey! · · Score: 4, Interesting

      I dunno if I agree, for two reasons. First, you'd be surprised what a sophisticated, determined and deep pocketed adversary can do (e.g., the smart card vulnerabilities that were uncovered a few years ago by analyzing the power consumption).

      Secondly, just because the clock is ticking fast doesn't mean you can't count them as they go by. The same process run at 1Mhz and 1Ghz is going to look substantially the same, because you're monitoring them faster too. If anything, you're less likely to get interrupted by I/O during the process.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    12. Re:He won't fix it? by Austerity+Empowers · · Score: 4, Interesting

      Faster processors also means that you don't need to be as precise. You can search more possibilities faster, but every byte you can chip away helps narrow down the search range.

      It's true that there are various "noisy" and uncontrollable aspects of modern CPUs, but if you take enough measurements and can establish a control, you can get valid information. It will not give you "the answer", it may help you to eliminate a set of data that is "not the answer". That is extremely valuable.

      Some people actually monitor chip power supply variations to try to get this information. I won't pretend to understand how they get meaningful data from this, but it involves knowing the system under attack very well, and wanting the data really, really badly. Hence very high security systems shield their processors magnetically and inject "noisy" (bogus, unrelated) operations to cover this. It sounds like a pipe dream, but I've visited places where these systems are built (think Blue), and read some of the papers describing these exploits.

      His point is valid, ANY information an attacker can get significantly reduces the integrity of the encryption. I just suspect Linus' point is that most (I said most, don't flame me) linux server deployments are geared towards not letting people on the server in the first place. This vulnerability seems to be valuable mostly to people who already have access to the system.

    13. Re:He won't fix it? by IIH · · Score: 4, Informative
      I mean, c'mon, think about things before you say them. Even REAL TIME SYSTEMS AT NASA don't run with enough consistency to be able to tell WHICH CHARACTER IN A STRCMP OPERATIONS fails.

      Maybe the original poster was referring to the DEC-10 page fault password insecutity that was based on strcmp returing as soon as it encountered once wrong character, based roughly on the following idea:

      1) Place Password with 1st/2nd character on a page boundry.
      2) Clear cache
      3) Call Password check.
      4) If no page fault occurred, then the 1st char must be wrong, change it and goto 3
      5) if page fault occurred, 1st character is correct (as 2nd char was checked), move password so 2/3 char is on page boundry and repeat.

      In this way, you can reduce the attack by a huge amount, for a n length password the brute force attack needed goes down from 256^n to 256*n.

      So, yes, attacks based on which character in strcmp fails have worked in the past, so it is valid to try and not make the same type of mistake again!

      --
      Exigo spamos et dona ferentes
  2. Dictator? by BBrown · · Score: 5, Insightful

    A dictator who has made his domain open-source, thereby giving everybody free reign to change and make distinct copies of it?

    Come on.

  3. OK Colin, Well done by Timesprout · · Score: 4, Insightful

    you found an obscure and difficult to exploit vulnerability. Now quit trying to make out the world is doomed and trolling on Linus to keep the spotlight on youself.

    --
    Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
    What truth?
    There is no dupe
    1. Re:OK Colin, Well done by TuringTest · · Score: 4, Insightful

      If this was about Microsoft and Bill refused to fix the vulnerability, nobody else could write a patch for the sources to solve the problem. See the difference?

      --
      Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
  4. Re:Dictator? by Tjebbe · · Score: 5, Funny

    or, to put it in Pratchett's words:

    He doesn't administer a reign of terror, just the occasional light shower.

  5. but... but... but... by databyss · · Score: 4, Funny

    The all powerful Dvorak said linux had no leaders...

    --
    Hmmm witty sig or funny sig? Maybe elitest techy sig!
  6. Re:Dictator? by Megor1 · · Score: 4, Funny

    Now that I think of it I've never seen Castro and Linus in the same room....and Linus always seems to be smoking fine cigars...and open source software is practically communism anyway...it all makes sense now!

    --
    Everyone that disagrees with me is a paid shill
  7. Fixing is easier said than done by Xpilot · · Score: 5, Insightful

    The kernel developers don't seem to agree on the right way to fix this, whether at the kernel level or in userspace. However, it may affect the performance of the kernel if it's done in kernelspace, and it is impractical to have everyone rewrite their userland software, as someone else pointed out. The "patch" which is available for FreeBSD to fix this problem only disables hyperthreading and does not provide a real fix.

    --
    "Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
  8. This sort of attitude is pretty common by Raleel · · Score: 5, Insightful

    It's along the same lines of the "if all you got is a hammer" problem. If you've spent a lot of time working on something, it's obviously important to you. That doesn't mean that it's important to everyone else. This may well be a significant flaw from the crytographer's perspective, but then again, they study crypto a lot and have a vested interest in it.

    As someone pointed out, yay for linux being free. As one or two above pointed out, someone who does care with the knowledge will write a patch. It'll get implemented as an option in the code, and if shown to be unobtrusive enough, may even get turned on by default.

    --
    -- Who is the bigger fool? The fool or the fool who follows him? --
    1. Re:This sort of attitude is pretty common by Otto · · Score: 5, Insightful

      Hence, this is an issue that effects me and my customers, and I seriously hope that a fix finds itself into either apache mod_ssl or the mainline Linux kernel PDQ.

      That's really what's up for debate here. Whether the patch should be in the kernel-land or in the code user-space (mod_ssl, for your example).

      The only realistic patch you could do in kernel-land is to simply disable HyperThreading. This works, but seems like a poor way to go. Any other form of patch in kernel-land just makes the attack harder and thus doesn't really work or it degrades performance way too much to be practical.

      But fixing it in userspace is somewhat easier to do, albeit you'd have to fix *every* user-space program that's susceptible to this sort of thing.

      Let's talk about the problem in general terms. When a program is doing some kind of computational stuff on something you want to remain secret, then it has to make some assumptions. Assumptions like the hardware is secure, or that it's not running on a virtual machine that's recording everything it does.. That sort of thing. You can come up with all kinds of ways to crack it like an egg if you work outside the box a bit and have total control of the machine it runs on.

      This problem is attacking one of those assumptions, namely that another process can't time the secret computations accurately enough to perform a timing attack. With HT, you have two things running on the same core, and so it is somewhat easier to do this sort of attack.

      So userspace programs that do secure computations have had one of their assumptions broken by HT. To remedy it, they need to rethink their assumptions. They need to or ensure that they perform equal timings regardless of the computations being done and so on. This is not particularly simple, but it's probably not particularly hard either.

      Of course, the attack is still largely theoretical. All it's been shown is that it's "possible", not that it's "easy" or even that it is indeed "doable". For one thing, without having some kind of clue as to the algorithim involved or some idea of what to look for, all you get are a bunch of timings. You still need to do some things to trigger it at the right time and in the right way as to be able to derive information from this channel.

      But crypto guys are paranoid like nobody else, and so they're naturally worried about this sort of thing. Mainly it's worrying to them because it's not a mathematical attack, which they're more used to. Modern crypto works based on theory and algorithims and such, and the idea that the algorithim being correct (for a given value of "correct") isn't enough to protect the security of the data is extremely worrying. A real world implementation of these algorithims now has to take some more real world facts into account, and this bothers them, of course.

      Linus is basically right here. The kernel is simply the wrong place to fix this. It doesn't ensure that processes cannot spy on other processes via subchannels like this, nor should it. If you're paranoid enough to think this is a real thing to guard against, then your secure code should take it into account. Existing code doesn't do that, and would need to be changed *even* if the kernel was patched. Because how do you know that your kernel has been patched? How do you know that you're not running on an HT processor? You can't know for sure, so you simply assume you are and take steps to make timing attacks fail. Because if you don't, you can't reasonably say that you've attempted to secure the code in this way.

      --
      - Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
  9. Re:Dictator? by squiggleslash · · Score: 5, Informative
    The guy was refering to the oft-quoted observation that Linus is a "benevolent dictator", or rather than Linux's development model is one of benevolent dictatorship. It wasn't an insult aimed at Torvalds. It's a comment about the development model used by many FOSS projects. See also Larry Wall and Perl, or Guido Van Rossem and Python. In all these cases contributors to the projects defer to a project figurehead who makes the final decisions as to what goes into the official version of the project, and where that project goes.

    The most common alternative model is community development, where a - usually but not always elected - committee of developer 'elders' steer the project. Apache and Mozilla would be good examples of the latter.

    I appreciate some people have heard about this comment first today, people are joining the Free Software and Open Source communities all the time, but it kind of surprises me that so many are criticising Colin for this without anyone explaining this.

    --
    You are not alone. This is not normal. None of this is normal.
  10. Another Fairy Tale... by ausoleil · · Score: 4, Insightful
    In layman's terms, this debate is:

    Scene: A wispy cloud scuds across the sunny blue sky. Not much happening, and the cloud is hardly even black.

    Chicken Little: The sky is falling! The Sky is falling!

    The Penguin DictatorNo, not really. It might fall, but it's very, very unlikely. So calm down!

    Chicken Little: I strongly disagree. The sky is falling! And because you do not understand the problem we're all going to die!

    The Penguin Dictator:Listen here. It's almost certainly not going to fall, and I need to worry about real problems!

    Chicken Little: (Runs screaming to the nearest coffeehouse with free wireless, where he types incessently:) The sky is falling! The Sky is falling! Tell Slashdot! Tell Tom's Hardware! Tell Cnet! Tell Linux Business News!

    The Penguin Dictator: Sigh. (And then he gets back to work. He looks up at the audience) They just do not get it, do they?

    The Windows Dark Lord: (Rubs hands together) Excellent, MOST excellent. (Yelling) Bring me my marketing minion!

    Marketing Minion: (being drug in by a bald guy yelling at him) Yes, O Master!?

    The Windows Dark Lord:Tell all the peasants that the sky is raining huge chunks of fire and dung! Tell everyone, tell them now! And have our independent consultants work on this day and night, night and day! Make sure that they independently tell everyone that they can easily avoid falminf chunks of sky dung if they stand behind our Windows! And RAISE the price!

    Some Guy At Some House In Some City Somewhere: "Wow, that was easy. Let me send this up to the Penguin Dictator. No sky ever fell, and that cloud is easily blown away. Nothing happening here, move along."

    The Penguin Dictator "Well that was easy. Include this patch in the next day's weather update!" Marketing Minion: Press Release!!! Millions killed by falling flaming sky chunks of burning dung with brain eating worms who eat children!!! Run for your lives!!!!

    Laura Didio, munching a do-nut"If you would hide behind Windows, the sky would stop falling! Your children would be safer and the world a better place." (looks at stoick ticker, says to self) 'Excellent. MOST excellent. Bring me a donut!'

    The Penguin Dictator "Sigh. Why didn't I just keep Sky 0.7a for myself? Why the bother, wy the bother?"

    EPILOGUE: No one was ever hurt by the piece of sky that never fell, and Chicken Little kept looking upward for another cloud to rant about.

    The End.

  11. Re:At least Linus.... by mattgreen · · Score: 4, Insightful

    Nice ad hominem attack. Attack the argument, not the person.

  12. Re:Linus and RMS by daigu · · Score: 4, Interesting

    RMS is more like the tribal elder reminding you of your ideals - especially during those times when you consider putting them aside because they seem impossible to live up to.

  13. Re:bad tactics from Colin Percival by jsonn · · Score: 4, Insightful
    Get the facts. Colin showed that you can retrieve ~30% of a RSA key by running a program in parallel. This can be improved most likely if you have the chance to do it more than once. It is also imported to keep in mind that you can't entirely avoid an unbalanced memory access pattern without also taking a huge performance penalty.

    The point of this debatte is that the Intel implementation sucks, it allows you to spy a lot on processes running on the virtual CPU. Sure, there are better alternatives than disabling HTT like the suggestions of Colin to only schedule threads of the same program on the virtual CPU. Actually, that is something you want to do anyway or otherwise you can seriously loose speed and drop under the performance of a processor with HTT disabled.

    Speaking of paranoia, it is often not a bad thing to have, many big security problems can be avoided. Oh, I forget to patch the Linux box next door.

  14. If security matters, don't do crypto in Linux by swillden · · Score: 5, Insightful

    ... or in any other general-purpose operating system on a general-purpose computer. PCs are fundamentally insecure. There are a dozen ways to spy on cryptographic operations done in them, ranging from trojans, to hardware side-channel attacks, and dozens more to get copies of keys that they store. This is just one particular attack that may permit an attacker who can't get a trojan running with sufficient privileges to spy on operations directly to obtain some key bits. But if the attacker can't do that, there are lots of other ways to get the keys. General-purpose computers are simply not trustworthy.

    If security is important, you do your crypto in a secure crypto module, like the FIPS 140-2 Level 4 IBM 4758 or the Level 3 Luna SA. Or, you use a general-purpose computer with special-purpose, very simple software and then provide strict physical access control to the machine and very limited network access -- often through a serial link using a custom protocol rather than via a real network. Or you could theoretically use a general-purpose machine with a TCPA chip with a regular, general-purpose operating system that has been modified to make use of the TCPA chip and with keys tightly bound to a well-defined system software configuration. But only if you have good physical security. In many situations it's still better to use a FIPS 140-2 Level 3 or Level 4 device.

    IMO, the existence of weaknesses like this in Linux, and the fact that they're widely known, is a *good* thing, because it helps convince people not to trust that which is inherently untrustworthy. We need more publicity of similar problems in Windows (and there are lots of them).

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  15. Fix the applications by Tom7 · · Score: 5, Informative

    Why should this be fixed in the Kernel?

    This appears to be an application bug, not a kernel one. The kernel never claims to completely isolate processes from one another; though there are memory protections, there are loads of ways that processes can observer each other's actions. This is just a particularly high-resolution one.

    The real "bug" here, IMO, is that openSSL believes that no other process can observe anything about its secret computations. Timing attacks against RSA have been known for some time, particularly with regard to modular exponentiation.

    It wouldn't be too hard to make RSA encryption take the same amount of time no matter what code path is used, and to make its memory access patterns uncorrelated with the keys (perhaps by using randomization during allocation). They should do this--the fact that their application leaks information has nothing to do with the processor it's running on; it's just that HT makes it particularly easy to measure that information. This would have a performance penalty, and I think the OpenBSD folks are too obsessed with performance, and that's why they've not done this. The performance obsession is a serious problem in the Unix world, and software systems in general.

    If implementing openSSL effectively means adding special kernel support for things like constant-length timeslices or cache invalidation between context switches, that's fine. But this is not a bug in the kenel unless the kernel purports to enforce total separation between processes, which it certainly does not.

  16. Re:strcmp vulnerability. by Carewolf · · Score: 4, Interesting

    No just compare every character in the input everytime, rather than short-cut when you know it wont match.

  17. Re:Great by cperciva · · Score: 4, Informative

    He ran the bloddy "exploit" well over 1000 time to retrieve 30% of an RSA key.

    Did you read a paper other than the one I read? I ran the exploit once, taking under one second, and I retrieved enough information to factor the RSA modulus N.

  18. A possible solution by jesup · · Score: 4, Informative

    This issue exists with AES implementations as well (search for AES SBOX cache timing on google). The AES vulnerability is, if anything, more worrying in a way. It can be made to be constant-time, but at a serious cost in performance.

    Here's another option: since this vulnerability depends on using the L1/L2 cache states to ferret out information, remove that from consideration. When processing an RSA (etc) key, have the code temporarily lock out context switches, AND have it take a fixed period of time to compute the result (or portion of the result), followed by flushing the cache. No data is left in the cache to analyze. The execution time of the code is fixed, so no dat leaks there. You don't have to make the algorithm a constant-time calculation if you can pad it at the end to the maximum (or close enough to the theoretical maximum that there's no true information leakage in practice). This helps avoid the potentially very slow algorithms that are truely constant-time. And flushing the cache at the end of each (portion of the) calculation removes that as a leak. (You need to flush it before waiting for the allocated time to elapse, note, and include max flush time in your calculation, and if possible factor into your calculation possible interrupt effects.)

    A pain? yes. Requires extensive mods to crypto algorithm implementation? Yes (though perhaps not to the core calculation.) Requires OS support? Almost certainly. Requires HW support? Would be helpful but probably not required. Loss of performance? Yes, though far lower than disabling HT I imagine in normal cases (when not decrypting/encrypting).

    Also, some of the restrictions above can probably be eased if the crypto algorithm is carefully designed and matched to the hardware.

    Disclaimer: I am NOT a crypto geek! I have worked in processor and cache design, though.

  19. Bzzt! Back the paradigm up here. by Paradox · · Score: 5, Insightful
    From what I've learned in software writing, is that it's preferrable to wait and see how much and how bad your software runs or has problems before you start charging into the situation to fix it.
    Wait. Wait wait wait. Who taught you this? This isn't XP. This isn't sound software practicies. Maybe you're thinking of the infamous quote by Car Hoare, "Premature optimization is the root of all evil," perhaps?

    Potential performance problems are things you should defer on until proper profiling can be done (unless they're total show stoppers). Security and correctness are things you cannot ignore except in extreme cases. Security is particularly important to nail down, because it can result in your customers losing data (even data not pertaining to your app), which is the first no-no of software.

    Application software has four priorities, in this order:

    1. Safety (shouldn't destroy data)
    2. Correctness (do what it says it does)
    3. Security (don't do anything else)
    4. Performance (do it fast)
    YMMV, of course, sometimes correctness falls below security, and occasionally performance goes above correctness in some mathmatical functions (if doing it correctly would take a decade and doing a close approximation would take a day, obviously you want the approximation and then a heuristic).
    Especially something as low level as this, which could have unseen side effects. Especially since this (to me, at least) seems to be more of a hardware problem than software, per se. (But, of course, I could be wrong.)
    In this case, I'd say proper fix is to disable hyperthreading by default, and make sure the user is aware of the hardware bug/consequence of using HT when they decide to turn it on. You need to let the user decide if they're willing to accept the security risk or not.

    The Linux Kernel Developers may decide otherwise, but that's how I'd call it if it was in my shop. It's a hardware problem and the software fix is not obvious.

    --
    Slashdot. It's Not For Common Sense