Slashdot Mirror


Linus Torvalds Calls Intel Patches 'Complete and Utter Garbage' (lkml.org)

An anonymous reader writes: On the Linux Kernel Mailing List, Linus Torvalds ended up responding to a long-time kernel developer (and former Intel engineer) who'd been describing a new microcode feature addressing Indirect Branch Restricted Speculation "where a future CPU will advertise 'I am able to be not broken' and then you have to set the IBRS bit once at boot time to *ask* it not to be broken."

Linus calls it "very much part of the whole 'this is complete garbage' issue. The whole IBRS_ALL feature to me very clearly says 'Intel is not serious about this, we'll have a ugly hack that will be so expensive that we don't want to enable it by default, because that would look bad in benchmarks'. So instead they try to push the garbage down to us. And they are doing it entirely wrong, even from a technical standpoint. I'm sure there is some lawyer there who says 'we'll have to go through motions to protect against a lawsuit'. But legal reasons do not make for good technology, or good patches that I should apply."

Later Linus says forcefully that these "complete and utter garbage" patches are being pushed by someone "for unclear reasons" -- and adds another criticism. The whole point of having cpuid and flags from the microarchitecture is that we can use those to make decisions. But since we already know that the IBRS overhead is huge on existing hardware, all those hardware capability bits are just complete and utter garbage. Nobody sane will use them, since the cost is too damn high. So you end up having to look at "which CPU stepping is this" anyway. I think we need something better than this garbage.

32 of 507 comments (clear)

  1. Is there any other option, Linus? by aglider · · Score: 5, Interesting

    You are right, Linus, as usual.

    But I'd prefer the Linux Kernel Development team to push a complete proposal on the table.
    Like totally ditching the support to Intels starting with the releases on next March 1st (or better April?).

    --
    Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
    1. Re:Is there any other option, Linus? by mwvdlee · · Score: 5, Informative

      I went in expecting the usual Linus ranting, and although he doesn't disappoint in that department, he also has a valid point.

      As I understand it, Intel proposes to build in a switch in future CPU's which tells the CPU to stop being insecure. The switch is going to be off by default and must be switched on by the kernel during boot. Intel proposes to let all future CPU's be insecure by default.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    2. Re:Is there any other option, Linus? by tlhIngan · · Score: 5, Insightful

      and how exactly does that do anything at all to improve the situation? or are you suggesting Open source hardware would somehow be magically design flaw free?

      And fundamental problems are still fundamental problems. The reason practically every processor has the same issues is because the same optimizations we used to make processors faster had the same fundamental design error.

      I mean, either someone designed the core branch predictor block and everyone worldwide copied it for every processor, or everyone implemented it differently, yet it has the same Spectre flaw, implying that the flaw is inherent in the way branch predictors work.

      The only way you can guarantee the designs are error free are to abandon everything that makes modern processors fast - OOO, speculation, branch prediction, and plenty more, including potentially pipelining (the fundamental technology everyone is trying to speed up by avoiding pipeline stalls). Go back to the old fetch-and-execute cycle and where memory operands are fully decoded and retrieved prior to even considering fetching the next instruction.

      Everyone will hate it, because now your 4GHz processor will be as fast as a 500MHz one.

    3. Re: Is there any other option, Linus? by Anonymous Coward · · Score: 5, Insightful

      While that is completely true, saying it doesn't solve the problem.

      It is no more Linus' problem to solve than it is your or mine.

      It is entirely up to Intel to do this and do it properly.

      Thank goodness Linus is saying this, because it will force Intel to solve it.

    4. Re:Is there any other option, Linus? by serviscope_minor · · Score: 5, Informative

      The only reason why many people need that 4Ghz to begin with is because of how bloated software has become.

      And because sensors have got better giving us mych larger datasets. You know, sound, video and audio are the common ones.

      I challenge you to find the bloat in FFMPEG, for example. Now try transcoding a HD video on a modern 4GHz desktop versus a 1st gen Raspberry Pi.

      --
      SJW n. One who posts facts.
    5. Re: Is there any other option, Linus? by Opportunist · · Score: 5, Informative

      AMD has one problem in common with Intel: Spectre. Meltdown is alone Intel's problem.

      Meltdown is fairly easy to exploit and quite serious. Spectre could be as serious, but so far nobody has shown conclusively that it is actually exploitable in a real life situation. Intel spun it to make people think they're the same, so everyone thinks Intel and AMD have the same problem. They don't. Intel has a serious, potentially crippling security hole and a potentially serious but most likely not usable security hole. AMD only has the latter.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    6. Re:Is there any other option, Linus? by Anonymous Coward · · Score: 5, Insightful

      Your comment is marked Insightful, 4 -- but it is overall totally wrong. You are presenting a false dichotomy!

      The issue with Spectre is not that there is a fundamental problem with branch prediction (or BP misses, or pipelining, or speculation, or any combination of these). The issue is that some processors don't actually clean up after themselves when branch mis-prediction occurs. They roll back instruction execution in some cases (great). In others cases they may simply abandon execution (okay). But they don't ever do so much rollback work as to invalidate cache lines (bad!).

      There is such a thing as provably reversible computation and storage, and it can be done correctly. But you have to limit the length of instructions over which you will continue to speculate to something you can reverse; and you **have** to flush cached information that should have never been available in the first place.

    7. Re:Is there any other option, Linus? by thegarbz · · Score: 5, Insightful

      The only reason why many people need that 4Ghz to begin with is because of how bloated software has become.

      What an ignorant and useless comment. Software bloat is a very minor portion of what it is our computers do. Almost none of what makes up bloat in software ever even comes close to pegging the CPU and would give you a few percent speed increase at the most.

      What we do depend on CPUs now is raw computing power. I was fine with a 500MHz PC back when my digital camera had a floppy disc in the back, now that it generates a 50mpxl 14bpp file it's not going to cut it and that has nothing to do with the relative bloat of the image viewer.

      Likewise for web browsers. It's not bloat to blame for the fact I expect a browser to be able to stream a 4K movie in surround sound. It's not bloat to blame for the 40 tabs I run concurrently. It's not bloat that I have 10 office apps open along side video conferencing software along with those browser windows.

      And among all that my CPU is sitting at 30% utilisation, 25% of which is being taken up by a full system virus scan (fuck monday mornings on my work machine).

      Spare us your "we could make this less bloated and we would all be happy with 500MHz" garbage.

    8. Re:Is there any other option, Linus? by mwvdlee · · Score: 5, Insightful

      IMHO, Intel is basically operating in PR-nightmare-cleanup-mode right now.
      They fucked up badly and are trying to lie, cheat and manipulate their way out of it.
      They are desperately trying to make it look like this is a generic problem (It's not; the AMD and ARM variants of these bugs are much less evil) and they are trying to safe face and shift blame however they can.
      Plain and simple truth is that Intel has knowingly made malicious choices and now they've been caught.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    9. Re:Is there any other option, Linus? by hcs_$reboot · · Score: 5, Interesting

      The reason practically every processor has the same issues is because the same optimizations we used to make processors faster had the same fundamental design error.

      I mean, either someone designed the core branch predictor block and everyone worldwide copied it for every processor, or everyone implemented it differently, yet it has the same Spectre flaw, implying that the flaw is inherent in the way branch predictors work.

      No. The fix is to not read from memory into the CPU cache during the speculative execution when that block of data is not there already. Changing this in the CPUs core would solve both Spectre and Meltdown, at a reasonable cost (would not defeat much current optimizations).

      --
      Slashdot, fix the reply notifications... You won't get away with it...
    10. Re:Is there any other option, Linus? by smallfries · · Score: 5, Informative

      Somebody with mod-points should mod up the AC parent. They are completely correct.

      The design flaw is not in using speculation or branch-predication. It is allowing the side-effects of instructions in those streams to be visible in the machine before the branches are retired. This is really basic stuff - I remember a discussion about this very issue in a processor design course back in '00 - '01.

      Intel gambled that the state of the cache was not visible to the programmer. Flush+reload showed that they were wrong.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    11. Re: Is there any other option, Linus? by LordKronos · · Score: 5, Interesting

      Even invalidating the loaded cache pages isn't necessarily sufficient. Because the act of loading one page means the flushing of another page, it may be possible to then do spectre in the opposite direction...preload the cache and if any preloaded pages become slower to access then you can determine the branch predictor caused them to be flushed. At least in theory....in practice that becomes more difficult in a multiprocess environment where other processes could be responsible for flush,but I certainly wouldnt want to predict it isn't possible.

      So the full solution may need to be more complex. Just like the CPU includes more registers than the architecture specifies so it can do scrap work in this extra registers and then roll it back without affecting the real registers,the CPU may need extra cache pages so that it can load a page and then flush it without having lost any of the previously loaded pages.

      Or alternatively, approach the problem from the opposite perspective. The problem is caused not just because of speculative execution but also because (for performance reason) the OS leaves all process memory mapped into every processes address space and the uses permission to try and make that memory unavailable. The other fix is to find a way to redesign virtual memory so that other processes memory is NOT mapped into each others memory space and is thus truely inaccessible. But that may be an even more difficult solution to implement

    12. Re:Is there any other option, Linus? by Anne+Thwacks · · Score: 5, Insightful
      The fix is to not read from memory into the CPU cache during the speculative execution when that block of data is not there already. Changing this in the CPUs core would solve both Spectre and Meltdown, at a reasonable cost (would not defeat much current optimizations).

      This is the correct answer. Proceed to Go and collect your $200.

      You may want set aside more silicon for caching and less for handling the speculation.

      You may also want more cores rather than greater complexity. This would not have been a good choice ten years ago, but now people are learning how to use the extra cores, it will probably sell well.

      Alternatively, you could set a flag to say whether your application cares about the risk (if the entire machine is dedicated to a single offline task, you probably don't).

      --
      Sent from my ASR33 using ASCII
    13. Re:Is there any other option, Linus? by AmiMoJo · · Score: 5, Insightful

      It's not a normal design compromise. AMD isn't affected by Meltdown because they did it right, Intel cut corners to get a small performance boost that they didn't need.

      Worse still, Intel's botched microcode fix can brick systems. Apparently 7 months wasn't enough to properly test it.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    14. Re:Is there any other option, Linus? by drinkypoo · · Score: 5, Informative

      And fundamental problems are still fundamental problems. The reason practically every processor has the same issues is because

      Is because it doesn't. AMD is not vulnerable to MELTDOWN and is less vulnerable to SPECTRE because they are more scrupulous and responsible than Intel, FULL STOP. There is no other reasonable way to regard the situation.

      Every speculative processor has some of the same issues, to some degree, but that is not every processor, and you are still using Intel's bullshit excuse FUD language when you say that all processors are vulnerable to these attacks. That is a lie as stated.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    15. Re:Is there any other option, Linus? by Zaiff+Urgulbunger · · Score: 5, Insightful

      I understand the sentiment, it's just not a professional way of handling the situation.

      Linus always tells it like it is, which you can either view as professional or not. But from an engineering perspective, it seems better to do that than just say something polite so you don't upset people.

      It appears to me he's directing his displeasure at Intel management/legal/marketing making decisions where really they shouldn't.

      And how does excluding 80-90% of the installed user base help Linux exactly?

      I very much doubt he's going to do anything of the sort. I would suggest the exact opposite in fact; he wants the best solution for all and is complaining that Intel's patches are constructed for their own benefit (legal/ass-covering), rather than that of their customers.

    16. Re:Is there any other option, Linus? by mwvdlee · · Score: 5, Interesting

      Linus seems to (begrudingly) accept the need for a temporary fix and there is already a temporary fix that works for current CPU's.
      The problem is Intel calling it a permanent fix and implying that every future CPU will be unsecure by default unless the OS flips a switch.
      That way Intel can blame any performance issues on the OS and still pretend their CPU is fast, even though it isn't when running in the secure mode that no sane person would ever use.

      How about a car analogy:

      Imagine all cars have two bugs in the gearbox that trigger on putting it in reverse certain ways.

      But 1 makes a dashboard light blink one time.
      All car manufacturers have this bug, and they all fixed it when found.

      Bug 2 makes your car explode.
      AMD and ARM knew about this and fixed it. It made their cars a bit slower, but atleast it wouldn't explode.
      Intel knew about it too, but they choose to ignore it. Their cars are a bit faster because of this.
      Intel fixed this by sending out a widget that stops the car from exploding, this widget does make Intel cars go slower.
      The widget doesn't fix it automatically, though! The driver has to switch the widget on every time he starts the car. If the driver doesn't switch the widget on, putting the car in reverse will still make it explode.
      Intel also says that this is how all future cars will be prevented from exploding; by adding this widget to every future car and requiring the driver to switch it on; it'll always be in "explode-on-reverse" mode by default.
      Intel does get to claim their car is faster by default though. Just don't put it in reverse.

      As a bonus analogy; Intel claims both bugs are the same because they are both triggered by the same action, so therefore all car manufacturers are vulnerable to the exploding car bug.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
  2. Linus Haiku by Moblaster · · Score: 5, Funny

    Linus proclaims thus:
    This patch is a piece of shit.
    So what else is new?

    1. Re:Linus Haiku by sacrilicious · · Score: 5, Insightful

      Linus proclaims thus: This patch is a piece of shit. So what else is new?

      If you mean "useful, straight communication from Linus as usual", then I'm with ya.

      But if you're trying to imply that Linus indiscriminately calls *everything* a piece of shit, then you're so offbase that I'll wonder if you're astroturfing on behalf of Intel. When Linus criticizes stuff, he's spot on. This patch is indeed a piece of shit.

      --
      - First they ignore you, then they laugh at you, then ???, then profit.
  3. Don't forget guys by Anonymous Coward · · Score: 5, Informative

    Don't forget guys Intel are the biggest contributor of code to the Linux kernel and it was they who wrote that code that would have crippled AMD as well as Intel cpus against their own flaw. Luckily AMD picked up on it and submitted a "elseif" statement to Intels code so AMD users wouldn't be neeedlessly affected by Intels cpu flaw.

    1. Re:Don't forget guys by PhunkySchtuff · · Score: 5, Informative

      Yep, I'm sure it was just a simple oversight that Intel's patch that hurt performance on Intel and AMD, and wasn't necessary on AMD, was applied by default to CPUs from both vendors. You know, Intel has only known about this for 6-7 months, so they were really rushed to get a working patch out in time. /sarcasm.

    2. Re:Don't forget guys by Rockoon · · Score: 5, Informative

      This is the same Intel that put cripple AMD cpus code generation in their compiler.

      Here is CPU optimization expert Agner Fog's blog on the subject: Intel's "cripple AMD" function

      --
      "His name was James Damore."
  4. Re:and your solution is? by Anonymous Coward · · Score: 5, Insightful

    I think his response to all of this was a verbal kick to the scrotum of intel in a very public way.

    I am glad of it too, had it not been for this thread I would not have known about the issues with these 'patches' which now seem more like last minute frantically cobbled together garbage.

    Because of linus' efforts I, and many other lurkers here on slashdot will be VERY wary of any 'updates' involving intel cpu's.

    Linus does not need to fix this, the community does not need to fix this, Intel needs to fix this, lets be realistic.

  5. Re:What is going on here...? by Anonymous Coward · · Score: 5, Insightful

    Linus is pointing out that the patches as submitted do things that should not be necessary. For example, the Linux kernel now uses this code technique called “retpoline” to avoid one of the Spectre bug variants. But this set of new patches also includes a performance-hurting workaround for the same Spectre variant that was already worked around. Why would that be necessary? It suggests that maybe Intel isn’t fully disclosing everything that they know, and that maybe the “retpoline” workaround is insufficient for reasons that Intel is keeping secret.

  6. Re:What is going on here...? by ledow · · Score: 5, Informative

    He's saying that you shouldn't have to "opt-in" to the security that everybody expects when you boot up your processor.

    At the moment, the processor just says "Hey, if you flip some magic bits when I boot I'll slow myself down and try to apply a fix".

    The processor should instead say "Hey, I'm one of the fixed models, don't bother trying to fix me again".

    It's a marketing / legal tactic so they can say the processor runs at such-a-speed (but insecurely) whereas anyone who actually cares about using the processor has to - every boot - flips lots of magic bits to make it secure and kill its performance. If you forget, insecure. If you do it wrong, insecure. If your OS doesn't support it, insecure.

    What Linus wants, and I can't disagree with, is a flag to this "this processor isn't vulnerable, so you don't need to do anything." which, if it's not present, they know that they have to apply as many protections as they can but can say "Hey, you have an insecure processor, we'll do our best" in the syslogs.

  7. Re:and your solution is? by gweihir · · Score: 5, Insightful

    My reading also. Intel did some shady things they likely did know were shady in order to have the best performance. Now that they have been caught and the shady things actually turn out to be really bad, they still do not want to fix them, because they do not want to admit how much they padded the performance of their chips and they still do not want to compete with an actually good design because everybody will see how they have been screwed over by Intel.

    Linux is just calling that out. I mean, a functionality that fixes a very critical security bug and it is _off_ by default? That is insane!

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  8. Re:ARM guys will probably do it right by gweihir · · Score: 5, Insightful

    ARM does not have to fix anything for the issue under discussion and neither has AMD. Meltdown is Intel only. They did it to get more performance while everybody else was careful and did not do it. Intel was warned by numerous research papers that this could go badly. Now they are lying about it and are trying to a) confuse the issue and b) have the fix (which exposes their real performance when running securely) not active at startup. a) is dishonorable and b) is insane. Linus is just calling them out here.

    Spectre is something else, and hits almost everybody. While fortunately, it is much, much harder to exploit (Meltdown is easy), Spectre will also be much harder to fix. It is possible that we will see an arms-race for a while with Spectre and that, in the end, it will need to be a compiler-level fix that finally fixes things. Not good, but apparently, the performance penalties for an actual hardware fix at this time would be a performance loss of 5x...20x.

    But to re-iterate: The only reason why Intel tries to lump Meltdown and Spectre together is so that they do not look as grossly incompetent and dismissive of their customers's security as they are with Meltdown.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  9. Re:Obligatory: Intel CPU Backdoor Report (Jan 1 20 by Opportunist · · Score: 5, Funny

    Is there a tl;dr version of that tl;dr version?

    --
    We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
  10. Don't Bet On Malice When Stupidity Will Do? by ytene · · Score: 5, Interesting

    You make some really interesting points around retpoline, but I wonder if this latest from Intel fails to account for this because they are being disingenuous, or because they continue to be a bunch of idiots?

    We're seeing similar problems to this with other very-long-established technologies, such as Windows [with Windows 10]. Things that have worked for decades up until W10 are breaking, or they are breaking in new and frustrating ways.

    For example, I have a triple-screen setup and using removable SSDs via a caddy unit, I can boot my computer into 2 different W10 instances, as well as multiple Linux builds. The 2 W10 instances behave in completely different ways, despite being set up, by me, with EXACTLY the same approach [scripted]. On one of them the Task Bar keeps relocating itself around the desktop, on the other it remains static. I've been back-and-forth with Microsoft and they don't know why...

    At the root of the problem I suspect they have changed something in W10, written by someone no longer at the company, possibly poorly documented and possibly with unknown consequences.

    Maybe Intel are having similar issues... A decision was made a very long time ago to do something insecure and stupid with speculative execution, but the person who made that decision is no longer with the company, so a new Team are trying to fix it and simply don't know what they're doing...

    I honestly don't know what the source is, but I do know that I am seeing "existing" functionality break with much greater frequency on core platforms like this. It just smacks of carelessness...

  11. Re:Lazy Intel? by drinkypoo · · Score: 5, Informative

    But i have to say, even as i prefer AMD, AMD does not have spectre resistant CPUs either.

    Yes, they do. SPECTRE attacks are more difficult to carry out against AMD than against Intel. In fact, they are only vulnerable to one out of two of the classes of SPECTRE attack. Please don't lie.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  12. Re:and your solution is? by drinkypoo · · Score: 5, Interesting

    we must fix things with what is possible, no matter how ugly.

    Intel went straight to ugly, and did not satisfactorily explore the realm of the possible. Linus perceived this, and announced it to the world. The ball is now in Intel's court. They can be responsible and competent, or the whole world can know that they are the fuckups that they are. It's their call.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  13. Fundamental problems : Yes, but... by DrYak · · Score: 5, Informative

    And fundamental problems are still fundamental problems. The reason practically every processor has the same issues is because the same optimizations we used to make processors faster had the same fundamental design error.
    I mean, either someone designed the core branch predictor block and everyone worldwide copied it for every processor, or everyone implemented it differently, yet it has the same Spectre flaw, implying that the flaw is inherent in the way branch predictors work.

    Well, there are different level in the whole Spectre/Meltdown debacle.
    Not all CPU are affected the same.

    (And nitpicking : only CPUs doing speculative execution are affected. Lots of RISC don't, only some recent like 64bits ARM cores do. And there are still CISC cores that don't even in modern days like older Atoms and Xeon Phi on Intel MIC boards).

    Speculative execution, from the moment it was presented (around the era of Intel Pentium Pro) as a new technology, was criticised as potentially executing past important checks if the speculation goes wrong. But it was dismissed back then.
    - because in the end, nothing is committed to memory/register, but instead is discarded. There are not (direct) permanent effect of the wrong speculation.
    - nobody paid attention much to the indirect, less significant effects, that still could be measured (like bringing memory into cache).
    - ...because back in those days (in the era of RC4 and 3DES encryption, MD4 MD5 checksums, etc.), attacking cryptography was still done by breaking imperfect algorithms and brute forcing small search spaces. Timing side channel where something of academic interests only. Known to exist, but in practice there are simple way to attack encryption.
    (so nobody in the early-to-mid 90s would have though that cache could lead to useable exploitable attack).

    "Spectre" is just some researcher figuring out a way to exploit this "known from the beginning" knowledge by putting it into the light of how crypto is attacked nowadays (side-channels, timing, etc.)

    That's the thing which every single CPU is affected by and which is still speculative execution working as it should (and normally should still be contained to data that could be accessed by the application anyway).

    But then, there are the cause of the "Spectre Variants / Meltdown" - due to "excessive" optimisations, suddenly the CPU not only access things that the application could already access anyway. It usually boils down to the CPU (and its designer) were trying to be way to smart.

    Meltdown only exclusively affects Intel CPUs. On intel CPUs, to speed things up, memory protection check are post-poned. If something happens to be already available in the cache, speculative execusion might pick it up even if it violates memory protection.
    This runs countrary of how memory location works, is undocumented (unlike the base caracteristics of speculative execution).
    (AMD CPU, as a counter example, are guaranteed by AMD to not be affected, because they do the expensive checks at the beginning of the pipeline and never let speculation through if it reads from an unauthorised memory location. There everything works according to docs).

    Spectre variants affects Intel CPUs: to speed things up, even if the destination of a jump is unknown (because it depends on a memory location that isn't even known yet: e.g. not-yet computed index of a jump table), an Intel will try to speculate where the execution would go (by keeping a list to remember where usually this poistion ends-up jumping to). Due to the specific way Intel CPUs work internally and keep this list of "possible destinations of a jump", it can get confused and jump to an impossible situation. the speculative execution will jump to an address that is not even in the jump table, it will execute at a position that could never be reach under normal circumstance.
    (AMD cannot exclude if their CPUs are affected. They definitely do not work the same wa

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]