Slashdot Mirror


Google Says CPU Patches Cause 'Negligible Impact On Performance' With New 'Retpoline' Technique (theverge.com)

In a post on Google's Online Security Blog, two engineers described a novel chip-level patch that has been deployed across the company's entire infrastructure, resulting in only minor declines in performance in most cases. "The company has also posted details of the new technique, called Retpoline, in the hopes that other companies will be able to follow the same technique," reports The Verge. "If the claims hold, it would mean Intel and others have avoided the catastrophic slowdowns that many had predicted." From the report: "There has been speculation that the deployment of KPTI causes significant performance slowdowns," the post reads, referring to the company's "Kernel Page Table Isolation" technique. "Performance can vary, as the impact of the KPTI mitigations depends on the rate of system calls made by an application. On most of our workloads, including our cloud infrastructure, we see negligible impact on performance." "Of course, Google recommends thorough testing in your environment before deployment," the post continues. "We cannot guarantee any particular performance or operational impact."

Notably, the new technique only applies to one of the three variants involved in the new attacks. However, it's the variant that is arguably the most difficult to address. The other two vulnerabilities -- "bounds check bypass" and "rogue data cache load" -- would be addressed at the program and operating system level, respectively, and are unlikely to result in the same system-wide slowdowns.

13 of 120 comments (clear)

  1. time flies by mapkinase · · Score: 4, Funny

    Pentium 4.99989 disaster seems like yesterday.

    --
    I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
  2. Or just Buy AMD & get no slow down with more p by Joe_Dragon · · Score: 5, Informative

    Or just Buy AMD & get no slow down with more pci-e lanes.

  3. Re:You can't "patch" hardware by supremebob · · Score: 5, Informative

    Geez... You make it sound like this is the first ever time someone has had to write a software patch to bypass a hardware flaw. Driver developers have had to come up with clever workarounds to hardware defects since the the dawn of computing.

    These Intel firmware fixes are just going to become part of yet another security update that will be required to keep systems secure.

  4. Google's technique requires patching binaries/code by JoeyRox · · Score: 4, Interesting

    Google's technique is to patch binaries so that branches/calls don't use the branch prediction mechanism of the CPU, which has a small performance hit but much smaller than KPTI. I suppose the presumption is that harmful code which uses the technique would have to compile it into their binary since most OS's prevent the self-modification of code segments/TLB entries once they've been placed into memory by the OS loader. But what about code segments generated entirely at runtime, including from interpreters and libraries like libjit?

  5. Re:More lies by 110010001000 · · Score: 4, Interesting

    It isn't "chip level". The Intel PR spin is out in full effect. Meltdown is a major flaw that can only be fixed by removing the flawed Intel processor and replacing it with a processor that doesn't contain the flaw. If you don't do that, the best you can do is mitigate the effects. There is no microcode fix either. What Google is doing is recompiling everything, which is fine, but hackers aren't going to do that.

  6. Re:Idiotic Moderation by 110010001000 · · Score: 5, Insightful

    Because it doesn't make sense: Intel has a KNOWN UNFIXABLE FLAW in Meltdown. It cannot be fixed. You are saying "don't switch to AMD because they might have a major flaw too at some point". Meltdown is a much larger problem than Spectre is.

  7. Re: amd needs desktop level server chips / ipmi bo by 110010001000 · · Score: 5, Informative

    More Intel spin. Spectre and Meltdown are different flaws. Meltdown is severe and unfixable and only affects Intel.

  8. Summary not very helpful, here's my attempt. by PhrostyMcByte · · Score: 5, Informative

    Google has created "retpoline", a technique which allows an indirect branch (e.g. a vtable call) to occur in a way that effectively disables speculative execution by isolating branch target prediction into a safe effectless loop. This addresses Variant 2 (aka Spectre).

    Retpoline does not depend on or assist a CPU or an OS patch: it is done purely at the software level, per-app, by a compiler. There is no simple OS-wide patch.

    Google says a retpoline call has performance "within cycles" of a regular old mispredicted branch. The zero-cost predictions we're used to are a thing of the past, because it effectively forces misprediction. I'd be curious to see a benchmark of an indirection-heavy platform like .NET.

    This does not help address or optimize Variant 3, which is what the big kernel patches for Page Table Isolation are needed for. So, your I/O-dependent apps like databases are still going to take a big performance hit. Nor does it address Variant 1.

  9. Google is connected to Intel at the hip by bongey · · Score: 4, Insightful

    Google is dependant on Intel CPUs at the moment and has a vested interest in not saying well our cloud just got 5-30% percent slower.

  10. Re:Google's technique requires patching binaries/c by PhrostyMcByte · · Score: 5, Insightful

    Google's technique ... has a small performance hit but much smaller than KPTI.

    Keep in mind Google's technique (retpoline) is not an alternative to KPTI. Retpoline addresses Variant 2. KPTI addresses Variant 3. Both are required.

  11. Re: Idiotic Moderation by Anonymous Coward · · Score: 4, Interesting

    I take it you didn't read AMD's press release explaining exactly what you say you want to hear.

    It's true that all processors have errata and can have bugs/flaws/security weaknesses... but, the Meltdown flaw which does not affect AMD is a specific kind which can't affect AMD because of architecture differences. Specifically, AMD checks to make sure user land code doesn't try to access kernel data without the correct permissions before executing predictive branches on it. Intel doesn't -- it goes ahead and runs the illegal code before flagging an exception to dump the branch after the fact. So, for a short time, there's data in cache on an Intel chip that should NOT be there because it should never have been accessed by the system to begin with.... and a specially crafted program can read it before it's flushed. This is because Intel (and ARM and others) chose a certain optimization for their speculative engine while AMD chose a different, more secure architecture.

    https://www.pcgamesn.com/intel...

    AMD's fix is -- no fix needed b/c we weren't stupid enough to let even speculative code run without checking its permissions first.

    Per AMD for the initial Linux kernel patch:

    AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.

    AMD is definitely vulnerable to lesser exploits -- some which are also patched others are mitigated... and some are obfuscated because they are processor generation specific. But, they are not vulnerable for Meltdown or any variant like it by design.

    Now remember... the fix for Meltdown is to flush the cache -- all levels -- when switching from user mode to kernel mode or vice versa.... every single time. That's a heck of a hit for some use cases. I believe Intel has found some ways to mitigate it with their 8th gen core series and will likely tinker with a better patch in the future.

    It is absolutely a great idea to purchase an AMD processor if it suits the needs of one's business for those use cases where it will perform better than an Intel chip that is crippled by this horrendous bug -- all things being equal. Obviously, businesses have contracts with 3rd party suppliers and don't necessarily get to pick and choose every aspect of hardware, nor is AMD a savior necessarily if their total cost of ownership is higher because of servicing more varieties of equipment, dealing with more motherboard types and vendors, electricity / Air conditioning costs, etc.

    One doesn't have to be a shill for AMD to notice it's obvious that Intel has a serious hardware flaw that AMD lacks and while any CPU can have errata, most can be patched with negligible effects. Intel having to flush caches between modes is a serious flaw if one runs programs that switch modes constantly. For average users and even gamers, there's not a huge impact. I'm running the patch right now for Windows and I can tell it affects Virtual Machines and a bit of file serving, but not enough for me to be too upset about it. If I had a high-end cluster for databases, a 20% hit to that would definitely make me want to check out AMD as an alternative... b/c even IF AMD has a bug that needs patching, it's unlikely to ever affect performance like this one does by requiring cache flushes to avoid having processes of user and kernel modes running at the same time for fear of one stealing data from the other.

  12. Re: Idiotic Moderation by jezwel · · Score: 4, Insightful

    Is there a compelling reason to believe that AMD processors are less likely to be vulnerable in the future than Intel processors?

    Right now only Intel is massively exposed on one security issue where other manufacturers are not. So yes - this makes it appear that AMD design philosophy values security over performance. Whether that is proved out remains to be seen.

    If one manufacturer is cutting corners with the engineering and the other isn't, then there's a logical reason.

    Intel seems to be the one cutting corners - for decades. You do remember the FDIV and FOOF bugs in early Pentiums? I don't recall other manufacturers having such severe problems (sure, mainly PR with FDIV) that a recall was required.

    Otherwise, there isn't a logical basis for using that as a reason to change your behaviour in the future.

    Intel cannot provide CPUs to retail without this flaw for another 18 months or so. That should most certainly influence short-term future behaviour IF the fix causes significant performance issues with your workload.

    It's also entirely possible that, faced with backlash and distrust, the manufacturer might take additional steps to ensure that no such similar issues occur in the future. If there was demonstrable evidence of this, it might be a good reason not to switch.

    Sounds strange to not switch to a vendor that doesn't suffer from this vulnerability, in the hope that Intel will fix it's processes to ensure this doesn't happen again. Right now though, there's no good reason to specify Intel for your CPUs.

    The important question is whether there is any reason to believe Intel processors will be more vulnerable in the future.

    Why is that important? All manufacturers will have problems. You make plans with known data today. Intel messed up big time, and until the problem is fixed they should absolutely have this issue in the 'known problems' pile when consideration of CPU choice is done.

  13. Re:Idiotic Moderation by Anonymous Coward · · Score: 5, Informative

    Correction, they speculated that they were able to get AMD chips to do that. Their toy attack (within process) succeeded showing AMD chips will do speculative ordering. No actual security risk there, beause processes can read their own memory.

    BUT, they didn't know for a fact why they didn't succeed in attacking the kernel.

    We've now had statements from AMD (after the paper was released) - namely, that permission bits are checked BEFORE issuing instructions so kernel memory isn't readable, even speculatively.

    So.. .yeah, remember the paper is only what they think could be happening.