Google Researchers Say Software Alone Can't Mitigate Spectre Chip Flaws (siliconrepublic.com)

← Back to Stories (view on slashdot.org)

Google Researchers Say Software Alone Can't Mitigate Spectre Chip Flaws (siliconrepublic.com)

Posted by msmash on Thursday February 21, 2019 @04:45AM from the closer-look dept.

A group of researchers say that it will be difficult to avoid Spectre bugs in the future unless CPUs are dramatically overhauled. From a report: Google researchers say that software alone is not enough to prevent the exploitation of the Spectre flaws present in a variety of CPUs. The team of researchers -- including Ross McIlroy, Jaroslav Sevcik, Tobias Tebbi, Ben L Titzer and Toon Verwaest -- work on Chrome's V8 JavaScript engine. The researchers presented their findings in a paper distributed through ArXiv and came to the conclusion that all processors that perform speculative execution will always remain susceptible to various side-channel attacks, despite mitigations that may be discovered in future.

17 of 98 comments (clear)

Min score:

Reason:

Sort:

Just always apply hardware access controls. by KirbyCombat · 2019-02-21 04:56 · Score: 4, Insightful

Is my understanding not correct? I thought that these vulnerabilities were due to processors not applying memory access controls during speculative execution. For me personally, I was very surprised to find out that memory access controls could be bypassed at all. Isn't it just a matter of always applying memory access controls? Isn't that why the access control is in the hardware?
1. Re:Just always apply hardware access controls. by kurkosdr · 2019-02-21 05:03 · Score: 2
  
  Well the hardware implementation was buggy, so you can bypass it after all. And since you cannot patch hardware (unless the patch is disabling speculative execution altogether and making all the affected hardware much slower) people try to find all kinds of workarounds to mitigate the issue.
2. Re:Just always apply hardware access controls. by msauve · 2019-02-21 05:10 · Score: 4, Interesting
  
  But the summary claims "...all processors that perform speculative execution will always remain susceptible...". That's a blanket statement which covers all processors, existing or future.
  
  --
  "National Security is the chief cause of national insecurity." - Celine's First Law
3. Re:Just always apply hardware access controls. by suutar · 2019-02-21 05:11 · Score: 2
  
  As I recall, the problem is that speculative execution alters the state of the CPU and its cache, and there are ways to determine information about that state afterwards that don't involve violating memory access restrictions, like "hey, loading that address didn't take as long as expected, something must've pulled it into the cache already" - that sort of thing. So the question is how much can they make the state act like the speculative execution never happened without actually reaching the point that speculative execution isn't beneficial.
4. Re:Just always apply hardware access controls. by radarskiy · 2019-02-21 09:35 · Score: 2
  
  You have assumed a multicore CPU that does not share the last level cache between cores, which is not necessarily true.
  While I have not a description of a Spectre exploit that specifically involves a multilevel cache, and I have not seen that case specifically rejected either.
5. Re:Just always apply hardware access controls. by Miamicanes · 2019-02-21 10:03 · Score: 4, Informative
  
  It's hard to explain without getting EXTREMELY technical, but here's a SOMEWHAT technical explanation:
  Back in "the old days" (6502, 68000, 8086, etc), a specific machine language instruction took a precise, deterministic amount of time to execute... 1 cycle, 2 cycles, 3 cycles, whatever. Always, and without exception.
  Sometime around the AMD K5 (late 90s), we got to a point where the combination of cache and execution-time optimizations used by processors (speculative & out of order execution, cache, etc) made it SEEM like the days of deterministic execution timing were over. You could predict best-case and worst-case execution times for a given block of code, but ACTUAL runtime execution times had become seemingly random when you tried to measure them on an operating system like Windows or Linux.
  It turns out, we were wrong. Execution times were as deterministic as ever... it's just that making sense of their timing had become too complex for humans to understand, so it SEEMED random. Then, "big data" and "machine learning" became common, and people discovered that execution timings weren't nearly as random as humans had come to believe they were.
  Problem #2: due to the state various performance optimizations leave the CPU and its cache in, the amount of time it takes an attempt to do something prohibited to fail varies in subtle ways depending upon the values being protected.
  So... taking advantage of analytics, machine learning, and lots of brute-force hammering & observation, it's possible for attackers to gradually discover the likely values of protected ram and registers. They can't necessarily do it with a single hit... but if they hammer away at something a million times and discover that a particular bit's value seems to be '0' ~70% of the time, and '1' the other 30% of the time... well, the bit's value is probably 0.
  Here's another roughly analogous example: suppose you're attempting to discover the combination to a safe. Suppose the lock is designed to frustrate attempts to listen for pins falling into place by ensuring that EVERY dial position results in the lowering of one pin and the rising of another. HOWEVER... someone discovers that certain unsuccessful combinations produce a slightly different sound than others. Using deep learning techniques, the algorithm predicts that sound #1 indicates a combination that's almost right, while sound #2 indicates a combination that's completely wrong. By rapidly performing a few million experiments with different combinations, the algorithm is able to eliminate 99% of possible combinations, and focus on the 1% it believes are likely to work... and as it continues to experiment, it discovers a THIRD sound variant that appears to exist whenever the third number is equal to 17. By successively setting aside unlikely combinations, it eventually stumbles upon the correct combination to open the safe.
  In security, this kind of problem is well known, and has a solution that generally works well: limit the rate at which an attacker can attempt different combinations. The problem is, that solution goes completely at odds against everything modern CPUs have attempted to accomplish for the past 50 years -- achieve better performance.
  Ultimately, Intel and others are probably going to start making CPUs with a security gradient:
  * The high-performance no-security portion that is designed to maximize performance, but makes no guarantees about security. Basically, "gamer" oriented CPUs will dedicate most of their silicon to this portion.
  * High-performance with minimal security... designed to avoid blatantly leaking data to other processes, but still totally vulnerable to Spectre-type attacks. CPUs targeted to enterprise users will probably dedicate most of their silicon to THIS mode.
  * The slow, separate, secure fortress. Totally separate, with no cache or optimizations at all, designed to guarantee absolutely deterministic execution times from the perspective of an outside observer. Basically, an 80386
gimme that old 6502 by fat+man's+underwear · 2019-02-21 05:09 · Score: 2

maybe make it run at a few 100 MHz, or upgrade to the 65816 architecture.... I think I'd rather have that.
Re:It's not a problem if you don't run unsigned co by bigpat · 2019-02-21 05:40 · Score: 2

We need to get away from this unsigned, unreviewed, wild code (like javascript) running on your machines.
Lock it down and stuff like this won't be a problem.
Systems for whitelisting apps and websites can help. But then the problem just shifts to how much do you trust whichever app stores or website whitelists you are using which are basically the same thing as a signing system. I mean I try to be careful about which apps I download, but if you want your computer to be a general purpose computer then you have to have some flexibility to run unsigned code. As a developer that often means my own code. Otherwise it is an appliance.
Re:It's not a problem if you don't run unsigned co by Rockoon · 2019-02-21 06:05 · Score: 2

We need to get away from this unsigned, unreviewed, wild code
As a representative of programmers everywhere, can you kindly take your idea and go fuck yourself?

--
"His name was James Damore."
Bad time for Intel CPUs by bangular · 2019-02-21 06:22 · Score: 4, Interesting

I'll be the first to admit this isn't my area of expertise. But after following these developments peripherally, I've been holding off buying a new desktop for awhile.

It seems like Intel has bumbled this at every step. They've put out a lot of misinformation causing a lot consumer confusion. It seems like every time they exclaim "it's fixed!" researchers say that's not the case. I'm assuming at this point we're probably at least a couple of CPU generations away from Intel fixing this properly.

On top of that, they've also been fighting the 10nm battle. More empty promises and missed deadlines on that front as well.

When I compare my current aging Intel system to single thread performance of the latest generation, it just doesn't justify the cost. AMD claims Zen 2 will fix all their problems. If they deliver, I will probably switch back to AMD. Intel burned a lot of goodwill in the past few years.
1. Re:Bad time for Intel CPUs by Fly+Swatter · 2019-02-21 08:24 · Score: 2
  
  So far, it has only been 'successfully' exploited in tightly controlled lab conditions. If you ask me, and no one did, it has been a way overblown risk by researchers; then fixes were hurriedly attempted with way too little testing; now researchers are saying 'it is still not fixed' and 'it can't be fixed', once again blowing the actual risk for us simple consumers way out of proportion.
  
  In the server and cloud world there might be a legitimate risk at some point, once we actually see an exploit in the wild then It would be time to panic - until then those of us using computers for single user tasks are burdened with all this FUD and slower performance unless we actively turn off the (according to researchers) still broken mitigation.
  
  If you can wait to buy until you need instead of want, that is always prudent for computer upgrades.
2. Re:Bad time for Intel CPUs by thegarbz · 2019-02-21 10:10 · Score: 2
  
  Let me start by defending Intel before turning this around. Speculative execution attacks are extremely difficult to execute in any constructive way without detailed targeted knowledge and access to the machine. As yet there's no known case of it actually being used and with very good reason: Pretty much every other security exploit is easier and more effective to execute with the exception being attacking a VM from another VM. Spectre / Meltdown really shouldn't come into consideration when buying a CPU unless you're running a cloud VM service or securing your country against a foreign state.
  Now on the flip side there already is no reason to go Intel, not when cost plays any part. The performance difference between Intel and AMD do not justify the costs. Where Intel exceeds in single thread performance you need an extreme edge case (e.g. pro gaming) to justify the cost. Where Intel exceeds in multi-core there pretty much is zero justification to chose Intel over a Ryzen or a Threadripper for cost, again edge cases such as a requirement to process AVX heavy loads not withstanding.
  On the high end threadripper even offers far more in terms of I/O including on core RAID and enough PCIX lanes to actually make a difference, something that Intel will offer on Xeons only if you spend even more money on additional frigging dongles.
3. Re:Bad time for Intel CPUs by lkcl · 2019-02-21 10:58 · Score: 3, Insightful
  
  I'm assuming at this point we're probably at least a couple of CPU generations away from Intel fixing this properly.
  unfortunately, it's much wose than the press is making out. i've had to investigate this in-depth as part of the design of the libre-riscv soc, because we criically rely on out-of-order execution for the vectorisation. i was shocked to learn that even in-order systems are potentially vulnerable to timing attacks.
  the first thing that people need to get absolutely clear: spectre was *just the first* in a *class* of timing attacks that opened researchers and hardware designers eyes to a blind-spot in computing architectures.
  the definition of a timing attack is as follows: one instruction may affect the completion time of past OR future instructions through resource starvation / contention, OR through state not being reset after use to a known uniform initial state.
  the FIRST spectre attacks were against memory and caches, on speculative designs.
  however it is perfectly possible, for example, for a multi-issue IN-ORDER system to have insufficient numbers of register ports, such that a certain unique combination of instructions may be arranged by an attacker to starve future instructions of the ability to complete instructions in a uniform time... and REQUIRE that they stall.
  by forcing instructions to stall, that is the very DEFINITION of a timing attack.
  against an *IN-ORDER* design.
  now, it is possible to put in place certain speculation mitigation barriers in hardware, however these barriers *ONLY* occur at interrupts, exceptions and, at a software / OS level, on "context switches". hence the reason why this paper says that no matter what hardware designers try to do, *intra-process* attacks simply CANNOT be mitigated without moving to an *INTER*-process software security model.
  FastCGI is toast, basically.
  there is a solution, and it's going to require a massive world-wide campaign to introduce a concept to the entire computing software world: the creation of intra-process speculation barriers. if we wish to keep using FastCGI, and if we wish to keep using Firefox and python-gevent (the single-process paradigm), we *need* a hardware instruction that "quiesces" internal state *AS IF* the hardware had just made a context-switch, terminating all speculative execution, resetting all internal state and so on.
  one way in which that may be possible to do in an out-of-order system that does not have such hardware-assisted in-process speculation barrier instructions is to issue about a hundred NOPs. the back-lash against doing so will be extreme, however it's not like there's much of a choice, here.
  bottom line is: this has been a major, major oversight by the entire computing industry for over 25 years. it's a problem *across the entire industry*, not just Intel, not just AMD, it's *everybody*. it's not going to be fixed in a couple of hardware revisions by one company.
Re:I for one am.... by Anonymous Coward · 2019-02-21 07:10 · Score: 2, Funny

... actually quite surprised.
Software engineers admitting something cannot be just fixed in software? Astounding.
Re:Oooh breaking news by AuMatar · 2019-02-21 07:30 · Score: 3, Informative

Inifinite cores wouldn't make speculative execution not needed. Speculative execution exists because you're trying to do things sequentially. Putting another core in there won't speed it up in any fashion- to avoid speculation you have to wait until the branch is determined. You could remove speculative execution and the optimization it provides, but yo'll end up with a FAR slower processor as your core will be idle from the beginning of a branch statement until its done- putting in a lot of noops into the pipeline. The deeper the pipeline in the architecture, the worse it will be.

--
I still have more fans than freaks. WTF is wrong with you people?
The irony by OneHundredAndTen · 2019-02-21 07:52 · Score: 3, Interesting

And the much-maligned, all-but-dead Itanium is immune to Spectre. Fancy that.
Re:I for one am.... by Darinbob · 2019-02-21 10:45 · Score: 2

I am. Though maybe it's just bad technology reporting. When they say "all processors that perform speculative execution will always remain susceptible" there's something wrong being reported. They should add in that this does not mean ALL processors (past, present, future, and from any vendor), and should end with "unless processors are redesigned."
Reading the summary as-is literally, it is disgreeing with itself.