Another Day, Another Intel CPU Security Hole: Lazy State (zdnet.com)
Steven J. Vaughan-Nichols, writing for ZDNet: The latest Intel revelation, Lazy FP state restore, can theoretically pull data from your programs, including encryption software, from your computer regardless of your operating system. Like its forebears, this is a speculative execution vulnerability. In an interview, Red Hat Computer Architect Jon Masters explained: "It affects Intel designs similar to variant 3-a of the previous stuff, but it's NOT Meltdown." Still, "it allows the floating point registers to be leaked from another process, but alas that means the same registers as used for crypto, etc." Lazy State does not affect AMD processors.
This vulnerability exists because modern CPUs include many registers (internal memory) that represent the state of each running application. Saving and restoring this state when switching from one application to another takes time. As a performance optimization, this may be done "lazily" (i.e., when needed) and that is where the problem hides. This vulnerability exploits "lazy state restore" by allowing an attacker to obtain information about the activity of other applications, including encryption operations. Further reading: Twitter thread by security researcher Colin Percival, BleepingComputer, and HotHardware.
This vulnerability exists because modern CPUs include many registers (internal memory) that represent the state of each running application. Saving and restoring this state when switching from one application to another takes time. As a performance optimization, this may be done "lazily" (i.e., when needed) and that is where the problem hides. This vulnerability exploits "lazy state restore" by allowing an attacker to obtain information about the activity of other applications, including encryption operations. Further reading: Twitter thread by security researcher Colin Percival, BleepingComputer, and HotHardware.
This just keeps getting better and better. "Hey, I got a ideal! I'm going to switch to intel for this build." I thought to myself. I wasn't even smoking any good weed when I did that, or bad weed. Nope, my dumbassery was my own fault.
I read at +2. If your post doesn't reach that level I will not see or respond to it.
...For some operating systems, the fix is already in. Red Hat Enterprise Linux (RHEL) 7 automatically defaults to (safe) "eager" floating point restore on all recent x86-64 microprocessors (approximately 2012 and later) implementing the "XSAVEOPT" extension. Therefore, most RHEL 7 users won't need to take any corrective action.
Other operating systems believed to be safe are any Linux version using the 2016's Linux 4.9 or newer kernel. The Linux kernel developers are patching older kernels. Most versions of Windows, including Server 2016 and Windows 10. are believed to be safe. If you're still using Windows Server 2008, however, you will need a patch. The latest editions of OpenBSD and DragonflyBSD are immune, and there's a fix available for FreeBSD. ...
AMD should be dumping money right now. Making all this as public as possible and pushing its own CPU's Go AMD go!
For some types of chips and applications, perhaps having real security means not being able to do fancy optimizations that degrade security.
I wonder how well typical PC operating systems would work if they were re-compiled to not take advantage of optimizations and run on a completely-de-optimized architecture-compatible CPU with buses, memory, chipsets, etc. that were similarly "de-optimized" and had other things in them like less-tightly-packed circuits to prevent certain side-channel attacks (e.g. rowhammer).
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Actually describing them as FP is misleading, they should be called the vector registers, they are wide (128, 256 or 512 bits) and while all (except legacy x87) FP instructions only work on these registers, there are plenty of instructions which use them as fixed size arrays of 8, 16, 32, and even 64 bit integers.
Of course you could perform the same tasks on the regular 64 (formerly 32) bit registers but this would be much slower. Using the vector register file also frees the integer registers to be used as counters, pointers and so on. This was especially true on 32 bit, where out of the 8 registers, the stack pointer (ESP) cannot be used for anything else, the frame pointer (EBP) may be needed for debugging/dynamic allocation, and EBX is used for addressing in position-independent code (all libraries). In many cases, the compiler was (32 bit is obsolete) left with 5 or 6 registers to play with.
I thought AMD and ARM cpus were also susceptible to exploits. Is that wrong?
Indeed lots of different CPU manufacturer could be producing CPUs susceptible to spectre vulnerabilities.
But not all CPU are created equal.
There are some key differences :
- not all CPUs actually do speculative execution. only a couple of ARM core actually do. The huge remaining amount doesn't and thus can't in any way be subjected to Spectre class vulnerabilities.
(Even some of Intel's own core, like some older Atom, or like Xeon Phi GPGPUs don't do speculative execution)
Intel has a different safety vs optimization threshold than most of others:
- with most other CPU manufacturer, Spectre vulnerabilities boil down to "access memory region to which the process should already have had read access anyway" (see v1 and v4), thus it could be already addressed by safe practice (v1: don't put 3rd-provided JIT code and crucial information in the same process. e.g.: a browser's JIT engine running webpage's scripts and the password manager should not be in the same process) (v4: always clean up your stack before bailing out if it could contain cricital data, or better keep all the critical data in some specific mapped pages), etc.
- Intel tends to push performance first to the detriment of anything else : some security test coming in too late.
AMD and most ARM won't speculate past memory protection. If a memory region is blocked from access for the process (generally : kernel memory), AMD will check the memory protection and never attempt to access the restricted region to begin with. Whereas Intel CPUs will speculatively access the restricted region and only do the check much latter, by which point the usual Spectre's cache loading side-channel leakage could have happened.
(There are few select ARM which are susceptible to Spectre v3a. Basically the same concept, but regarding system-reserved register - this being RISC architecture, with tons of registers)
AMD and ARM will honor non-maskable interrupt. In today's vulnerability Intel tries to speculatively execute the point past which the system should contect switch the FPU registers (which includes stuff like SSE and AVX registers. i.e.: an attacker could be speculatively peeking into what another process did with these - SIMD operations with SSE/AVX are used in encryption/decryption, so an attacker could occasionnally spot what other process are decrypting/encrypting and whith which keys)
So you end up with vulnerabilities v3 and today's which are Intel exclusive.
Also Intel tends to be a tiny bit more aggressive and/or jumping through some shortcut and/or having way deeper pipeline and longer speculations, in order to shave a few cycles off :
end results :
v2 (Indirect Branch prediction) is currently successfully exploitable on Intel. Though in theory some AMD CPU could do speculative indirect branching, there are no current usable exploits in the wild.
v1 actually works on Intel CPU without even activating the JIT - the speculation is so deep that an bytecode interpreter could speculatively access stuff
v4 works much easier on Intel (deeper pipeline higher chance to manage to read something that wasn't erased from memory yet)
etc.
TL;DR: due to technical choices prioritizing performance, Intel CPU tend to be even more vulnerable.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Where's the tablet-optimised website? This one's just a few tweets, it doesn't even have a logo! Where's the superbowl ad?
In an era where every single vulnerabilities needs to get a catchy name and a well designed reactive website (almost a superbowl ad ?), even before confirming if there's a viable exploit, it's nice to see the big hats of security reacting (cpercival - the daddy of Scrypt) and taking time to write an actual exploit to test, even if communication is done over an unglamorous channel as twitter.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Floating point registers are used for encryption?
This doesn't make any sense to me, unless they're actually shared registers.
Exactly :
the FP registers are shared with integer SIMD registers
(FP87 and MMX are exactly the same register file under a different name, modern CPU tend to use AVX/SSE for their floating point computation and use the same registers also for interger SIMD, etc.)
Basically (As cpercieval explained in his Twitter thread), the CPU will only always switch the content of its basic CPU registers (rax, rbx, etc.)
Everything else (e.g.: SSE's xmm0, xmm1, etc) will only be switch when needed (though a non maskable interrupt). But just like with Meltdown Intel CPUs didn't give a fuck about memory protection, in this spectre vulnerability Intel CPU don't give a fuck about context switching and will happily speculatively execute and process old left-over data in these registers.
The problems is that most efficient crypto implementation are likely to be implemented using MMX, SSE or AVX (including the AESNI hardware), thus critical data is likely to be hanging in these registers when a process that handle encryption is interrupted and multi-task-switched to the attacker's process.
On any other CPU, if the attacker's process attempts to access of there registers, the process will immediately be interrupted, and the kernel will also switch the FP/MMX/SSE/AVX context (the process will only see it's own content of XMMn).
On Intel hardware, the CPU will happily try to speculatively continue executing based on the old stale content of the XMMn register (which could be containing the data that the encrypting process was manipulating), enabling possibility to leak through the usual spectre's cache side-channel, until the NMI is served the correct context is loaded and the speculative attempt thrown away.
Sounds stupid, but it enables Intel CPU to show a couple of % faster on benchmarks.
e,g,; benchmarks that do encryption are less likely to be hit performance-wise by multi-tasking :
in the case of switching an encryption process -> normal CPU process -> back to the encryption process, a normal CPU (like AMD) will lose time forcing a reload of the SSE/AVX context when returning to the encryption.
An Intel CPU will happily continue immediately the encryption with the content of the XMMn registers, and speculative execution will eventually turn being right as the left-over data in the XMMn registers is the encryption's own content from right before getting interrupted by the CPU-only thread (which left it untouched).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
I wonder how well typical PC operating systems would work {...}
Well lots of smartphones and nearly all single-board computers (including the Pi) use ARM core that don't do any speculative execution.
(Only very few high-performance smartphones use ARM with speculative exec and thus potentially spectre-vulnerable)
They are still not that bad at common browsing tasks.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]