Hyperthreading Considered Harmful
cperciva writes "Hyper-Threading, as currently implemented on Intel Pentium Extreme Edition,
Pentium 4, Mobile Pentium 4, and Xeon processors, suffers from a serious
security flaw. This flaw permits local information disclosure, including
allowing an unprivileged user to steal an RSA private key being used on the
same machine. Administrators of multi-user systems are strongly advised
to take action to disable Hyper-Threading immediately.
I will be presenting this attack at
BSDCan 2005 at 10:00 AM EDT on May 13th, and at the conclusion of my talk
I will also releasing a paper describing the attack and possible mitigation
strategies."
I read about this last night here at KernelTrap. They offer more info, evidently having talked to Colin...
Unlike SMP, with HT you're interleaving two threads on the same physical execution unit. That means that there is data from another thread in registers at the same time that you're executing, without having enough instructions execute during a context switch to flush the pipeline. It also means that the other process's page table is in the MMU while you're executing. Even if their proof-of-concept attack doesn't work on some other operating systems, everyone needs to look over their code to make sure this isn't just an accidental effect that could change with increasing pipeline depths, different context switch logic, etc.
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
Actually, Intel CPUs contain patchable microcode ROMs. You can see the option to enable it when you configure a Linux kernel.
-mkb
I'd be willing to bet he's right. He is currently awaiting a doctorate from the University of Oxford, which is commonly held as the finest academic institution in the world.
(I'm not biased by having spent the past 7 years there)
The Ars Technica page on hyperthreading with the Xeon might provide some clues. It lists which parts of the CPU are replicated, partitioned and shared.
...
...
One final bit of information that should be included in a discussion of partitioned resources is the fact that when the Xeon is executing only one thread, all of its partitioned resources can be combined so that the single thread can use them for maximum performance. When the Xeon is operating in single-threaded mode, the dynamically partitioned queues stop enforcing any limits on the number of entries that can belong to one thread, and the statically partitioned queues stop enforcing their boundaries as well.
The same can be said for the register file, another crucial shared resource. The Xeon's 128 microarchitectural general purpose registers (GPRs) and 128 microarchitectural floating-point registers (FPRs) have no idea that the data they're holding belongs to more than one thread--it's all just data to them, and they, like the execution units, remain unchanged from previous iterations of the Xeon core.
For a simultaneously multithreaded processor, the cache coherency problems associated with SMP all but disappear. Both logical processors on an SMT system share the same caches as well as the data in those caches. So if a thread from logical processor 0 wants to read some data that's cached by logical processor 1, it can grab that data directly from the cache without having to snoop another cache located some distance away in order to ensure that it has the most current copy.
You might think since the Xeon's two logical processors share a single cache, this means that the cache size is effectively halved for each logical processor. If you thought this, though, you'd be wrong: it's both much better and much worse. Let me explain.
Each of the Xeon's caches--the trace cache, L1, L2, and L3--is SMT-unaware, and each treats all loads and stores the same regardless of which logical processor issued the request. So none of the caches know the difference between one logical processor and another, or between code from one thread or another.
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Other People's Cache - HyperAttacks with HyperThreading - Dag Arne Osvik, Norway
Regular desktop apps tend to have lots of threads, but the issue is not how many threads exist but rather how many of them attempt to use CPU at the same time.
For instance, your web browser might spawn a thread to do a DNS lookup, since you wouldn't want the GUI to block during DNS. That thread hardly uses any CPU though. When your Web browser does real work, like rendering, it will usually be confined to a single thread.
I have seen the future, and it is inconvenient.
Hyper Transport has nothing to do with Hyper Threading. Hyper Threading means processor support for several (usually two) execution threads at once. Hyper Transport is a bus technology to interconnect pocessors, RAM, motherboard chips, PCI bus and the like.
AMD's Hyper Transport is similar to Intel's Hyper Threading, but in my books, superior.
That's like saying that the computers from Apple Computers are similar but superior to the computers from Apple Records. Notice how Apple Records makes no computers? Just because they start with the same word does not mean two things are the same.
I just watched his talk, and you are on the right track. Your workaround is one he suggested too. It's actually a timing based attack based on watching the cache misses in a spy thread to try and reverse the RSA public key. The interesting thing is this isn't Hyper-Threading only - it's possible on normal procs too that don't flush the cache between context switches. It's just that with HT context switches can be far more common.
Random is the New Order.
My paper is available here.
Have fun reading, I'm going back to the conference.
Tarsnap: Online backups for the truly paranoid
Why wasn't Intel notified over the past SEVEN MONTHS ?
They were. I've clarified the page somewhat now, but "Other security teams" includes Intel.
Tarsnap: Online backups for the truly paranoid
Why notify FreeBSD and then wait 2 or 3 months before notifying other possibly affected vendors (at least other BSDs)?
Two reasons. First, because I'm part of the FreeBSD Security team -- I'm required to notify them about potential issues.
Second, because if I contacted lots of security teams with what I had on December 31st, they wouldn't have listened: "Umm, hey guys, there's a problem with hyperthreading. I've convinced myself that it is real, but I don't really have any evidence to give you, so you'll just have to believe me..."
Tarsnap: Online backups for the truly paranoid
He alerted SCO to a flaw in their OS?
Actually, I posted to vendor-sec. I was rather surprised when I got an email back from SCO -- I didn't think that they'd be on vendor-sec.
Tarsnap: Online backups for the truly paranoid
Well, I just read the paper, and I applaud Colin on several levels. First off, the theory of the attack is rock-solid and well-written. Secondly, he describes very implementable OS work-arounds, crypto library fixes, and finally chip design corrections which will totally eliminate the security hole.
This is one of the best thought out, best written papers of its kind that I have read in my over thirty years of work in the engineering field.
About the word "if": If bullfrogs had wings, they wouldn't bounce around on their little green butts.