Hyperthreading Considered Harmful

← Back to Stories (view on slashdot.org)

Hyperthreading Considered Harmful

Posted by CowboyNeal on Friday May 13, 2005 @12:07AM from the not-just-for-performance dept.

cperciva writes "Hyper-Threading, as currently implemented on Intel Pentium Extreme Edition, Pentium 4, Mobile Pentium 4, and Xeon processors, suffers from a serious security flaw. This flaw permits local information disclosure, including allowing an unprivileged user to steal an RSA private key being used on the same machine. Administrators of multi-user systems are strongly advised to take action to disable Hyper-Threading immediately. I will be presenting this attack at BSDCan 2005 at 10:00 AM EDT on May 13th, and at the conclusion of my talk I will also releasing a paper describing the attack and possible mitigation strategies."

30 of 392 comments (clear)

Min score:

Reason:

Sort:

more info at KernelTrap by Anonymous Coward · 2005-05-13 00:15 · Score: 5, Informative

I read about this last night here at KernelTrap. They offer more info, evidently having talked to Colin...
Access to CPU/memory Calls by tronicum · 2005-05-13 00:31 · Score: 1, Informative

The point is that most servers system don't allow you to execute system calls which you could exploit.

You need at least root/administrator privileges to get stuff from the OS memory.

So before you can exploit the system you must have access to the system it self.

It is an "local" kind of "root exploit" if you can read from the system memory of other processes if the claim is true.
Re:This ought to be interesting by Chris+Snook · 2005-05-13 00:34 · Score: 4, Informative

Unlike SMP, with HT you're interleaving two threads on the same physical execution unit. That means that there is data from another thread in registers at the same time that you're executing, without having enough instructions execute during a context switch to flush the pipeline. It also means that the other process's page table is in the MMU while you're executing. Even if their proof-of-concept attack doesn't work on some other operating systems, everyone needs to look over their code to make sure this isn't just an accidental effect that could change with increasing pipeline depths, different context switch logic, etc.

--
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
Re:Whoosh!!! by mmkkbb · 2005-05-13 00:38 · Score: 5, Informative

Actually, Intel CPUs contain patchable microcode ROMs. You can see the option to enable it when you configure a Linux kernel.

--
-mkb
On the other hand by MountainMan101 · 2005-05-13 00:59 · Score: 4, Informative

I'd be willing to bet he's right. He is currently awaiting a doctorate from the University of Oxford, which is commonly held as the finest academic institution in the world.

(I'm not biased by having spent the past 7 years there)
1. Re:On the other hand by babbage · 2005-05-13 02:18 · Score: 5, Informative
  
  And this isn't the first time he has come up with some interesting research that has been mentioned on Slashdot before. Sure, he seems to be a little arrogant, but with his record so far, I think he's earned the benefit of the doubt here...
  
  --
  DO NOT LEAVE IT IS NOT REAL
2. Re:On the other hand by Drakonian · 2005-05-13 03:18 · Score: 2, Informative
  
  This guy is a smart cookie. I just saw his talk. He doesn't come across as arrogant at all. I think his exploit is plausible. It's a timing attack but could allow you to discover a 1024-bit private key in under 5 mins or so if you know what you are doing.
  
  --
  Random is the New Order.
Re:How to exploit by mikael · 2005-05-13 01:01 · Score: 4, Informative

The Ars Technica page on hyperthreading with the Xeon might provide some clues. It lists which parts of the CPU are replicated, partitioned and shared.

One final bit of information that should be included in a discussion of partitioned resources is the fact that when the Xeon is executing only one thread, all of its partitioned resources can be combined so that the single thread can use them for maximum performance. When the Xeon is operating in single-threaded mode, the dynamically partitioned queues stop enforcing any limits on the number of entries that can belong to one thread, and the statically partitioned queues stop enforcing their boundaries as well.

...

The same can be said for the register file, another crucial shared resource. The Xeon's 128 microarchitectural general purpose registers (GPRs) and 128 microarchitectural floating-point registers (FPRs) have no idea that the data they're holding belongs to more than one thread--it's all just data to them, and they, like the execution units, remain unchanged from previous iterations of the Xeon core.

For a simultaneously multithreaded processor, the cache coherency problems associated with SMP all but disappear. Both logical processors on an SMT system share the same caches as well as the data in those caches. So if a thread from logical processor 0 wants to read some data that's cached by logical processor 1, it can grab that data directly from the cache without having to snoop another cache located some distance away in order to ensure that it has the most current copy.

...

You might think since the Xeon's two logical processors share a single cache, this means that the cache size is effectively halved for each logical processor. If you thought this, though, you'd be wrong: it's both much better and much worse. Let me explain.

Each of the Xeon's caches--the trace cache, L1, L2, and L3--is SMT-unaware, and each treats all loads and stores the same regardless of which logical processor issued the request. So none of the caches know the difference between one logical processor and another, or between code from one thread or another.

--
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Simplest Solution by Loualbano2 · 2005-05-13 01:05 · Score: 2, Informative

Most machines let you disable it in the BIOS, which would have to be the simplest way of turning it off possible.
Re:This ought to be interesting by Anonymous Coward · 2005-05-13 01:21 · Score: 1, Informative

According to XP Task Manager:

Firefox: 9
Visual Studio: 10
Outlook: 8
Gaim: 2
Explorer: 8
Re:Probably a Timing-Based Attack by AtrN · 2005-05-13 01:32 · Score: 5, Informative

This got mentioned in comp.arch and Dan Bernstein pointed out others have mentioned similar things previously. The abstract mentioned reads,
Other People's Cache - HyperAttacks with HyperThreading - Dag Arne Osvik, Norway

We have investigated the use of memory caches of modern processors as side-channels for timing attacks against software implementations of cryptographic algorithms. In particular, we have successfully performed a new kind of attack where the attacker has no privileges other than being able to run on the same processor as the victim. That is, the attacker has no access to plaintext or ciphertext, and is not allowed by the operating system to communicate with the victim. In this scenario we have recovered 45 out of 128 key bits from AES encryption of English text in just one minute on an Intel processor with HyperThreading. Moreover, with regular known plaintext attacks we have achieved full key recovery.
Absolutely... by Kjella · 2005-05-13 01:45 · Score: 2, Informative

...RSA is vunerable to timing attacks (why we have blinding in software). It's a wonder noone has thought about this earlier though, I remember reading about the military considering virtual machines (i.e. one physical machine could be on both classified/unclassified systems). One of the reasons they didn't was the ability to tap/signal through spinlocks and other timing data. I always thought this was a "well-known but too unlikely to be interesting" weakness, but I guess not. Maybe I should have published a paper myself.

--
Live today, because you never know what tomorrow brings
Re:This ought to be interesting by timster · 2005-05-13 01:49 · Score: 4, Informative

Regular desktop apps tend to have lots of threads, but the issue is not how many threads exist but rather how many of them attempt to use CPU at the same time.

For instance, your web browser might spawn a thread to do a DNS lookup, since you wouldn't want the GUI to block during DNS. That thread hardly uses any CPU though. When your Web browser does real work, like rendering, it will usually be confined to a single thread.

--
I have seen the future, and it is inconvenient.
Hyperthreading performance with numerical models by Orp · 2005-05-13 02:01 · Score: 2, Informative

This is only tangentially related to the security issue, but I found that disabling hyperthreading on a cluster of dual Xeons running Linux greatly improved performance with a distributed memory (MPI) numerical model. Short summary: even if you only run your model on physical CPUs, hyperthreading will apparently bounce jobs around in a somewhat random way. Not sure if it's a hardware issue or a software (Linux) issue.

Here is a link which goes into detail

--
A squid eating dough in a polyethylene bag is fast and bulbous, got me?
Re:Whoosh!!! by Anonymous Coward · 2005-05-13 02:04 · Score: 2, Informative

But since Intel has not yet responded to this, it's unknown whether the problem can be fixed that way. We'll just have to see after the official annnouncement.
Re:Where's the details? by archen · 2005-05-13 02:10 · Score: 2, Informative

I'm not sure what is really involved by this, but a FreeBSD security bulletin was released today addressing this topic (including a kernel patch and work-around) so I highly doubt this is simply a stunt.
Re:So does anyone know... by anno1602 · 2005-05-13 02:47 · Score: 4, Informative

Hyper Transport has nothing to do with Hyper Threading. Hyper Threading means processor support for several (usually two) execution threads at once. Hyper Transport is a bus technology to interconnect pocessors, RAM, motherboard chips, PCI bus and the like.

AMD's Hyper Transport is similar to Intel's Hyper Threading, but in my books, superior.

That's like saying that the computers from Apple Computers are similar but superior to the computers from Apple Records. Notice how Apple Records makes no computers? Just because they start with the same word does not mean two things are the same.
Re:This ought to be interesting by Drakonian · 2005-05-13 03:21 · Score: 4, Informative

I just watched his talk, and you are on the right track. Your workaround is one he suggested too. It's actually a timing based attack based on watching the cache misses in a spy thread to try and reverse the RSA public key. The interesting thing is this isn't Hyper-Threading only - it's possible on normal procs too that don't flush the cache between context switches. It's just that with HT context switches can be far more common.

--
Random is the New Order.
Re:This ought to be interesting by fitten · 2005-05-13 03:31 · Score: 2, Informative

Yes, but each context MUST have its own register set or it makes no sense at all. Perhaps the attack comes through the rename registers or somesuch. Each SMT (or HyperThread) context has its own set of registers and they don't share.
Paper by cperciva · 2005-05-13 03:32 · Score: 5, Informative

My paper is available here.

Have fun reading, I'm going back to the conference.

--
Tarsnap: Online backups for the truly paranoid
Re:Where's the details? by Drakonian · 2005-05-13 03:34 · Score: 3, Informative

Sure. It's a timing based attack on based on watching cache misses. If you have a Spy thread running on an HT processor that is also running OpenSSL for example, you can get a picture of the frequency of cache usage and from that reverse engineer the exponents and multipliers used in the RSA exponentiation. Note: You'd definitely need some cryptographic experience for this. From this, you can get about 310 bits of the 512 bit exponent and brute force the rest, which can be done in polynomial time.
The reason HT is vulnerable is because both threads share the cache and context switches can happen at any time. It could on normal non-HT procs too but the context swithces are more likely to flush the cache or not happen as often.

--
Random is the New Order.
Re:Timeline - WTH? by cperciva · 2005-05-13 03:50 · Score: 4, Informative

Why wasn't Intel notified over the past SEVEN MONTHS ?

They were. I've clarified the page somewhat now, but "Other security teams" includes Intel.

--
Tarsnap: Online backups for the truly paranoid
Re:2 or 3 months before notifying other vendors? by cperciva · 2005-05-13 03:54 · Score: 4, Informative

Why notify FreeBSD and then wait 2 or 3 months before notifying other possibly affected vendors (at least other BSDs)?

Two reasons. First, because I'm part of the FreeBSD Security team -- I'm required to notify them about potential issues.

Second, because if I contacted lots of security teams with what I had on December 31st, they wouldn't have listened: "Umm, hey guys, there's a problem with hyperthreading. I've convinced myself that it is real, but I don't really have any evidence to give you, so you'll just have to believe me..."

--
Tarsnap: Online backups for the truly paranoid
Re:SCO.... by cperciva · 2005-05-13 03:56 · Score: 4, Informative

He alerted SCO to a flaw in their OS?

Actually, I posted to vendor-sec. I was rather surprised when I got an email back from SCO -- I didn't think that they'd be on vendor-sec.

--
Tarsnap: Online backups for the truly paranoid
The paper is now available by Anonymous Coward · 2005-05-13 04:29 · Score: 1, Informative

The paper is now available at:
http://www.daemonology.net/papers/htt.pdf
HT needs 1M L2 cache to avoid suckage by jhantin · 2005-05-13 05:26 · Score: 2, Informative

I've tried HT on both the 3.0c [Northwood, 512k L2] and 2.8e [Prescott, 1M L2] P4 models, both with identical hardware otherwise [1Gb dual channel DDR400, 875P chipset, nvidia fx5200, 120Gb 7200RPM ATA133 WD disc]. It's really nice on the 2.8e, but you fall in the cache miss tar pit on the 3.0c. With HT turned on the 2.8e actually feels faster than the 3.0c ever did, especially under heavy load, and is nearly impossible to bring to its knees whatever I throw at it.

Back on topic: This attack doesn't really shock me that much; covert channels are a fact of life in any multi-user machine, and anything that needs bulletproof security should be on isolated hardware. Attacking an RSA implementation by analyzing cache performance is a truly sweet hack though... my propeller-beanie spins in admiration. :)

--
...when you're writing a game...tweak the difficulty of "Easy" to something [your mother] can cope with. -- onion2k
Details now given. by BrakesForElves · 2005-05-13 07:10 · Score: 4, Informative

Well, I just read the paper, and I applaud Colin on several levels. First off, the theory of the attack is rock-solid and well-written. Secondly, he describes very implementable OS work-arounds, crypto library fixes, and finally chip design corrections which will totally eliminate the security hole.

This is one of the best thought out, best written papers of its kind that I have read in my over thirty years of work in the engineering field.

--
About the word "if": If bullfrogs had wings, they wouldn't bounce around on their little green butts.
Re:This ought to be interesting by imgod2u · 2005-05-13 07:59 · Score: 2, Informative

It's about shared cache and timing cache hits and misses. One thread can monitor the cache hits and misses of another thread (because access to a cache miss takes more time) and infer how that thread operated. This is as much of a problem on dual-core (with shared cache) as any SMT implementation. As noted in the paper, it's even a problem on normal systems that use paging.
Re:Where's the details? by cperciva · 2005-05-13 09:16 · Score: 3, Informative

How about just not allow different UIDs on the core at the same time?

That would be the ideal solution (assuming that you also check for setuid/setgid programs). Unfortunately, it's really hard to do that correctly due to problems of kernel data locking.

FreeBSD's policy on security fixes is that they must never ever break anything -- so if necessary (as in this case) a simple but suboptimal fix will be used instead of a complicated fix which might have the inadvertent side-effect of causing machines to crash.

--
Tarsnap: Online backups for the truly paranoid
Re:Also announced by Adi Shamir in February by cperciva · 2005-05-14 00:52 · Score: 2, Informative

During the Cryptographer's Panel at the RSA conference, Adi Shamir made a short reference to this vulnerability.

Yes, we seem to have discovered the problem independently. (Until today I wasn't sure if we had discovered the same problem -- Adi Shamir didn't reply to an email I sent him about this -- but I got an email from Eran Tromer after my paper went online.) ...a presentation would be forthcoming at the Eurocrypt 2005 rump session next week in Denmark.

I don't want to pre-release their results, but Shamir, Tromer, and Osvik decided to demonstrate the attack in a somewhat different way. I think it demonstrates how dangerous this attack is that two people independently discovered the attack and came up with different entirely practical targets for it.

--
Tarsnap: Online backups for the truly paranoid