Intel Hyperthreading In Reality
A reader writes: "Looks like GamePC has got the first look at Intel's new Xeon processor, which has the new super-fantastico Hyperthreading technology, which tricks your OS into thinking one CPU is two CPUs, two CPUs is four. Looks neat in theory, benchies included."
Hyperthreading is a pretty cool idea, especially for those of us who would like to see SMP move more into the mainstream.
According to this article, though (posted on 2cpu.com), the Windows 2000 scheduler doesn't know how to take advantage of hyperthreading, since it doesn't know how to take advantage of virtual processors. (I suppose Windows XP does?) Go figure. Anyway, this looks like it's probably worth checking into. I'm sure Linux will support it!
---Have you crashed Windows XP with a simple printf recently? Try it!
Make sure you use the Printer Friendly view, that way you don't get 12 pages of slashdotted hell! Look here.
-- Dan
Basically what they're doing is simply taking unused processor resources and allocating them to another thread. You can now have multiple _threads_ of excecution simultaneously... truely simultaneously.
Thread X is using register's B and C
Thread Y can able to use registers A and D.
These threads can be executed together without a context switch... and the processor will hunt out these relationships in hardware. That's what "the big deal" is.
Until now, when a processor "multitasks", it's simply switching from one thread of execution to the next... it allocates separatetime to two different threads....Now it can allocate the exact same timeslice to multiple threads as long as there isn't a resource dependancy.
If your program can be architechted to take advantage of this (or your OS can schedule tasks like this), you'll get a huge benifit (read: if it works on SMP systems, it'll get some benifit on this as well).
WHY? I mean, come on... If you want two processors, shouldnt you have 2 processors in the systems???
Maybe because SMT makes the die 5% bigger, while 2 processors is upwards of 100% bigger? This is where a thing called "cost" comes in.
SMT essentially allows for the CPUs to be used more efficiently. A lot of the time an ALU will sit idle while the FPUs work, and with SMT both can work at the same time on different threads.
Modern CPU's have many different execution units. Depending on the code running, not all of them may have work scheduled. Future work may depend on previous results; obviously you can't do this in parallel. The idea of "HyperThreading" is to run more than one thread of execution at a time with the multiple execution units - so more work gets done per clock cycle.
A quick Google search turned up an article here. At one point I read a really excellent article on single-processor multithreading (discussing a future Alpha processor) but I can't find it anymore. Hopefully AMD will do something like this as well for a future Hammer processor.
A number of people have posted asking what the point was of making a single processor act like two processors. It's actually explained in the article linked to above.
Apparantly, he big deal is that a single processor can only handle one thread at a time--multitasking works by breaking programs down into threads, and working on one thread for a little while, then another, then another, then back to the first. But at any given time, only one thread is being actively executed. Hyperthreading changes this--a single processor can work on two threads truly simultaneously. This makes multitasking a hell of a lot more efficient.
The original Howling Frog is a fictional character and has no UID.
And it may hurt. A downside of "hyperthreading" is that the threads contend for cache space, so if the threads are executing very different code, the cache miss rate will rise. Of course, this happens in ordinary threading on each context switch, but with "hyperthreading", there's a context switch of sorts on every instruction cycle. If this effect shows up, it will show in L1 cache miss rates.
This isn't a totally new idea, either. The first step in this direction was the peripheral processor for the CDC 6600, in the 1960s, which appeared as ten peripheral processors to the programmer. Internally, it was ten sets of registers and one ALU, doing one instruction for each machine state in turn. Basic/4, a forgotten minicomputer manufacturer, tried a similar idea in the 1970s.
On the other hand, this apparently isn't that tough a feature to add to an already-superscalar CPU, so why not?
Some other complaints about this "invented at Intel" terminalogy can be found at The Register.
Also Toronto has a nice slide show (pdf) on the topic.
For the record I contributed a little tiny bit to this stuff when I was at Intel (I found what I think was the first multi-processor bug for SMT.)
As Nietsche famously said, "If you stare too long into the Abyss, 1d4 Tanar'ri of random type will attack you."
Simultaneous Multithreading (SMT) is not a new idea, although no one to my knowledge has implemented it yet. Intel just calls it "Hyperthreading"...it is essentially SMT.
And yes, this is a very good idea. A modern superscaler out-of-order processor, like the Athlon and Pentium Pro (and later), can issue and retire multiple instructions per clock cycle. However, it can *only* do this if there is enough instruction-level parallelism (ILP). Turns out, there is not enough ILP in current programs to take full advantage of the chips processing capabilities. Issue slots and function units go unused due to dependencies in the program and cache misses that stall the processing. A typical processor can only look at about 32 instructions at a time. This is not a large enough window to execute future instructions out-of-order when such a stall occurs.
However, 2 threads of execution will likely fill all of the issue slots. They are also independent threads of execution, so dependencies don't exist between them. This means that when the pipeline stalls due to a cache miss, the other thread can keep on retiring instructions.
To all those saying that this is dumb, I suggest you study some modern architecture (I'm not talking about your undergrad architecture course either). A paper I read recently studied the affects of SMT on a simulated Alpha processor. The results were astounding with very little changes to the processor core. I heard that the next Alpha was slated to include SMT before Intel killed it.