Slashdot Mirror


Hyper-Threading Speeds Linux

developerWorks writes "The Intel Xeon processor introduces a new technology called Hyper-Threading (HT) that makes a single processor behave like two logical processors. The technology allows the processor to execute multiple threads simultaneously, which can yield significant performance improvement. But, exactly how much improvement can you expect to see? This article gives the results the investigation into the effects of Hyper-Threading (HT) on the Linux SMP kernel. It compares the performance of a Linux SMP kernel that was aware of Hyper-Threading to one that was not." Ah, the joys of high performance.

18 of 239 comments (clear)

  1. Fundamental mistake by cbcbcb · · Score: 5, Insightful
    >It compares the performance of a Linux SMP kernel that

    >was aware of Hyper-Threading to one that was not."

    But if you aren't going to use hyper threading you would use a UP (non-SMP) kernel, which would gain you considerable performance. The benefits are not so clear cut as many of the benchmarks show limited benefit from hyperthreading and would perform faster on a uniprocessor kernel.

  2. good stuff by The+Evil+Couch · · Score: 5, Insightful

    The results on Linux kernel 2.4.19 show Hyper-Threading technology could improve multithreaded applications by 30%. Current work on Linux kernel 2.5.32 may provide performance speed-up as much as 51%.

    while it may not be very useful for a single-user box(it actually looks like it would be a detriment), integrating it into client-server situations would give us some nice boosts in performance. web servers ought to see some real gains with this.

    1. Re:good stuff by windex · · Score: 5, Insightful

      You aren't looking at this logically. It's not that "you need that much CPU for a webserver", is that "look at how many more customers you can squeeze in per server".

      This lowers cost for providers, and eventually lowers costs for consumers.

      Yee haw.

  3. Re:But the real question... by stratjakt · · Score: 2, Insightful

    >> Does SMP support automatically allow benefits from Hyperthreading

    Yes

    HT essentially partitions out the CPUs pipeline into two pipelines executing concurrently: That is, two CPUs on the same die.

    --
    I don't need no instructions to know how to rock!!!!
  4. Only Threads ? by makapuf · · Score: 2, Insightful

    I know, there might be many places where it has been discussed before, but could someone please tell me if HT is only for threading or can it be used for precesses, too.
    And I know, they are essentially the same syscall under linux, and might be faster, b/c of synchronization issues wrt to the memory access IIRC ...

  5. Application dependant by PaschalNee · · Score: 5, Insightful

    The pretty detailed (for me anyway) article on Ars Technica concludes that performance on a HyperThreaded CPU will be very much dependant on the application mix. While research like this is useful it will probably always be a try and see scenario.

  6. Useful for development? by NixterAg · · Score: 5, Insightful

    Like most development shops, we do a great deal of development for multiprocessor machines so we write a lot of multithreaded code. Multithreaded code creates a whole host of new debugging pitfalls that don't show up if the developer is debugging on a single processor workstation. As John Robbins says in his terrific Debugging Applications book, if you are developing a multithreaded application, you better be certain you are doing your debugging in a multiprocessor environment.

    From a development standpoint, will a hyperthreaded chip provide an adequate environment in duplicating the behavior of a multi-processor PC well enough that shops can buy cheaper, one CPU machines for development and still be confident in their results? I'm guessing nothing will replace the real thing but I'd be interested in any commentary.

  7. In other news... by dirvish · · Score: 3, Insightful

    Hyper-Threading Speeds Windows

  8. Re:Underwhelmed by stratjakt · · Score: 2, Insightful

    It's not so great, if you need SMP you still cant beat two or more physical CPUs.

    In this scheme, the pipeline is split into two and two concurrent threads run in it. Which is pretty neat, but hurts performance in some situations.

    - Cache latency is basically doubled, as two VCPUs now fight over access to the cache

    - Pipeline depth is shortened for either given VCPU, which hurts code that was optimized for the longer pipelines (lots of matrix math, MMX stuff).

    It's a cool development in CPU design, but it has a ways to go, and the OS needs to be aware of it. You should be able to 'shut it off' in code on the fly, if you want to dedicate 100% real CPU to a given task.

    --
    I don't need no instructions to know how to rock!!!!
  9. Re:But... by stratjakt · · Score: 2, Insightful

    It doesn't 'emulate' SMP, it actually performs two operations at the same time by splitting the instruction pipeline in half (well not in half, it varies as to how much pipeline each 'cpu' gets). It's not as good as SMP for various reasons, mostly boiling down to the two threads sharing the rest of the chip.

    It does 'hurt' sometimes, but it's usually negligable, and you have to pretty much go out of your way to design code that would run slower - such code can 'hurt' traditional SMP systems as well.

    I'm sure there will be plenty of cooked benchmarks for fanboys to rant about in the future, just like there are between 3DNow! and MMX/SSE/2..

    It is a cool development, and *can* be shut off if it's only hindering your system (ie; you're running Windows 98 or a linux kernel with no HT support - and thus wasting pipeline to a 'CPU' that isn't used)

    --
    I don't need no instructions to know how to rock!!!!
  10. It's just you by Royster · · Score: 5, Insightful

    WHat you've conveniently snipped out in your trollish post is all of the applications benchmarks showing improvements. If you're not going to run any application code, you might as well shut the machine off and save the marginal stress on the environment.

    Most of us have our computers do work and those applications, running on an OS which has *barely* slowed, will be able to do more work in the same amount of time under the HT-aware OS than under one which does not utilize the second, virtual processor.

    --
    I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i
  11. Technical Summary by 0x69 · · Score: 5, Insightful

    If you're running code that's efficient on a P4 (few mis-predicted branches, low cache miss rate, good parallelism, etc.) then HT is pretty much useless.

    If you're running code that's inefficient on a P4 (which pays for its high GHz with long pipelines, large latencies, a slow decode stage, and several other drawbacks), then HT can usually paper over a fair percentage of these problems. But remember that HT requires OS support, may require application support, and "your mileage will vary".

    --
    It's easy to make up & spread cool- and credible-sounding stuff. Finding & checking hard facts is hard work.
  12. Expensive HT or cheap real SMP? by ponos · · Score: 5, Insightful


    In Europe P4 3.0 with HT costs ~745 euro (+tax)
    An Asus A7M for dual Athlon costs ~260 euro (+tax)
    Two Athlon XP 2200+ cost ~340 euro (+tax).
    Alternatively you can get two Athlon MP 2000+ for
    roughly the same money (if you don't trust the
    XPs).

    Now, please explain to me why would someone
    with real SMP needs in mind (and NOT games)
    consider the P4 with HT.

    P.

    P.S. I understand that the prices in the US are
    different, but still, it is VERY expensive.

    1. Re:Expensive HT or cheap real SMP? by Anonymous Coward · · Score: 1, Insightful

      Excellent Point.

      HT works the best where you can do better with dual CPUs.

      Why not using dual CPUs?

  13. Summary by swillden · · Score: 3, Insightful

    So, in a nutshell, what MS says is: Windows 2000 counts processors in a broken way and requires you to buy licenses for every logical processor, even though you won't get nearly as much processing power as you would if you really had that many physical processors. But rather than fix this bug, we're going to solve the problem by making you buy .NET, which counts processors correctly. So either way, if you're going to use hyperthreading, expect to send us more money.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  14. Re:Also the Pentium 4 - 3 Ghz is hyperthreaded. by Henry+V+.009 · · Score: 3, Insightful

    Yes, they do something almost exactly like it. Simply buy two processors and a multi-processor motherboard. That defeats the purpose of this technology, of course, but it nearly accomplishes the same thing.

    Other than that, well, I'm--still--waiting for Hammer. AMD is dropping a long ways behind Intel. Price is all they've got, and AMD isn't even competeing on price-performance real well at the moment. My guess is that Intel hyperthreaded systems will probably be better price-performance wise than AMD before long--if they aren't already.

  15. Re:Underwhelmed by dcmeserve · · Score: 2, Insightful

    > Cache latency is basically doubled, as two VCPUs now fight over access to the cache

    I'm pretty sure this is wrong -- cache latency isn't doubled; the SIZE is HALVED. The two threads access two different virtual caches. Trying to get them to contend for a single cache would be an architectural nightmare.

    Though I believe it's still one physical cache -- which means that the latency is going to be higher than what you'd expect for a cache of its apparent size.

    > Pipeline depth is shortened for either given VCPU, which hurts code that was optimized for the longer pipelines (lots of matrix math, MMX stuff).

    I don't actually know about the pipeline, but I suspect this is wrong too: shortening the pipeline (reducing the number of stages) is a fundamental change in the architecture; a pipeline isn't something you can cut in half and give the front end to one process and the back end to another. Each stage is quite unique.

    Now if you mean that the latest Pentiums have a shorter pipeline than previous incarnations, then maybe that's right (though I'd doubt it -- they're always *lengthening* the pipeline to get those higher GHz numbers). But that would have nothing to do with Hyperthreading.

    --
    "Orthodoxy is unconsciousness" - Orwell
  16. Re:HT is not single-chip SMP by SpinyNorman · · Score: 3, Insightful

    I don't believe that's correct.

    As I understand it HT can indeed speed up pure integer code (or more generally code that's competing for a single CPU resource). HT will allow another thread to exceute if the current one is waiting on anything from pipeline results to memory access. I believe that modern CPU/memory speed disaparity was one of the driving forces behind it - if one thread gets a cache miss then another may be able to continue executing rather than having to sit idle waiting for main memory.