Hyperthreading Hurts Server Performance?

← Back to Stories (view on slashdot.org)

Hyperthreading Hurts Server Performance?

Posted by CowboyNeal on Saturday November 19, 2005 @02:16AM from the opposite-effects dept.

sebFlyte writes "ZDNet is reporting that enabling Intel's new Hyperthreading Technology on your servers could lead to markedly decreased performance, according to some developers who have been looking into problems that have been occurring since HT has been shipping automatically activated. One MS developer from the SQL server team put it simply: 'Our customers observed very interesting behaviour on high-end HT-enabled hardware. They noticed that in some cases when high load is applied SQL Server CPU usage increases significantly but SQL Server performance degrades.' Another developer, this time from Citrix, was just as blunt. 'It's ironic. Intel had sold hyperthreading as something that gave performance gains to heavily threaded software. SQL Server is very thread-intensive, but it suffers. In fact, I've never seen performance improvement on server software with hyperthreading enabled. We recommend customers disable it.'"

8 of 255 comments (clear)

Min score:

Reason:

Sort:

Re:The code wasn't changed by springbox · 2005-11-19 02:27 · Score: 4, Insightful

That's lame. It seems like an exteremely BAD idea to get programs to worry about the total cache usage on the CPU. If this is the case, then no wonder performance is suffering. There should be no reason for any programmer to write a threaded application so it's "hyperthreading optimized," especially since HT was seemingly created as a transparent mechanism to increase performance.
Re:The code wasn't changed by drerwk · 2005-11-19 02:35 · Score: 4, Insightful

It seems like an exteremely BAD idea to get programs to worry about the total cache usage on the CPU.
If you want to maximize performance then you want the compiler to know as much as possible about the architecture. If you have no cache then loop unrolling is a good thing, if you have a small cache then loop unrolling can bust the cache. If you are doing large matrix manipulations, how you choose to stride the matrix, and possibly pad it is exactly dependent on the size of the cache. Now, it may be that having the applications programmer worry about it is too much to ask, but the compiler most certainly needs to worry about such detail.
Re:Poor mans dual-core by dsci · 2005-11-19 02:47 · Score: 5, Insightful

What is the performance gap between dual CPU vs Dual-core?

It's the usual answer: it depends.

We have to get rid of the notion that there is one overall system architecture that is "right" for all computing needs.

For general, every-day desktop use, there should be little difference between a dual CPU SMP box and a dual core box.

I have a small cluster consisting of AMD 64 X2 nodes, and the nodes use the FC4 SMP kernel just fine. All scheduling between CPU's is handled by the OS, and MPI/PVM apps run just as expected when using the configurations suggested for SMP nodes.

In fact, with the dual channel memory model, dual core AMD systems might be a little better than generic dual CPU, since each processor has it's "own" memory.

--
Computational Chemistry products and services.
Time to Buy AMD? by olddotter · 2005-11-19 02:55 · Score: 4, Insightful

Sounds like it might be time to buy more AMD stock.
I second the person that said programmers shouldn't be writing code to the cache size on a processor. How well your code fits in cache is not something you can control at run time. Different releases of the CPU often have different cache sizes. And frankly developers should always try to achieve tight efficent code, not develope to a particular cache size.

--
Think Deeply. ...
Re:The code wasn't changed by ochnap2 · 2005-11-19 03:00 · Score: 5, Insightful

That's nonsense. Compilers routinely do loads of optimisations to better suit the underlying hardware. That's why any linux distro that ships binary packages has many flavors of each important or performance sensitive package (specially the kernel, in Debian you'll find images optimised for 386, 586, 686, k6, k7, etc). Is one of the reasons of the existence of Gentoo, also.

So MS had to make a choise: ship a binary optimized for every possible mix of hw (being the processor the most important factor, but not the only one), which is impossible, or ship images compatible with any recent x86 processor/hw... without being specially optimised for any. That's why hyperthreading performance suffers.

This is an important problem on Windows because most of the time you cannot simply recompile the un-optimised software to suit your hardware, as you can in Linux, etc.

(sorry for my bad english)
Hyperthreading works best with "bad" code by cimetmc · 2005-11-19 03:20 · Score: 4, Insightful

Beside the cachae considerations which were discussed by numerous people here, there is one aspect that hasn't been mentioned.
The reason why hyperthreading was introduced in first place was to reduce the "idle" time of the processor. The Pentium 4 class processors have an extremely long pipeline and this often leads to pipeline stalls. E.g. the processing of an instruction cannot proceed because it depends on the result of a previous instruction. The idea of hyperthreading is that whenever there is a potential pipeline stall, the processor switches to the other thread which hopefully can continue its executon because it isn't stalled by some dependency. Now most pipeline stalls occur when the code being executed isn't optimized for Pentium 4 class processors. However the better Pentium 4 optimized your code is, the less pipeline stalls you have and the better your CPU utilisation is with a single thread.

Marcel
Re:The code wasn't changed by springbox · 2005-11-19 03:29 · Score: 4, Insightful

I wasn't thinking of compilers. I was mostly talking about the people who have to write the software. Assuming there's no compiler that knows about HT, I stand by my assertion that it would generally be a bad pratice to get people to worry about it. Especailly these days. Another point that I was trying to make is that even if there were compilers who knew about the HT issues, I still think it's exceedingly stupid that Intel went ahead with HT despite the glaring problems that were mentioned. If people want multiple of threads of execution on the same processor then they should get one with two cores.
Lots of programs are designed with the multiple thread model in mind. Programs should not be designed with the multiple thread model plus cache limitations in mind.
Re:The code wasn't changed by Tim+Browse · 2005-11-19 04:13 · Score: 4, Insightful

It seems like an exteremely BAD idea to get programs to worry about the total cache usage on the CPU.

For an application like SQL Server, I'd have to disagree. Are you saying there's no one on the MSSQL team who looks at cache usage? I'd hope there were a lot of resources devoted to some fairly in-depth analysis of how the code performs on different CPUs. After all, after correctness, performance is how SQL Server is going to be judged (and criticised).
Given that a while back I watched a PDC presentation by Raymond Chen on how to avoid page faults etc in your Windows application (improving start-up times, etc), I'd say that Microsoft are no strangers to performance monitoring and analysis.
For your average Windows desktop app, then yes, worrying about cache usage on HT CPUs is way over the top. For something like SQL Server? Hell, no.