Why Faster CPUs? What About SMP?
Codeine asks: "As we press harder and harder against the physical limitations of speed, why do CPU manufacturers continue with the costly faster single processor model, instead of focussing on multi-processor designs? The new IBM Blue Gene seems to be acknowleging that more/simpler processors is the way to go (very like non-AI, millions of neurons). Why aren't we seeing commoditisation of SMP?"
There are two main reasons SMP isn't more pervasive:
The first of these is pretty self-explanatory. I'll try to expand a little on the second.
Multiprocessor (MP)hardware is a lot more complex than uniprocessor (UP) hardware, with extra latency in the memory subsystem to deal with potential cache issues - even if no sharing is occurring at that particular moment. Code running on multiple processors needs to do locking, and the locking itself can be pretty costly (especially since it uses bus-saturating interlocked memory instructions). This is why running an MP kernel on a single processor is even slower than a UP kernel. Lastly, not all code parallelizes well; much of it contains major sequential dependencies. In the end, all of the extra work that's done to make MP behave correctly may end up costing more than it's worth even for small numbers of processes.
As the number of processors increases, all of these effects increase exponentially. The memory system starts to get pretty hideously expensive, cache warming and memory locality issues become more complex as efforts are made to reduce the strain on the memory system, and all the while it becomes harder and harder to keep all of the CPUs busy enough to make the whole thing worthwhile...and this is even for a mere couple of dozen processors.
When you're looking at something like Blue Gene, look not at the amount of CPU power involved but at the incredible memory/communications bandwidth - multiple communicating processors on a single chip, multiple chips on a board, boards arranged into modules, etc. The key to Blue Gene is that they have this phenomenal bandwidth coupled with a specialized application which is almost uniquely able to take advantage of how the memory/communications system is structured.
Slashdot - News for Herds. Stuff that Splatters.
True. I've been using an SMP machine for three years now, and it's painful to go back to a single-processor machine. However, I think that there are several reasons for hardware vendors (in particular, x86 vendors) not releasing SMP machines:
1. Drivers
SMP causes a lot of badly-written drivers to fail, although they might work reasonably well under a single CPU.
2. Cost
SMP on x86 requires more expensive motherboards, a larger-capacity power supply, and overall better quality of components, all of which costs more (not to mention the cost of the second CPU itself).
3. Competition
x86 vendors have to keep their prices down in order to be competitive, and with the current "MHz = Better speed" idea firmly implanted in the minds of most people, it's going to be harder selling a dual-CPU 700MHz system (for example) if there are 800MHz single-CPU systems available.
4. Lack of OS support
Like it or not, the majority of users are still stuck on Win95/98, neither of which support SMP. WinNT/2000 does, but how many computers for home use are sold with those installed?
5. Bad architecture
The x86 platform's SMP, quite frankly, sucks. A lousy bus/cache architecture means that you won't get 2x the performance you would from a single CPU for any application which hits main memory a lot.
6. Difficulty of programming for SMP
If you want to get the benefits of SMP from within a single application, you basically have to use threads, which are a real pain to debug properly.
That's all I can think of off the top of my head...
FreeBSD can run ACPI because their SMP is poor. FreeBSD (Note that 5.x will probably change this) using the big giant lock mythod of getting at the hardware. Thus when you acess hardware on one CPU the other cpu is stoped. Generally this is bad, but it means that ACPI works - the system looks like a single processor to ACPI.
I love freeBSD, and have run it in SMP since the pre-3.x days.
First of all, to some extent SMP is being commoditized -- Apple, for instance, is now selling SMP as being a simple one-step upgrade from UP in their PowerMac G4s. Apple is also the computer vendor that brought us widespread use of USB, the focus on industrial design as a consideration buying computers, etc. Expect other vendors to follow that lead, insofar as they can load operating systems that can take advantage of SMP.
Microsoft should probably credited with holding systems back to single processors with Win9x/ME, and yes even WinNT. With NT, IIRC, processes, not threads, were spread across processors -- so you saw very little benefit running a single, multi-threaded app on an SMP system. I would hope W2K does something more reasonable, as in something that virtually every other SMP implementation does (notably, except MacOS pre-X), and spread threads across processors.
Finally, in the x86 arena, only intel can support SMP currently -- and considering that AMD has been providing a much better price/performance ratio for some time, and is even generally ahead in performance right now. That makes it more difficult to justify going with lower-performing, more expensive processors to increase performance, although of course the difference between dual 800MHz P3's and a single 1.1GHz Athlon should be quite noticable if you're running a well-threaded application (or lots and lots of processes).
All that is for PC systems (including Macs as Personal Computers, if not Wintel PeeCees :). For other architectures (alpha, sparc/ultrasparc, MIPS, PA-RISC for instance), SMP is alive and well. SGI's highest-end workstations-that-could-be-servers, Octanes and Octane2s, support two processors, and their servers support a lot of processors. Sun has SMP workstations and ridiculously SMP servers as well; I've seen a lot of SMP alpha motherboards, but since alpha's are almost as commodity as PCs I haven't checked out what sorts of systems [c|o|m|p|a|q] sells. Hewlett-Packard also sells SMP workstations and servers, but my experience with them is with the old HP 9000/7xx series that are largely, if not completely, uniprocessor.
--Matthew
SMP is not always faster. If you are running two completley independent CPU bound programs, then SMP is faster, but then why not have two comptuers? As soon as your threads need to interact SMP slows down. Depending on your algorythm this might or not be a big deal.
Or to put it anouther way, the best SMP code will in the general case be slower on a 2 cpu system as the smae program for one processor that is twice as fast. (ie a SMP program for two P3-500 will run slower then a single processor only program for one P3-1000. Cache cohearancy issues and the like. Of course two P3-500s might be cheaper by enough to make it worthwhile.
Massive multi-CPU machines like the IBM Blue Gene you reference are never SMP. SMP machines generally have multiple CPUs sharing bus, RAM and I/O, as well as everything else a uniprocessor machine has, and therefore encounter all kinds of inefficencies as you scale up. In practice, 4 CPUs seems to be the "sweet spot" for SMP - after this you start running into the law of diminishing returns hard. You can, to an extent, code around this, by doing things like making the locks in your kernel more and more fine-grained, but this adds a lot of unnecessary overhead for machines with small numbers of CPUs and also makes the code orders-of-magnatude more complex (and hence unmanagable). "Supercomputers" generally are built with large numbers of nodes (generally with 1-4 CPUs each) that could (in theory) operate independently, with a very high speed, low latency interconnect lashing them toghether (really they are glorified clusters). This seems to be the future for high-end Unix machines: both Compaq's Wildfire (now shipping) and IBM's Regatta (coming soon) systems will feature a "cluster in a single box" type of archetecture.
--
http://gammatron.weblogger.com