G5 vs. x86 and Mac OS X vs. Linux
demonbug writes "Anandtech has an article up comparing performance of dual G5s to AMD Opteron and Intel Xeon workstations. The article also takes a look at performance under Mac OS X versus Linux. It provides an interesting look at some of the strengths and weaknesses of the different CPUs." From the article: "This article is written solely from the frustration that I could not get a clear picture on what the G5 and Mac OS X are capable of. So, be warned; this is not an all-round review. It is definitely the worst buyer's guide that you can imagine. This article cares about speed, performance, and nothing else! No comments on how well designed the internals are, no elaborate discussions about user friendliness, out-of-the-box experience and other subjective subjects. But we think that you should have a decent insight to where the G5/Mac OS X combination positions itself when compared to the Intel & AMD world at the end of this article."
Darwin vs. Linux on PPC!! This is a more useful comparison, IMHO, than Linux on a P4 and/or AMD vs. Darwin. Then you can better gauge the kernel latencies, etc. Are there any differences in Mac OS X Server's kernel? The article concludes Mac OS X is ok for Desktop use, but not for server use. I found this article disappointing.
I'm not a Mac Zealot, lets start with that.
But they are running a test and are identifying the thread creation as being really slow on the Mac and that that is the cause for the Mac's slow performance on the MySQL test.
Come now, if you are running software that is slow because you are creating threads all the time then you need to change software.
Use some kind of threadpool and *kaping*, problem is gone.
This is more revealing for MySQL than it is about Mac OS X.
The Internet is full. Go Away!!!
MySQL runs just fine on the BSDs, Linux, and even Windows. Every project on the face of the planet that uses threads has to be re-written for the sake of Darwin/OS X?
I will point out that this is hardly relevant for a desktop OS, and that I am more than happy with my dual G5/1.8GHz. Getting things done faster and neater due to elegant interaction design is much more important to me than being able to spawn threads quickly ;)
Except that they used Apache 1.3 and MySQL, two of the worst possible choices. If they'd gone for Apache 2.x (which actually uses threading, instead of processes) and PostgreSQL, things would've looked much nicer.
Wouldn't it have been better to use compilers that are tuned for each platform? Say, Intel's compilers for the x86 systems, and IBM's compilers for the PPC systems. These compilers could perform better prefetching, for example, and you might get a more accurate idea of what the systems could do with binaries that are tuned for that system.
Most of the benchmark data is bottlenecked by gcc, as the review mentions. That's fair, because that's what so many of us use to compile on these kinds of platforms. But I do think that Apple would do well to throw some of their programmers at the GCC project, at least adding their expertise to some of the Altivec modules. It would show off their platform, and return some value to the gcc project surely used extensively by Apple.
--
make install -not war
"This means that applications use slower user-level threads like in FreeBSD and not fast kernel threads like in Linux. It seems that FreeBSD 5.x has somewhat solved the performance problems that were typical for user-level threads, but we are not sure if Mac OS X has been able to take advantage of this.
In order to maintain binary compatibility, Apple might not have been able to implement some of the performance improvements found in the newer BSD kernels."
Yes, server performance with the xserve seems terrible right now, but I think that will be solved in the future, as apple will incorporate the enhanchements from fbsd 5, and more importantly 6. They are cooperating (freebsd and apple) it seems on many issues.
What a crock of shit. Guess what, buddy -- he's measuring the performance of both at the same time! *gasp*
He takes the most advanced Apple system, a dual G5 2.7 GHz, and compares it with 2 recent AMD / Intel machines, an Opteron 250 and a Xeon DP 3.6 GHz. The Apple system gets their most recent OS release, Tiger 4.1. The Intel and AMD systems get a SUSE release running Linux 2.6.5.
Using these machines to run various benchmarks reveals how these modern, currently-available platforms compare to each other. It's an obvious test to undertake. Notably, their benchmarks show that OS X's threading performance, especially with MySQL and Apache, doesn't compare favorably with performance on the Intel and AMD systems. That's good information to have at one's disposal.
Which would probably even things up, if anything. Remember that GCC's largest user base is probably x86, and most of its developers are probably working on x86 PC's. So it stands to reason that a lot of work has gone into the x86 optimisations in GCC over the years. But, they're very different CPU's (translated-CISC vs kinda-RISC) so different things have to be done to optimise for each processor family. Maybe it's easier to optimise for PPC. Maybe it's easier for Apple to create optimisations for the few PPC CPU's it uses (603, G3, G4, G5), while it takes an army of volunteers to create optimisations for the plethora x86 CPU's (386, 486, p5, p6, pentium 3, pentium 4, amd386, amd486, k5, k6, k7, k8, and all the ones from Cyrix and now Via).
I want to hear about the techniques used by "the major database vendors" to deal with the thread blocking issue. Maybe programs like MySQL can take advantage of these enhancements, too.
This doesn't appear flattering for Apple, but it's apparent that they have been scrambling to get the user experience right in OSX, at the expense of sub-optimal kernel development. Hopefully they will be able to refocus on the kernel and the compiler and get the performance up to what Linux people expect. Thread blocking will become much more of an issue as multi-core CPUs become mainstream.
Linux is a good example of what can happen here. They got crummy benchmarks, the kernel guys identified the bottlenecks, experiments were written to overcome the bottlenecks, and eventually the fixes made it into the kernel and everyone benefits. Notice how Microsoft doesn't brag about performance any more?
Sunshine is the best disinfectant. - Tip O'Neill
The odds are pretty good that you'll need to do some CLI sorcery to get an X-Server to run under OSX.
Double-click on your hard disk.
Double-click on Applications.
Double-click on Utilities.
Double-click on X11.
Compare a machine running OS 8 or OS 9 to a Macine running OSX, the machine will be discernably slower when running OSX.
The interesting question is, why?
Here's what I've found:
Compare a machine running NeXTSTeP with a comparable machine running OS 8 (say, the Performa 475 vs the NeXTStation Mono). The NeXT, running the same basic kernel as OS X, is about as responsive for pure GUI interactions and WAY more responsive running multiple applications or when disk I/O is involved.
Compare a machine running BeOS and Sheepshaver with the same machine running OS 8 natively. Under BeOS, the machine is again more responsive, and again disk I/O is much better.
Compare a machine running OS 9 applications under Classic and the same machine running OS 9. Not a lot of difference. Slightly slower screen, better disk I/O, and much more responsive than OS X applications.
OS 8 vs OS 9? Not that big a deal. OS 9 does multitasking a bit better, it seems, but at the same time it's a bigger system.
OS X should be faster than OS 9, then, even with the "Microkernel overhead", because of the improved multitasking and disk I/O. But you can see that it isn't just using it on the same machine.
The big difference is that OS X allocates a separate raster map for each window, and composites them without involving the app. Scrolling panes in windows can end up using a raster map the size of the scrolling region. This means at least tens of megabytes of extra storage just on scrolling, and at least one and sometimes two additional copies (dpending on translucency) before any pixel makes it to the screen.
This is why QE and QE2d are such big wins on the Mac. They move one of the copies out of the way.
Meanwhile on OS 9, you usually have zero copies... the app calls Quickdraw and Quickdraw renders just what's visible, and may completely bypass the CPU to do it. Just like just about every other windowing system I know of, including NeXTstep.
Lamborghini? Did you read the article? They found that Linux was ten times faster for high-end server apps that make lots of system calls. That's more like comparing that old Charger to a shiny new bicycle. I love OSX's GUI too, but is it worth an order of magnitude speed penalty? On a server system? Hell no.
(I similarly dislike Linux and like OSX, so this article disappointed me. I do think they made some mistakes in their testing. However, the unerlying problems causing the performance issues are certainly real.)
Mach's multitasking _performance_ still blows.
Compared to OS 9? Have you used classic Mac OS? The classic Mac OS multitasking charade (I won't call it a kernel) was appalling. It had no real scheduler, applicatons ran for a while, gave up the CPU voluntarily, and went on. There was no way to get smooth interapplication concurrency because the API was built around operations that weren't even thread-safe, let alone safe for separate independent applications to use concurrently.
That's what I'm comparing Mac OS X with, not other real multitasking operating systems, but a hideous shambling wreck that was so bad it made a 240 MHz Power PC running Mac OS 9 feel less responsive than a 30 MHz 68040 running Mach.
Later on, I ran both OS 9 and OS X on the same hardware. OS X was smoother and more responsive in the face of even heavy competition for the CPU and disk than OS 9 just sharing files. I had an upgraded 7600 which I was going to use as a file server and occasional console, until I started trying to use it that way. Any time it started sharing files it got slow, unresponsive, and jerky. I wanted to use it for music, but iTunes would chop and skip on just about any file access. Upgrading it to a 240 MHz CPU and giving it a second SCSI card just for file sharing didn't help.
Bad as Mach is, it's so much better than what Apple was using before that if they had just stuck to using Quickdraw and Display Postscript OS X would have knocked the doors off OS 9 on the same hardware. I had a copy of Rhapsody DR1 for Intel at one point, and it was easily the equal of BeOS (another OS I've found has an inflated reputation) on the same test box.
It's not the Mach kernel that makes OS X slower than OS 9, it's the Quartz graphics.
k, I don't know whether this "between 2 and 5(!) times slower" stat is true, but even if it is, it should hardly matter. The time it takes to create a thread should be insignificant compared to the time it takes for the thread to do its work (at least, in a good program). Not to mention that in the case of a multithreaded server (like Apache 2), thread pools are used so that less time is spent creating and destroying threads.
The bits on the bus go on and off... on and off... on and off...