FYI, there are many PPC compilers which easily outperform gcc. Your assertion that PPC compilers need to "catch up" with x86 compilers is unfounded. Regardless, it was not relevant for the purposes of Apple's benchmark. GCC was chosen for reasons which are quite clear if you read the report.
That is to say, no matter what system you run it on, it should give you the same answer. Whatever gives you the answer the fasest is the highest performer. This is all the user cares about.
I think you missed the point of Apple's test. Read the opening page of the report. Apple could easily have rigged this test in such a way as to get MUCH better performance than they did. They could have used RAM disks instead of serial ATA drives. They could have used an OS specifically designed for the tests, but which was useless in a production environment. Similarly with a compiler. (It looks like they may have done so actually with the specialized heap they used.) In the end they would have had an impressive number that was completely irrelevant to the end user experience.
The idea behind the tests was to, as much as possible, compare apples to apples (pun not really intended). The idea was to compare OS X + PPC 970 vs. Linux + a top of the line Dell, controlling as many of the other factors as possible. This meant trying to keep the other elements as similar as possible.
Ultimately, a SPEC benchmark on a system you can never get or would never use, does not do you a lot of good. This test is fairly exemplary because (for the most part... Apple did cheat a bit) these systems seem fairly representative of what people can expect to be using in the real world.
Well, I'm not saying the PowerPC backend is better or worse than the x86 back end, but I am saying that it's rediculous to imply that Apple was able to "soup up" gcc for their benchmark, while the Intel compiler sucked. The Intel crew have had more than their fair shot at optimising the compiler's output.
You description of the difficulties with hyperthreading isn't entirely accurate. Even with hyperthreading turned off, both threads will be competing for the same resources. The problems mostly come from the kernel not doing a smart job of scheduling threads. So, for example, if you have only 2 threads that are runnable, if the kernel sees 4 "virtual" processors (because it has no understanding of hyperthreading), it's entirely possible it will schedule the 2 threads to run on the 2 "virtual" processors that happen to be the same physical processor. This is of course a complete waste of resources. You also have issues with intelligent use of memory (if you have 2 processes with 2 threads each, it is often better to have threads in the same process executing on the same processor). Of course, it's unlikely (but possible) that turning on hyperthreading for the non-multiprocessing benchmark would do anything but harm performance (certainly that's been my experience to date).
In general, it's worth noting that most of the SPEC benchmarks which have been submitted to SPEC appear to have hyperthreading turned off.
As for the performance of the Athlon 3200+ you are quoting was on a very different system. For starters it was running Windows, it was using a different compiler, and it was using a specialized heap (although I notice the 970 got a special heap for it's part of the test;-), amongst other things. Consequently it's hard to draw conclusions from the isolated score. If you look around at similar benchmarks for the 3GHz P4, you'll see that it achieves similar scores. That would suggest if anything that the two chips perform comperably for spec benchmarks.
So far, all the head-to-head comparisons I've seen have been inconclusive at best. That doesn't mean what you are saying is wrong, but it's far from clear that this is the case.
Actually, if you look at the details of the benchmarks, the fp benchmarks where the x86 really performed poorly were the Fortran benchmarks. The x86 actually won one of the fp benchmarks that was in C.
Intel and various other parties have spent a lot of time trying to optimize gcc's performance for x86. They completely dwarf efforts made by Apple & IBM's to optimize PowerPC performance.
Actually, it's entirely likely that hyperthreading would have actually made the systems perform more slowly, particularly since the stock kernel they were using was not hyperthreading aware. That being said it was interesting that they turned on hyperthreading for the single CPU tests.
I haven't seen any compelling evidence yet as to whether Athlons or Opterons score better on Spec benchmarks. It'll be intersting to run these benchmarks again once the G5's are actually out.
You are comparing Apple's to oranges. Apple only quoted "Specfp_base", and those FP benchmarks are for "Specfp". One would expect those kind of jumps in performance.
Actually, if you look at the benchmarks, the GCC scores are actually pretty close (for the FP benchmarks). Where the Intel side really fails is with the Fortran compiler, which is generally considered to be a pretty good Fortran compiler.
Given that if you are comparing Linux vs. OS X, the vast majority of your code will have been compiled with gcc, and the number of man years spent optimizing gcc's x86 performance, I think this is actually a pretty fair benchmark.
"I mean, it should meet the open group's standards, right? My concern is apple might not think it will meet TOG's standards and they'd rather not risk it."
Actually, I would expect that they do not meet the open groups specifications. OS X is different enough that they are bound to have problems, and having to solve the technical problems is almost certainly prohibitively expensive.
Hey, if you work with her, any chance you can convince her to do some broader analysis investigating the possibility that the code in question could be obtained from public sources, such as the BSD code base or books?
Frankly, I wasn't surprised that the code looked strikingly similar. It's to be expected really, given that both code bases have a lot of contributions from the "commons". I would think a more useful analysis would try to trace the code samples back to their sources, both on the Linux side and on the Unixware side.
Good point - but what I don't understand is why they don't put a limit on the Linux side of things? Such as - keep this secret until a judge says this is over, owtte.
Because once you publish this, the claim that they have any "trade secrets" basically goes away. Imagine if the court ruled in SCO's favor, and they you published away (and legally too). The "trade secret" status would be lost.
This is the pain with trade secret cases. In order to prove your case, you have to present the secrets, but once you do it, they are no longer secrets, making the court case pointless. The notion of sharing these secrets with anyone prior to trial is pretty rediculous. At least in the courtroom there are restrictions which prevent the secrets from being shared except as directed by the judgement. Outside of that it's far more complicated.:-(
The whole NDA thing is a pretty rediculous farce. The fundamental principle is that SCO does not want it revealed what code in Linux is in question. Since that code is already public knowledge, if they let you publish information "obtained from other sources" you can basically publish the relevant code.
SCO also doesn't want to have to turn over everything they got. They basically want to just throw down their Unix trade secrets and Linux source code, and have people draw their own conclusions. A contract with more flexibility could open them up to having to share a lot of other things related to the case.
As for the state of Utah clause, it's pretty typical for a contract to have some state governing its enforcement, and typically the home state of the company drafting the contract. Sadly, they aren't a Delaware corp.:-(
I think SCO is as evil as the next guy, and I think the NDA thing is a red herring, but I have to say I can't see how else they could have written this NDA without compromising secrets that they obviously feel they need to protect.
Thank you so much for this article. Tech support for family members is a source of great stress in my life, however, none of them have sent me a power bar in the mail. I never realized how easy I had it.;-)
The native file system for OS/2 was HPFS. JFS was originally on AI/X, and was later ported to OS/2. Admittedly, a lot of the work on JFS on OS/2 was used for JFS on Linux, but that still does not change the fact that JFS is indeed from AI/X.
I realized it's only supporting the WebDAV protocol for synching with Exchange 2000 or later. It still doesn't do the MAPI-over-RPC native protocol of Exchange.
I didn't know that KDE had anything for synching with Exchange. Is this really just the ability to synch with an IMAP server, or will I be able to synch KOrganizer with Exchange as well?
Honestly, I've been getting the sival every time I used sigwaitinfo(). I've been doing this since 2.4. I agree that without the sival you lose a fair bit of the benefit of asynch I/O with signals for notification. This was totally working for me with SGI's AIO interface. I believe it was also working with the clunk glibc implementation.
3. Solaris doesn't have 64 bit memory access. Its like 38 or 48 bit. Check their UltraSparc docs.
Solaris does have 64-bit memory access. Sure, the average UltraSparc can't address nearly that much memory (just like every other CPU out there), but the operating system certainly supports it.
That being said, Linux can address up to 64GB of memory on Xeon's thanks to PAE. Of course, you are stuck with only 4GB per process.
SGI's asynch I/O mechanism supported the standard POSIX notification mechanisms (except perhaps not the NOTIFY_THREAD option, I can't remember). The new asynch I/O implementation that will hopefully make 2.6 should allow you to implement the entire POSIX spec no problem.
I have been using realtime signal queues on Linux for ages. I'm not sure what you are looking for that is missing there.
My old comp sci. professor used to talk about the difference between "speed" and "efficiency". A quicksort program written in basic, running on a 386 machine, can for sufficiently large datasets, outperform an insertion sort program written in assembler, running on a Cray.
Most really big performance wins (like the 100x example) don't come from using a lower-level language or fancy compiler optimisations. They mostly come from a more efficient design.
Yup. My current employer uses an outside detective agency to do background checks. They basically do a credit report and verify parts of your resume. If it all meets preset guidelines, all my employer gets is a, "he checked out" message. Also, on the day of the job interview, they gave me the waiver that I needed to sign, and it included a detachable sheet with contact info on the detective agency if I had any concerns or wanted to see the report.
In general, this is a very informative post. I'd like to point out a couple of things:
1) The IMAP protocol, HTTP protocol, and for that matter many other protocols support the notion of simultaneous processing of multiple requests. You may not be able to submit requests exactly simultaneously, but you don't have to wait for the response from one before you submit the next request. The same is true of X11. So, the only "atomic" part of X11 sending and parsing of X messages, and this is only if they are sharing the same connection.
2) There is nothing which prevents a multi-threaded application from establishing multiple connections with an X server. This would allow for completely multithreaded operations (well, maybe the graphics card driver would impose some synchronization points, but who knows).
Win95 users paid for that OS, and more importantly it was clear that Win95 was a seperate product. Apple has now essentially found a way to get people to pay an additional $50 to the $130-odd they get for OS upgrades. Add on.Mac and the price of the Mac OS X 10.2 upgrade I bought last year essentially doubled.
What if Apple started selling upgrades to Finder? Or better yet the disk repair utility?;-)
The assumption that software's value is in it's feature set is completely false. What was the value of Microsoft Office 97 the day before Office 2000 came out? What was the value the day after? Shouldn't they be the same, or at least roughly the same? Why was the price of Office 2000 the same as the price of Office 97 the day before, even though it has more features?
Since we wasn't a lawyer, but rather was in the employ of a lawyer, is it possible for him to violate attorney-client privilege (I honestly don't know)?
FYI, there are many PPC compilers which easily outperform gcc. Your assertion that PPC compilers need to "catch up" with x86 compilers is unfounded. Regardless, it was not relevant for the purposes of Apple's benchmark. GCC was chosen for reasons which are quite clear if you read the report.
That is to say, no matter what system you run it on, it should give you the same answer. Whatever gives you the answer the fasest is the highest performer. This is all the user cares about.
I think you missed the point of Apple's test. Read the opening page of the report. Apple could easily have rigged this test in such a way as to get MUCH better performance than they did. They could have used RAM disks instead of serial ATA drives. They could have used an OS specifically designed for the tests, but which was useless in a production environment. Similarly with a compiler. (It looks like they may have done so actually with the specialized heap they used.) In the end they would have had an impressive number that was completely irrelevant to the end user experience.
The idea behind the tests was to, as much as possible, compare apples to apples (pun not really intended). The idea was to compare OS X + PPC 970 vs. Linux + a top of the line Dell, controlling as many of the other factors as possible. This meant trying to keep the other elements as similar as possible.
Ultimately, a SPEC benchmark on a system you can never get or would never use, does not do you a lot of good. This test is fairly exemplary because (for the most part... Apple did cheat a bit) these systems seem fairly representative of what people can expect to be using in the real world.
Well, I'm not saying the PowerPC backend is better or worse than the x86 back end, but I am saying that it's rediculous to imply that Apple was able to "soup up" gcc for their benchmark, while the Intel compiler sucked. The Intel crew have had more than their fair shot at optimising the compiler's output.
;-), amongst other things. Consequently it's hard to draw conclusions from the isolated score. If you look around at similar benchmarks for the 3GHz P4, you'll see that it achieves similar scores. That would suggest if anything that the two chips perform comperably for spec benchmarks.
You description of the difficulties with hyperthreading isn't entirely accurate. Even with hyperthreading turned off, both threads will be competing for the same resources. The problems mostly come from the kernel not doing a smart job of scheduling threads. So, for example, if you have only 2 threads that are runnable, if the kernel sees 4 "virtual" processors (because it has no understanding of hyperthreading), it's entirely possible it will schedule the 2 threads to run on the 2 "virtual" processors that happen to be the same physical processor. This is of course a complete waste of resources. You also have issues with intelligent use of memory (if you have 2 processes with 2 threads each, it is often better to have threads in the same process executing on the same processor). Of course, it's unlikely (but possible) that turning on hyperthreading for the non-multiprocessing benchmark would do anything but harm performance (certainly that's been my experience to date).
In general, it's worth noting that most of the SPEC benchmarks which have been submitted to SPEC appear to have hyperthreading turned off.
As for the performance of the Athlon 3200+ you are quoting was on a very different system. For starters it was running Windows, it was using a different compiler, and it was using a specialized heap (although I notice the 970 got a special heap for it's part of the test
So far, all the head-to-head comparisons I've seen have been inconclusive at best. That doesn't mean what you are saying is wrong, but it's far from clear that this is the case.
Note that Apple quoted "base" test scores. You are quoting non-base scores. One would expect them to be much higher.
You are comparing Apple's to oranges. Apple only quoted "Specfp_base", and those FP benchmarks are for "Specfp". One would expect those kind of jumps in performance.
Actually, if you look at the benchmarks, the GCC scores are actually pretty close (for the FP benchmarks). Where the Intel side really fails is with the Fortran compiler, which is generally considered to be a pretty good Fortran compiler.
Given that if you are comparing Linux vs. OS X, the vast majority of your code will have been compiled with gcc, and the number of man years spent optimizing gcc's x86 performance, I think this is actually a pretty fair benchmark.
"I mean, it should meet the open group's standards, right? My concern is apple might not think it will meet TOG's standards and they'd rather not risk it."
Actually, I would expect that they do not meet the open groups specifications. OS X is different enough that they are bound to have problems, and having to solve the technical problems is almost certainly prohibitively expensive.
Hey, if you work with her, any chance you can convince her to do some broader analysis investigating the possibility that the code in question could be obtained from public sources, such as the BSD code base or books?
Frankly, I wasn't surprised that the code looked strikingly similar. It's to be expected really, given that both code bases have a lot of contributions from the "commons". I would think a more useful analysis would try to trace the code samples back to their sources, both on the Linux side and on the Unixware side.
Good point - but what I don't understand is why they don't put a limit on the Linux side of things? Such as - keep this secret until a judge says this is over, owtte.
:-(
Because once you publish this, the claim that they have any "trade secrets" basically goes away. Imagine if the court ruled in SCO's favor, and they you published away (and legally too). The "trade secret" status would be lost.
This is the pain with trade secret cases. In order to prove your case, you have to present the secrets, but once you do it, they are no longer secrets, making the court case pointless. The notion of sharing these secrets with anyone prior to trial is pretty rediculous. At least in the courtroom there are restrictions which prevent the secrets from being shared except as directed by the judgement. Outside of that it's far more complicated.
The whole NDA thing is a pretty rediculous farce. The fundamental principle is that SCO does not want it revealed what code in Linux is in question. Since that code is already public knowledge, if they let you publish information "obtained from other sources" you can basically publish the relevant code.
:-(
SCO also doesn't want to have to turn over everything they got. They basically want to just throw down their Unix trade secrets and Linux source code, and have people draw their own conclusions. A contract with more flexibility could open them up to having to share a lot of other things related to the case.
As for the state of Utah clause, it's pretty typical for a contract to have some state governing its enforcement, and typically the home state of the company drafting the contract. Sadly, they aren't a Delaware corp.
I think SCO is as evil as the next guy, and I think the NDA thing is a red herring, but I have to say I can't see how else they could have written this NDA without compromising secrets that they obviously feel they need to protect.
Thank you so much for this article. Tech support for family members is a source of great stress in my life, however, none of them have sent me a power bar in the mail. I never realized how easy I had it. ;-)
The native file system for OS/2 was HPFS. JFS was originally on AI/X, and was later ported to OS/2. Admittedly, a lot of the work on JFS on OS/2 was used for JFS on Linux, but that still does not change the fact that JFS is indeed from AI/X.
I realized it's only supporting the WebDAV protocol for synching with Exchange 2000 or later. It still doesn't do the MAPI-over-RPC native protocol of Exchange.
I didn't know that KDE had anything for synching with Exchange. Is this really just the ability to synch with an IMAP server, or will I be able to synch KOrganizer with Exchange as well?
Honestly, I've been getting the sival every time I used sigwaitinfo(). I've been doing this since 2.4. I agree that without the sival you lose a fair bit of the benefit of asynch I/O with signals for notification. This was totally working for me with SGI's AIO interface. I believe it was also working with the clunk glibc implementation.
3. Solaris doesn't have 64 bit memory access. Its like 38 or 48 bit. Check their UltraSparc docs.
Solaris does have 64-bit memory access. Sure, the average UltraSparc can't address nearly that much memory (just like every other CPU out there), but the operating system certainly supports it.
That being said, Linux can address up to 64GB of memory on Xeon's thanks to PAE. Of course, you are stuck with only 4GB per process.
SGI's asynch I/O mechanism supported the standard POSIX notification mechanisms (except perhaps not the NOTIFY_THREAD option, I can't remember). The new asynch I/O implementation that will hopefully make 2.6 should allow you to implement the entire POSIX spec no problem.
I have been using realtime signal queues on Linux for ages. I'm not sure what you are looking for that is missing there.
My old comp sci. professor used to talk about the difference between "speed" and "efficiency". A quicksort program written in basic, running on a 386 machine, can for sufficiently large datasets, outperform an insertion sort program written in assembler, running on a Cray.
Most really big performance wins (like the 100x example) don't come from using a lower-level language or fancy compiler optimisations. They mostly come from a more efficient design.
Yup. My current employer uses an outside detective agency to do background checks. They basically do a credit report and verify parts of your resume. If it all meets preset guidelines, all my employer gets is a, "he checked out" message. Also, on the day of the job interview, they gave me the waiver that I needed to sign, and it included a detachable sheet with contact info on the detective agency if I had any concerns or wanted to see the report.
That seems fair enough to me.
It is my understanding that this has been doing this for a while, and was actually called a "Canadian Cross Compiler".
In general, this is a very informative post. I'd like to point out a couple of things:
1) The IMAP protocol, HTTP protocol, and for that matter many other protocols support the notion of simultaneous processing of multiple requests. You may not be able to submit requests exactly simultaneously, but you don't have to wait for the response from one before you submit the next request. The same is true of X11. So, the only "atomic" part of X11 sending and parsing of X messages, and this is only if they are sharing the same connection.
2) There is nothing which prevents a multi-threaded application from establishing multiple connections with an X server. This would allow for completely multithreaded operations (well, maybe the graphics card driver would impose some synchronization points, but who knows).
Win95 users paid for that OS, and more importantly it was clear that Win95 was a seperate product. Apple has now essentially found a way to get people to pay an additional $50 to the $130-odd they get for OS upgrades. Add on .Mac and the price of the Mac OS X 10.2 upgrade I bought last year essentially doubled.
;-)
What if Apple started selling upgrades to Finder? Or better yet the disk repair utility?
The assumption that software's value is in it's feature set is completely false. What was the value of Microsoft Office 97 the day before Office 2000 came out? What was the value the day after? Shouldn't they be the same, or at least roughly the same? Why was the price of Office 2000 the same as the price of Office 97 the day before, even though it has more features?
Actually, there was such a thing as HotJava long ago... it definitely was EOL'd a long time ago.
Since we wasn't a lawyer, but rather was in the employ of a lawyer, is it possible for him to violate attorney-client privilege (I honestly don't know)?