Apple Hardware VP Defends Benchmarks
He said Veritest used gcc for both platforms, instead of Intel's compiler, simply because the benchmarks measure two things at the same time: compiler, and hardware. To test the hardware alone, you must normalize the compiler out of the equation -- using the same version and similar settings -- and, if anything, Joswiak said, gcc has been available on the Intel platform for a lot longer and is more optimized for Intel than for PowerPC.
He conceded readily that the Dell numbers would be higher with the Intel compiler, but that the Apple numbers could be higher with a different compiler too.
Joswiak added that in the Intel modifications for the tests, they chose the option that provided higher scores for the Intel machine, not lower. The scores were higher under Linux than under Windows, and in the rate test, the scores were higher with hyperthreading disabled than enabled. He also said they would be happy to do the tests on Windows and with hyperthreading enabled, if people wanted it, as it would only make the G5 look better.
In the G5 modifications, they were made because shipping systems will have those options available. For example, memory read bypass was turned on, for even though it is not on by default in the tested prototypes, it will be on by default for the shipping systems. Software-based prefetching was turned off and a high-performance malloc was used because those options will be available on the shipping systems (Joswiak did not know whether this malloc, which is faster but less memory efficient, will be the default in the shipping systems).
As to not using SSE2, Joswiak said they enabled the correct flags for it, as documented on the gcc web site, so that SSE2 was enabled (the Veritest report lists the options used for each test, which appears to include the appropriate flags).
http://www.osnews.com/story.php?news_id=3877
the article is analyzing if the recent announcements from Apple were innovation or simple catch up.
Eh, we do this sometimes, when it is appropriate. In this case, I have a PR contact at Apple who asked me last week if I wanted to talk to someone about WWDC, and we set up a call last weekend, for this afternoon. It just happened to coincide with the benchmark discussion, which Greg was eager to set straight (he had read the arguments and already compiled his responses :-). We also talked a bit about some other topics, but nothing of interest that you haven't read elsewhere.
Slashdot has a huge readership of IT professionals, both in-charge of purchases, and the target market themselves.
The comparisons should be done on the fastest results available, and not be based on some arbitrary factoring out of the compiler capability.
It isn't arbitrary.
In the end, SPEC is about measuring how fast something can be done in the real world
No, it really isn't. It is about raw performance, not real-world use.
I'm a science guy, and for the calculations and simulations done here at the physics dept. where I work, the IBM power4 kills just about everything else. And when I saw the powermac calculate fractals with mathematica faster than the xeon box by more than a factor of 2, I was very excited (although a little cautiously) to see we will soon get power4 performance for well under $20,000
Apple just got rid of Project Builder; this is their new IDE: http://www.apple.com/macosx/panther/xcode.html
"Reality is just a convenient measure of complexity" -Alvy Ray Smith
Metroworks is much more tuned but it is owned by Motorola, so apple could do zero tuning for the G5. Unless IBM writes a mac compiler, apple uses GCC because they do most programing in Obj C which is only supported by GCC not Metroworks. Apple chose GCC because it is what they use. Hopefully either IBM writes a compiler or apple severely overhauls GCC. Give apple some time and they will get more performance out of GCC, but they simply have not had the time yet. Like when the P4 came out it was slow until intel made a good compiler for it. The truth is apple is offering similar performance for less money and the package is higher quality. If it is slightly faster or slower is really irrelevant right now.
First it's "They're too slow and too expensive."
Now it's "They're blazingly fast, but still too expensive"? Have you SEEN the $799 G4 eMacs?
this is the funniest claim i've seen in a while. not only does apple do this, so does dell, and so does virtually every consumer-oriented company on the planet. gas companies shave a TENTH of a PENNY off gas prices to make them seem cheaper.
a department store (was it macy's?) started this practice. the funny part is that the aim wasn't really to fool consumers into thinking it was less expensive. alas, the real purpose was to force cashiers to open the register, since the customer was almost always going to be due some change.
woof!
SPECfp: The Power4+ at 1.7 Ghz has the highest SPECfp score (1699 @ 1.7Ghz); higher than Itanium (1431 @ 1Ghz), the most recent Alpha (1482 @ 1.15Ghz), and the Pentium 4 (1229 @ 3.0Ghz).
SPECint: As far as SPECint, the Power4 is not in the lead (1113 @ 1.7Ghz), but is still respectable when compared to Pentium4's (1200 @ 3.0Ghz).
The G5/970 should do similarly or better than the G5/970 (since the G5/970 is running at 2.0Ghz vs Power4+ 1.7Ghz). One caveat is that the G5/970 has a smaller on-chip second-level cache (512kB vs 1.5MB), which will hurt its performance on some codes.
Certainly Apple's test uses a drastically different compiler than the reported SPEC results. This results in absolute numbers that are lower, but Apple's relative comparison is still reasonable, IMHO. I think it is safe to claim that Apple has really closed the gap in processor speed and now has processors with comparable performance to the fastest chips money can buy. About damn time. :)
SPEC is very much a real-world application benchmark of CPU intensive tasks that people actually run and use, and not an arbitrary synthetic benchmark to measure CPU hardware performance only. The faster these benchmark programs run, the better scores you get, and the quicker you can go home after finishing your work at the office. SPEC's goal doesn't care how you get the job done, as long as it gets done. You could optimize the memory, compiler, ALU, or whatever. It seems Apple got themselves into a trap when they claimed fastest desktop based on SPEC results, as they clearly didn't understand the systems level benchmark objectives of SPEC CPU2000.
In the end, with a score of 1200, the Pentium 4 gets the job done much faster than PowerPC 970, with a score of 800. I'm not sure what Apple was thinking when they published the lower scores of the Pentium 4, as clearly the Pentium 4 could do better. The ultimate claim that Apple has the fastest desktop system is therefore incorrect.
From www.spec.org:
Q9: What source code is provided? What exactly makes up these suites?
A9: CINT2000 and CFP2000 are based on compute-intensive applications provided as source code. CINT2000 contains 11 applications written in C and one in C++ (252.eon) that are used as benchmarks:
Name Brief Description
164.gzip Data compression utility
175.vpr FPGA circuit placement and routing
176.gcc C compiler
181.mcf Minimum cost network flow solver
186.crafty Chess program
197.parser Natural language processing
252.eon Ray tracing
253.perlbmk Perl
254.gap Computational group theory
255.vortex Object-oriented database
256.bzip2 Data compression utility
300.twolf Place and route simulator
CFP2000 contains 14 applications (six FORTRAN77, four FORTRAN90 and four C) that are used as benchmarks:
Name Brief Description
168.wupwise Quantum chromodynamics
171.swim Shallow water modeling
172.mgrid Multi-grid solver in 3D potential field
173.applu Parabolic/elliptic partial differential equations
177.mesa 3D graphics library
178.galgel Fluid dynamics: analysis of oscillatory instability
179.art Neural network simulation: adaptive resonance theory
183.equake Finite element simulation: earthquake modeling
187.facerec Computer vision: recognizes faces
188.ammp Computational chemistry
189.lucas Number theory: primality testing
191.fma3d Finite-element crash simulation
200.sixtrack Particle accelerator model
301.apsi Solves problems regarding temperature, wind, distribution of pollutants
Moore's law says that the transitor count will double every 18 months, doesn't it? Nothing to do with clock speed...
In any case, that seems to match (roughly) with Intel's roadmap from their P4 introduction...
Month 1 - 1.5
Month 2 - 1.7
Month 4 - 1.8
Month 6 - 2.0
Month 10 - 2.1
Month 13 - 2.4
Month 14 - 2.5
Month 18 - 2.8
Month 21 - 3.0
Month 30 - 3.2
The truth is whether a company brings out SPEC marks made under fair configurations or faked configurations, there will always be those who will accept the figures at face value, those who will contest them no matter what and those who really counldn't care less. I am in the third category, if you're curious ;)
Everyone buys a piece of hardware for different reason, some for design, some for brand, some out of faith, some because they have the money and even some because of an application. If you are choosing for the last reason then the question should be whether it is fast enough for you, and does it in they you want.
I would recommend everyone to buy the computer that meets their usage requirements and not for some theoretical and utopic bunch of values that don't really mean much in the real world, unless you are only wanting to gloat over something totally subjective.
As a final word, sometimes the slowest factor in getting a job done, is not necessarily the computer, but the user taking their time, because the application has been so badly implemented, to be difficult to use and understandable.
Computers have the potential to the make the most complicated of applications accessible to a layman of the subject.
Jumpstart the tartan drive.
you know this and I know this but many trolls don't know this. I think Apple just got tired of hearing how PCs are faster and what not. Personally I was blown away by the keynote. Also, for anyone wondering I'm using the developer preview now and if the release of Panther is anything like the preview, holy crap. It is nice. There are a ton of tiny improvements here and there that really make it nice, even nicer than Jaguar. These are little things that weren't mentioned in the keynote.
-
What do you mean $100 upgrade "every few months." When did Jaguar debut? More than a few months ago. Did they charge for 10.1? No. You probably don't even own a Mac or have never even paid for an OS upgrade, either.
Yet another facetious and inaccurate accusation.
it was to prevent cashiers from pocketing money.
woof!
That is the silliest thing I've heard in a while. What, Moore's Law says doubling every 18 months, right? Keep in mind that is an exponential growth curve. So after 3 years, you will have 4x the performance. So for Apple to keep up with Moore's law (which has been degrading anyway, I think today people say 24 months), Apple would have to introduce 3.2Ghz machines next year. Now, don'tcha think Steve Jobs would have sounded a little funny saying "3.2 GHz in one year"? I think that is unnecessarily precise. Plus it gives him a chance to underpromise and overdeliver. If he is willing to make that prediction, that has to be the lowest possible speed they can envision in 12 months. Maybe they're expecting 3.5GHz or even 4 GHz in a year, but to say that would actually cut into sales.
Also, remember that IBM said the PPC 970 chip would top out at 1.8 GHz initally? Looks like they surpassed that.
--
The internet is the greatest source of biased information in the history of mankind.
Actually, I just watched the video again. He actually said:
--
The internet is the greatest source of biased information in the history of mankind.
Overpriced is not the right word. More like Underpriced.
I urge anyone to compare the featureset of Final Cut Pro 4 ($899) vs. similar solutions in the PC world. Avid Xpress DV doesn't even stack up, and with all the plugins and tools, you'll end up spending far more to equal twice the price of the Apple G5 hardware.
It really amuses me when people talk about 10.x updates as if they are service packs. Someone yesterday mentioned this saying "Microsoft doesn't charge us for SP.x upgrades", which was really comedic. Windows ServicePacks just fix broken stuff, and sometimes even break more. With OSX 10.x updates you get brand new features all the time.
I wish people really understood how this shit worked.
Man are you way behind the times. I can do that even with my dual 1Ghz G4.
It is called "market lock-in"
It doesn't matter if it is a better product--someone will ask their friends "what will work for me?" their friends say "I use this, it works for me" and that prompts said person to go out and buy X.
Most people I talk to I can sway to buying a Mac--if I get to them first and let them get their hands on one.
Integrate Keynote and LaTeX
The Apple Quake 3 benchmarks disabled hardware acceleration. They were solely testing the CPU, or trying to at least. The guy from... oh, damn, what's that guy's name, the guy who did the OpenGL demo yesterday. He did the same thing. They did all the rendering they could in the CPU.
That's why the numbers were low compared to other tests that used accelerated graphics.
for ( int i = 0; i < n; i++ ) array[ i ]++;
into new code that would load the array in blocks of 4 and use a single SSE2 instruction to increment all items in that block at a time, (adjusting its code appropriately for the case when n is not divisble by 4.) there is a preprocessor for PPC called VAST/AltiVec that will do this sort of thing for PPC, but I'm really doubt that Apple wasn't using it.
my other lambda is a Y
I've got a Powerbook G4 800MHz, and it's perfectly suited for development. Compile times will of course be better with the G5, but I don't mind waiting for something to compile... makes me take a break from heavy coding. So if you need portability, go for the powerbook. Although you may want to wait a little longer just to see if the 15" gets updated like it should. TiBooks have an annoying problem keeping their paint on and I'm getting tired of the bluetooth adapter sticking out the back.
The G5 would have the same problem if it was working on a dataset that was 1.5x the size of its physical memory.
MJC
The 30th International Symposium on Computer Architecture had an interesting panel discussion on benchmarking in industry and academia, with people like John Hennessy, Dave Patterson and Gurinder Sohi on stage. The conclusions: most benchmarking in industry, especially SPEC, is a pack of lies. And benchmark results published by academic researchers aren't much better. So, not really much point in losing a lot of sleep at least over their SPEC numbers.
One big advantage with SPEC is that there are rather detailed rules for the tools and setup you can use to report benchmark numbers.
One of those rules is that you can use a prerelease compiler, but it has to ship on your operating system within 6 months (or shorter, if they have updated it since last year...)
I second that motion. My main machine is a Powerbook G4 at 667Mhz from about 1.5 years ago. It works wonderfully for development work. I do heavy C++ and Java development on it and it performs beautifully.
I also second the paint problem.
There is also a bit of a 802.11b problem. For some dumb reason they ran the antennae horizontally in the base rather then up the side of the display like a self-respecting laptop would. This decreases the distance you can be from your WAP and still have a good connection.
I don't have a big problem with this as I am in a small apartment, but even through just 2 walls or so over a distance of 20 feet the quality will drop to under half.
Fabulous laptop though. Perfect for development work. Jaguar comes pre-installed with C, C++, Objective-C, Java, Perl, Python, Ruby...
Justin Dubs
The 970 (G5) being 64-bit just means it can handle larger integers. That's it. You can address >4 GB of RAM and you can express integers >4.3 billion. In general, 64-bit isn't faster than 32-bit unless you're specifically doing 64-bit math (which would have to be emulated on a 32-bit processor). In fact, it's often slower. If you're using 64-bit integers and you don't really need them, you're sucking up twice the memory bandwidth for no reason.
Many people have this idea that 64-bit processing is some kind of SIMD (like MMX, SSE, or AltiVec). It isn't. The 970 can't process two 32-bit integers with one instruction (unless you're using AltiVec, but we're talking about its 64-bit capabilities here). There is no reason to expect a 64-bit chip to be intrinsically faster than a 32-bit chip.
I just got a 450MHz G4 Cube (pre-owned, obviously).
I have used high-end workstation-class machines, both RISC and CISC, multi-GHz Intel machines, and Macs back to System 6. This Cube is without a doubt the best computer I have ever owned or used.
That having been said, I have seen Apple make some prety serious hardware and customer service mistakes. I would buy another Mac in a heartbeat, but I would wait for these systems to ship for at least six months before buying one of them. Wait until you can check Mac help forums. Find out what the problems are, if any. You don't want to spend $3000 on a computer, and have the paint chip off.
If you fall off a building, go real limp, because maybe you'll look like a dummy and people will be like hey, free dummy
And while you're busy mulling that over in your mind, I agree, let's wait 'til a third party sees these, and can compare side by side.
"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
As a computer science student one of the things they teach us is to evaluate performance (mostly of algorithms) in terms of n. If you have n items, and one algorithm takes n seconds and the other takes 2xn seconds to process them, both results are on the order of n or O(n). In order for there to be some significance to the difference in results, there must be some other factor of n (like log(n), n squared, n cubed, etc.).
So, for all intents and purposes the benchmarks, though interesting, are not really significant EXCEPT that it shows that:
Look, I love macs and have been an Apple fan since my //e, but as a computer professional, even I have to be dispassionate about these things.
Either that, or I'm just pissed because I can't afford a $3,000 computer when I just bought an iBook with my student loan.
Nitewing '98
Everything works...in theory.
Thats not entirely true. I had an ADC student membership all through 2000, and I got OS X DP4. I didn't get DP1, DP2, or DP3 though. The student members get some of the seedings, just not most of them.
Well then those results would be even better for Apple since it is so "buggy and crappy" care to elaborate?
Jeff
I hate it when people ask silly questions without reading the first thing about story. Here is the quote to save you from scrolling back to the beginning: "Joswiak added that in the Intel modifications for the tests, they chose the option that provided higher scores for the Intel machine, not lower. The scores were higher under Linux than under Windows, and in the rate test, the scores were higher with hyperthreading disabled than enabled. He also said they would be happy to do the tests on Windows and with hyperthreading enabled, if people wanted it, as it would only make the G5 look better."
Interesting, when I took a course in Optimizing Compilers last year the concensus was that GCC is pretty awful when it comes to optimizations. Even general non-architecture dependent optimizations. The lecturers reason behind it was twofold.
First most research on compilers are being done at big corprorations. IBM being the single largest as I understood it. Naturally they put their optimizations in their own compilers first, the rest of the world have to implement them from their papers. (If they are lucky and the algorithms are not patented.)
Second if you were to put a good optimization in GCC it wouldn't take long before all other compilers had that optimization as well. GCC is OSS afterall.
We did comparisons between GCC and SunCC on UltraSPARC. SunCC minimal optimizations (O1) beat GCC with maximum optimizations (O4).
I'm just finished a course on vectorizing/parallelising compilers. There the situation is that even the best commercial compilers are pretty much equivalent to junk. Implementing the vector algorithms is a lot harder though. Even compared to complex SSA-form optimizations.
Get one of the physics guys to take some code and compile it on both platforms. We'll run the machines in a native mode. Use whatever compiler you want (although a standard compiler like gcc would be best) with all the optimization turned on for effect. Then crunch a big multi-gigabyte raw data files, like those generated by modern particle accelerators. Finally, feed the data into visualization utilities and display it.
Unfortunately, I'm no longer at a nuclear physics facility with access to this kind of data; otherwise, I'd do it myself. My 400Mhz P2 (linux box) used to take ~23 hours to make a first pass on a 2GB "raw" data file (which only represented 90 minutes of data btw). This will give you a real world feel for raw compute power and visualization power. If there is a significant difference, it should be obvious.
On the research level gcc is not as bleeding edge as other compilers. So if you run example code that shows the merits of a particular optimization, gcc may look not so good. But in practice, it's quite good.
My experiences with UltraSPARC are also a few years old, but gcc was faster and produced better code than Sun CC back then. You have to make sure to set -march=ultrasparc, of course. And I'm not sure about UltraSPARC but normally gcc -O4 does not do more than -O3, which basically is -O2 with function inlining. You can also get some boost with profile based optimization with gcc.
In summary, gcc produces very good code, but you might have to use some little known options for it. For example, gcc on Athlon XP and Pentium >= 3 may gain significant floating point performance with -mfpmath=sse,387 (I got >10% speed-up on lame, gcc's code was even faster than icc's with vectorizer). Another option worth knowing is -malign-double and the regparm attribute.
Another thing you have to keep in mind is: recent optimization advances normally are not big breakthroughs but small incremental advances. Many of them only help in a handful of special cases. gcc 3 has many more optimizations than gcc 2.95.3 and they were so proud of it that they said "much faster code on x86", and then there was whining and gnashing of teeth when most software was unaffected or even slower.
The only platform where I really would prefer the vendor cc is HP-UX on PA-RISC. The HP CC consistently produced 10-30% faster code than gcc (although that may have changed, I haven't used gcc > 2.7 on HP-UX).
Admittedly, this was when the PowerPC was pretty new, and the choices were the IBM/AIX compiler which was robust and produced fast code but required an AIX box in addition to a Power Mac, or the nacent Metrowerks CodeWarrior compiler which run natively on the Power Mac, but generated poorly optimized code.
If I recall my history timeline correctly, after CodeWarrior came
- the Apple MPW "MrC" compiler (better code than CodeWarrior 1.0, but with a wacky command-line "IDE"), then
- gcc for PowerPC (cruddy code back then), then
- the Motorola PowerPC compiler (better code than Apple's compiler, with NO IDE - it plugged into the CodeWarrior or MPW IDE).
- Then Motorola inexplicably stopped selling their compiler.
- Later Motorola bought Metrowerks.
- Somewhere along the line, gcc learned to generate better PowerPC code.
- Eventually, Apple pretty much shelved their "MrC" compiler, and settled on using gcc for Mac OS X
- Monday, Apple released their "Xcode" environment -- still using gcc, I believe.
Apple's MPW tools are still available (free) here for Mac OS 7/8/9. The new Mac OS X tools including Xcode are available here.As a side note, it's really nice to see Apple giving away a full development suite for free, and continuing to put development time and effort into improving it.
-Mark
Apple has CodeWarrier, which is better than their native GCC by about a magnitude of 20. Had they used that, their code would have been as good if not faster then the VC++ stuff.
He's actually right about the compiler hurting them. =p
thats right, I said dell1 6.html
http://www.theregister.co.uk/content/39/314
According to Joswiak, HT was disabled in the SPECint and SPECfp base tests because it yielded higher scores than when HT was enabled. VeriTest did keep HT switched on when it performed its SPECint and SPECfp rate tests.
Indeed, a number of Register readers have pointed out a report on Dell's web site that supports Joswiak's claim. Essentially, it says HT is good for server applications, but less well suited to compute-intensive apps. It uses SPEC CPU 2000 as an example of such an application, and found a "system performance decreased 6-9 per cent on the CPU 2000 speed tests and decreased 27-37 per cent on the CPU 2000 throughput tests" with HT enabled.
DELL's own comments on SPEC benchmarks and turning off hyperthreading for best results:
0 2- khalid.htm
http://www.dell.com/us/en/biz/topics/power_ps3q
I'm at WWDC right now and posting this comment from Safari running on a G5. I don't care what any of the benchmarks say -- this machine screams from a user's point-of-view.
No matter what I throw at it, I can't get either one of the CPUs above 50%.
http://arstechnica.infopop.net/OpenTopic/page?a
...but what does it say when a new IBM 2GHz chip meets or exceeds the execution speed and power of the Intel top of the line 3GHz chip? What happens when the G5 hits 3GHz next year?
I don't know Intel's roadmap, but they gotta be sweating a bit. Is there any doubt of the benchmark outcome when the GHz are equal?