ohmygod2 wrote to us with a story from
SF Gate that Apple, unsurprisingly, is going to be one of the purchasers of IBM's PowerPC 970. At this time, though, it's unclear where Apple is going to actually *use* said chip.
Update: 10/14 15:53 GMT by
H : Follow-up to Tim's
story.
THis is an interesting story:
:)
The 970 is a derivative of the Power 4 chip (with what I assume to be the Altivec extensions)
These run in the 1.6 -2.0 Gig range
As a Risc chip
with 64 byte chunks.
Granted, I am unsure as of yet if Darwin runs 64 bit natively, but when it does, imagine a dual processor of these (with of course, quartz extreme pushing all of the video over to the Graphics processor).
Maybe I am getting my hopes up, but this is what I have been waiting for. New macintosh, here I come
Blah Blah Blah.
http://www.eet.com/semi/news/OEG20021014S0059
Essentially a derivative of the company's Power4 microprocessor, IBM's PowerPC 970 adds 64-bit PowerPC compatibility, an implementation of the Altivec multimedia instruction-set extensions and a fast processor bus supporting up to 16-way symmetric multiprocessing.
I hope they use a memory controller that does at least DDR 333.
The law is a weapon of the government, not a protection for the likes of you. Surely you understand that.
Keep in mind most of these articles are coming from the BusinessWeek article, or an IBM press release. IN the IBM release, *nothing* about a real date of shipping was stated. What was stated was "Second Half of 2003".
As for the GHz issue, the chip does more per-clock than the P4. This means that it can still be competitive. Just wait another day for the MPF, and maybe we'll be able to see some initial SPEC numbers.
I think you'll be pleasantly surprised.
.
Blocklevel: Practical Information Architecture
When Apple started selling FireWire-based Macs, Intel immediately tried to marginalize it by saying that the technology only appealed to a niche of consumers, and oh-by-the-way here's our specs for ATA/66 and USB 2.0 (for which the detailed specs hadn't been finalized, and which didn't start hitting mainstream systems until some 2 years later).
Intel takes seriously Andy Groves's words about only the paranoid surviving.
From the article:
"Critics -- notably Intel -- argue that most desktop users have no need for 64-bit processing. In fact, Microsoft Corp. has yet to release a 64-bit version of Windows that will run on AMD's Hammer chips."
Is it any wonder, given they just lost their defense against Intergraph's patent lawsuit which may result in them not being able to release the Itanium series?
Hey, Intel, last I checked, no one had a use for 32-bit processing or 640K of RAM on the desktop, either.</sarcasm>
blog |
the P4 executes 1 instruction per cycle. the G4 does 3 (the basis of apples "megahertz myth" myth), so this is a huge step up.
as for the laptop part, hell yeah. my tibook by the end of 2003 should be nearing the end of it's "useful lifespan" - whatever that is, and i'll probably sell it for half of what i bought it for then and buy the latest, greatest "G5" laptop once it's avalible. that's the plan, at least. i'm in college after all.... and apple has a tendancy to take forever to release a new laptop based on a new processor design.
moox. for a new generation.
I agree with you, but I hope you're not confusing instructions per cycle with length of the pipeline.
The P4 processes instructions in a pipeline. The pipeline can contain 20 instructions at any one time, but each instruction is only finished once it exits the pipeline.
Same goes for the 970, I'd imagine.
To truly increase instructions per cycle, you have to add extra pipelines (and a lot of extra circuits to prevent instructions from stepping on each other)
If pipelines were always full, and all instructions were equivalent, the P4 would beat the pants off of the 970. But the pipeline is not always full because instructions often depend on the results of other instructions, and not all operations are equal in their requirements.
So shorter pipelines often handle instruction dependancies better resulting in better performance, while (for other reasons) longer pipelines are easier to design for higher Ghz.
Also, notice that according to press relewase 970 will be the single core version of Power4, so you should look at the green box closer to Sun's suckers, not at the orange one. Press release also notices "economy version" of Power4, so it may be even slower.
MSDOS: 20+ years without remote hole in the default install
Ahem. 128MB L3 cache (on the POWER4 in the benchmark)? Daaaamn. I'm not saying that a fat L3 cache has anything to do with SPEC benchmarks (I'm guessing it doesn't), I'm just making an observation: that's a lot of cache! And it's probably bloody expensive to get 128MB of cache-speed memory. HP's comments allude to that but it also has 64GB of RAM so it's sort of a straw man ("let's overconfigure a system and then make fun of how overpriced it is").
I think it's quite silly of HP to say that "IBM's Power4 architecture is outclassed in performance". Really? A 10% difference qualifies as outclassed? I don't agree. And the POWER4's SPECint score is better. "Outclassed?" Hah.
Of course the proof is in the pudding. Let's see what actually hits the streets. Apple has now been "just around the corner from really kicking Wintel's butt" performance-wise for about 8 or 9 years, but it has yet to happen. We were all led to believe that the PPC would blow away x86 and that never happened. With luck, IBM will actually deliver a really kick-ass CPU at clock speeds close to the x86 family, and the superior per-clock performance will actually make it faster. But there would still be the question of price/performance. If Joe PC Buyer can buy a faster PC for the price of a Mac, it doesn't matter that the Mac runs cooler, or at a lower clock speed, or in 64-bit mode. Joe will just say "my $500 PC is faster than your $1500 Mac, end of story". And he will have a good point. Until that changes, the only people who care are the people who are willing to pay a premium for OS X and the Mac experience, and people who need something faster than the fastest desktop PC but still want a user-friendly mainstream desktop OS. The folks who use Office and Outlook all day won't be able to justify the extra $1000 or whatever it would cost to get a Mac that performs similarly.
I'd also like to remind everybody that benchmarks don't necessarily reflect real-world performance. This is a very synthetic benchmark that is great for telling you what the best-case raw CPU performance of these CPUs can be, but it doesn't prove that $REAL_APP will see those performance gains over older CPUs.
In particular it's not clear what the performance cost would be of using code compiled for a PPC604 would be vs. using code compiled with the very best compiler for the POWER4. I'm sure that Steve Jobs will crow about another highly-optimized Photoshop benchmark that we can all wish represented overall system performance, but it doesn't. That said, I imagine that the really important professional creative apps (you know, the ones that cost thousands of dollars per seat and really beat the @%$%@$% out of the CPU) will be quickly updated for the new CPUs because their customers will demand it. (To be fair, the same is true for the Itanium.)
And IBM said no one needed the power of the 80386. Then Compaq released their 386 monster and IBM stopped mattering in the PC world.
The difference is that we have had plenty of 64 bit processors aimed at the lower end and they just don't work. It is too expensive to bring in 64 bits from RAM to cache when the average variable has less than 8 significant bits. Hence the packed words of VLIW Itanium.
Back when my job description included developing code for the Alpha and the Pentium, just paging in the larger 64 bit code killed the speed advantage of the Alpha chip.
By "in flight" I'm assuming you mean "in some stage of the processing pipeline at any given moment" - I believe the P4 has something like a 20stage pipeline, the G3/G4 I believe is more along the lines of an 8 stage pipeline, if memory serves.
..
Part of what's at stake here is how many instructions are decoded/dispatched each clock cycle and then other factors like branch-prediction and such muddy the waters a bit more. In the end, the 'instructions per cycle' is really more of an average than anything else, as not every instruction will be a candidate for sending through the parallel functional units, etc. Taking into account the efficiency of the branch-prediction unit is important, too, since you could take a wrong turn and have to clear out all your functional units, at every stage of the pipeline and start over again, in certain circumstances. The fewer times this happens, the more effective your CPU will be at pushing the bits around.
Bottom line: modern processor mechanics are far more sophisticated than can be easily summarized by any one number or neat phrase. Just ask AMD about that one