chip+guy · Slashdot Mirror

Re:Coming to a computer near you... on Intel Cuts Back on 820 Chipset Manufacturing · 1999-09-21 17:06 · Score: 1

DRDRAM for servers? Even Intel looked into this and rejected it in favour of DRAM and eventually DDR (check out the 460GX chipset for merced :). Server vendors are most interested in two things: cost per gigabyte and to lesser extent, latency. The higher device bandwidth of DRDRAM is a non-issue for server guys - they will build as wide a memory or interleave as much as necessary to get the bandwidth they want. Servers today often have 512 or 1024 bit wide DRAM arrays. DRDRAM is cursed with finicky PWB layout and parametric requirements and a single channel can only handle 32 devices. For server size memories either multiple memory controller ASICs, each handling 2 or 4 DRDRAM channels are needed or a hierarchial memory design with fan-out repeater chips which add to the already miserable rambus latency. Both of these approaches are logistical nightmares with the complex PWB routing issues, cooling, and increased physical board area consumed and greater time of flight from long signal traces. Heck, mainframe guys wish the world had stayed with EDO :)

Re:The cost will come down on Intel Cuts Back on 820 Chipset Manufacturing · 1999-09-21 16:38 · Score: 3

Moore's law isn't the issue. The high cost problems for DRDRAM are fivefold. First the rambus access cell stuck onto a normal DRAM core plus the necessary changes to the DRAM core itself adds about 25% to die area. What's more the RAC doesn't scale down with the rest of the DRAM going to a smaller feature size process. Second, even in 0.22 um DRAM processes the AC functional yield of DRDRAM is about 30%. This means that out of 100 DRDRAM parts made and have all bits functional, only 30 of them can run at 800 Mbps. The others have to be binned to 600 or 700 Mbps speed grade parts which no system house will touch with a ten foot pole (lower performance than PC100 SDRAM). Third, these parts need *very* expensive production testers AND these testers can only test up 16 parts at a time compared to 64 for an SDRAM tester. Fourth, DRDRAM compatible motherboards and memory cards must be made with more expensive impedance controlled PWB technology. Finally, every DRDRAM and DRDRAM compatible chipset sold pays a small put significant royalty to Rambus inc. Some of these factors will lesson over time. But the 64 Mbit question is why would anyone pay 10 or 20% more (let alone the 50 to 100% seen now) for memory devices with significantly longer latency, thermal management problems, PWB design headaches *YET* offers little or no system level performance advantage over PC100 SDRAM (and evidence exists to show that formany apps DRDRAM is actually *slower* than PC100).

Re:Hidden instructions on Zilog (re-)introduces the Z80 · 1999-09-20 04:19 · Score: 1

The hidden instructions would work only if Zilog kept the *exact* same microarchitecture and hardwired the instruction decode logic and state machine implementations the same way. I would guess they would clean up the design and do more things in parallel . After all, double pumping a 4 bit ALU to implement byte wide instructions is hardly required in a Z80 core implemented in 0.35 um CMOS or whatever.

Re:If only... on Motorola G5 - 2Ghz 64bit · 1999-09-19 18:13 · Score: 1

Well Motorola will certainly need every minute of those two years to achieve their goal. A 2 GHz G5 represents a factor of four increase in clock rate for the PPC family in 24 months. Sorry but semiconductor technology *might* provide a factor of two on its own. This means that the G5 will have to be a very deeply pipelined design. This is a completely radical departure for PPC designers and I would anticipate both massively slipped schedules and disappointing results. Even if Motorola had experience with the tools and techniques for this style of design I would expect the curse of complex CPU design to apply like it did for the 21264, merced, and UltraSPARC-III. Don't hold your breath for either the delivery date or the 2 GHz clock rate. The other thing is I doubt Mot would invest the huge resources in the G5 unless it was also well suited for high end embedded control (would you bet a few hundred million bucks on Steve jobs being consistent? remember when Apple was the blue chip sustomer for the StrongARM? :) This means the G5 is unlikely to be as bleeding edge in complexity, bandwidth, and parallel execution resources as IA-64 and Alphas of the same era and will do less work per clock cycle.

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 15:44 · Score: 1

Hobbled? AMD is shipping production quantities of 700 MHz K7's to distributors as we speak. The G4 is built in a better process (0.22 um CMOS+Cu vs 0.25 um CMOS+Al) yet is unlikely to go beyond 500 MHz without a shrink. And the K7/650 yields 63% higher SPECint95 and 24% higher SPECfp95 numbers than the G4/400 despite its smaller L2 cache and lack of a Intel style compiler tweaking tiger team. Not bad for inferior silicon technology and the massive deadweight of x86 ;-)

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 06:08 · Score: 1

Sorry but the facts are against you. The 604 could issue four instructions per cycle. The 603 could issue two instructions per cycle. The G3 and G4 can issue three instructions per cycle but only of one the three is a branch. Thus the G3 and G4 stick with the same 64 bit, two instruction dispatch bus from the fetch/branch unit to the rest of the execution units as the 603 rather than the 128 bit, four instruction wide bus found in the 604. Thus the G3 and G4 are based much more closely on the 603 than the 604 (as far as these things go).

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 03:04 · Score: 1

Please distinguish between ISA and chip design. The x86 instruction set may be antedeluvian but the microarchitecture of the K7 is state of the art (could we expect less from Dirk Meyer and Co.?). And I said before, the G3 and G4 are fine embedded control MPUs and this wasn't serendipity at work either. When Apple twisted the knife in the Mac clone makers (and screwed Mot's own ambitions in this area at great cost $) both IBM and Motorola decided that PPC had to have a more stable market than Apple and that it was embedded control. PPC is going gang busters in embedded control and I wouldn't be suprised if they come in third behind MIPS and ARM in unit volume for 1999. They rule the VME RISC world and both the regular PPC parts and the integrated 860 style devices are hot in telecom. Altivec is powerful and great for DSP work and other hand-coded computational kernels but show me a production C or FORTRAN compiler that can effectively use it.

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 02:42 · Score: 1

First of all the Alpha EV6 has a 7 stage pipeline not 15. Secondly, it has a branch misprediction penalty of 7 or 8 cycles, not 200. Thirdly, you obviously don't have a clue what "superpipelining" is. And the EV6 can issue four instructions per clock (including up to four integer) while the G3 and G4 can only issue three (and one must be a branch) so you tell me which is more superscalar. Finally, am I a buddy of Groves? Nope. In fact the only reason I wouldn't p*ss on Andy Grove if I had the chance would be if he was on fire. BTW, what the heck is a 20164?

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 02:25 · Score: 1

Well the 21264 actually has 7 pipe stages for non-FP instructions. BTW, the G4 uses 4 pipe stages for integer instructions.

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 00:35 · Score: 1

You Apple supporters just don't get it do you? Saying that PPC beats x86 etc at the same speed is a meaningless statement. The five stage pipe used by PPC has to stuff a lot more work between clock edges so it will never catch up in clock rate to a modern microarchitecture desktop CPU. The G4 uses a nice'n'sexy 0.22 um copper process just to go 400 to 500 MHz. The Alpha EV6 can run up to 600 MHz in a 0.35 um aluminum process and over 800 MHz in a 0.25 um aluminum process. Yet the EV6 has the most complex microarchitecture ever shipped (can run with up to 80 instructions simultaneously in flight in its out of order execution core). It is not just the CPU either. Steve Jobs can blather on and on about Gflops all he wants but Apple builds mediocre chipsets to go with their reasonably nice processors. Look at McCalpin's STREAM benchmark scores for effective system memory bandwidth. The best Apple score is a G3/266 system with a STREAM SCALE score of 128 Mbyte/sec. A 440BX system with a Pentium II/350 gets 279 Mbyte/sec (and an Alpha EV6/500 XP1000 gets 971 Mbyte/sec). If Jobs wants a G4 "supercomputer" he had better hire some chipset designers away from Intel. BTW, I won't exactly hold my breath waiting for an Apple G4 system to show up on John Dongarra's Linpack list with 1000+ MFLOP/s in the 100x100 column. ;-)

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-15 00:02 · Score: 1

The G4 may very well targeted the 604 but I am talking about what is under the hood. The G4 and G3 internally ressemble the 603e more than the 604. Motorola rates the G4 at 18/18 SPEC95 at 400 MHz with honking large L2. The K7 gets 27/22 at 600 MHz. Please no comments about "per MHz performance". The G4 uses an ancient five stage pipe and will always have a substantial clock rate penalty in a given technology compared to more modern designs like P6, K7, and Alpha. Altivec is a well done SIMD extension but like all SIMD extensions, it will have little usefullness for most applications. You are correct about the relatively low power consumption of PPCs. But this is an issue only for laptops etc. The K7 is circa 50 Watts which is not difficult to deal with in a properly designed system. Current EV6 desktop systems dissipate up to 109 Watts per CPU without heroic effort and even the Exponential PPC based desktop that Apple canned years ago handled a 70+ Watt CPU.

Re:Somebody buy those nice people at Motorola a be on PowerPC Processor Roadmap · 1999-09-14 22:37 · Score: 3

The PowerPC family hasn't been maintained as a general purpose desktop and server RISC ever since the Somerset joint design center blew it keeping the 604 on the performance curve and wiffed on the 620. The PPC 750 (G3, or Arthur) is a glorified 603e tweaked to run Mac code better. The G4 is nothing more than a G3 with slightly improved FPU and the Altivec extensions. The G4 is a slick chip for high end embedded control and digital signal processing. But PowerPC hasn't kept pace with microarchitecture developments in the x86 world let alone with its RISC brethren. It has ridiculously short pipelines and rather modest out of order execution resources. The roadmap shows the G4 hitting 1 GHz in a 0.15 um copper process. I should bloody well hope so - the Alpha EV68 and AMD K7 will likely exceed 1.5 GHz in a 0.18 um aluminum process The writing is on the wall for Apple. The PowerPC has gone embedded control. Neither Mot nor IBM want to pretend to compete with Intel for the desktop any more (although IBM is doing some interesting 64 bit Power chips such as Northstar and Pulsar that compete against Xeon in the server market). Mot didn't build Altivec into G4 for Steve Jobs ego. It is there to win sockets in future generations of base stations for wireless services. As the differences between the bleeding edge desktop market and where PowerPC is heading become more and more evident Apple either has to start building products that look less like desktop PCs and more like internet appliances and PDAs etc or chose a new processor family.

Re:can anyone say "about fr*gging time?!?" on Socket Athlons by early next year? · 1999-09-06 20:09 · Score: 1

Sorry but 8 Mbyte is too much on chip cache to be credible. The 1.5 Mbyte cache in the 0.25 um PA-8500 takes up over 300 mm2. Even in 0.18 um an 8 Mbyte cache would take up ~900 mm2 or two times bigger than can fit on a stepper. The source might have meant an 8 Mbit L2 cache. Adding that size L2 to the K7 cache and shrinking it to 0.18 um would still produce a big chip for mainstream PCs but at least it is in the realm of possibility.

Slashdot Mirror

User: chip+guy

Comments · 38