HP Shows Off PA-8800 SMP-On-A-Chip CPU Plans
Eric^2 writes: "At last week's MicroProcessor Forum, HP's David J. C. Johnson unveiled the details of HP's latest RISC processor destined to redefine performance in Server-Class processors. Following a relatively simple strategy, the PA-8800 processor combines two PA-8700 cores on a single chip to enable symmetric multiprocessing (SMP) on a single processor. Aside from bumping the core speed up to an initial 1 GHz, enhancements include the addition of combined 35 MB L1+L2 cache. The article contains the full text. AMD, please steal an idea..."
Why hasn't someone else done something like this? I would pay whatever it cost to get even an 8MB L1 & L2 Cache. Anyone want to make me one?
Um, this is my sig.
It wouldn't be stealing an idea. This idea has been around for a long time in academia. Maybe the poster forgot this, but the POWER4 from IBM does this, and comes with 32Mb of L3 cache, plus an ondie shared L2 cache. The idea isn't new, it's known as CMP (Chip-level Multi-Processing). Really, "SMP on a chip" is merely called CMP.
:)
Also, though Sun has decided not to use the MAJC architecture for anything (they were hoping to try to get it to become a video-accelerator, but that's not even going to happen, most likely), that too was fully spec'ed out to have multiple cores on a chip...it's really nothign new
The longstanding rumour is that AMD will be coming out with a dual-hammer processor (ie, CMP). In academia, the idea has been used frequently as well.
The idea of using CMP isn't even that big a deal to most consumers. While it would be nice for AMD to come out with a chip that does multithreading (merely because it increases real-world throughput quite a bit, depending upon the type of multithreading), the average PC running windows 9x/ME/ XP Consumer won't be able to multithread anyway. The only reason for AMD to multithread is for the server-space, which is what they're aiming for with the hammer series...but I digress.
The most interesting parallel architecture I heard about at the MPF was Siroyan's OneDSP architecture. This is a clustered VLIW machine that can execute up to 64 instructions each cycle! See the EE times article and their MPF paper
Earlier steps in the multi-CPU direction included the 8-way DEC Alpha (killed in the merger with HP?) and a little National Semiconductor product for embedded systems with two very modest CPUs on a chip.
Risc or cisc architecture primarily affect the complexity of the fetch and decode stages of the CPU.
The famous Intel-pipeline is in the execute stage (ALU).
Pipelining is a strategy which is equally valid for both risc as in cisc architectures, and a risc architecture do not offer any complexity advantage in the execute stage. After all a multiplier is a multiplier regardless of overlaying architecture.
Nowdays we don't really see much diffrence in performance between risc and cisc architecures for upscale processors. This is because the savings in fetch and decode logic are dwarfed by other costs like prefetch, reordering and brach prediction (which are used for both architectures).
It doesn't seem too practical to me. Most apps don't benefit greatly from SMP anyway.
They don't? What kind of server do you run? Most all pieces of production-class server software that I know of benefit from multiple processes. Look at Apache, forking off five, ten, or even more processes to handle requests. MySQL, I believe, uses threads. PostgreSQL forks off a new backend for each connection. Shoot, even your telnet, ftp, ssh, and mail daemons will fork off for each connection, allowing you to take advantage of more than one CPU.
If you're sitting at home working on a spreadsheet, you're right, SMP isn't for you - and this machine isn't targetted at you. When you're running a server that may have tens, hundreds, or thousands of SIMULTANEOUS processes fighting for CPU time, every processor counts.
And, to make things even better, even if you're only running a single, non-threaded process, having two processors still makes the machine much more "responsive", as the second CPU can handle kernel code for file IO, network code, interrupt handling, writing to logs, and a lot of other tasks. Ever seen how much CPU time even syslog can chew up?
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
Bullshit. No we haven't. We're still right here.
Interesting, but the PA-8800 is a PA-RISC processor, not an IA-64/IPF processor. It doesn't have predicated execution or instruction bundles. What you said is true, just for a different instruction set architecture and processor family.