The CPU Redefined: AMD Torrenze and Intel CSI

Re:huh? by Chrisq · 2007-03-05 01:09 · Score: 3, Interesting

I think it has to do with the number of configuration options. Even if technology was able to fabricate one super chip with the best possible GPU and sound processor might be great for some people, but others would be better off with extra general purpose cores, cache, etc. The flexibility of "mix and match" probably outweigh the advantages of having the separate components on a single chip

Re:huh? by MrFlibbs · 2007-03-05 01:18 · Score: 4, Interesting

The CPUs will still be multi-core. They will also integrate as many features as makes sense. However, there are limits on how big the die can be and remain feasible for high volume manufacturing. Using an external co-processor is both more flexible and more powerful.

The interesting thing about this whole co-processor approach is that the same interface used to connect multiple CPUs to each other is being opened up for other processing devices. This makes it possible to mix and match cores as desired. For example, you could build a mesh of multi-core CPUs in a more "normal" configuration, or you could mate each CPU with a DSP-like number cruncher and make a special purpose "supercomputer". It will interesting to see what types of compute beasts will emerge from this.

Interesting by Aladrin · 2007-03-05 01:20 · Score: 2, Interesting

I find the idea of multiple Processing Unit slots on the motherboard that can each take different type of chips to be very interesting. I'm not sure how well it will work, though. The article mentions 5 types that already exist: CPU, GPU, APU, PPU and AIPU. (Okay, the last doesn't exist yet, but company is working on it.) There's only 4 slots on that motherboard that's shown. I definitely do NOT want to see a situation where the common user is considering ripping out his AIPU for a while and using a PPU, then switching back later. I can only imagine the tech support nightmares that will cause.

So the options are to have more slots, or make something I like to call an 'interface card'. See, there'll be these slots on the motherboard that cards fit into... wait, don't we have this already?

And more slots isn't really an option because the computer would end up being massive with all the cooling fans and memory slots. (Which are apparently seperate for each PU.)

I kind of hope I get proven wrong on this one, but I don't think this is such a great idea. Just very interesting. Having 16 slots and being able to say you want 4 AIPUs, an APU, 4 GPUs, 3 PPUs, and 4 CPUs on my gaming rig and 1 GPU, 1 APU, and 14 CPUs on my work rig would be awesome.

--
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM

Re:Interesting by eddy · 2007-03-05 01:51 · Score: 2, Interesting

Maybe if a motherboard featured a very large generic socket to which was attached one cooling solution, it'd work out better. Processing Units, which would be smaller as to fit as many as possible, would be able to go anywhere in this socket (in a grid-aligned fashion). Easiest solution, socket is X*X square grid, and all PUs must be say X/2 (or hopefully X/4) squares which can be arranged in any fashion. Plunk them in, reattach cooling over all of them, boot and enjoy that 4CPU, 2GPU, 2FPU configuration.

Separate sockets with separate cooling, which I assume is what we're about to see [more of], is going to get messy. And loud.

Maybe in the future some day we've have "Tetris Computing" where you have to puzzle to fit the PUs optimally in the socket. "Oh, I'd really like that nVAMD GPU eXTReMe 2010, but it's an L-piece, and I really need an S-piece for 'tetris' in my bottom half of the socket..." :-)

--
Belief is the currency of delusion.
Re:Interesting by Overzeetop · 2007-03-05 02:18 · Score: 5, Interesting

You are correct - sockets are just a reincarnation of slots, but less flexible because you're limited to what you can put on a single chip instead of an entire card.

Perhaps the better thing to do would be better slot designs (not that we need more with all the PCI flavors floating around right now) with integrated, defined cooling channels. If you were to make the card spec with a box design rather than a flat card, you could have a non-connector end mate with a cooling trunk and use a squirrel cage (higher volume, quieter, more efficient)fan to ventilate the cards.

--
Is it just my observation, or are there way too many stupid people in the world?

Re:huh? by Mr2cents · 2007-03-05 01:23 · Score: 5, Interesting

Adapting another quote: "If you want to create a better computer, you'll you'll end up with an Amiga". It's more or less what they're describing here. Amiga made heavy use of coprocessors back in the days. It could do some quite heavy stuff (well, at the time), while the CPU usage stayed below 10%.

One cool thing I discovered while I was learning to program was that you could make one of the coprocessors interrupt when the electon beam of the monitor was at a certain position. Pretty nifty.

BTW, for those who are too young/old to remember, those were the days of dos, and friends of mine were bragging with their 16 color EGA cards. Amiga had 4096 colors at the time.

--
"It's too bad that stupidity isn't painful." - Anton LaVey

Re:huh? by TheThiefMaster · 2007-03-05 01:26 · Score: 4, Interesting

It's a cost and feasibility thing. The original FPUs were separate because they were expensive, not everyone needed them, and it was impractical to integrate them into the cpu because it would make the die too large and result in large numbers of failed chips. They became part of the chip later once the design was refined and scaled down.

The same applies to trying to integrate GPUs into the CPU, at the moment a top-end GPU is too large and expensive to integrate, and not everyone needs one. The move to having a GPU in a CPU socket should cut a lot of cost because the GPU manufacturers won't have to create an add-in-card to go with the GPU, they can just design the chip to plug straight into a standardised socket.

At the same time low-end GPUs are small and cheap enough that they are being integrated into motherboards, integrating a basic GPU into the CPU seems like a good next move, and the major cpu manufacturers seem to agree. IIRC Via's smallest boards integrate a basic cpu, northbridge and gpu into one chip? AMD are definitely planning it with their aptly named "Fusion". *Checks wikipedia* Yeah, Via's is called "CoreFusion".

Still, you are right, all-in-one cpus are the future, we're just not quite there yet.

Re:huh? by walt-sjc · 2007-03-05 01:52 · Score: 5, Interesting

Ahh - the Amiga. My favorite machine during that era. I got my A1000 the first day it was available. Modern OS's could still learn a lot from that 20 year old OS. Why oh why are we still using "DOS Compatible" hardware????

Amiga had 4096 colors at the time.

Better put "4096" with a "*" qualifier. You couldn't assign each pixel an exact color - the scheme got you more colors by being able to set a bit that said that the next pixel modifies the previous pixel by "x". In this way, they could get more colors using less memory than traditional X bits per color per pixel schemes (Amiga was a bitplane architecture.)

Anyway, back on topic, I wish that the CPU manufacturers could finally come up with a "generational" standard socket. A well-designed module socket should last as long as an expansion slot standard (ISA,PCI,PCIe) and not change for damn near every model of chip. I should be able to go out and get a one, 2, 4, 8 socket motherboard, and stick any CPU / GPU / DSP module into it I want. Can we please finally shitcan the 1980's motherboard designs?

Cell Clusters by Doc+Ruby · 2007-03-05 02:43 · Score: 3, Interesting

How about the Cell uP (first appearing in Playstation3), which embeds a Power core on silicon with a 1.6Tbps token ring connecting up to 8 (more later) "FPUs", extremely fast DSPs. IBM's got 4 of them on a single chip, connected by their "transparent, coherent" bus, a ring of token rings. One Cell can master a slave Cell, and IBM is already debugging 1024 DSP versions, transparently scalable by the compiler or the Power "ringmaster" at runtime.

These little bastards are inherently distributed computing: a microLAN of parallel processors, linkable in a microInternet.

Imagine a Beowulf cluster of those! No, really: a Beowulf cluster of Cells.

--

--
make install -not war

Re:huh? by evilviper · 2007-03-05 03:18 · Score: 3, Interesting

"If you want to create a better computer, you'll you'll end up with an Amiga". It's more or less what they're describing here.

That's what he's describing, but I don't believe for a second that's what it's going to be...

I don't believe for a second practically ANYONE is going to buy an expensive, multi-socket motherboard, just so they can have higher-speed access to their soundcard... Ditto for a "physics" unit.

This exists solely because CPUs are terrible at the same kinds of calculations ASICs/FPGAs are incredible at. That will be the only killer app here.

Video cards are a good example on their own. CPUs are so bad, and GPUs are so good, that transferring huge amounts of raw data over a slow bus (AGP/PCIe) still puts you far ahead of trying to get the CPU to process it directly. And it works so well, the video card companies are making it easier to write programs to run on the GPU.

And GPUs aren't remotely the only case of this. MPEG capture/compression cards, Crypto cards, etc. have been popular for a very long time, because ASICs are extremely fast with those operations, which are extremely slow on CPUs.

The situation is much more like x87 math co-processors of years past, than it is like the Amiga, with independent processors for everything.

It is likely that, in time, integrating a popular subset of ASIC functions into the CPU will become practical, and then our high-end video cards will be simple $10 boards, just grabbing the already-processed data sent by the chip, and outputting it to whatever display.

Then maybe AMD and Intel will finally focus on the problem of interrupts...

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

Re:huh? by rbanffy · 2007-03-05 04:19 · Score: 2, Interesting

OK... Let's rephrase that:

Folks with 16-bit PCs were bragging about their 16 out of 64 color EGA cards and single-tasking OSs when even the simplest the Amigas had 32-bit processors, 32 out of 4096 colors, PCM audio and a fully multi-tasking OS coupled with a GUI.

As for the "processor socket", there are people selling computers that go into passive backplanes. If you put the CPU and memory in a card, there is little reason why you would have to upgrade the rest of the computer when you change the CPU (you would have to scrap the card, anyway, but processors are intimately related to chipsets, so, it is to be expected.

Thete are some SoC (system on a chip) solutions out there too. Those incorporate the chipset (or most of it) into the CPU, so, it would be easier to build a trans-generational socket

--
http://www.dieblinkenlights.com

You can buy Torrenza today by LordMyren · 2007-03-05 04:19 · Score: 2, Interesting

To the best of my knowledge, Torrenza is already implemented. The HTX port on many Opteron motherboards is a HyperTransport connection. You can already buy FPGA dev kits from U. of Mannheim that plug into this HyperTransport slot and interface with the rest of your system. Torrenza may continue to advance the HyperTransport / Coprocessor war, but as far as I'm concerned, Torrenza is already here.

Re:huh? by WorseThanNormal · 2007-03-05 04:29 · Score: 2, Interesting

But by making specialized chips, you can limit and optimize the instruction set to allow for many more instructions per second. The performance gains of this strategy (as well as using this a a means of heat distribution) could out strip the latency gains of putting everything on one chip.

looks like a revamped AMD 4x4 by ZirbMonkey · 2007-03-05 06:35 · Score: 2, Interesting

If I was reading the picture on the second page correctly, it looks like AMD plans to use a "4x4" type motherboard architecture, but with the second CPU spot made for a dedicated GPU chip instead of another redundant CPU. The CPU and GPU wouldn't be on the same die in this case.

I think this would make sense to me. Right now when I upgrade my video card, I throw out the ram, GPU, and integrated circuitry of the entire package to replace everything with the new video card upgrade (which happens every 6 months for me). What if I could buy the GPU and DDR3 separately and not throw out everything each upgrade? Not only would the infrastructure be faster, but the upgrades could be cheaper since you don't need to buy the whole package every time.

This obviously only matters to the enthusiast trying to keep on the edge of Moore's Law. I like the idea, but we'll see how things turn out in reality of 2010.

Re:huh? by default+luser · 2007-03-05 07:22 · Score: 3, Interesting

Isn't this called "passive backplane" or something? If it already exists for some systems, why not desktop computers?

Early high-end computer systems started out like this, utilizing backplanes like VME. They've been phased-out, because ultimately that modularity was too expensive, and because the shared-bus architecture hurt performance. Hardware devices that used to require multiple cards can now fit on a single chip, and have their own PCIe drop to increase performance. Memory upgrades that used to require multiple cards just to reach 1MB are now eclipsed by 8 and 16-chip configurations on a single DIMM (a specialized expansion slot), and have their own bus to improve performance.

Let's say they went with the Single-board computer design (CPU+memory+bus controller) - now your costs go up, because you have to build multiple "processor cards" for all the different backplanes you want to plug into. ISA backplane - 1 model. PCI backplane - 1 model. PCI + ISA backplane - 1 more model, and it also requires a new specification: the new bus designs have to play nice with the limited I/O space at the back of the card, so you end up either making the bus connector larger, or you end up making certain bus combinations impossible.

With the motherboard and atached bus design, your costs go down because you can provide a mixture of the busses that are the most popular. Thus, you only have one product to design and electrically verify, and only one manufacturing line to test.

Also, when you move to point-to-point architectures like PCI-Express, with a separate backplane you really limit yourself to the slot configurations you can offer. Unlike with a shared bus, with P2P interconnects you have to make sure the backplane layout matches the connector layout exactly. This means you either standardize on ONE configuration (boring), or you put the ports on the processor card (what we are doing).

The only places that still use modular bus designs today are embedded developers, and that's because they still need the expandability and modularity that end-users do not. They also need the backward-compatibility affored by these old bus specifications (VME especially). They pay for it, in terms of performance - most of them bypass the slow backplane of VME or CompactPCI with faster interconnects like Gig/10GigE, Fibre Channel, RapidIO or Infiniband.

--

Man is the animal that laughs.
And occasionally whores for Karma.

Re:huh? by real+gumby · 2007-03-05 07:38 · Score: 2, Interesting

Werent the first co-processors FPUs

Actually there was an evolution of processor design into single, monolithic processing units; until well into the '70s it was hardly uncommon that computers would have all sorts of processing units (remember the "CPU" is the "Central Processing Unit.") Of course in this case I'm primarily talking about mainframes; one of the distinctions of the minicomputer (and later microcomputer) was that "everything was together" in the CPU. But even then the systems didn't really start out monolithic: it was not uncommon to find minicomputers with separate FPUs, writable microcode and the like.

Another element of mainframes which is reappearing is I/O processors (AKA "channel controllers"); most mainframes from the 50s on had programmable I/O processors. The ARPAnet interfaces from the very earliest days were computers in their own right.

Finally, consider that there's more to the interconnect issues than low latency between processing units. For example, if you can load a coprogram into a coprocessor (be it FPU or parallel unit, graphics unit or the like) then it can crunch away and do its own DMA (perhaps to separate banks) without (for example) cache contention or the like. You can get better performance than having all these features on-chip.

By the way the unified philosophy of the microprocessor influenced Unix and the C language, which featured a monolithic kernel (and lots of stuff in userspace) in the former case and "weird" artifacts like reduced syntactical diversity and the like (remember the line in the intro to K&R: " 'You mean you have to call a function to do I/O?' " -- I found this hilarious since I was a Lisp programmer at the time and was well used to this approach). But look back at how people were thinking back then: these were small systems for small (e.g. research, instrument control, etc) applications. Nobody, or hardly anybody anyway, back then was imagining minicomputers replacing "real" computers. In a funny way they were right, since today's microcomputers look a lot like the mainframes of yore.

Slashdot Mirror

The CPU Redefined: AMD Torrenze and Intel CSI

16 of 200 comments (clear)