Multicore Chips As 'Mini-Internets'
An anonymous reader writes "Today, a typical chip might have six or eight cores, all communicating with each other over a single bundle of wires, called a bus. With a bus, only one pair of cores can talk at a time, which would be a serious limitation in chips with hundreds or even thousands of cores. Researchers at MIT say cores should instead communicate the same way computers hooked to the Internet do: by bundling the information they transmit into 'packets.' Each core would have its own router, which could send a packet down any of several paths, depending on the condition of the network as a whole."
This technology that networks different cores can also serve another purpose, to prevent damage from core failure, and diagnose such failures. If the cores are connected to other cores, the same data can be processed by bypassing a damaged core, making over heating or manufacturing problems important, but almost treatable. Who knows, cores might even get replaceable.
I guess MIT has forgotten about the Transputer....
I started reading an immediately had flashbacks to the Transputer
Patent litigation: A doctrine of Mutually Assured Destruction... in which everyone seems willing to push the button
Ah, you're clever; but it's internets all the way down.
Errr... the internal "bus" between cores on modern x86 chips already is either a ring of point to point links or a star with a massive crossbar at the center.
ccNUMA?
Only the State obtains its revenue by coercion. - Murray Rothbard
AMD uses HT and Intel has its ring bus, both of which use point-to-point links. Buses have serious trouble with the impedance jumps at the taps and clock skew between the lines, that's why nobody is using them in high speed applications any more. Even the venerable SCSI and ATA buses went the way of the dodo. The only bus I can see in my system is DDR3 (and I think that will go away with DDR4 due do the same problems.)
thegodmovie.com - watch it
That's just plain inefficient use of silicon area. They wish to waste some of that limited space on additional logic that isn't strictly necessary. And it will cause a significant bottleneck to be created. Did they forget about DMA controllers or something? You already need a DMA controller no mater what and it's perfectly capable of accessing the necessary memories as it is. Adding some extra capabilities to the DMA controller would be far more efficient in logic area size and most likely lead to a better performance compared to this bad idea.
I admit that despite being a technical user, I was not aware that only 2 chips are allowed to "talk" at a given time. I had (erroneously, it would seem) assumed that in order for a 3+-core chip to be fully useful, such a switch/router would have to already be in place.
So, have Intel, AMD, and others simply tricked us into thinking that a 3+-core chip can actively use all its cores at once (as is the natural assumption), or am I misinterpreting something? If they have, why on earth didn't they include a "router" in the original designs? It seems entirely too obvious for the eggheads in R&D to have missed (or so one would think, anyway). I'm sure there are technical hurdles to overcome, but unless that can be managed, what is really the point of many-core CPUs that can't have all cores acting at once?
I still think switches on tiny low traffic networks is a silly notion, though now that cost of switches are insignificant(and when was the last time you saw a hub for sale) I just go with the flow.
Back in the day we had a client who dumped their hubs in each branch for much more expensive at the time switches, then whined that there was no advantage. I replied you insisted on putting your 2 386's and a dot matrix printer on it, and even threatened to take your biz elsewhere, you what you wanted, enjoy
Yeah, great idea. Take the very fastest communication that we have on the entire planet, and replace it with the absolute slowest communication we have on the planet. Great idea. And with it, more complexity, more caches, more lookup tables, and more things to go wrong.
The best part is that it's totally unbalanced. Internet protocols are based on a network that's ever-changing and totally unreliable. The bus, on the other hand, is best on total reliability and static.
I'd have thought that a pool concept, or a mailbox metaphor, or a message board analog would have been more appropriate. Something where streams are naturally quantized and sending is unpaired from receiving. Where a recipient can operate at it's own rate uncommon to the sender.
You know, like typical linux interactive sockets, for example. But what do I know.
As mentioned in other comments, this has been done before. The method of message passing isn't as fundamental as one key point - that it is all explicit message passing.
Intel and AMD x86/x64 CPUs use coherent cache between cores to make sure that a thread running on CPU 1 sees the same RAM as a thread running on CPU 3. This leads to horrible bottlenecks and huge amounts of die tied up in trying to coordinate the writes, maintain coherency between N cores (N-1 ^2 connections!), and it all just goes to hell pretty fast. Intel has this super new transactional memory rollback thing, but it's turd polishing.
The next step is pretty obvious (see Barrelfish) and easy: no shared coherency. Everything is done with message passing. If two threads or processes (it doesn't really matter at that point) want to communicate they need to do it with messages. It's much cleaner than dealing with shared memory synchronization, and makes program flow much more obvious (to me at least - I use message queues even on x86/x64). If you need to share BIG MEMORY between threads, which is reasonable for something like image processing, you at least use messages to explicitly coordinate access to shared memory and the cores don't have to worry about coherency.
This scales extremely well for at least a couple thousand CPUs, which is where the 'local internet' becomes useful.
Where it becomes not easy is that almost all programs written for x86/x64 assume threads can share memory at will. They'd need to be rewritten for this model or would suddenly run a whole lot slower since you'd have to lock them to one core or somehow do the coordination behind their back. It'd be worth it for me!
I can't seem to find the old story or my comment on it, but when Google acquired a 'stealth' startup a year or so ago the most interesting thing about it was that the primary investigator had a few patents for packet-switched CPU's.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
You'd think someone with a 7-digit UID wouldn't be so arrogant.
Support the EFF and Creative Commons. The war is coming, and they're supporting you...