Sun Unveils Direct chip-to-chip Interconnect
mfago writes "On Tuesday September 23, Sun researchers R. Drost, R. Hopkins and I. Sutherland will present the paper "Proximity Communication" at the CICC conference in San Jose. According to an article published in the NYTimes, this breakthrough may eventually allow chips arranged in a checkerboard pattern to communicate directly with each other at over a Terabit per second using arrays of capacitively coupled transmitters and recievers located on the chip edges. Perhaps the beginning of a solution to the lag between memory and interconnect speed versus cpu frequency?"
therefore the speeed increase will be unnoticable.
I wonder if this release might have been pressed forward a bit to squelch some of the talk about Sun losing their will to innovate after Bill Joy left.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
This sounds like a sweet technology. Hopefully we'll see this in a real product in the near future.
Though the way people talk about SUN, were more likely to see it licensed to some other company...
Yes Francis, the world has gone crazy.
Does "terrabit" mean that it will be made of pieces of the earth?
Via Google
This could prove very interesting as the speed usually drops when "leaving the chip" to do communications. There has been alot of research to develop protocols to ease on-chip communication when several ICs are combined on a single chip. If Suns technology can stand the test, NoC/SoC products could reduce it's time-to-marked dramatically...smaller and faster devices for everyone!
BTW: I didn't RTFA since it requires (free) reg.
Or maybe Rambus is already fixing to sue them.
-Libertarian secular transhumanist
That is the nature of the beast.
Remember how excited you were to get your hands
on a 386 machine?
The thrill of your first encounter with a 286 screamer?
Upgrading to 16k from 4k on your TRS-80?
Your first disk drive for your Apple 2?
It's all relative.
So enjoy
Whatever, I think this will end up being the SUV of chip to chip conections. ;)
New Sun Microsystems Chip May Unseat the Circuit Board
This might be the obvious question but, why hasn't anyone done this before?
It seems obvious, the end of chip has pins. The chip it will eventually connect to has pins. Instead of having 20 trace lines to the next chip why not redesign them so the out/inputs of both line up to reduce the complexity of the design.
Anyone wanna fill in my mental gap for me?
You mean those silicon chips with ceramic packaging? They sure will!
of these, well that's kind of the point actually :-)
...in bed
I wonder if this hardware computing model could provide the first real base for Neural Network computing? As far as I know, any neural network is currently emulated on linear processing machines.
Someone gets it. As an Electrical Engineer-in-training, I was always frustrated with people who got these big bad processors and wondered why their improvement was minimal.
They never quite grasped that the biggest bottleneck is between the processor and memory.
My EE instructor always said that they could improve performance by doing one simple thing: make the interconnects on the motherboard between the motherboard and RAM rounded instead of cornered. You could then increase bus speed as you wouldn't have magnetic loss at the corners like you do now.
You fix that, and you can see a SUBSTANTIAL improvement in performance. The only thing that can be done beyond that is to get a Platypus drive (Solid state "Hard Drive" made from Quikdata made from DDR RAM). Then you reduce your access time to your hard drive from milliseconds to nano/microseconds.
Sounds a lot like the ol' Transputer (was from INMOS), of course faster. One could also think of AMD's HyperTransport. So, again, except maybe for the speed, I don't see much innovation here.
If only people could remember that "terra" has something to do with earth, "tera" is the unit...
Placing large numbers of chips adjacent to one another has obvious problems with heat and power, in particular when running at those speeds. That, rather than interconnect technology, is probably the main reason we still package up chips in large packages.
This might be useful for placing a small number of chips close together, in particular chips that may require different manufacturing processes.
Most importantly, will I still need my ThinkGeek 'I am teh Chip Haxx0R' bib?
And they call that news? Most of the stuff around me is made of bits of the earth. Some bits are more processed than others, but they are still b.o.t.e.
Transputer background
Anybody remember the viruses which could travel from floppy to floppy back in the C64 days? You would put an infected floppy next to a clean floppy, and the virus would just hop over! Don't know about the speed though...
;)
(No kidding, there were people back then who told and believed this nonsense
Great potential for gridcomputing, just keep adding chips.
Arrgh! That's twice. I'm gonna have to start reading at the bottom of the page so I can catch the references to other articles.
Public use of any portable music system is a virtually guaranteed indicator of sociopathic tendencies. -- Zoso
Schematics here
The article is a bit vague as to what the innovation really is.
The article immediately made me think of multi-chip modules. Multi-chip modules is an idea which never really caught on in the industry (except for IBM), and I'm not sure how Sun's innovation isn't just a take-off along that idea. Multi-chip modules have failed due to costs since much has to go right to get a multi-chip module that works.
Any practical chip-to-chip connectivity scheme had better have a good rework scheme. If it doesn't, it's just boutique technology that will not affect the industry overall.
Having worked on chips with multi-gigabit pins, a huge problem is resynchonizing the signals. Creating a receiver to align one pin's data with 15 neighbors at 3GHz takes a whole lot more logic space on the die than a small driver (or receiver). The auxiliary logic basically makes shrinking the final driver FET almost meaningless.
Modern chip design is a constant trade-off between features and cost. And what's cheap is what everyone has been doing for years (or is an evolution of that).
Isn't that called a trace? Or another fancy name would be a lead? I think that there are people with prior art...
Do not look into laser with remaining eye.
It's terabit and receivers, Hemos. Not terrabit or recievers.
and here's my proof
Shall we call it Prime Intellect?
(actually, by the story naming convention, it would be closer to intellect 1, but oh well)
funny munging
By comparison, an Intel Pentium 4 processor, the fastest desktop chip, can transmit about 50 billion bits a second. But when the technology is used in complete products, the researchers say, they expect to reach speeds in excess of a trillion bits a second, which would be about 100 times the limits of today's technology.
If a P4 is already doing 50 Gbps (as they say), and this uber-technology will allow 1Tbps (which is 20x a P4's 50Gbps), then how is that "100x the limits of today's technology" ?
<shakes head>
There's nothing you can do anymore.
Normally I don't pimp Sun, but here's something that makes me think they still have a finger on the pulse of things:
;-)
Read about plans for Sun's "Niagra" core
I understand they hope to create blade systems using high densities of these multiscalar cores for incredible throughput.
There's your parallel/grid computing.
the Transputer. It had 4 available hardware connections and the description of the way the different processors communicate is very similar to what is described by the article.
Of course to take maximum effect of this communication speed in general parallel applications, main memory access would have to be improved. I'd guess these things will have huge on-chip caches.
IANAEE either, but this made a little more sense to me after I read this Inforworld article, which talks about two other aspects of Sun's DARPA-funded project: clockless "asynchronous logic", and building processors with interchangeable and upgradable modules. They absolutely need these busless "proximity" interconnects for the processor modules to communicate at close to on-chip speeds, and the clockless architecture lets them get rid of the bus. Or vice versa... or something like that.
Working prototype computer about six years away, according to the article.
"This is not a sig." -- R.
As usual with alot of Computer Science, this appears to be just an old idea reinvented...the Transputer...and about time too!
ibm has layed out similar plans for modular storage blocks (http://www.google.com/search?sourceid=navclient&i e=UTF-8&oe=UTF-8&q=ibm+storage+bricks) connected by pads on the surfaces. good luck patenting that, unless the application was made a couple of years ago.
Now there's something we EE's know about. (Or not...) We got it wrong in the upper North East with the huge black out.
Looks like it's even used in the tiny chip to chip communications. Basically, to overcome the impotence caused by the little bit of impedance between the chips, we'll add some capacitance (CAPs). Adding the cap's to ground provides reactive power.
There is nothing new under the Sun. This concept, along with several others like it have been around for at least 15 years.
Ralston Purina will sue for copyright infringement.
"[T]he single essential element on which all discoveries will be dependent is human freedom." -- Barry Goldwater
"Perhaps the beginning of a solution to the lag between memory and interconnect speed versus cpu frequency?"
You mean the problem that everyone outside the PC world already solved? Please people, for your own sake go learn about the alpha architecture. Where all the CPUs connect to other cpus via north, south, east and west. They can all communicate that way, even routing around failed cpus. Then you can start crying when you realize crapaq threw it away.
You feel strongly enough about registering at the NYT to mention it in your interesting post, but you registered for Slashdot where registration isn't required for reading articles or even posting. Just curious - what's the difference?
Respectfully,
"Mignon"
You Java programmers are one step above VB programmers!
They're actually made of bits of exploded stars.
I bet they could bath the board in a dielectric liquid that would increase the capacative coupling, while also removing heat more efficiently. Not sure what this liquid would be, but those guys at Sun are smart, they can figure it out.
I cna't help but wonder just how bad the heat problems could eventually get on a system designed like this. I mean, the Northbridge on a typical PC these days can burn your hand...
for those of us ignorantly curious enuff to click on the above link... pray for us...
Nothing new here. No?
I remember seeing the first Transputers on my very first Cebit visit sometime in the early 90s. The Transputer workstations would crunch full screen fractal grafics in seconds, which was an amazing feat back then. Just plain *everybody* was convinced they would put the then ruling Amiga to rest or - also a popular theory back then - would be adapted by Commodore. There is this Transputing PL Ocam that, as far as I can tell, makes Java, C# and all the rest look like kiddiecrap. Everyone who I know who knows Ocam says it rules and usually also has the skills to prove it.
The overall concept - very much like the one Sun is talking about now - was to stick in a CPU, or 2 or 10 and make the box faster with nearly no decline in perfomance/processor ratio. It actually did work that way.
Transputers never made it though, to expensive and the required software developement was to esotheric back then. It would be really nice to see this concept rise again. Maybe now they actually would be affordable.
We suffer more in our imagination than in reality. - Seneca
If you look at a modern evaluation board with gigabit SERDES or SERialization-DESerialization (e.g. the 3.125Gbit/s differential signal pair per channel), the trace routes are typically rounded, with no square corners. This is done to reduce the effective impedance along the line which needs to be carefully controlled. They also run in parallel closely-routed pairs because it's typically a differential signal. Actually looks a bit like a set of minature train tracks without the railroad ties.
In fact, multichannel SERDES is the next real interconnect technology. It's used in Infiniband, HyperTransport, PCI Express, Rambus RDRAM and in 10 Gb/s Ethernet (usually as 4x3.125Gbit/s channels as a XAUI interface between optical module and switch fabric silicon with 8b/10b conversion). There are even variants, such as LSI Logic's HyperPHY, that are deployed specifically for numerous high-bandwidth chip-to-chip interconnections. The problem that is cropping up is that the traditional laminate PCBs are becoming the limiting factor in increasing per-channel connectivity, to the extent that 10Gbit/s per channel speeds are next to impossible on these boards due to the lack of signal integrity. There has been some experimentation for very short hops on regular boards, as well as using PTFE resins to manufacture the boards themselves, but it's precarious at best.
As for Sun's technology, it's interesting but I don't know how much it will catch on or how feasible it will be. It creates packaging issues and requires good thermal modelling and 3-D field modelling to account for expansion and contraction through the operating temperature range and the presence of nearby signals, which could affect the integrity of the signals.
Seems to me that not only would such a design cause RF interference, but it would be susceptible to strong RF fields as well. I doubt it could pass part 15 compliance.
--fatboy
...did anyone else remember the MicroJ-11, a PDP11-on-TWO-chips implementation in a single DIP? Two chips wired together on one carrier. (IIRC the floating-point unit was one chip and everything else was on the other.) It got used in cluster storage controllers (HSC70/90) and all sorts of interesting gadgets.
As any EE will tell you, time varying signals can be transferred across a capacitor. This U.S. patent describes an isolation circuit using capacitive coupling to couple signals between two IC chips within the same IC package. Sun seems to be using this type of technology to communicate between IC chips in separate packages.
Oh, BTW this Sun stuff is nothing like transputers and their links. It's chip-level interconnect to avoid pads.
Maybe its inside a function with parameters, which recursively calls itself, unconditionally.
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
Ivan E. Sutherland has always been a great thinker. An article about asynchronous computers fascinated me last year. You can find more details here. And you can count on him for real products to come.
You C programmers are one step above the end of the buffer.
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
Great ! I am so happy to see that there are some real programmers exist who see the truth. I have seen our Sun E3500 with 8 CPUs felt like a pentium pro with java shit running on it. But it was management's vision ... what we can do. I just procured the servers and pretended that I am doing social work by giving Sun more money.
- People who believe other people have no right to live, got no right to live ...
If that's a magnifying glass, it's not a very good one. Their hands look like the size of... their hands!
This is very, very interesting.
I/O limitations of traditional chip architectures prevent us from building truly large-scale hardware neural network systems. To achieve the connectivity required to model a net as complex as the human brain's, it's not enough to link up an array of small neural chips,. because you hit a bandwidth bottleneck as soon as you try to go off-chip. This limits neural architectures to simple, regular block-structured models.
These chips of Sun's only meet up at the edges, but (assuming advances in reduced power usage and heat dissipation technologies) imagine if this was extended to provide connectivity on all exterior surfaces of the package? You could build neural networks of arbitrary size that weren't I/O bound.
This would enable truly "brute force" approaches to connectionist AI, and quite possibly something capable of human-level intelligence in real time.
compliance with the US Patriot Act
and "Pentagon John" Poindexter's
TIA (Total Information Awareness).
WTF, who would even need a "wiretap"
when a loop antenna and a LNA will do?
there's no "heap". learn some assembly. (heap is only a concept)
and the next new SUN Micro innovation will
be a wireless mind-machine brain implant?
You need more training. Or less ego.
Look at a recent P4 motherboard for 45 degree traces. Look at any previous motherboard with RAMBUS (even an Nintendo 64 from November 1999) for curved traces.
It's not so much a question as knowing about something as it is implementing it. If it isn't affordable, it isn't worth it. Because if it isn't affordable, you might be able to buy two affordable ones for the same price. And you're going to have trouble beating the performance of two systems with one.
Finally, to make a hard drive from RAM is to totally lose track of the idea of what a hard drive is. Hard drives are supposed to be slower but they make it for it with lower cost per megabyte. Instead of a RAM drive, just put more RAM in your machine, it will use it as a disk cache/backing store and get you all the performance you want.
Also, at 120us per command register access, you really cannot initiate any transfer over ATA in under 0.75ms.
www.xanoptix.com: making Hybrid ICs with optical interconnects. (Their proof-of-concept is a hybridized parallel optics transceiver that runs 72 channels at 3 1/8 gb/s.) There had been a handful of other competitors -- check out lightreading.com for some of them -- but most lost funding and/or momentum, while Xanoptix has been going pretty much at-speed.
With all this talk of Sun Chips, is anyone else hungry? I wonder if they'll produce a ranch version.
You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.
Chuck Moore's 25x MISC stack machine technology.
See www.colorforth.com, and www.ultratechnology.com for more information on this overlooked, and underrated stuff.
I'm concerned about new applications in surveillance.
While I'm sure remote gap-dropping is a long ways away, what if someone were able to place a sensing device on top of the gap between the chips? Who needs to decrypt someone's hard drive if they can just log memory transfers?
It's not like similar techniques aren't already in place. For example, I'm certain I heard about a case where someone put a keylogger on a mobster's computer. I know someone's used keyloggers at my college campus.
What's this Submit thingy do?
Is that...
1,000,000,000,000 bits per second
or
2^40 bits per second?
Theres a whole bunch of bits per second difference there...
"Perhaps the beginning of a solution to the lag between memory and interconnect speed versus cpu frequency?"
Thank God! This was something that used to keep me up at all hours of the night. A solution to this problem will change the world. It may even stop the RIAA from suing young girls and stop global warming. Thank you for letting me sleep at night once again.
WTF?
--ken
Bitcoin pyramid: Join here: http://www.bitcoinpyramid.com/r/1427 it's FREE!
...in particular chips that may require different manufacturing processes.
Or at least portions of more complex circuits where part of the circuit may not warrant the added cost of SOI, 90nm, or strained silicon.
But then, those divisions are already made. AMD, for one, is working on recombining those parts. As an example, consider AMD's putting the memory controller on the CPU die.
I am curious, however, as to whether you could have more than one silicon die in the same ceramic casing. This would let you combine different parts made using different techniques. The CPU and L1/L2 caches could be on a high-cost process, with the L3 cache and memory controller being built with cheaper processes.
Placing more emphasis on cost savings than on performance, you could build the L1 instruction cache with a slower process than the L1 data cache. Or you could leave all of L1 on the same process and split L2 into instruction and data caches, with different processes.
If you wanted to ramp up CPU speed without a major hit to your performance, you could reduce the feature size of the core and L1 caches, and put L2 on a slower, more reliable process. That way, you could ramp up core speeds without having as much worry about yield loss from cache failure.
Helluva way to increase yield, and it gives the designer a LOT of options.
What's this Submit thingy do?
... speed doesn't matter anymore.
I wonder why these dummies in these corporations spend so much money in high performance computing research, man if only they would listen to Carlos.
After all, it's not like this will make Mine Sweeper or the Windows Calculator program any faster. Speed doesn't matter people, give it up. Let's just dedicate ourselves to farming.
- sigs are for wimps.
YOU GET +5 FOR STEALING MY COMMENT FROM AN OLD ARTICLE!!!
2 967
(thanks rei)
http://slashdot.org/comments.pl?sid=78540&cid=696
How many other +5 informatives did you steal, asswipe?
Fuck Beta. Fuck Dice
Sounds like I should finish my Object Occam language ready for Suns work to be on general release:-)
The lab next door to me has an old Evans & Sutherland ESV graphics processor. It's the size of a small dorm fridge and it was built in '90 or '91, and is still working...as an end table, after being retired in '98. But I understand that a few are still doing graphics work today, despite (I think) not being Y2K compliant. Anyway, I like knowing that the age of 65 Dr. Sutherland's still out there working since back in the day apparently he did a lot of cool work.
n./t.
n over t
A possibly naive question: Would it make any sense to include some substantial amount of main [non-cache] memory on the CPU die or in the same package? If it were say 128MB to 256MB or so, that would likely be enough to cover the working set of a typical consumer desktop machine. A NUMA-aware OS could be smart enough to migrate data structures to this faster 'on-chip memory' from external DIMMs.
In other words, if it's too difficult or expensive to improve latency of all memory, is it cost-effective to do so for the system's most frequently or LRU data?
As I figure it, this causes more problems then it solves. I mean you can arrange chips in a checkerboard patter or star, or whatever and they can capacitively couple to each other, but you have no control over which chip capacitively couples to which. I guess this would work for a synchronous system wide bus where everyone only has to know about what's on the data bus, but most current architectures have more than one bus. How do you direct who reads what bus if the bus is just 'there'.
[n/t]
Christ, are you mods on crack?
Who the frick modded it up AFTER I posted the links the prove he plagarized that comment word for word from an article I replied to a week ago THAT HAS LITTLE TO DO WITH THIS ARTICLE.
Mary, Mother and Joseph are you dumbfucks retarded?!?
Fuck Beta. Fuck Dice
Which has already been done to death. I myself am doing this at the circuit level for my thesis (On the circuit level is has been done for 10 years). It's calle neuMOS.
No end to the line of excuses for writing bad code...