HyperTransport 3.0 Ratified
Hack Jandy writes "The HyperTransport consortium just released the 3.0 specification of HyperTransport. The new specification allows for external HyperTransport interconnects, basically meaning you might plug your next generation Opteron into the equivalent of a USB port at the back of your computer. Among other things, the new specification also includes hot swap, on-the-fly reconfigurable HT links and also a hefty increase in bandwidth."
I can only imagine what that could do to us cheap bastards who have small clusters of older PC's sitting in a second bedroom or closet.
"Hum... I can't quite afford a whole new system or even a motherboard and two new procs... I'll just add a new one to the back of an existing one"
At last! The day of easily being upgrade to a multi-proc system may soon be at hand! (assuming they also have some sort of... external hub device).
Help Brendan pay off his student loans
Somehow I doubt this will become available on hyper-transport 3...
I really can't see it being that kind of socket!
For now why dont you just stick with your 'Current Solution' and stop dreaming that you need all that extra 'Bandwidth'
Maybe they should integrate the RAM in to the CPU or something.
So, x86 processors are finally getting on par with other processors from, like, 15 years ago?
Who is John Galt?
Hrm... Need a temporary boost in your folding at home project? Plug in an FPGA module!
This can only be a good thing.
"you might plug your next generation Opteron into the equivalent of a USB port at the back of your computer"
Is this a serial connection?
Or will you need a foot wide port with 700 or so contacts on it?
I know serial connections are very fast nowadays, but I don't know if you can get the entire memory bandwidth of a cpu without spreading the bandwidth in parallel connections.
So, you take the external interconnects, a large SMP box, and a transfer rate unachievable by anything except channel-bonded Myri/Infiniband/Quadrics, and you've suddenly commoditized (is that a word?) the Origin 2K architecture. Unfortunately, there will be that inevitable gap between "announced" and "benchmarkable", but this should lead to interesting system design.
Computing might just become fun again. Small systems passing information around to form a display wall, or big systems chained together to become huge systems.
the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
--
So who is hotter? Ali or Ali's siter?
Oh wait.
I can see an interesting situation where you could have a traditional CPU, to which you could plug in additional external processor modules as your needs expand. (assuming the OS could handle sharing out multithreaded apps over a variety of different multi-CPU configurations.)
Dave has a processor intensive project this week? He gets the big stack plugged into his machine until someone else in the office needs it.
Server getting bogged down? Add another couple modules to the system.
I like the idea.
m-
You catch enchiladas by picking them up behind the head and holding them underwater until they don't kick anymore -VeGas
Why would I want to plug a CPU into a USB slot for?
I really can't see it being that kind of socket!
Oh I dunno, take it out to dinner, buy it a few drinks, you never know what could happen.
What happened? Did the /. eds go on strike for 4 hours?
A fast replacement for MIDI!
There are 0x40000000 types of people: those who understand 32-bit IEEE 754 floating point, and those who don't.
Mother Nature knew it all along.
HT 3.0 increases the bandwidth to 41.6 GB/s, that's 86% more than 2.0. It's also expected to be backwards compatible with current motherboards using 2.0. The new processor will run with 3.0 speeds while the motherboard will be stuck with 2.0. The new Rev. F AMD cpus are expected to have HT 3.0. It should help with multi-processor systems where the high bandwidth connects each cpu.
Whoever subimtted the article doesn't understand what the external HT links are for. They are _NOT_ a replacement for USB or any other similar technology. External HT is used to link multiple chassis together to form a large SMP box. This is similar to infiniband, etc. This is NOT designed to be a way to just plug in a CPU to an external port. Read the pdf:
p df
http://www.hypertransport.org/docs/tech/ht30pres.
I am a viral sig. Please help me spread.
The frontside bus just crashed! Seriously though, I'm curious to see what Intel's development will be in memory interfacing.
Most of the HyperTransport updates look to be good (and, frankly, about time) but I am highly concerned that if certain manufacturers (such as Broadcom) haven't even bothered to do better than a fragmentary 1.x and have ignored 2.x entirely, there is little hope that they'll do much with 3.x.
And that's the big problem. If AMD are the only ones who ever implement the specification in full, correctly, then it doesn't offer any significant advantage. It isn't universal enough to be useful. That is the killer that has murdered so many excellent technologies. Being good - even being the best - isn't enough. If a rival is more widely adopted, then it'll be the rival that wins. The marketplace doesn't reward quality, it rewards popularity. Quality achieves nothing.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Now half my brain will be trying to design a 939 connector USB cable in the background....
hehe external CPU, someone got a better batch of something than i did.....
HT enabled FPGA boasts 300x performance gains in some computations
I am a viral sig. Please help me spread.
Processor on a stick. Cool idea. Now we only need to update the USB spec to supply devices with 100W of power! While you're at it don't forget that we'll also want a couple hubs in the path.
Lurking at the bottom of the gravity well, getting old
The fact that you were modded flamebait makes me wonder which fool computerfucker got points today.
Slashdot - where whining about luck is the new way to make the world you want.
We'll be able to go from New York to Tokyo in less than three hours?
What?
Why are MacBook Pros so much faster than Powerbooks?
:/
The MacBook Pro sports a 666Mhz DDR FSB, while the Powerbook sports a 133Mhz FSB. It doesn't matter how fast your processor is if you don't have a fast enough way to power it (much like a V-12 will not do well with a single-barrel carb used on a lawnmower engine).
The Von Neumann bottleneck is the significant limiting factor in all machines, once your working set of data exceeds that of your L1/L2 cache. Suddenly your 1.5 Ghz G4 is 266 Mhz
Faster hypertransport means happier users of AMD machines. My AMD64 beats the pants off my Sempron 2500 because its 800Mhz HT bus allows it to do context switches in less than 1/3rd the time of the Sempron!
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
Hey Intel, hows the FSB? And, for that matter, how's that DRM-soaked Viiv product going?
"Sure there's porn and piracy on the Web but there's probably a downside too."
You've gotta see my dedicated Hypercard stack co-processor running on top of my custom Hypertransport stack.
It's smokin!
Bah. Why bother.
/silly off.
I'd rather have an external motherboard. Keep the CPU in the case, and everything else outside.
WhiteWolf666 an exBush supporter. All you new-school,compassionate,save the children Republicans can rot in hell
Just make all the components (memory, CPU, disks, interfaces) like Legos, and you'll be set. Need more RAM? Just add another block. Suzy needs some extra CPU for a big project, let her borrow your block for the day.
The bonus feature would be collecting enough hardware to make the Millenium Falcon out of your PC.
Err ... your AMD64 is good because it's got a low latency on-die memory controller. It doesn't even have to think about the slow FSB bottleneck.
The fact that the link to the chipset is also fast is just a bonus.
Too bad Apple isn't making new products with Hypertransport anymore, now that they're using Intel instead of the G5 or AMD. It would be interesting to have a rack of XServe machines that just do plug-and-play clustering via a Hypertransport port. Unless they go with AMD in the XServe (which actually wouldn't make much sense for a 1U single/dual processor unit), then I don't think we'll see anything like this.
Nope, "ultra slow bus" it is! Version 1.1 for extra slowness. The nice thing is you can just keep stacking USB hubs and keep plugging in processors. Good thing these are USB powered or we'd need lots of plugs! The nice thing is I'm sure they'll come in all sorts of clear cases with LEDs. Which HD are you speaking of? High Definition? Hard Drive? Or Harley Davidson? Either way, I'm sure the possibilities will be extreme. Maybe even Xtreme!
Can we use the External port for a video card box?
That will cut down on heat in your case by haveing the cpus and ram in one box and video cards in a other one.
Problem is current crop of FPGA chips aren't fast enough to replace a 'real' cpu.
Im a great fan of FPGA and they they are cool, but i also know what their place is, and replacing comparably ( relative cost/performance curve ) cheap CPUs isn't it.
---- Booth was a patriot ----
Perhaps it's because your Sempron 2500 is a socket 754 chip, so cannot use dual-channel memory. The AMD64 has a faster FSB, and it's dual-channel.
Many people (including yourself it seems) misunderstand HT. It isn't the FSB, an Athlon 64 has no FSB. HT is only used to communicate non-memory I/O and to synchronize caches between processors when doing memory I/O. So it's rather unlikely that HT could make your context switches 3X faster. Best thing for that would be a bigger cache, which your AMD64 probably has also.
http://lkml.org/lkml/2005/8/20/95
... we'll have custom HyperTransport socket FPGA chips to boost Opteron systems coming out of http://www.drccomputer.com/ real soon.
My AMD64 is a Socket 754, and my Sempron is Socket 462. It's on a much, much slower bus connection to its RAM. The Sempron has 180ns latency to RAM, while my AMD64 has 60 ns (worst case).
The AMD64 average context switch latency is a few microseconds; 15ns average. Sempron is 10ns best, 70ns average. I can send you a PDF with a few hundred graphs I did with lmbench on several platforms for a reseach project recently, if you don't believe me.
So, if my kernel is doing a context switch HZ times a second, I'm getting way better interactive performance on my AMD64 machine -- which is a socket 754 single-channel memory device. The FSB dominates.
The bus connection between my CPU and the RAM is, indeed, the Hypertransport. Northbridge, CPU, and RAM are all connected by it. Perhaps you missed all the AMD documentation on this, or the entry in Wikipedia:
"Front-Side Bus Replacement
The primary use for HyperTransport is to replace the front-side bus, which is currently different for every machine (or some set of them). For instance, a Pentium cannot be plugged into a PCI bus. In order to expand the system the front-side bus must connect through adaptors for the various standard buses, like AGP or PCI. These are typically included in a controller called the northbridge."
And, yes, I am taking into account caches as well. I do appreciate the healthy skepticism.
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
all i need is an external processor on my desk.... it already drives me knutz enough when my cat gets do close to my LCD....
BAD KITTY STOP BATTING MY EXTERNAL PROCESSOR AROUND..... THAT'S A BAD KITTY.
actually I am happy to see you, however that is in fact a banana in my pocket.
I'm sure lots of people have thought of this before me, and probably even in this topic, but I have to ask: is there any reason they can't implement a HyperTransport link straight to the RAM? Has that already happened? I'm only a layman when it comes to processor/motherboard architecture, but it seems to me that with all that available bandwidth we should be throwing the kitchen sink into it.
You have a strange manor of speaking, sir.
Instead of the harping on the implementation (which was done in a slapdash, amatuerish fashion by SiByte in order to make a quick buck - and screw the customer), you should blast Broadcom for basically dropping support for this CPU. Broadcom has done almost nothing whatsoever to improve the CPU. In fact, they go far out of their way to avoid the needed improvements. Witness the completely bogus (and nearly useless) JTAG support for the 1250.
They used to have GDB support for it for free. That's all gone; and in fact no longer works with the new Rev C 1250's. Instead, you have nearly useless third-party support from Corelis and Greenhills.
Forget source code debugging if you have a ClearCase SCM, unless you want to go through a bit of pain and hackery.
And, hells bells, let's not talk about the memory controller, which is the worst one I've ever seen. If there were ever anything which needed improvement, it is that.
In short, if you chose the BCM1250, you were an idiot and deserve what you got. No sane embedded person would do so. A clueless architect might, but not a real embedded engineer.
I once had to inherit this mess; and I'm delighted to be done with it.
So just avoid Broadcom altogether. They have an established track record of leaving you high and dry should you make the mistake of depending on them. And they just don't give a damn about their customers.
The best way to predict the future is to create it. - Peter Drucker.
I see what the original replier meant. I was correct for Intel, but I'd forgotten a few details of how AMD changed things with Athlon64. Certainly, HyperTransport's important for filling RAM, but once RAM is full, it's straight to the CPU.
Thanks for the reminder.
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
And don't forget the patents on processes to make fiber!
At least that was what one paper was claiming would help the adoption of fiber to the desktop. The combination of some important patents expiring and the increased costs of making better copper cabling would cause fiber to become a better choice is a few years for just about everything. Might be delayed until power over fiber starts working.
I run folding@home 24/7 you insensitive clod!
We've only been plugging extra computing power into our BBC Micros for 25 years.
The idea was that manufacturers could lash together a working platform very cheaply with just the processor, some RAM, a tiny bootstrap EPROM, one I/O chip and no interrupts. The BBC Micro then talked to all the peripherals on behalf of the processor. This is how the first ARM1 processors were mounted - and two decades on, a 64 MHz ARM7TDMI board is available.
Is this intended to be used for peripherals as well? For example, I might have a handheld device that I can plug in to a desktop to use its CPU to do processor-intensive stuff on the handheld that it would not normally be used for when on its own.
Or is that completely wrong?
"And the meaning of words; when they cease to function; when will it start worrying you?"
That's exactly what a cache is. It's very high speed memory, often SRAM, attached on a very wide bus. Instead of letting the programmer or the OS decide which parts of software to put in the high-speed ram, and what to leave in the low-speed ram, the cache controller does, essentially letting all the data have a place in the high-speed ram, but occasionally replacing it.
What you describe doesn't really solve any real problems. Graphics cards bennefit from fancy memories like gddr3 because they are bandwidth starved. We can see from the lack of performance increase in DDR2 systems, that opterons and pentiums are not bandwidth starved when it comes to memory, they are latency bound. Super-fast memory designs like XDR don't really help the latency problem, they only increase bandwidth, which is why you see them used for bandwidth starved micros like cell or the cray vector systems. SRAM would help the latency issue, but its so expensive, you can't throw a quarter gig in a system, even a couple megs is really expensive, so it's better to use that as cache, rather than direct-address memory.
Furthermore, you're not saving all that much money. Some of the cost of expensive graphics ram is the memory, but a lot of it is also the elaborate memory controller, and all of the bus pins. What you're proposing still requires expensive CPU designs, CPU sockets, many-layer motherboard layouts, and memory package designs. You still have most of the cost of high-end server designs, but only a fraction of your memory is fast. It doesn't seem worth it.
This sounds very much like the interlink that SGI/Cray uses to turn 2- and 4-processor "bricks" into multi-way supercomputers. I guess it's just testimony as to how advanced those Cray-turned-SGI guys were (we're talking early 1990s)...
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom