Slashdot Mirror


HyperTransport 3.0 Ratified

Hack Jandy writes "The HyperTransport consortium just released the 3.0 specification of HyperTransport. The new specification allows for external HyperTransport interconnects, basically meaning you might plug your next generation Opteron into the equivalent of a USB port at the back of your computer. Among other things, the new specification also includes hot swap, on-the-fly reconfigurable HT links and also a hefty increase in bandwidth."

179 comments

  1. External HyperTransport? by DaHat · · Score: 3, Funny

    I can only imagine what that could do to us cheap bastards who have small clusters of older PC's sitting in a second bedroom or closet.

    "Hum... I can't quite afford a whole new system or even a motherboard and two new procs... I'll just add a new one to the back of an existing one"

    At last! The day of easily being upgrade to a multi-proc system may soon be at hand! (assuming they also have some sort of... external hub device).

    1. Re:External HyperTransport? by merreborn · · Score: 4, Insightful

      "Hum... I can't quite afford a whole new system or even a motherboard and two new procs... I'll just add a new one to the back of an existing one" ...Except you'd need a hypertransport 3.0 motherboard to begin with, and enough appropriately clocked RAM to make use of the processor. The whole "External CPU" idea was just speculation anyway; it's not mentioned anywhere in the article.

      Point being, you'll never be able to plug a new opteron into _anything_ that's sitting in your closet right now.

    2. Re:External HyperTransport? by Anonymous Coward · · Score: 0

      That's nonsense. I guarantee you there would be "PCI HyperTransport Cards" available on the market - no matter how pointless such a card would be. As for RAM, it can be included on the card or as part of the external CPU enclosure. Then you're only limited by the PCI bus speed.

    3. Re:External HyperTransport? by merreborn · · Score: 1

      That's nonsense. I guarantee you there would be "PCI HyperTransport Cards" available on the market - no matter how pointless such a card would be. As for RAM, it can be included on the card or as part of the external CPU enclosure. Then you're only limited by the PCI bus speed.

      That's really, really limited -- the PCI bus isn't even considered suitable for 3D graphics cards anymore, much less CPUs.

      You're proposing putting an entire PC on a PCI card (since it's now gotta have RAM -- which requires a new BIOS and its own bus, which in turn means you've got most of a motherboard on there) -- why not just build an entire PC from the ground up, and ditch the old system entirely?

    4. Re:External HyperTransport? by InThane · · Score: 1

      They'll still make them.

      I seem to remember somebody selling "system extenders on a card" back in the late '80s or mid '90s - that fad may have come 'n gone more than once. I also seem to remember the benchmarking showing that the pathetically sad system speed was due to memory latency issues...

      --
      InThane
    5. Re:External HyperTransport? by sleigher · · Score: 1
      --
      All points of time and space are connected.
    6. Re:External HyperTransport? by Anonymous Coward · · Score: 0

      You make it sound like I was advocating for such a card. On the contrary, I originally implied there would be little point to the card. Perhaps I should have quoted "only", but it seemed unnecessary given the context.

      The sibling poster understood the point. Even though such cards would be pointlessly stupid, they could and would still be made, thus enabling the OP to use a new Opteron with existing older hardware.

    7. Re:External HyperTransport? by somersault · · Score: 1

      didnt see anyone mention PCI buses, and dont think a USB connection would have to go through it. The biggest problem IMO would be the latency from signals having to travel all the way from your USB port to communicate with the other processor/whatever master controller there is.

      You're right that we'll have faster architectures in the future that will make this obsolete/limited, but if it gives more raw processor cycles in the meantime, then some people will find it useful - though likely not mainstream gamers, more professionals doing intensive calculations (who can probably afford to get a new machine anyway, but if you have 16 more processors connected to your machine in a USB chain, why not?)

      --
      which is totally what she said
    8. Re:External HyperTransport? by Thundercleets · · Score: 0

      You can and will only connect what they want you to as they control what can be connected when and where using TCPA/TPM then Billg gets his turn telling you want to do with it and even if he will let you use his Windows on it using DRM/etc.

  2. Re:f*** by jamieswith · · Score: 2, Funny

    Somehow I doubt this will become available on hyper-transport 3...

    I really can't see it being that kind of socket!

    For now why dont you just stick with your 'Current Solution' and stop dreaming that you need all that extra 'Bandwidth'

  3. So the CPU will still be waiting for RAM? by Anonymous Coward · · Score: 2, Interesting

    Maybe they should integrate the RAM in to the CPU or something.

    1. Re:So the CPU will still be waiting for RAM? by DaHat · · Score: 2, Interesting

      Good point... but do you really want to dedicate a large chunk of ram to a specific processor in such a manor?

      Sure, with it there would be a possibility of cache coherency issues while without there would be a performance hit whenever something hit the bus...

      I guess it'd depend on the cost of ram when building such a device... I'm guessing that a whopping 64-128 meg cache aught to be enough for sometime.

    2. Re:So the CPU will still be waiting for RAM? by robthebob · · Score: 5, Funny

      Not to be pedantic, but while I might not want to dedicate a large chunk of ram to a specific processor in such a manor, I might want to live in that manor, and maybe have my serfs carry out the computations for me.

    3. Re:So the CPU will still be waiting for RAM? by stinerman · · Score: 2, Informative

      In a design class I took, our professor talked about something called "processor-in-RAM". The idea is that you'd have a few processors all with their dedicated RAM. The program you are running would be copied in each processors's RAM. When a branch was ready to be taken, half the processors would go one way and the other half the other. The processors that guessed right would let the other processors know they were wrong and update them with the new information. This way there is no penalty hit as all branches are correctly predicted.

      I'm guessing that a whopping 64-128 meg cache aught to be enough for sometime.

      Yeah, it'd provide some huge performance gain, but the shear cost of that much cache would easily be on the order of tens if not hundreds of thousands of dollars. Cache requires a few gates for each bit stored, while RAM uses gates to control capacitors (one capacitor for each bit).

    4. Re:So the CPU will still be waiting for RAM? by Loconut1389 · · Score: 1

      ccNUMA/NUMA architecture uses processor local ram whenever possible and transfers data from other memory when required. See SGI.

    5. Re:So the CPU will still be waiting for RAM? by stinerman · · Score: 1

      It was awhile ago when he explained it. Perhaps I didn't do his lecture justice for reasons of memory or he didn't do a good job of explaining it.

    6. Re:So the CPU will still be waiting for RAM? by Metabolife · · Score: 1

      First they're integrating the PCI-E bus into the cpu. And with AMD recently licensing Z-Ram tech, they should be able to fit more cache on chip with no performance penalties.

    7. Re:So the CPU will still be waiting for RAM? by Anonymous Coward · · Score: 3, Funny

      I second that pedantry, and have aught but praise for the parent poster.

    8. Re:So the CPU will still be waiting for RAM? by SpinJaunt · · Score: 1

      google search for AMD + Z-RAM ;) - certainly close.

      --
      /. is good for you.
    9. Re:So the CPU will still be waiting for RAM? by Loconut1389 · · Score: 1

      No, I think he was right, but I wasn't careful enough in how I tried to add on a thought and wasn't careful enough in reading your post ;o)

      I think what you're talking about is some sort of decentralized branch prediction, and that sounds like something I heard in one of my classes too. What I was trying to add was that in common use is something like ccNUMA where an application is run on a specific cpu/node and the memory for that thread is on that node board/cpu cache, but when something needs to come from memory elsewhere, it traverses the interconnect.

    10. Re:So the CPU will still be waiting for RAM? by Anonymous Coward · · Score: 0

      "The processors that guessed right would let the other processors know they were wrong and update them with the new information. This way there is no penalty hit as all branches are correctly predicted."

      Yeah but the processors that branched the wrong way would have still wasted cycles on incorrect calculations and they would still have to clear their pipelines before the correct data could be accepted. I really don't see that concept as being all that useful.

    11. Re:So the CPU will still be waiting for RAM? by slashjames · · Score: 1

      Take a variant of the older Slot 1 processor design. Have the processor on one side, and 4 memory slots on the other. It makes the traces between memory and processor almost negligible. Without the traces and slots for memory on the motherboard, they should get smaller and hopefully cost less.

    12. Re:So the CPU will still be waiting for RAM? by smallfries · · Score: 3, Informative

      You're mixing up a few pieces of technology here. Processors with their own dedicated memory has been invented many times by different people. Modern loosely coupled clusters fit this bill, but further back there was the transputer systems in which each processor had memory on board. Systems like this are more difficult to program than single image systems (even with a CSP derivative as the language) but they produce higher performance.

      The other thing that you are describing is multiway branch prediction. A processor like the Pentium guesses which way a branch goes and despatching instructions down that path to the pipeline. When it is wrong there is a hit as the pipeline stalls and all of those cycles are lost. In multiway branching both outcomes of the branch are despatched to the pipeline. The cost is that half the instructions being executed will be thrown away. If you go 2 branches deep then it is 75%. The advantage is the latency is minimised as the pipeline is always full.

      The last thing is processor-in-RAM, or smart memory. In this system a miniture processor is embedded on the DRAM die. The small processor is capable of computing striding patterns in arrays. As the program executes on the main processor the smaller processor predicts which memory locations are going to be accessing and presending the data to the host processor, reducing latency.

      Good luck on your class. Architecture is one of the more interesting courses in a CS degree.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    13. Re:So the CPU will still be waiting for RAM? by ArbitraryConstant · · Score: 1

      I've often thought that. They already have support for NUMA in OSes like Windows, Linux, etc, and DRAM cells take 1/6th as many trasistors as SRAM cells (used in cache). You could still have RAM external to the CPU, it would just be recognized as being non-local.

      --
      I rarely criticize things I don't care about.
    14. Re:So the CPU will still be waiting for RAM? by philthedrill · · Score: 1

      Maybe they should integrate the RAM in to the CPU or something.

      The problem with integrating DRAM is that capacitance is very sensitive to heat; cells won't be able to hold a charge (and will be useless functionally) if temperatures get too high.

    15. Re:So the CPU will still be waiting for RAM? by ScrewMaster · · Score: 1

      I have a gigantic penis. None of this means you wont see some notable improvements from multiple cores and HT3, but as you know, that's down to the software.

      I dunno, given that this will be useful for embedded applications I'd say it's your firmware that's in question.

      --
      The higher the technology, the sharper that two-edged sword.
    16. Re:So the CPU will still be waiting for RAM? by Anonymous Coward · · Score: 0

      The last thing you want your serfs to do is something like screw up the zabulon computations or point a lasgun directly at the shield thingy you are developing.

    17. Re:So the CPU will still be waiting for RAM? by 10Ghz · · Score: 2, Interesting

      Instead of that, how about having some REALLY fast RAM right next to the CPU? Take a look at modern vid-card. Hi-end models have 256-512MB of uber-fast DDR3-RAM on 256bit bus. And the GPU's are usually bigger than CPU's are. And still, they can seel the entire package (GPU, card and RAM) for about $500. What if we did something similar with CPU's? Instead of selling CPU's as chips, sell them as modules (like SGI and Sun do). Attached to that module would be the CPU, and attached to the CPU would be 256-512MB of ~1GHz RAM on a 256 bit bus.

      And before you say that that is too little RAM.... Other CPU's in the system would have such RAM-setup as well. There could also be traditional memory-banks attached to the Nortbridge as well. So each CPU in the system would have 256-512MB of VERY fast RAM attached directly to the CPU. In addition to that, they could also access the RAM on other CPU's (like AMD64-machines do today). AND in addition to that, there would also be traditional memory-banks attached to the northbridge, for memory-expansion. The Northbridge-RAM would be shared with all the CPU's in the system (naturally).

      Of course, such a system would cost a bit more than current systems do. But it would have a metric assload of bandwidth. Would such system make any sense at all? Considering that vid-card makers can sell such RAM attached to relatively large GPU for around 500 bucks, why couldn't CPU-makers sell a smaller CPU with similar RAM for about same price?

      --
      Lesbian Nazi Hookers Abducted by UFOs and Forced Into Weight Loss Programs - -all next week on Town Talk.
    18. Re:So the CPU will still be waiting for RAM? by Ruphuz · · Score: 1

      You aught to be more respectful with others' orthographic faults.

      --
      My other post is a First.
    19. Re:So the CPU will still be waiting for RAM? by Pusene · · Score: 1

      Oh, you're sooooo old-school.
      I would just outsource the whole computing-thing.

      --
      Error #13: No coffee. Operator halted. Please place boot device at bottom.
    20. Re:So the CPU will still be waiting for RAM? by be-fan · · Score: 1

      The RAM attached to video cards is not only fast, but has terrible latency. How do you think they clock it at 1GHz+? It wouldn't do at all for the main memory in an AMD64 machine.

      --
      A deep unwavering belief is a sure sign you're missing something...
    21. Re:So the CPU will still be waiting for RAM? by 10Ghz · · Score: 1

      IIRC, individual memory-chips (like in vid-cards) can be clocked higher than DIMM-modules can. And besides, we currently have 800Mhz (effective) RAM. How about leaving it at 800Mhz, but doubling the bus? latency would be reasonable, but bandwidth would be twice as big.

      --
      Lesbian Nazi Hookers Abducted by UFOs and Forced Into Weight Loss Programs - -all next week on Town Talk.
    22. Re:So the CPU will still be waiting for RAM? by flaming-opus · · Score: 1

      You do raise an interesting point about the memory expansion. Opterons are limited in the amount of local memory they can address. Each opteron includes 2 memory controllers. DDR memory controllers can't drive more than 3 modules per bus, limiting an opteron to 6 dimms per cpu. Currently that puts a ceiling of 12GB/cpu of memory until 4GB dimms become available in quantity. For most people, this is not a real hinderance; 24GB for a dual-proc sled is plenty enough. There are some cases, however, when your real problem is the size of addressable memory. I've wondered if someone out there might not put together some motherboards with some memory-extender controllers that would plug into hypertransport, and add additional remote memory capacity to the cpu's on the board. I suppose the people who need more than 12GB/CPU are pretty few, and you just tell them to buy more processors, even if you're not cpu bound.

      SGI does something like this on the altix, which allows you to use memory-only nodes on the ccNuma fabric.

    23. Re:So the CPU will still be waiting for RAM? by be-fan · · Score: 1

      They can be clocked higher, but the timings must be relaxed for them to operate at that speed. It's a limitation of the DRAM core, not the interface. The bus could be widened, but that doesn't help latency. The big problem with modern machines is actually not bandwidth, but latency. The AMD64, for example, gains almost nothing with DDR2, even with an almost doubling of bandwidth, because latency is not reduced.

      --
      A deep unwavering belief is a sure sign you're missing something...
    24. Re:So the CPU will still be waiting for RAM? by Anonymous Coward · · Score: 0

      Memory latency is in fact reduced slightly in AM2. The K8 is simply not bandwidth-limited, and so the increased bandwidth goes unutilized. If all you have is a water pistol, it makes little difference that you happen to be shooting it down the Grand Canyon.

  4. x86 processors by bioglaze · · Score: 1, Insightful

    So, x86 processors are finally getting on par with other processors from, like, 15 years ago?

    --
    Who is John Galt?
    1. Re:x86 processors by peragrin · · Score: 1, Interesting

      Why not Windows is finally (hopefully?) getting on par with features that Unix enforced 15 years ago.

      And before anyone goes to say NTFS has had those features for years , if that was really true then why can i easily delete files on any windows machine. Why is it that malware can hide in any system directory? because MSFT never enforced those standards.

      --
      i thought once I was found, but it was only a dream.
    2. Re:x86 processors by heinousjay · · Score: 1

      I've always pictured the color of OS zealotry as a sort of bright flamingo pinkish hue. Make sense to you?

      --
      Slashdot - where whining about luck is the new way to make the world you want.
    3. Re:x86 processors by Andrzej+Sawicki · · Score: 2, Insightful

      OMG Ponies?

      But seriously, you got it wrong. It's puke green, of course.

    4. Re:x86 processors by Anonymous Coward · · Score: 0

      Seems to me that the system administrator is the one who should be enforcing access restrictions... NTFS allows you to easily restrict access to files and directories and I've had no problem using it...

    5. Re:x86 processors by fitten · · Score: 4, Interesting

      Yup... It has always been thus. The difference is that the high-end processors do exotic things and then Intel/AMD suck it in when it is ready for commoditization. The x86 has *always* been behind in those types of technologies (but usually pretty far ahead in tricks to make the x86 ISA fast) because those technologies are high-end. Eventually, it all trickles down to commoditization and then we get it in x86s.

    6. Re:x86 processors by Anonymous Coward · · Score: 0

      I think you are confusing the features of NTFS with the way Windows user accounts are setup by default. If you can go and delete any files on Linux when you are root, does it mean that the filesystem is flawed? Sure, in the case of Windows, it is stupid to require normal users administrator access just to run some programs, but it has nothing to do with design of the filesystem.

    7. Re:x86 processors by Anonymous Coward · · Score: 0

      Actually, high end processors were never reputed to do lots of crazy stuff to go fast; quite the contrary, performance processors usually had a less complex set of instructions, i.e. RISC, and usually higher clock speeds. Of course, one big reason RISC was so fast was that compilers didn't exist that would take advantage of all of x86's instructions, and also many of those instructions would take multiple cycles, and also RISC systems usually higher memory bandwidth to the CPU.

    8. Re:x86 processors by Wdomburg · · Score: 4, Informative

      So fifteen years ago everyone else had 20GB/sec buses? Funny, Sun seems to think they were using MBus, which peaked at around 350-400MB/sec. And HP was dropping CPUs on a GSC bus running at ~ 250MB/sec. I'd look up what state of the art was for SGI and IBM, but it would be silly. AMD and Intel surpassed other chip vendors on a number of fronts years ago.

    9. Re:x86 processors by jacksonj04 · · Score: 3, Insightful

      Okay, let me explain about the difference between hardware and software. Processors and HyperTransport, and thus the subject of this discussion, are hardware related. Windows and Unix are software. Blabbering on about how Windows is the scourge of the world and we should all use vi/emacs/insert_editor_here when the parent was clearly talking about hardware with no association other than your own (Extremely weak, see other replys) point seems a bit... oh I don't know. OS Zealous?

      --
      How many people can read hex if only you and dead people can read hex?
    10. Re:x86 processors by hackstraw · · Score: 1

      The x86 has *always* been behind in those types of technologies

      The x86 came out when disco was still cool.

      usually pretty far ahead in tricks to make the x86 ISA fast

      going on 30 years of hacking and PhDing the chip will do something.

      I've been ready for x86 to die for years now, but then Apple joined the club.

    11. Re:x86 processors by Anonymous Coward · · Score: 0

      Why is it that malware can hide in any system directory?

      Because you run everything (probably including internet explorer) as administrator, them blame MS when it blows up in your face? Guess how effective Unix file permissions are at stopping malware running as root from hiding in system directories...

    12. Re:x86 processors by imsabbel · · Score: 1

      I guess you should tell that cray, so they can dig in their old junkyards for their old good tech so they dont have to use hypertransport anymore...

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
    13. Re:x86 processors by nihaopaul · · Score: 1

      whys it a hardware firewall when its just a processor, ram, storagelike device in a box but runs linux?

    14. Re:x86 processors by daniel23 · · Score: 1


      Ouch, that hurt, Andrzej.

      --
      605413? Yes, it's a prime.
    15. Re:x86 processors by Lucractius · · Score: 2, Informative

      guess what

      Cray use Hyper Transport now

      --
      XML - A clever joke would be here if /. didn't mangle tag brackets.
    16. Re:x86 processors by nogginthenog · · Score: 1

      if that was really true then why can i easily delete files on any windows machine
      Because you have the rights to do so. No different from Unix.

      Why is it that malware can hide in any system directory? because MSFT never enforced those standards.
      Because the user who executed it had the rights to do so. No different from Unix.

      The problem with Windows is that by default most users are administrators, and some software depends on this. Fundamentally Windows filesystem security no different from Unix.

    17. Re:x86 processors by Wdomburg · · Score: 1

      I think that was the point.

  5. External FGPA units? by SaDan · · Score: 3, Interesting

    Hrm... Need a temporary boost in your folding at home project? Plug in an FPGA module!

    This can only be a good thing.

    1. Re:External FGPA units? by bunions · · Score: 1

      Oh lord I have been waiting for this for so long. Here's a fun link: http://starbridgesystems.com/

      --
      there is no need to sign your posts. this isn't usenet. your username is right there above your post. stop it.
    2. Re:External FGPA units? by Anonymous Coward · · Score: 0

      Seems pretty much like the SNES and the SuperFX on a cart to me.

  6. A port? by Anonymous Coward · · Score: 2, Funny

    "you might plug your next generation Opteron into the equivalent of a USB port at the back of your computer"

    Is this a serial connection?
    Or will you need a foot wide port with 700 or so contacts on it?

    I know serial connections are very fast nowadays, but I don't know if you can get the entire memory bandwidth of a cpu without spreading the bandwidth in parallel connections.

    1. Re:A port? by Loconut1389 · · Score: 2, Interesting

      Check out the SGI/CrayLink setup used for ccNUMA - the port is around 2.5 inches, but has quite a lot of pins (maybe 100?). I don't think foot-wide is really necessary.

      IMHO, fiber optics- though delicate, could offer higher bandwidth. I'd rather have my whole fiber go dark from a break and know it than have one strand of many go out and not know it and have all kinds of whacky/intermittant behavior.

      I still struggle to understand why fiber optics are so expensive- the lasers used are fairly cheap and the cables really aren't that complex either and are made in enough quantity.. but I guess since it's not mainstream, it's expensive.

    2. Re:A port? by heinousjay · · Score: 0, Troll

      I still struggle to understand why fiber optics are so expensive- the lasers used are fairly cheap and the cables really aren't that complex either and are made in enough quantity.. but I guess since it's not mainstream, it's expensive.

      Prices are typically set based on how much the customers are willing to pay. The component cost generally isn't influential aside from forming a price floor. That's the market economy at work. You don't have to struggle, or even understand. Just stand back and let it do its work.

      --
      Slashdot - where whining about luck is the new way to make the world you want.
    3. Re:A port? by symbolset · · Score: 1
      Looking at the doc... Wires the same as version 2. Just the clock speed bump. 16 Command/address data links, 2 clocks, one control, two system (PWROK and RESET). Clocks to 2.6GHz for bandwidth of 20.8GB/S. It doesn't say how many ground wires in the wire. Another slide says up to 32 signal wires and up to 41.6GB/S.

      Here's the pdf from another post: http://www.hypertransport.org/docs/tech/ht30pres.p df

      --
      Help stamp out iliturcy.
    4. Re:A port? by Anonymous Coward · · Score: 5, Informative

      The reason fiber optic (particularly glass core) is so expensive is due to the difficult and sensitive process required to manufacture that cable, though the materials used are extremely inexpensive. The diameter of the glass core must be matched exactly to the wavelength of light to travel over that fiber. In addition the composition and purity of the glass must meet certain standards to prevent reflection, signal attenuation, or signal skew, all of which would result in inconsistent or degraded performance. As far as the lasers being cheap, yes a laser can be cheap, but again the same demanding requirements apply to both versions of laser used in data communications, which again increases the manufacturing cost.

    5. Re:A port? by Anonymous Coward · · Score: 0

      Thanks.

      It looks like it has a length limit of one metre.
      For inside a computer, that's very good isn't it?

    6. Re:A port? by heinousjay · · Score: 1

      Wow, did I piss off a communist? Does someone want to keep the mechanics of the free market under wraps? The secret's out, comrade.

      --
      Slashdot - where whining about luck is the new way to make the world you want.
    7. Re:A port? by Boone^ · · Score: 1

      If you designed your untrusted channels with some type of sliding window go-back-and-retry protocol, you'd know if you broke a wire or it's otherwise noisy.

    8. Re:A port? by Anonymous Coward · · Score: 0

      optics... not economics.

    9. Re:A port? by Anonymous Coward · · Score: 0

      Your posts are insipid. You should really stop shattering bottles while they're up your rectum.

    10. Re:A port? by alas_anon · · Score: 1

      > The reason fiber optic (particularly glass core) is so expensive
      > is due to the difficult and sensitive process required to manufacture
      > that cable, The cable is cheap, it is the manual labor involved in mounting the connector to the single mode fiber that makes it so expensive.

    11. Re:A port? by Evil-G · · Score: 1

      Perhaps, but isn't multimode fibre used over short distances, such as less than a few kilometres? Surely using fibre to connect devices which are all in the same room wouldn't use singlemode fibre anyway?

      The connectors for multimode fibre are cheap and easy to connect to the fibre - I've managed to do a few myself, which are still in use.

    12. Re:A port? by incabulos · · Score: 1

      SGIs craylink worked this way - plug two identical systems together via a thick cable, boot each, and you suddenly have a single-system-image box with twice the cpu/ram/IO of any individual component system. Granted no-one other than SGI used it, so as SGIs business fades ( assuming the downward trend of the last few years continues ), it will be a dead technology.

      Remaking the concept via an open standard is the first step to getting this sort of specialised technology to filter down to the commodity x86 world. The linux kernel itself is already able to utilise NUMA-workalike archs thanks to the experience with SGI, so getting an optimised and well performing OS on this new hardware should be relatively easy.

  7. Nice... by Frumious+Wombat · · Score: 2

    So, you take the external interconnects, a large SMP box, and a transfer rate unachievable by anything except channel-bonded Myri/Infiniband/Quadrics, and you've suddenly commoditized (is that a word?) the Origin 2K architecture. Unfortunately, there will be that inevitable gap between "announced" and "benchmarkable", but this should lead to interesting system design.

    Computing might just become fun again. Small systems passing information around to form a display wall, or big systems chained together to become huge systems.

    --
    the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
    1. Re:Nice... by questionlp · · Score: 3, Interesting

      Although HT 3.0 will be a very good step to bring the Opteron closer to the Origin architecture, but the Opteron still lacks or does not have good implementationse of the cache coherency and other caching features of NUMAlink used in the Origin servers/clusters. The Horus chipset helps in some ways, but doesn't help scaling beyond 8P in a glueless fashion.

      Just my $0.01

    2. Re:Nice... by TopSpin · · Score: 1

      commoditized (is that a word?)

      Whatever. If you're in IT and you don't invent two words a year you're coasting. Try 'elaborisha': (obviously excessive complexity for the sake of questionable or obsolete tangibles.) Zero hits on Google. Verb it and you have elaborize. :)

      Computing might just become fun again.

      It's fun now. Over at Supermicro you have four socket motherboards designed for 1U hosts. Intel is planning 4 core CPUs (MP, blah blah) by Q1 '07. 16 cores in 1U. Meanwhile Sun has an 8 core CPU shipping...

      About all the 'fun' I can handle right there. Never mind CPUs dangling on the end of hypothetical cables burning melt marks in the desk. Good way to keep coffee warm I guess.

      --
      Lurking at the bottom of the gravity well, getting old
    3. Re:Nice... by Gothmolly · · Score: 1, Troll

      You're right. So because it doesn't do $SPECIFIC_BUZZWORD, we should shitcan the entire thing. Very +1, Insightful.

      --
      I want to delete my account but Slashdot doesn't allow it.
    4. Re:Nice... by civilizedINTENSITY · · Score: 2, Insightful

      Are you suggesting AMD buy SGI?

    5. Re:Nice... by hackstraw · · Score: 1

      the Opteron still lacks or does not have good implementationse of the cache coherency and other caching features of NUMAlink used in the Origin servers/clusters.

      I have an SGI running Linux that has NUMAlink with cache coherency with stock Itanium CPUs and of course NUMAlinks. Is this something that cannot be extended from what SGI has done to use the cache coherency over HTX?

      I don't know, but hopefully someone does.

    6. Re:Nice... by Frumious+Wombat · · Score: 1

      True, but being as we've lost so many good technologies, such as the Cray-style SHMEM, and others are languishing, i.e. the Origin coupling to make NUMA systems, it's nice to see steps in the right direction. There's always the worry that the future was nothing but business desktops glued together with funky networking and software hacks, and it's good to see that features the HPC community can use directly may become available again. Sometimes you just need the bigger box, and the ability to link Origins together in a modular fashion, from the pair-wise O200 to the full-sized O2K, always seemed to be a neat trick. As I said, whether the first implementation is optimal, bringing that capability to the PC world is a step in the right direction. Having this capability should spur some design effort into implementing the cache coherency and other features of the O2000, and help move the Intel/AMD hardware beyond its business-oriented PowerPoint-running roots.

      Of course, I speak as a chemist, for whom the Origin was easy to code for, if somewhat lagging in performance per processor. I would like to see that architecture, modular, and affordably priced, become available again.

      --
      the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
    7. Re:Nice... by ArbitraryConstant · · Score: 4, Funny

      "Are you suggesting AMD buy SGI?"

      Hell, I've got some change left over from lunch, I'm thinking of buying SGI.

      --
      I rarely criticize things I don't care about.
  8. not USB by Gates82 · · Score: 1
    Please tell me that there will be a better interconnect for this kind of processing power than USB (ultra-slow bus). This would be awesome for have direct NIC or external RAID's to have this amount of bus available. The HD possiblities become quite extreme.

    --
    So who is hotter? Ali or Ali's siter?

    1. Re:not USB by sconeu · · Score: 1

      He's not saying it would be USB. He's using it as an example of the external bus. That is, you'd have a new CPU, which you could plug into an external HT port.

      Obviously we need BadAnalogyGuy to comment on this one.

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
    2. Re:not USB by dgatwood · · Score: 1
      That's a nice idea and all, but it doesn't make a lot of sense architecturally, at least for general-purpose computing. HT is designed as a peripheral bus. Making a CPU be a peripheral to the main system... well, you could offload work onto it, I suppose, and it would have DMA access, but it would still be the ultimate third wheel---far enough out that memory accesses would be relatively slow, and it couldn't realistically share peripheral access, so all UI interaction and device access would pretty much have to be handled on the main CPU/GPU, so you end up bottlenecked by the main CPU for a lot of stuff anyway.

      There are some spaces where this could be really useful---places where clusters work well already. AFAIK, effectively, it becomes an asymmetric NUMA variant (not CCNUMA) architecture . The extra work to make it CCNUMA so that it could work with shared data without lots of extra programmer overhead would probably be excessive, and thus probably not worth it, both from a man hours POV and a relative performance POV. That said, such an architecture would be good for intensive audio processing and other similar tasks with relatively low interstitial dependency, and if it could be effectively pipelined from CPU to CPU without going back to main memory (i.e. multiple external CPUs each pulling data from a queue in the previous CPU's RAM), it would be absolutely wonderful.

      The place where this would be really interesting, though, would be the whole "one bus to rule them all" space. You could use this to cheaply add external PCI slots without the relatively expensive hardware needed to send PCI more than a couple of inches (though this can also be solved using PCIe as the interconnect). You could use this to eventually supersede low performance busses like USB. This could easily be used for (small) cluster computing at very high speeds. This could replace serial ATA easily, since there isn't a huge amount of infrastructure in place that requires external SATA yet. This would be GREAT for things like audio interfaces, since you would effectively get local bus performance externally---the only thing that comes close right now without massive overhead is FireWire. And so on.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    3. Re:not USB by EndlessNameless · · Score: 2, Interesting

      That's a nice idea and all, but it doesn't make a lot of sense architecturally, at least for general-purpose computing. HT is designed as a peripheral bus. Making a CPU be a peripheral to the main system... well, you could offload work onto it, I suppose, and it would have DMA access, but it would still be the ultimate third wheel---far enough out that memory accesses would be relatively slow, and it couldn't realistically share peripheral access, so all UI interaction and device access would pretty much have to be handled on the main CPU/GPU, so you end up bottlenecked by the main CPU for a lot of stuff anyway.

      Um... I hate to break this to you, but AMD-64 CPUs use Hypertransport links as their interconnect already. Which means the way you described it is exactly how it works. The 100-series Opterons have 1 HT link that goes to the system's peripheral devices and buses. The 200-series Opterons have 2 HT links: one connects it to the other CPU and the other connects to peripheral devices. I think you can guess how many links the 400- and 800-series Opterons have.

      The place where this would be really interesting, though, would be the whole "one bus to rule them all" space. You could use this to cheaply add external PCI slots without the relatively expensive hardware needed to send PCI more than a couple of inches (though this can also be solved using PCIe as the interconnect). You could use this to eventually supersede low performance busses like USB.

      This is how HT is used internally already. It connects the CPU to the other buses and system devices (the other end of the link is usually terminated by the southbridge ASIC. As far as clustering goes, a 1-meter link makes it somewhat doable, but rememeber that there are already high bandwidth external interconnects like Infiniband that are already in use. I didn't see anything in the article that suggested HT is capable of blowing the established technologies out of the water.

      --

      ---
      According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
    4. Re:not USB by dgatwood · · Score: 1
      Um... I hate to break this to you, but AMD-64 CPUs use Hypertransport links as their interconnect already. Which means the way you described it is exactly how it works. The 100-series Opterons have 1 HT link that goes to the system's peripheral devices and buses. The 200-series Opterons have 2 HT links: one connects it to the other CPU and the other connects to peripheral devices. I think you can guess how many links the 400- and 800-series Opterons have.

      Ah. But that's a different HT bus than the one that is used for peripherals. The peripheral HT bus could reasonably be extended outside the case. I'm not convinced that a CPU HT bus could be; I seriously doubt that the CPUs would be happy with a couple of feet of latency on a bus intended to provide cache coherency, much less a meter or more.

      Light travels about a foot per nanosecond in vacuum, so I'd estimate about 8 inches per nanosecond for electricity in copper, give or take. A three foot interconnect would thus be between four and five nanoseconds. That's 8-10 clock cycles each way, assuming a modest 2GHz CPU. That's a LOT of latency for cache coherency.

      If you use a lightweight cache coherency protocol that assumes it is safe to do something until told otherwise and rolls back the changes if needed, that means that if one CPU pulls something into its cache and immediately changes it while another CPU reads it, the losing CPU could have to roll back farther than the total number of execution pipeline stages in most CPUs by the time it realizes that anything has happened. Whether this is a problem or not depends on what the CPUs are doing. At the very least, you no longer have strict cache coherency with such an arrangement.

      If, instead, the CC protocol waits for a response from the other CPU every time it pulls something into the cache, you are safe against those sorts of problems. However, if you do that, the round trip latency for three feet would add 10+ns of additional effective memory latency in addition to any other cache coherency overhead and the increased actual memory latency. This would result in a very noticeable performance drop compared to traditional SMP, as memory latency would then increase by 20% over existing designs for the local CPU and 40% for the distant one. And if you have two CPUs hanging off of separate meter-long interconnects, you've just increased memory latency by 20% for the local CPU and a whopping 60% for the distant ones.

      I'd literally have to see it working to be convinced that the CCNUMA architecture used in the AMD interconnect could cope with such an extreme CC latency, and even if it could, I'm also not entirely convinced that it would make sense to do so. There are plenty of applications for outboard CPUs that don't require cache coherency; for those applications, such a setup would work quite well... like the audio pipelining I mentioned... but to try to do CCNUMA over that long a bus with a modern CPU would be... frankly... nuts.

      As far as clustering goes, a 1-meter link makes it somewhat doable, but rememeber that there are already high bandwidth external interconnects like Infiniband that are already in use. I didn't see anything in the article that suggested HT is capable of blowing the established technologies out of the water.

      I wasn't saying that it would. However, being built into stock systems has advantages in terms of cost. For large clusters, Infiniband (max 30Gbps per link) is a great choice. For smaller clusters, though, particularly for folks strapped for cash, a HT-based cluster would kick Ethernet's backside (even with Myrinet) from here to Cleveland.

      BTW, since you can support multiple Infiniband busses off a single PCIe bus that hangs off a HT bus without decreasing their performance, it is safe to say that, at least for clusters not requiring switching, an external HT (assuming bandwidth comparable to that of the internal version) would kick Myrinet and Infiniband into next week. The new HT 3.0 is over ten times as fast. :-)

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    5. Re:not USB by be-fan · · Score: 1

      That's pretty much how HT works already. Even with a few cm worth of traces, the latency hit is about 20% for the local CPU, and 60% for non-local ones. When your memory latency is at least 50ns to begin with (and that's cutting-edge --- the best FSB-based platforms are up at 80-100ns), an extra 5ns wire dely is really not a whole lot.

      --
      A deep unwavering belief is a sure sign you're missing something...
  9. If you don't like what you see just change the... by tapfu · · Score: 0

    Oh wait.

  10. Hmmmm. by ultramk · · Score: 4, Insightful

    I can see an interesting situation where you could have a traditional CPU, to which you could plug in additional external processor modules as your needs expand. (assuming the OS could handle sharing out multithreaded apps over a variety of different multi-CPU configurations.)

    Dave has a processor intensive project this week? He gets the big stack plugged into his machine until someone else in the office needs it.

    Server getting bogged down? Add another couple modules to the system.

    I like the idea.

    m-

    --
    You catch enchiladas by picking them up behind the head and holding them underwater until they don't kick anymore -VeGas
    1. Re:Hmmmm. by DaHat · · Score: 2, Interesting

      I was thinking something similar... there is one issue that no one here has thrown out yet. Heat.

      Lets say your company has a 4-way hub that can be plugged into the system of choice... imagine the cooling such a thing would require in order to keep from burning up in its enclosed plastic or (more likely) metal box.

      Not to mention the noise... oh good god the noise. My dual core 3800+ at home is quite loud... I can only imagine what a few of those bad boys sitting on your desk would sound like under full load.

      I suppose a good deal of issues could be eliminated if low power cpu's were to be used in such a manor... then you wont have as many issues drawing from the host PC (ie not necessarily having to have an external power supply).

    2. Re:Hmmmm. by Anonymous Coward · · Score: 0

      That's especially interesting with the article on the FPGA-for-Opteron-socket still on the front page. I just wish people would use that to develop cheap special-purpose CPUs for hypertransport, e.g. crypto-accelerators, physics accelerators etc...

    3. Re:Hmmmm. by pivo · · Score: 1

      I'm not exactly sure what you're saying, but if you're implying that having a loud 4-way would be hot, then I'd have to agree with you, though I think I'd prefer bad girs rather than boys. The only thing I don't understand is what this has to do with a manor.

    4. Re:Hmmmm. by greg_barton · · Score: 1

      My dual core 3800+ at home is quite loud...

      Really? I've got a 4200 with the stock cooler and it's whisper quiet. I had a shuttle box before and I was afraid the switch would be unpleasant. (I've had cpu coolers before that sounded like jets taking off. Not good...) But the cooler that came with the cpu was just fine, not much louder than the shuttle, and I run it with an open case.

    5. Re:Hmmmm. by masklinn · · Score: 3, Insightful

      My dual core 3800+ at home is quite loud...

      No it isn't you dummy, your cooling system is, now just get a knowledgeable friend to slap a Thermalright HR-01 and a Nexus 120mm fan (undervolted to 9V) on it and it'll be whisper-quiet.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    6. Re:Hmmmm. by Anonymous Coward · · Score: 0

      I have a dual core 3800+ as well (stock cooler) and its probably the quietest running processor i've ever had. The ceiling fan in my room is louder than the computer even!

    7. Re:Hmmmm. by ultramk · · Score: 1

      See, I was thinking more along the lines of sealed, liquid-cooled units.

      One of the things that people forget is that one of the biggest reasons that most PC cases are loud is that they have to be upgradable, and re-configurable. By default, the fans are full speed, and the air flow isn't designed for quiet operation.

      If something's a sealed module that will never need to be opened, the thermal profile is a known quantity that can be engineered around. Look at Apple's G5 series: whisper quiet, unless your ambient temp is 80+ degrees. Of course, they aren't the most expandable things in the world, but we're just talking about modules here.

      Of course, pick energy-efficient processors, and vent them properly, and you can use passive cooling. No fans at all. Quiet.

      m-

      --
      You catch enchiladas by picking them up behind the head and holding them underwater until they don't kick anymore -VeGas
    8. Re:Hmmmm. by Anonymous Coward · · Score: 0
      You said:
      I suppose a good deal of issues could be eliminated if low power cpu's were to be used in such a manor...

      Also, seeing your earlier post on this same story, we find:
      Good point... but do you really want to dedicate a large chunk of ram to a specific processor in such a manor?

      In both cases, the word you want is manner. A manor is a landed estate.

      Also, the verb you were looking for was ought, not aught.

      This is why your student loans are not paid off.
    9. Re:Hmmmm. by ScriptedReplay · · Score: 1

      I can see an interesting situation where you could have a traditional CPU, to which you could plug in additional external processor modules as your needs expand.

      Indeed. It sounds like a Cell-type Opteron configuration waiting to happen. If AMD manages to pull something like that off, Intel will have to eat dust for a while. For now, though, it's a fun speculation.

    10. Re:Hmmmm. by masklinn · · Score: 1

      One of the things that people forget is that one of the biggest reasons that most PC cases are loud is that they have to be upgradable, and re-configurable. By default, the fans are full speed, and the air flow isn't designed for quiet operation.

      No, the one thing people forget about is that they didn't care about noise in the first place, not until they started going deaf, and they didn't want to put $50 into their CPU cooling solution and stuck to the crappy buldozer-engine like 60mm fans because it was cheaper.

      It's fairly easy to have a quiet upgradeable reconfigurable computer with a "low" investment (low meaning above average fans, above average case, good CPU/GPU cooling solutions and a good, quiet PSU. Desn't really come cheap, but you can find them at very acceptable prices.

      Silent (or quiet enough that you can fully forget about it) is another, much harder to tackle issue. And it costs far more than merely quiet too. But it's also doable. It's all about knowledge and giving you the means to reach your ends.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    11. Re:Hmmmm. by ponos · · Score: 1
      Lets say your company has a 4-way hub that can be plugged into the system of choice... imagine the cooling such a thing would require in order to keep from burning up in its enclosed plastic or (more likely) metal box.
      The new socket AM2 dual core Athlon X2 3800+ will be available in both "normal" 89W versions and ALSO 65W and 35W (!!) versions. The 89W number is already lower than what the Athlon (original one, not XP) 1400 would require. Simply put, these processors are not power hungry. Furthermore, you can even enable Cool&Quiet power management. My Athlon 3200+ is running at 1000MHz, 1.1V with very low power consumption while I'm typing. Modern processors are expected to require less energy, not more.
      Not to mention the noise... oh good god the noise. My dual core 3800+ at home is quite loud... I can only imagine what a few of those bad boys sitting on your desk would sound like under full load.
      There are at least 3 major sources of noise: (a) cpu cooler, (b) graphics card cooler and (c) PSU. Many graphics cards (including most modern nvidia cards) allow you to turn their fan down during 2D operation using something like Riva Tuner or nvclock (for linux). I have currently turned off the fan of my Geforce 6600GT and the temperature is reasonable (less than 50 degrees celsius). The stock cpu cooler is a decent performer and does not produce much noise. If you think it causes too much noise, I would suggest going for the Zalman 7700Cu or 9500 coolers. They are remarkably quiet and also very efficient. A quiet PSU will cost you a lot (90+$), so it's a second option. Consider the Zalman 460W or other products from reputable companies (OCZ, Tagan, Antec). You can also install cheap Akasa Pax.Mate sound insulation. In conclusion, your system does NOT have to be noisy simply because it's a dual core 3800+. Even faster systems can be made to run quiet with little extra expense and careful planning. Top of the line systems may require watercooling but, if you are paying 3000+$ for a computer you can propably afford the extra ~200$.

      P.

    12. Re:Hmmmm. by Slime-dogg · · Score: 1

      I suppose a good deal of issues could be eliminated if low power cpu's were to be used in such a manor...

      I only do this because you have written this twice. The word is "manner," not "manor." A manner is a way of acting. A manor is a mansion.

      --
      You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.
    13. Re:Hmmmm. by PitaBred · · Score: 1

      Ok, you've done it at least twice in this story. THE WORD IS MANNER, NOT MANOR. Please... I know they're homonyms. It's not that hard, though. Show respect for yourself and the people reading your comments by not coming across as ignorant.

    14. Re:Hmmmm. by default+luser · · Score: 1

      Exactly, here is a real-world example of a mid high-end X2 3800+ system built to be quiet. I recently built the following:

      Athlon 64 X2 3800+ w/retail cooler (3600 RPM full speed, not exactly quiet).
      GeForce 7900 GT (with a LOUD stock cooler)
      Asus A8N5X (with an annoying 6000 RPM fan on the chipset).

      To make this system quiet, I used the proper components, and voided a couple warranties:

      * Antec Sonata II case w/450w powersupply. VERY quiet, stock ($100). Is priced competitively with other quality cases.

      * Zalman vf900-Cu video card cooler, to replace the "miniature vacuum" stock cooler on the 7900 GT ($50). Alternatively, I could have used the vf700-Cu, or an Artic Cooling fan ($30).

      * Zalman passive northbridge heatsink, to replace the 6000 RPM screamer ($5).

      * Scythe 120MM 800 RPM liquid bearing fan, to be mounted on the front fan mount to cool the hard drives and the new passive heatsink ($15).

      * Used the Q-fan feature of the ASUS motherboard to throttle the retail CPU fan. Even at max load it doesn't top 2400 RPM, and at idle it is a whisper at 1100 RPM ($0).

      Total cost to make my system quiet (assuming you could have bought a crappy, loud case + powersupply instead for $50):

      $120

      Was it worth it? HELL YES.

      --

      Man is the animal that laughs.
      And occasionally whores for Karma.

    15. Re:Hmmmm. by masklinn · · Score: 1

      The current best PSUs are probably the Seasonic S12 series: they're extremely quiet (not fanless, but they come close) and they have extremely good stability. They don't come that cheap (500W is around $100, 600W is about $120) but they're worth it in my opinion (I do own a Seasonic S12/600, which replaced a Tagan 480W. Makes less noise AND my PSU has stopped heating my case. I was impressed.)

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
  11. Maybe I'm confused... by __aaclcg7560 · · Score: 0, Troll

    Why would I want to plug a CPU into a USB slot for?

    1. Re:Maybe I'm confused... by jamieswith · · Score: 1

      Yes, You're confused... the term USB port here is not literal, but rather used to illustrate a small discrete port built into the back of the machine

      Hence why it says 'the equivalent of a USB port at the back of your computer' - it's not literally a USB port, but rather just an illustration to give you the idea of something small tucked away at the back of the system...

      Can you imagine how many lines this thing will need to run a true hypertransport bus? even if they trim it down it's going to be a heck of a lot more than the 4 used for USB.

  12. Re:f*** by eln · · Score: 5, Funny

    I really can't see it being that kind of socket!

    Oh I dunno, take it out to dinner, buy it a few drinks, you never know what could happen.

  13. OK, then - I'll ask. by Anonymous Coward · · Score: 0

    What happened? Did the /. eds go on strike for 4 hours?

    1. Re:OK, then - I'll ask. by jamieswith · · Score: 1

      Maybe they're off trying to figure out how to plug an opteron into a USB port...?

    2. Re:OK, then - I'll ask. by kfg · · Score: 1

      Naaaaaaaaaah! They're just doing a bit of "load balancing" to make up for the last time they kept posting story after story for hours after commenting was broken.

      KFG

  14. Finally! by mypalmike · · Score: 1

    A fast replacement for MIDI!

    --
    There are 0x40000000 types of people: those who understand 32-bit IEEE 754 floating point, and those who don't.
    1. Re:Finally! by bathyscaaf · · Score: 1

      There already is one -- Open Sound Control.
      Unfortunately it has only been implemented in a smattering of hardware, though it is available in a fair amount of software, including Max/MSP, SuperCollider, most of the Native Instruments stuff (their KORE hardware/softSynth system will be using it). Also iimplemented for Java, Perl, PHP, Ruby, etc.

      I believe it runs over ethernet using UDP.

      OK, back to the discussion....

    2. Re:Finally! by Anonymous Coward · · Score: 0

      mLan is a much more widely used technology that works over a FireWire connection. It is a peer-to-peer system that, like MIDI, allows devices to be daisy-chained without requiring routers or other peripheral technologies. Unlike MIDI however, the base specification includes full digital audio and video streaming as well as device control data, so it can also be used (for example) to connect digital mixing consoles to various recording devices using a single cable.

      The downside to mLan is the (current) lack of a full Linux implementation. It is however well supported on both MacOS ("classic" and OS X) and Windows XP.

  15. Brains by umbrellasd · · Score: 1
    We've got a liquid cooled CPU in a separate enclosure that is connected to the body by a HyperTransport Interconnect, too! Soon vendors will come to market with processers in self-contained water cooling devices where you just take the cord and plug it into the computer.

    Mother Nature knew it all along.

    1. Re:Brains by dpiven · · Score: 1

      Just wait until it's time to spawn some child processes.

  16. Increased Bandwidth by Metabolife · · Score: 5, Informative

    HT 3.0 increases the bandwidth to 41.6 GB/s, that's 86% more than 2.0. It's also expected to be backwards compatible with current motherboards using 2.0. The new processor will run with 3.0 speeds while the motherboard will be stuck with 2.0. The new Rev. F AMD cpus are expected to have HT 3.0. It should help with multi-processor systems where the high bandwidth connects each cpu.

  17. NOT anything like USB at all. by Visaris · · Score: 5, Insightful

    Whoever subimtted the article doesn't understand what the external HT links are for. They are _NOT_ a replacement for USB or any other similar technology. External HT is used to link multiple chassis together to form a large SMP box. This is similar to infiniband, etc. This is NOT designed to be a way to just plug in a CPU to an external port. Read the pdf:

    http://www.hypertransport.org/docs/tech/ht30pres.p df

    --

    I am a viral sig. Please help me spread.
    1. Re:NOT anything like USB at all. by I+Like+Pudding · · Score: 5, Funny

      External HT is used to link multiple chassis together to form a large SMP box. This is similar to infiniband, etc.

      Oh, so it's like USB

    2. Re:NOT anything like USB at all. by codemachine · · Score: 1

      Yes, it isn't intended for external CPUs, but that doesn't mean it wouldn't be possible. It would depend on what the OS and motherboard could support. The earlier article about the FPGA that plugged into the Opteron's motherboard slot is an example of what could now be done outside the case, if there was demand for it.

    3. Re:NOT anything like USB at all. by Visaris · · Score: 1

      Oh, so it's like USB

      Well, I suppose so... if you really want to make an SMP box out of low bandwidth, high latency USB links...

      --

      I am a viral sig. Please help me spread.
    4. Re:NOT anything like USB at all. by cyngus · · Score: 3, Insightful

      The similarity they were referring to is the plug-and-play nature of USB. The external link capability combined with 3.0's hot swapping would allow you this same kind of flexibility. You completely missed the point of the analogy.

    5. Re:NOT anything like USB at all. by Anonymous Coward · · Score: 0

      They are _NOT_ a replacement for USB or any other similar technology.

      Speak for yourself - I need it to support my new 2million-DPI ultra-violet laser mouse!

    6. Re:NOT anything like USB at all. by I+Like+Pudding · · Score: 1

      Well, I suppose so... if you really want to make an SMP box out of low bandwidth, high latency USB links...

      My post was a joke. Your sense of humor = low bandwidth, high latency

    7. Re:NOT anything like USB at all. by ink_13 · · Score: 1

      You completely missed the point of the joke.

    8. Re:NOT anything like USB at all. by EnderWiggin99 · · Score: 1

      You completely killed the joke. Are you happy?

    9. Re:NOT anything like USB at all. by Anonymous Coward · · Score: 0
      You sure it's that similar to infiniband?


      Infiniband is a lot more like PCI-express. This is a northbridge interconnect.

  18. Wait, what's that I hear? by IlliniECE · · Score: 0

    The frontside bus just crashed! Seriously though, I'm curious to see what Intel's development will be in memory interfacing.

  19. In the meantime... by jd · · Score: 4, Interesting
    Broadcom's BCM1250 MIPS processor implements a totally non-standard HyperTransport that blends several of the early 1.x specifications in a way that is unpredictable and a pain. Yes, folks, there are manufacturers out there who don't debug or maintain their product lines, who won't stick to published specs, and who can't be relied upon to publish their own specs. Sometimes, those of us who post on Slashdot slam Intel for decisions that are nothing short of insane, but there are actually far far worse offenders out there.


    Most of the HyperTransport updates look to be good (and, frankly, about time) but I am highly concerned that if certain manufacturers (such as Broadcom) haven't even bothered to do better than a fragmentary 1.x and have ignored 2.x entirely, there is little hope that they'll do much with 3.x.


    And that's the big problem. If AMD are the only ones who ever implement the specification in full, correctly, then it doesn't offer any significant advantage. It isn't universal enough to be useful. That is the killer that has murdered so many excellent technologies. Being good - even being the best - isn't enough. If a rival is more widely adopted, then it'll be the rival that wins. The marketplace doesn't reward quality, it rewards popularity. Quality achieves nothing.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:In the meantime... by Jeff+DeMaagd · · Score: 1

      This is one reason I would like AMD to get serious about making chipsets, although that might give them added incentive to do a Broadcom and make their own chips to be a non-standard implementation just to be evil.

      Most of the problems I get with a computer are with those that use chipsets of a different brand as the processor, generally the low end ones though, because I've had very good experiences with a system that had a Serverworks brand chipset.

    2. Re:In the meantime... by Trinn · · Score: 1

      Hey they couldn't do worse than some of VIA's offerings (nah, I kid, I am just fed up with having an "agp 4x" board that can only do 2x because someone forgot to measure the load-drop on a line to the agp port......not to mention other weird issues.)

  20. Thanks alot by hurfy · · Score: 2, Funny

    Now half my brain will be trying to design a 939 connector USB cable in the background....

    hehe external CPU, someone got a better batch of something than i did.....

    1. Re:Thanks alot by mambru · · Score: 1

      Opteron has 940 pin, Athlon64 939. Hypertransport is a packet protocol, its pin count is completely unrelated ;-)

    2. Re:Thanks alot by Anonymous Coward · · Score: 0

      Half your brain? Based on the above post, that isn't going to leave you much compute for essential OS functions (like breathing). Again based on the above post, that might not be a bad thing.

      It is indeed a pity that they don't have something like hypertransport for humans.

    3. Re:Thanks alot by Trogre · · Score: 1

      Clearly someone not familiar with blade servers.

      --
      "Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
  21. Broadcom isn't the whole industry: by Visaris · · Score: 2, Interesting
    --

    I am a viral sig. Please help me spread.
    1. Re:Broadcom isn't the whole industry: by Locke2005 · · Score: 1

      Gee... how about a hint about which specific computations it speeds up? FFT? DCT? Wavelet compression? I can think of lots of audio/video codec applications that this would be a win for, but nothing in the way of general purpose computing.

      --
      I've abandoned my search for truth; now I'm just looking for some useful delusions.
    2. Re:Broadcom isn't the whole industry: by hackstraw · · Score: 1


      Thanks for the link.

      This upgrade to HTX is welcome, and "A good thing(TM)". I posted about it earlier today before hearing about 3.0 coming out here http://it.slashdot.org/comments.pl?sid=183891&cid= 15191568

      It was ignored from moderation, but I thought it was a good post, if I say so myself :)

  22. HyperAmps by TopSpin · · Score: 1

    Processor on a stick. Cool idea. Now we only need to update the USB spec to supply devices with 100W of power! While you're at it don't forget that we'll also want a couple hubs in the path.

    --
    Lurking at the bottom of the gravity well, getting old
  23. Re:f*** by heinousjay · · Score: 2, Interesting

    The fact that you were modded flamebait makes me wonder which fool computerfucker got points today.

    --
    Slashdot - where whining about luck is the new way to make the world you want.
  24. So finally by iminplaya · · Score: 2, Funny

    We'll be able to go from New York to Tokyo in less than three hours?

    --
    What?
    1. Re:So finally by smithmc · · Score: 1

        We'll be able to go from New York to Tokyo in less than three hours?

      Ninety minutes from New York to Paris, well by '76 we'll be A-OK...

      --
      Downmodding is the refuge of the weak. Don't downmod, make a better argument!
  25. Hypertransport is the wave of the future. by Inoshiro · · Score: 4, Informative

    Why are MacBook Pros so much faster than Powerbooks?

    The MacBook Pro sports a 666Mhz DDR FSB, while the Powerbook sports a 133Mhz FSB. It doesn't matter how fast your processor is if you don't have a fast enough way to power it (much like a V-12 will not do well with a single-barrel carb used on a lawnmower engine).

    The Von Neumann bottleneck is the significant limiting factor in all machines, once your working set of data exceeds that of your L1/L2 cache. Suddenly your 1.5 Ghz G4 is 266 Mhz :/

    Faster hypertransport means happier users of AMD machines. My AMD64 beats the pants off my Sempron 2500 because its 800Mhz HT bus allows it to do context switches in less than 1/3rd the time of the Sempron!

    --
    --
    Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
    1. Re:Hypertransport is the wave of the future. by Anonymous Coward · · Score: 0

      Faster hypertransport means happier users of AMD machines.

      And Apple PowerMac and iMac G5s, which have from 600 to 1.35GHz HyperTransport buses. Oh, and Intel-based Macs with FSBs between 667MHz and 1066MHz.

      Ours go to eleven, etc.

    2. Re:Hypertransport is the wave of the future. by Lord+Ender · · Score: 1

      That 266MHz statement is only true when every other instruction is hitting a new address in RAM. In reality, you are likely to be hitting the disk a lot, too. Then you'll have less than 1MHz performance :-)

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    3. Re:Hypertransport is the wave of the future. by aminorex · · Score: 1

      That's informative? Only if misinformation is a flavor of information.

      --
      -I like my women like I like my tea: green-
  26. Said before and said again... by zaguar · · Score: 2, Insightful
    I've said it before and I'll say it again - Open standards lead to better products. Case in point - Hypertransport. That story about the possibilities of fluid simulations/path finding in the oil industry opened up by co-processors slotting into HTT links is just a case in point.

    Hey Intel, hows the FSB? And, for that matter, how's that DRM-soaked Viiv product going?

    --
    "Sure there's porn and piracy on the Web but there's probably a downside too."
  27. It runs Hypercard like nobody's business by Saint+Stephen · · Score: 1

    You've gotta see my dedicated Hypercard stack co-processor running on top of my custom Hypertransport stack.
    It's smokin!

    1. Re:It runs Hypercard like nobody's business by Vorondil28 · · Score: 1

      Pfft, I've got one too -- with a TURBO button!!!

      --
      This sig rocks the casbah.
  28. External CPU? by WhiteWolf666 · · Score: 1

    Bah. Why bother.

    I'd rather have an external motherboard. Keep the CPU in the case, and everything else outside. /silly off.

    --
    WhiteWolf666 an exBush supporter. All you new-school,compassionate,save the children Republicans can rot in hell
  29. Legos by Slayback · · Score: 2, Funny

    Just make all the components (memory, CPU, disks, interfaces) like Legos, and you'll be set. Need more RAM? Just add another block. Suzy needs some extra CPU for a big project, let her borrow your block for the day.

    The bonus feature would be collecting enough hardware to make the Millenium Falcon out of your PC.

    1. Re:Legos by Anonymous Coward · · Score: 0

      Cool!

      Then I can fit the fibre-optic case-mod!

  30. Wow. Just. Wow. by Anonymous Coward · · Score: 1, Informative

    Err ... your AMD64 is good because it's got a low latency on-die memory controller. It doesn't even have to think about the slow FSB bottleneck.

    The fact that the link to the chipset is also fast is just a bonus.

  31. Apple by codemachine · · Score: 1

    Too bad Apple isn't making new products with Hypertransport anymore, now that they're using Intel instead of the G5 or AMD. It would be interesting to have a rack of XServe machines that just do plug-and-play clustering via a Hypertransport port. Unless they go with AMD in the XServe (which actually wouldn't make much sense for a 1U single/dual processor unit), then I don't think we'll see anything like this.

    1. Re:Apple by ciroknight · · Score: 1

      How do you know they're not making new products with HyperTransport? They're still a signed member of the HyperTransport Consortium, and could be using HT elsewhere in the business. Just because their mainline products don't use it, isn't any reason to write it off.

      And who knows, maybe they'll convince Intel into using HT instead of the CSI bus they've been working on for so long. Intel's got to have an in-house implementation of HT up and running (it's an open standard, why not?), it's not all that far-fetched (after all, Intel DID implement x86-64...)

      --
      "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
  32. USB 1.1 at that! by Anonymous Coward · · Score: 0

    Nope, "ultra slow bus" it is! Version 1.1 for extra slowness. The nice thing is you can just keep stacking USB hubs and keep plugging in processors. Good thing these are USB powered or we'd need lots of plugs! The nice thing is I'm sure they'll come in all sorts of clear cases with LEDs. Which HD are you speaking of? High Definition? Hard Drive? Or Harley Davidson? Either way, I'm sure the possibilities will be extreme. Maybe even Xtreme!

  33. video cards by Joe+The+Dragon · · Score: 1

    Can we use the External port for a video card box?
    That will cut down on heat in your case by haveing the cpus and ram in one box and video cards in a other one.

  34. Low boost by nurb432 · · Score: 1

    Problem is current crop of FPGA chips aren't fast enough to replace a 'real' cpu.

    Im a great fan of FPGA and they they are cool, but i also know what their place is, and replacing comparably ( relative cost/performance curve ) cheap CPUs isn't it.

    --
    ---- Booth was a patriot ----
  35. or... by YesIAmAScript · · Score: 2, Informative

    Perhaps it's because your Sempron 2500 is a socket 754 chip, so cannot use dual-channel memory. The AMD64 has a faster FSB, and it's dual-channel.

    Many people (including yourself it seems) misunderstand HT. It isn't the FSB, an Athlon 64 has no FSB. HT is only used to communicate non-memory I/O and to synchronize caches between processors when doing memory I/O. So it's rather unlikely that HT could make your context switches 3X faster. Best thing for that would be a bigger cache, which your AMD64 probably has also.

    --
    http://lkml.org/lkml/2005/8/20/95
    1. Re:or... by Anonymous Coward · · Score: 0

      "Many people (including yourself it seems) misunderstand HT. It isn't the FSB, an Athlon 64 has no FSB."

      Picking nits here, but that is not really true. An Athlon 64 has a FSB that connects the core (or cores) to the onboard northbridge. It is called the SRQ (system request queue).

      Also, the coherent HT links used to communicate with other CPUs on the opteron line is effectively an FSB. It just looks weird compared to traditional FSBs because it is packetized. In the future, all FSBs will be packetized, AMD is just ahead of the curve here.

      A suitably motivated designer could design a traditional northbridge for the opteron that talks coherent HT, letting current parts use FBDIMMS or whatever. Obviously that would lose the latency advantage of an on-die memory controller, though.

  36. HT+Opteron+FPGA even sooner by galdur · · Score: 1

    ... we'll have custom HyperTransport socket FPGA chips to boost Opteron systems coming out of http://www.drccomputer.com/ real soon.

  37. I understand quite well. by Inoshiro · · Score: 1, Informative

    My AMD64 is a Socket 754, and my Sempron is Socket 462. It's on a much, much slower bus connection to its RAM. The Sempron has 180ns latency to RAM, while my AMD64 has 60 ns (worst case).

    The AMD64 average context switch latency is a few microseconds; 15ns average. Sempron is 10ns best, 70ns average. I can send you a PDF with a few hundred graphs I did with lmbench on several platforms for a reseach project recently, if you don't believe me.

    So, if my kernel is doing a context switch HZ times a second, I'm getting way better interactive performance on my AMD64 machine -- which is a socket 754 single-channel memory device. The FSB dominates.

    The bus connection between my CPU and the RAM is, indeed, the Hypertransport. Northbridge, CPU, and RAM are all connected by it. Perhaps you missed all the AMD documentation on this, or the entry in Wikipedia:

    "Front-Side Bus Replacement

    The primary use for HyperTransport is to replace the front-side bus, which is currently different for every machine (or some set of them). For instance, a Pentium cannot be plugged into a PCI bus. In order to expand the system the front-side bus must connect through adaptors for the various standard buses, like AGP or PCI. These are typically included in a controller called the northbridge.
    "

    And, yes, I am taking into account caches as well. I do appreciate the healthy skepticism.

    --
    --
    Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
    1. Re:I understand quite well. by ArbitraryConstant · · Score: 3, Informative

      "The bus connection between my CPU and the RAM is, indeed, the Hypertransport. Northbridge, CPU, and RAM are all connected by it."

      This is wrong. Athlon64s have an on-die memory controller. They communicate with memory directly through the dual-DDR memory bus, no intermediaries. This is what gives Athlons their famously low memory latency.

      In Athlon64s, the northbridge as we know it does not exist because the memory is connected directly to the CPU itself. The CPU is connected to the chipset by way of a hypertransport bus, and memory I/O for other devices goes over this bus to the CPU's memory controller.

      --
      I rarely criticize things I don't care about.
    2. Re:I understand quite well. by Pulzar · · Score: 1

      The bus connection between my CPU and the RAM is, indeed, the Hypertransport. Northbridge, CPU, and RAM are all connected by it.

      Well, whoever marked you as informative was fooled by the same info that fooled you into thinking this. Hypertransport, as the poster you are replying to explained, is *only* used to acces non-memory I/O in single-CPU systems. In those systems, like yours, it is used as a link between the CPU and the northbridge (as the wikipedia article indicates), but, unlike Intel systems, the RAM is *not* located at the northbridge. The AMD CPUs have a memory controller in the same package as the CPU core -- that's why you can achieve those low latencies you measured.

      On Intel systems, though, the FSB is the link that ultimately leads to RAM. Its bandwidth only needs to be hihgher (or equal) to the bandwidth of the RAM for it not to be the bottleneck. Faster FSB doesn't mean lower latency (on most chipsets, anyway), but it does mean higher bandwidth.

      Oh, and one nitpick -- the Intel FSB (used on new Intel Macs) is quad-pumped, not double-pumped (i.e. not DDR). So, the clock speed is 166MHz, and data transfer is 4x166 = 666MHz, like you said.

      Nobody doubts your data. It's your understanding of the system architecture that needed some updating, that's all. It's a common mistake, anyway, AMD kinda turned things around with their on-die memory controller :).

      --
      Never underestimate the bandwidth of a 747 filled with CD-ROMs.
  38. great... one more thing on my desk by atarione · · Score: 1

    all i need is an external processor on my desk.... it already drives me knutz enough when my cat gets do close to my LCD....

    BAD KITTY STOP BATTING MY EXTERNAL PROCESSOR AROUND..... THAT'S A BAD KITTY.

    --
    actually I am happy to see you, however that is in fact a banana in my pocket.
  39. RAM over HT link? by Pranadevil2k · · Score: 1

    I'm sure lots of people have thought of this before me, and probably even in this topic, but I have to ask: is there any reason they can't implement a HyperTransport link straight to the RAM? Has that already happened? I'm only a layman when it comes to processor/motherboard architecture, but it seems to me that with all that available bandwidth we should be throwing the kitchen sink into it.

    1. Re:RAM over HT link? by Al+Dimond · · Score: 1

      AMD procs these days have an on-die memory controller; they're connected directly to their RAM. HyperTransport is used for connecting processors in a multiprocessor system. This is useful because in an Opteron system each processor connects to a different slice of physical memory, so the processors need some reasonably fast way to access each other's memory.

      Adding a layer like HyperTransport on top of the direct connections present in current AMD systems would just make things slower.

    2. Re:RAM over HT link? by Pranadevil2k · · Score: 1

      If that's the case, why don't they get more than 8GB/s out of DDR2-800 RAM? (according to Anandtech review of AM2) It should be running more along the lines of 10-11GB/s out of its max 12... If that isn't caused by not having the bandwidth available, what causes it?

  40. The Rain in Spain Falls Mainly... by umbrellasd · · Score: 1

    You have a strange manor of speaking, sir.

  41. Just avoid Broadcom by btarval · · Score: 2, Interesting
    Far be it from me to defend Broadcom (as no one in their right mind should choose the BCM1250), but the 1250 is an old, nearly unmaintained CPU. It was done about 6 years ago, when the HT spec was hardly off the ground. So, yes, it implements a non-standard version of HT; but the HT spec was still evolving.

    Instead of the harping on the implementation (which was done in a slapdash, amatuerish fashion by SiByte in order to make a quick buck - and screw the customer), you should blast Broadcom for basically dropping support for this CPU. Broadcom has done almost nothing whatsoever to improve the CPU. In fact, they go far out of their way to avoid the needed improvements. Witness the completely bogus (and nearly useless) JTAG support for the 1250.

    They used to have GDB support for it for free. That's all gone; and in fact no longer works with the new Rev C 1250's. Instead, you have nearly useless third-party support from Corelis and Greenhills.

    Forget source code debugging if you have a ClearCase SCM, unless you want to go through a bit of pain and hackery.

    And, hells bells, let's not talk about the memory controller, which is the worst one I've ever seen. If there were ever anything which needed improvement, it is that.

    In short, if you chose the BCM1250, you were an idiot and deserve what you got. No sane embedded person would do so. A clueless architect might, but not a real embedded engineer.

    I once had to inherit this mess; and I'm delighted to be done with it.

    So just avoid Broadcom altogether. They have an established track record of leaving you high and dry should you make the mistake of depending on them. And they just don't give a damn about their customers.

    --
    The best way to predict the future is to create it. - Peter Drucker.
  42. After taking some time to review direct connect.. by Inoshiro · · Score: 1

    I see what the original replier meant. I was correct for Intel, but I'd forgotten a few details of how AMD changed things with Athlon64. Certainly, HyperTransport's important for filling RAM, but once RAM is full, it's straight to the CPU.

    Thanks for the reminder.

    --
    --
    Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
  43. Patents by Anonymous Coward · · Score: 0

    And don't forget the patents on processes to make fiber!
    At least that was what one paper was claiming would help the adoption of fiber to the desktop. The combination of some important patents expiring and the increased costs of making better copper cabling would cause fiber to become a better choice is a few years for just about everything. Might be delayed until power over fiber starts working.

  44. Temporary?! by Anonymous Coward · · Score: 0

    I run folding@home 24/7 you insensitive clod!

  45. External 2nd processors - nothing new. by Anonymous Coward · · Score: 0

    We've only been plugging extra computing power into our BBC Micros for 25 years.

    The idea was that manufacturers could lash together a working platform very cheaply with just the processor, some RAM, a tiny bootstrap EPROM, one I/O chip and no interrupts. The BBC Micro then talked to all the peripherals on behalf of the processor. This is how the first ARM1 processors were mounted - and two decades on, a 64 MHz ARM7TDMI board is available.

  46. Too Thick to RTFA, But A Question... by gilgongo · · Score: 1

    Is this intended to be used for peripherals as well? For example, I might have a handheld device that I can plug in to a desktop to use its CPU to do processor-intensive stuff on the handheld that it would not normally be used for when on its own.

    Or is that completely wrong?

    --
    "And the meaning of words; when they cease to function; when will it start worrying you?"
  47. What you are describing is cache by flaming-opus · · Score: 1

    That's exactly what a cache is. It's very high speed memory, often SRAM, attached on a very wide bus. Instead of letting the programmer or the OS decide which parts of software to put in the high-speed ram, and what to leave in the low-speed ram, the cache controller does, essentially letting all the data have a place in the high-speed ram, but occasionally replacing it.

    What you describe doesn't really solve any real problems. Graphics cards bennefit from fancy memories like gddr3 because they are bandwidth starved. We can see from the lack of performance increase in DDR2 systems, that opterons and pentiums are not bandwidth starved when it comes to memory, they are latency bound. Super-fast memory designs like XDR don't really help the latency problem, they only increase bandwidth, which is why you see them used for bandwidth starved micros like cell or the cray vector systems. SRAM would help the latency issue, but its so expensive, you can't throw a quarter gig in a system, even a couple megs is really expensive, so it's better to use that as cache, rather than direct-address memory.

    Furthermore, you're not saving all that much money. Some of the cost of expensive graphics ram is the memory, but a lot of it is also the elaborate memory controller, and all of the bus pins. What you're proposing still requires expensive CPU designs, CPU sockets, many-layer motherboard layouts, and memory package designs. You still have most of the cost of high-end server designs, but only a fraction of your memory is fast. It doesn't seem worth it.

  48. Craylink, anyone? by csoto · · Score: 1

    This sounds very much like the interlink that SGI/Cray uses to turn 2- and 4-processor "bricks" into multi-way supercomputers. I guess it's just testimony as to how advanced those Cray-turned-SGI guys were (we're talking early 1990s)...

    --
    There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom