Slashdot Mirror


A Three-Way AMD Opteron Server

Abdul tips a thin little review up at The Inquirer of the Themis Slice. "The Slice is a three socket Opteron machine with two PCIe slots and two Infiniband 4x ports... Why would you want three sockets rather than four? Easy, latency. Any CPU in a 3S system is one hop away from any other CPU. In a 4S system, you can be two hops away. This adds latency, and more importantly, you take a big hit on cache coherency latency. This kills performance."

137 comments

  1. Weird by Major+Blud · · Score: 1

    That is one weird looking board. "you take a big hit on cache coherency latency" Isn't this only a problem with NUMA based systems (of which Opteron is)? The article also mentions UltraSparc and PowerPC-64....

    --
    If you post as Anonymous Coward, don't expect a reply.
    1. Re:Weird by ItsLenny · · Score: 0, Flamebait

      Isn't this only a problem with NUMA based systems

      NUMA?
      --
      ----------
      Trying to fix or change something only guarantees and perpetuates it's existence
    2. Re:Weird by Anonymous Coward · · Score: 5, Informative

      This is also a problem on FSB systems, as all CPUs need to snoop the bus for cache coherency information. On Intels dual-bus systems, this information needs to go across busses. The Intel 4 FSB systems are even worse. AFAIK, Opteron is the only x86 chip that would support 6 cores (12 cores with Barcelona) with a single hop.

    3. Re:Weird by Kreff · · Score: 1

      I think you're right.

    4. Re:Weird by CastrTroy · · Score: 1

      Couldn't you build a 4-way,8-way, or N-Way board where there isn't such a latency problem? Where each processor is connected to each other processor. Sure the circuit design would be pretty complex, but if it is such a speed increase, such that 3 processors gives you more power than 4, then it might be worth it. It might be very difficult with 16 + processors, but 4 shouldn't be that difficult. If it is impossible, please explain why.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
    5. Re:Weird by TheRaven64 · · Score: 5, Informative

      Yes, it's possible. The main problem in general is that cost scales in proportion to the factorial of the number of nodes. The main problem in the specific case of Opterons is that each chip needs one HyperTransport controller per other CPU. Current Opterons come with up to three HT connections, and you need one for connecting to the PCIe bus, and other peripherals, leaving two for CPU-to-CPU connections.

      --
      I am TheRaven on Soylent News
    6. Re:Weird by contrapunctus · · Score: 1

      You'd have to put the processors in a circle or something...

    7. Re:Weird by poopdeville · · Score: 2, Interesting

      I was under the impression that this latency issue was caused by the fact that there is no positive solution to the utility problem. Essentially, each core is connected directly to the other two, in a planar graph. There's no way to connect each of 4 cores to the other three without the connections intersecting, at least if the connections are made on anything topologicically the same as a convex subset of the plane (that is, no planar graph exists).

      This can be solved directly by creating chips with multiple planes on which connections can be made, or indirectly by running messages through other cores, at the cost of latency. Then again, I have no idea if multi-layer chips are in production.

      --
      After all, I am strangely colored.
    8. Re:Weird by pla · · Score: 2, Interesting

      If it is impossible, please explain why.

      Problem 1)
      Draw four circles on a piece of paper.
      Now draw a line from every circle to every other circle without crossing any lines.

      Problem 2)
      Draw four circles on a piece of paper. Draw two "pins" on each.
      Now draw a minimal path between any two circles such that you can only start and stop at a pin, and only one connection can go to a single pin.



      You have the right idea for problem 1, that for low-N, you can just route connections through different layers of the board. But that only works for low-N and doesn't generalize (though in fairness, neither does to the "3-CPU" solution).

      For problem #2, no real solution exists other than limiting the degree of connectedness to some low number of pins (2 gives the simplest case above single-CPU, a daisy-chain or ring topology), or having centralized signal switching (star topology).

    9. Re:Weird by poopdeville · · Score: 1

      Wouldn't work. See http://mathworld.wolfram.com/UtilityGraph.html What you really need is to allow interconnections to go over or under each other.

      --
      After all, I am strangely colored.
    10. Re:Weird by TheRaven64 · · Score: 2, Informative

      Not really, because modern circuit boards are not planes. A modern motherboard is typically 7 layers, with wires in one layer all running parallel to each other. Within a die the utility problem is much more of an issue, but this is largely due to constraints other than those under discussion.

      --
      I am TheRaven on Soylent News
    11. Re:Weird by conspirator57 · · Score: 1

      Structured ASICs typically have several (7-9 being common) metal routing layers which can and do cross without interconnect on a regular basis. Vias == pillars comprised by layering dots in each layer. crossing is accomplished with other materials which may or may not be removed depending on your expense/yield/dielectric properties needs. The last impacts EM coupling of traces and inter-symbol interference, and by extension speed.

      --
      "If still these truths be held to be
      Self evident."
      -Edna St. Vincent Millay
    12. Re:Weird by contrapunctus · · Score: 1

      It's a joke.

    13. Re:Weird by poopdeville · · Score: 1

      So a specific question: do modern dice have the ability to use multiple planes? I'm referring specifically to those in use by AMD and Intel for multi-core machines. Circuit boards as such aren't really relevant to the issue of interconnecting cores on dice.

      --
      After all, I am strangely colored.
    14. Re:Weird by HaloZero · · Score: 1

      That assumes a two-dimensional topology. PCBs do not suffer that same constraint (e.g. they have more than a single layer to work with). Any side of a six-sided cube is adjacent to any other side, assuming that you have the ability to transport a unit along the interior of the cube. If you took all six processors, and wired them with the same theory, no processor is more than one hop away from any other.

      --
      Informatus Technologicus
    15. Re:Weird by rrhal · · Score: 5, Insightful

                 x
                /|\
               / | \
              /  x  \
             / .   . \
            x---------x

      --
      All generalizations are false, including this one. Mark Twain
    16. Re:Weird by poopdeville · · Score: 1

      Heh, sorry to ruin it. The fact that there is no positive solution to the utility problem is not obvious, so I took your suggestion seriously.

      --
      After all, I am strangely colored.
    17. Re:Weird by Anonymous Coward · · Score: 0

      I wish I still had mod points from this morning. You, sir, would get one on 5 of your posts.

    18. Re:Weird by Anonymous Coward · · Score: 0

      Very pretty.

      And just how do you propose to connect to the memory and other devices? That currently takes the spot of one of your Xs.

    19. Re:Weird by knapkin · · Score: 1

      If you read the link you posted you would see you have misquoted or misinterpreted the utility problem.  Below is a diagram showing how to connect 4 nodes to each of the other 3 without intersection in one plane.  Posted as code because I can't seem to get it to work otherwise.

      X---X
      |\ /|
      | X |
      \ | /
        X

    20. Re:Weird by poopdeville · · Score: 1

      The system bus is a "utility" for the purposes of the problem as well. There are two ways to interpret this: first, as a utility problem, since each core needs to connect to four utilities. Or, after a counting argument, as the 5-node complete graph K5, which cannot be embedded in the plane.

      --
      After all, I am strangely colored.
    21. Re:Weird by knapkin · · Score: 1
      This is true, if you include the system bus, you have 5 nodes which each have to be connected to the other 4 and you get K5. I was simply responding to the statement that:

      There's no way to connect each of 4 cores to the other three without the connections intersecting, at least if the connections are made on anything topologicically the same as a convex subset of the plane (that is, no planar graph exists).
    22. Re:Weird by Bob-taro · · Score: 1

      Mod parent up! A picture is worth a thousand words.

      --
      Prov 9:8 Do not rebuke mockers or they will hate you; rebuke the wise and they will love you.
    23. Re:Weird by Poltras · · Score: 1

      I think you're right. And I don't think so. What's your point?
    24. Re:Weird by poopdeville · · Score: 1

      Agreed, your counter-example was appropriate for the context I set up. I assumed that the system bus would be included among the interconnections necessary, but phrased it in a way that made that very non-obvious.

      --
      After all, I am strangely colored.
    25. Re:Weird by originalnih · · Score: 1, Funny

      The plural of 'die' is not 'dice'.

      Let's get the basics straightened out before you resume pretending you know anything.

    26. Re:Weird by networkBoy · · Score: 1

      interconnect planes, yes.
      Transistor planes, no.

      A typical CPU by AMD or Intel is about 9-12 layers, only on of which (the bottom doped Si layer) has transistors. Everything else is poly or metal.
      -nB

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    27. Re:Weird by mcpkaaos · · Score: 1

      Very nice. But, you didn't use circles.

      --
      It goes from God, to Jerry, to me.
    28. Re:Weird by poopdeville · · Score: 1

      Look it up. Both dies and dice are acceptable.

      --
      After all, I am strangely colored.
    29. Re:Weird by Anonymous Coward · · Score: 0

      Problem is that you left out a spot for a connection to the outside world so to speak. Just like the Althon 64 the Opteron uses a HT link for it's connection to the chipset and pretty much anything other than memory or other processors. To connect up those useful things like Ethernet connections, Hds or even video cards you need to break at least one of those HT links and connect a process up to a chipset.

      A 3 way server would not only have single hop access to any memory but would also have 3 HT links available for connections to chipsets allowing for at least 3 full PCI-E 16 slots and other crazy configs like 6 Gb either ports and 24+ sata ports.

    30. Re:Weird by LuSiDe · · Score: 1

      Its not my field, but afaik SGI solved this in MIPS with a 'cross' architecture. Imagine it like an X.

      --
      WE DON'T NEED NO BLOG CONTROL.
    31. Re:Weird by Short+Circuit · · Score: 1

      So they're all on a contiguous bus, or is there an arbitrating node at the center?

      Multi-tap busses have their own problems (ISTR hearing about FSB speed limits on Intel multi-processor and early multi-core machines.), but that mini-node would guarantee each CPU to be two hops from any other.

      (Not my field either...)

    32. Re:Weird by KinkyClown · · Score: 1

      Is it Christmas already?

    33. Re:Weird by mabhatter654 · · Score: 1

      your confusing the issue. AMD Opteron processors ship with up to 3 HT buses on board plus dedicated ram. That is the physical limit. Intel processors only have 1 bus, then they play with different ways of connecting that to ram and other components. The board is an AMD-unique thing. HT sees processors or other components as "equals" not a "master-slave" relationship like in Intel-land. That makes the AMD available for trying out weird ways of connecting hardware and writing programs.

      on a side note this arrangement would make a killer gaming rig! You can pull 2 16x PCI-E buses off each processor. Imagine the video power you could kick out! Or on a workgroup server you could pull out extra south bridge chips for adding User IO or disks. The usefulness of AMDs HT is only starting to be tapped.

    34. Re:Weird by blackicye · · Score: 1

      Look it up. Both dies and dice are acceptable.


      In Casinos, yes.
    35. Re:Weird by Anonymous Coward · · Score: 0

      A die in the context of integrated circuits is a small block of semiconducting material, on which a given functional circuit is fabricated. Typically, integrated circuits are produced in large batches on a single wafer of electronic-grade silicon (EGS) through processes such as lithography. The wafer is cut into many pieces, each containing one copy of the ciruit. Each of these pieces is called a die.

      There are three commonly used plural forms: dice, dies, and die.


      http://en.wikipedia.org/wiki/Die_(integrated_circu it)
      http://www.patentstorm.us/patents/6869826.html
      http://www.patentstorm.us/patents/6380729.html
      http://www.wipo.int/pctdb/en/wo.jsp?WO=1985/04385
      http://ieeexplore.ieee.org/iel1/16/149/00002490.pd f?arnumber=2490
      http://ieeexplore.ieee.org/iel5/8973/28473/0127159 1.pdf
      http://ieeexplore.ieee.org/iel5/157/3478/00122279. pdf?isnumber=3478&arnumber=122279
      http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumb er=105148

      Done being wrong yet?
  2. nothing new by Exter-C · · Score: 3, Informative

    There is nothing new in this product at all, IBM have had this type of server platform (3 socket supported) for some time in the form factor of the x3755.

  3. IBM System x3755 by OS24Ever · · Score: 5, Informative

    Disclaimer, I work for IBM.

    The IBM System x3755 has offered this feature since it came out as well. Instead of the fourth processor card you install a pass through card and it turns it into a three way. We've done a few benchmarks (warning pdf) with the Pass Through card and what it could do between 3CPU and 4CPU operations.

    pretty cool ability for a few things.

    --

    As a rock-in-roll Physicist once said, No matter where you go, there you are.

    1. Re:IBM System x3755 by Anonymous Coward · · Score: 5, Funny

      OS24Ever wrote, "Disclaimer, I work for IBM."

      You don't say... : p

    2. Re:IBM System x3755 by afidel · · Score: 1

      Rerun the test with the HP having 15K disks and I might not dismiss the results. Oh and I hate that SPECjbb2005 doesn't require financial disclose, jobs per $ and jobs per watt are the only things that really matter.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    3. Re:IBM System x3755 by mr_mischief · · Score: 2, Interesting

      Actually, I've never worked for IBM, and I keep pricing eComStation. I'd kind of like to use that on a system or two. Warp 3 is getting a bit paunchy. I don't want to drop it, though, because then I'd be down to Linux, BSD, Windows, OS X, DOS, and AmigaOS.

      Visopsys, ReactOS, OpenSolaris, plan9, Minix, QNX, MMURTL, OpenVMS, Haiku, and some others could serve for utility and novelty in varying degrees, but I already have plenty of software for OS/2.

      Yes, I'm an avid system collector. If you have hardware or software that's old, obsolete, and quirky, I probably want it.

    4. Re:IBM System x3755 by Zak3056 · · Score: 1

      Rerun the test with the HP having 15K disks and I might not dismiss the results.

      They actually did address this in their benchmark document:

      Configuration Exception
      Due to backorder shipping delays from HP on the 144GB SAS 15K RPM hard drives the 72GB SAS drives
      were deemed an acceptable substitute. The SPECjbb2005 workload tool does nothing to exercise the hard
      drive and writes no data to it.
      As a result, this configuration exception was determined to be immaterial to the
      performance results addressed in this study.


      So while I would still take it with a grain of salt, I wouldn't dismiss the results out of hand... usually if someone is trying to game the numbers, they don't come out and address the problem so directly.

      --
      What part of "shall not be infringed" is so hard to understand?
    5. Re:IBM System x3755 by OS24Ever · · Score: 1

      For the record, I used OS/2 before I worked for IBM, and not after I worked for IBM. I got a Win95 machine. Though at the time we still had end users on OS/2. I think it got pushed out when they started Y2K-ing things as a desktop OS. I was a heavy 1.3/2.x user, even ran a BBS on it in the early 90s, Maximus was the name if I remembered right.

      --

      As a rock-in-roll Physicist once said, No matter where you go, there you are.

    6. Re:IBM System x3755 by networkBoy · · Score: 1

      Maximus kicked ass.
      I ran a BBS on it in the dying days of the BBS era. There are times I want to bring it back (ala, dialup to initiate a circuit connection, then DSL to DSLAM connection, but the TelCos won't allow that because they are asshats).
      -nB

      Time to fire up the old BBS server and serial console port into it for fun :)

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    7. Re:IBM System x3755 by OS24Ever · · Score: 1

      A friend of mine kept his BBS saved on a disk, every once in a while he runs some TCP/IP to Serial thing that lets me telnet into his WWIV based system. I think Win 2k3 broke the last one he had so I've not used it in a while. Was quite the retro trip especially since I had unread email from 1993.

      --

      As a rock-in-roll Physicist once said, No matter where you go, there you are.

    8. Re:IBM System x3755 by Short+Circuit · · Score: 1

      My family was the longest in a chain of owners of a BBS that ran from 1987 until January or February of this year. From about 1993 on, we used Worldgroup (The DOS version; The Windows version was useless for us.) as the accounting and auth package for a dial-up ISP.

      Up until about 2004, dial-up access was offered by way of a Galactibox packed with Galacticards connected to 33.6Kbps and 56Kbps modems. (At its peak, Cyberspace BBS had around forty serial modems connected through that tentacled monster...Too bad most of the connections were people scripting Tele-Arena.) Around 1997, we added a terminal server to get more lines and offer a purely-digital connection (on our side, though one or two customers did pay for ISDN service.).

      As a kid, I learned you could crash Vircom's TCP/IP stack by trying to play Quake across a couple 33.6 connections into the same Worldgroup box. Crutch and I ended up dialing directly into each other after that, until my parents got ISDN. (When was the last time someone called you a Low-Ping Bastard? :-) )

    9. Re:IBM System x3755 by OS24Ever · · Score: 1

      We didn't specifically call out the cost of the systems, but we listed the exact hardware used for each test with part numbers. We used it because SpecJBB2005 is CPU centric and doesn't rely on I/O as much as say TPC-C would where you can just attach 2400 hard drives to it to drive the number up.

      Believe it or not, when my team is tasked with coming up with studies like this, we try to be as fair as possible and don't try to stack the deck. We know the people evaluating purchasing our stuff aren't that stupid, and get really ticked off when you think they are and try to pass by some silly stacked against the other guy benchmark.

      We just wanted to show two features we have on our box that are hard to explain to the 'people that control the money' when they buy stuff versus the people that actually use the hardware and understand how they could apply either a 3-cpu configuration or how all 8 slots of memory run at 667MHz. That's what we tried to do with a third party.

      The third part paper company actually purchased both servers from resellers, not from us. They just give us a bill for the hardware that we'll pay. It's the best way we can figure out at the moment to try and remain 'impartial', as 'impartial' as you can be by asking a company to run a benchmark we're fairly certain we'd win.

      Again as a disclaimer I do work at IBM, but I'm not speaking for them, just what I know goes through my own head when we're working on coming up with studies like this.

      --

      As a rock-in-roll Physicist once said, No matter where you go, there you are.

    10. Re:IBM System x3755 by PrescriptionWarning · · Score: 1

      Probably also worth noting that an x3755 takes up 4 U of rack mount space. this opteron slice thing looks like it might be more dense than that if its similar to a blade.

    11. Re:IBM System x3755 by OS24Ever · · Score: 1

      Good point, my intent was to point out that the 3-cpu idea wasn't nutty, it actually had some merit.

      --

      As a rock-in-roll Physicist once said, No matter where you go, there you are.

    12. Re:IBM System x3755 by networkBoy · · Score: 1

      "(When was the last time someone called you a Low-Ping Bastard? :-) )"
      Fairly recently.
      I can select from proxy servers across the world as exit points and my local site has OC48.
      Not telling where though.
      -nB

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
  4. A three way, huh? by Anonymous Coward · · Score: 0

    Can't post that one on Youtube.

  5. What is this article about? by WFFS · · Score: 2, Funny

    Sorry... I tuned out after 'A Three-Way'.

    1. Re:What is this article about? by camperx2k7 · · Score: 1

      Damn. I was gonna go for the Jerry Maguire-esque "You had me at 'three-way'".

  6. CoProcessors? by tji · · Score: 4, Interesting

    Wasn't AMD also talking about licenses or agreements with other companies to allow for different types of coprocessor chips to be used alongside their processors?

    There is some interesting potential in that realm.. Crypto accelerators for VPN, SAN, or other devices. Multimedia encode/decode accelerators (encode 1080P H.264 in real time?). Inevitable video game acceleration devices (physics co-processor, accelerated NIC chip, 3D GPU offload processor?).

    Those would be even more interesting in home-user oriented Athlon64 boards. Multi-socket opteron boards are out of my price range.

    1. Re:CoProcessors? by DigiShaman · · Score: 2, Insightful

      That's why we have buses to open up expansion possibilities.

      For example, we have NIC chips that offload TX checksum processing, Audio accelerators (Creative X-Fi), 3D GPU cards (nVidia and ATI cards), and physic cards (ASUS brand AGEIA card). The only reason you want a dedicated socket is for extremely fast and wide IO to RAM. So far, only the GPU has come close to needing that but hanging just fine with the PCI Express interface.

      --
      Life is not for the lazy.
    2. Re:CoProcessors? by LWATCDR · · Score: 1

      Some devices are already avaliable that plug into extra AMD sockets.
      FPGAs are very popular so that you can create custom co-processors.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    3. Re:CoProcessors? by Anonymous Coward · · Score: 0

      I'm sure you're right. You should call AMD and tell those stupid engineers to stop the project.

      http://enthusiast.hardocp.com/news.html?news=MTkyN TgsLCxoZW50aHVzaWFzdCwsLDE=

  7. What would you do... by Tackhead · · Score: 5, Funny
    ...with a million dollars?

    > Why would you want three sockets rather than four? Easy, latency. Any CPU in a 3S system is one hop away from any other CPU. In a 4S system, you can be two hops away. This adds latency, and more importantly, you take a big hit on cache coherency latency. This kills performance."

    Lawrence: Three chips at the same time, man.
    Peter: That's it? If you had a million dollars, you'd use three sockets at the same time?
    Lawrence: Damn straight. I always wanted to do that, man. And I think if I worked at AMD I could hook that up, too; 'cause I hate motherboard layouts with latency.
    Peter: Well, not all layouts.
    Lawrence: Well, the type of chips that'd triple up on a board like that would.
    Peter: Good point.
    Lawrence: Well, what about you now? what would you do?
    Peter: Besides three chips at the same time?
    Lawrence: Well, yeah.
    Peter: Idle.
    Lawrence: Idle, huh? Peter: I would relax... I would sit on my ass all day... I would idle.
    Lawrence: Well, you don't need a million dollars to idle, man. Take a look at that fourth chip: it's two hops away, don't do shit.

    1. Re:What would you do... by realdodgeman · · Score: 0, Offtopic

      You don't need a million dollars. I know Microsoft is a rip-off, but 1 million for the Xbox 360? Don't think so... But then again they use PowerPC CPUs.

    2. Re:What would you do... by SCHecklerX · · Score: 1

      methinks you missed the office space reference.

    3. Re:What would you do... by Creepy+Crawler · · Score: 1

      When you mention Lawrence and computers, I think of this Story.

      And they're not good thoughts.

      --
    4. Re:What would you do... by Anonymous Coward · · Score: 0

      Only on Slashdot would the expression "three way" be used in reference to CPUs and get guys excited too.

    5. Re:What would you do... by adamofgreyskull · · Score: 1

      You know what I would do if I had a million dollars? I would invest half of it in low risk mutual funds and then take the other half over to my friend Asadulah who works in securities.

  8. Mac OS X on this machine... by andrewd18 · · Score: 2, Funny

    Any CPU in a 3S system is one hop away from any other CPU.
    So... if I run Mac OS X on this box, can we call it an iHOP?
    1. Re:Mac OS X on this machine... by mrchaotica · · Score: 2, Funny

      Only if you use it to fry pancakes!

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    2. Re:Mac OS X on this machine... by revengebomber · · Score: 1

      Next year at QuakeCon: See the amazing PanPC, the only computer specially designed to fry pancakes with its heatsinks!

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
  9. Where's the specs? by achbed · · Score: 2, Interesting

    There's no reference to this board/blade anywhere on the manufacturer's site. The only thing I can find is that this guy saw this board at a conference and took a shot and wrote a really short article about it. Ok, so a 3-way is a bit of a novelty, but good luck getting it to work. Isn't most microcode on the processors designed with 1, 2, or 4 way in mind? And isn't the cache coherency microcode embedded (at least in part) on the processors themselves? So setting up a 3-way using current processors would actually increase latency and error-checking, correct? IANAPD, but this seems like a dead end.

    1. Re:Where's the specs? by SQL+Error · · Score: 4, Funny

      No.

    2. Re:Where's the specs? by mistahkurtz · · Score: 1

      well... i called the mfr, and they don't sell this stuff to you and i. their gear is put on air craft carriers, and destroyers, where they have specialty applications written. take the article for what it truly is, an fyi on some interesting technology, and leave it at that.

      --
      not only is time travel possible, it's irrelevant.
  10. Threesome by macdaddy · · Score: 2, Funny

    So what kind of doe will this Opteron Threesome run me?

    1. Re:Threesome by garcia · · Score: 1
    2. Re:Threesome by rootofevil · · Score: 1

      a deer, a female deer.

      --
      turn up the jukebox and tell me a lie
    3. Re:Threesome by smurphmeister · · Score: 5, Funny

      So what kind of doe will this Opteron Threesome run me? Probably a couple of bucks at least!
    4. Re:Threesome by everphilski · · Score: 1

      rei, ayanami?

    5. Re:Threesome by Dystopian+Rebel · · Score: 1

      Too dear for me. Anyway, AMD is like Bambi in Intel's Core 2 Duo headlights right now.

      --
      Rich And Stupid is not so bad as Working For Rich And Stupid.
    6. Re:Threesome by mortonda · · Score: 1

      You owe me a new keyboard.... LOL

    7. Re:Threesome by Fulcrum+of+Evil · · Score: 1

      Mmm, deer...

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
  11. Same latency with 4 processors by Laxator2 · · Score: 4, Interesting

    The article states that with 3 processors one gets better performance, latency wise, because in a triangle configuration any processor cache is just one hop away. You can have 4 processors in a tetrahedron configuration and still have any processor one hop away. Of course it will take 3 hypertransport connections per processor just for the internal communications, so a 4th connection is needed for at least one processor to connect to the northbridge. The quad-core Opteron will have a maximum of 4 hypertransport connections, is that right ?

    1. Re:Same latency with 4 processors by pla · · Score: 1

      You can have 4 processors in a tetrahedron configuration and still have any processor one hop away

      Ignoring the physical trace-routing issues, you can have N fully connected nodes as long as every one has a N-1 connections (ie, a dedicated link to every other node), plus you need at least one bus-drop somewhere.

      In practice, all those connections need to physically connect somewhere, making more than a handful of fully-connected processors all but impossible.

    2. Re:Same latency with 4 processors by Chris+Burke · · Score: 1

      The quad-core Opteron will have a maximum of 4 hypertransport connections, is that right ?

      Will have, yes, once both chip and socket support it. The current socket only supports 3 HT links.

      --

      The enemies of Democracy are
    3. Re:Same latency with 4 processors by default+luser · · Score: 5, Informative

      Yes, the quad-core chips will have the fourth link. In addition, the chips will be able to split their 16-bit HT links into dual 8-bit HT links, allowing for 8-way CPU configurations without hops (8 x 8-bit HT links per socket). In reality, this is the reason why AMD is pushing the new HyperTransport 3.0: so they can cut the bus lines to 8 without sacrificing too much bandwidth.

      Check it out here.

      --

      Man is the animal that laughs.
      And occasionally whores for Karma.

    4. Re:Same latency with 4 processors by Mac_D83 · · Score: 1

      Yeah you can have the same latency if you connect them like this:

      P---P
      | X |
      P---P

      The caveat is you need 3 connections per cpu instead of 2.

      The 2 connection setup would look like this:

            P
          / \
        P---P

      Cheers
      Michael Mc Donnell

    5. Re:Same latency with 4 processors by Jeff+DeMaagd · · Score: 1

      So far that I know, the AMD CPUs that have three external HT links are the 8xx series Opterons, which gives up to eight physical processors with a maximum of two hops. I haven't heard of one with four external HT links. The 8xx series Opterons are bloody expensive.

  12. 3 is a magic number! by wwmedia · · Score: 0, Flamebait

    so 3 is better than 4?

    is this AMDs way of saying "oh look we cant make a proper quad core system like intel so we just make 3 the magic number! and everyone will buy our marketing technobable crap"

    1. Re:3 is a magic number! by Ngarrang · · Score: 1

      so 3 is better than 4?

      is this AMDs way of saying "oh look we cant make a proper quad core system like intel so we just make 3 the magic number! and everyone will buy our marketing technobable crap" Anything is possible. The real question is, what is AMD capable of selling. Sure they can add 1 more hypertransport controllers as some of the others posters have mentioned, but what does that to the cost of the chip? Sometimes, you have to slower to go faster. Or, in this case, you need fewer to do more.
      --
      Bearded Dragon
    2. Re:3 is a magic number! by WaXHeLL · · Score: 1

      Quad Core? We're talking about multiple CPUs, not multiple cores.

      --
      The troll with karma.
    3. Re:3 is a magic number! by kabloom · · Score: 1

      You wouldn't want a 16 processor computer?

      4 cores per chip (providing 3 unused HTs), by 4 chips.

    4. Re:3 is a magic number! by affinity · · Score: 1

      Actually due to the bus differences...Intel is going to a HT-like bus instead of sticking with there current bus.

      --
      no sig yet
    5. Re:3 is a magic number! by Carewolf · · Score: 1

      You wouldn't want a 16 processor computer?
      Not with Intels design, no thanks. On Intel all cores share the same bus, making it more and more congested with additional CPUs. The usefull limit is somewhere between 4 and 8 cores with that design.

  13. 6-way systems by crow · · Score: 1

    This reminds me of some 6-way systems that I'm told Data General used to sell. They took two 4-way systems, and used one of the processor slots on each as a bridge between the two boards.

  14. 4 way? by Anonymous Coward · · Score: 0

    Yep 4-way lines don't fit on a 2-dimensional plane, without crossing each other. But who said, we have a single 2-dimensional plane?

    1. Re:4 way? by JamesRose · · Score: 0, Redundant

      x.........x
      ...........
      .....x.....
      ...........
      .....x.....

    2. Re:4 way? by obsolete1349 · · Score: 1

      OMG, you did it! You figure out how to make the flux capacitor... a 4-way Opteron server!!

    3. Re:4 way? by Anonymous Coward · · Score: 1, Informative

      As I understand it, this is more analagous to a chemistry problem than a topographical one. You can consider each CPU as, say, an oxygen atom, with two available HT "bonds" (three minus the one required for PCIe/etc). You can't get four oxygen atoms to mutually bond with each other, no matter what geometry you try.

  15. ROOTER by Bob-taro · · Score: 1
    Opening sentences FTA:

    Themis Computer has developed a breakthrough in distributed computing for mission-critical systems. By functionally disaggregating commercial computing resources and housing them in a standardized footprint, purpose-built enclosure, the Themis Slice Architecture provides resilience with superior thermal and kinetic management. This open and modular design allows for spiral technology refresh, extending computing infrastructure investments for complete lifecycle management. I admit this article is probably just over my head technically, but did anyone else read this and think of ROOTER? I mean, what is "kinetic management" in a computer? Maybe they spin the CPUs through the air instead of blowing air over them. That might explain "spiral refresh technology" as well.
    --
    Prov 9:8 Do not rebuke mockers or they will hate you; rebuke the wise and they will love you.
  16. Workarounds by imgod2u · · Score: 1

    Isn't this only a problem if the OS doesn't manage the NUMA architecture well? Surely there is an OS out there smart enough to recognize separate processors with separate memory regions and assign physical addresses appropriately....

  17. hard to justify by aapold · · Score: 5, Funny

    I mean how to convince the wife that we need a three-way?

    --
    "Waste not one watt!" - CZ
    1. Re:hard to justify by swb · · Score: 2, Funny

      Especially when you haven't shown her the value in a two-way yet.

    2. Re:hard to justify by pimpimpim · · Score: 2, Insightful
      tell her it will mean less hops in general, and she might be fine with it.

      (sorry about this)

      --
      molmod.com - computing tips from a molecular modeling
    3. Re:hard to justify by Hoi+Polloi · · Score: 1
      --
      It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
    4. Re:hard to justify by Anonymous Coward · · Score: 0

      mod parent up! :)

    5. Re:hard to justify by aminorex · · Score: 1

      You could use the cache-affinity angle.

      --
      -I like my women like I like my tea: green-
  18. Multi core by jshriverWVU · · Score: 2, Interesting

    Curious if it can take multi-core cpu's. Having a 3way system with dual core opteron's sounds really nice.

  19. Too bad it was two other guys by spun · · Score: 1

    I guess you shouldn't have tuned out, now look what you're stuck with.

    Twice.

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  20. HT links by mieses · · Score: 1

    How are the hypertransport links arranged?

  21. think three-dimensional by Anonymous Coward · · Score: 1, Insightful

    Any CPU in a 3S system is one hop away from any other CPU. In a 4S system, you can be two hops away. This adds latency, ...

    How about a tetrahedron for four CPUs?
    1. Re:think three-dimensional by Anonymous Coward · · Score: 1, Informative

      They are talking specifically about the Opteron. Each CPU has two links. You'd need three links from each CPU to form a tetrahedron.

  22. Not as good as it sounds by sunderland56 · · Score: 1

    This architecture might be good for server applications - i.e. lots of instances of a single-CPU task.

    However, it doesn't work that well for large apps that get parallelized across multiple CPUs. It turns out that most code, and most compilers, are good at splitting tasks in two - or in powers of two - so having three CPUs is no faster than having two.

    1. Re:Not as good as it sounds by Namlak · · Score: 1

      However, it doesn't work that well for large apps that get parallelized across multiple CPUs. It turns out that most code, and most compilers, are good at splitting tasks in two - or in powers of two - so having three CPUs is no faster than having two.

      The third processor can run supporting thread(s) that control the "worker" threads. Let alone support processes such as network, I/O, or anything else in the OS - leaving the two CPUS (and their caches) wide(r) open for application crunching.

    2. Re:Not as good as it sounds by dlapine · · Score: 2, Informative
      Ok, so it's not for HPC systems. I'm betting that the number of servers/server farms out there may make this attractive for the non hpc users, if the 3 way is significantly cheaper than a 4 way. If you can get this on a blade, you get a 50% increase in CPU power for non-parallel tasks.


      Hmmm, now that I think about it, a three way box might be really interesting for some HPC loads as well. The low latency is a really big issue for some codes, and the three way could be more scalable (with some hand coding and profiling) than a 4 socket box with non-uniform latencies. The would apply to MPI code written and optimized for specific tasks- not the simple parallelization that some compilers can do. There's a significant number of HPC users who are happy running non-parallel code on hundreds of dual socket systems who might be able to scale fairly easily to 3 way systems. Actually, the code is parallel, to the extent that it runs on both cpus, but these particular users don't want the network latency for MPI code, even on fast networks. They could scale to three way with little loss of performance on one of these.

      Hmmm, a third thought occurs to me. A 3 socket system might also be really,really useful for codes that are I/O intensive- let the traditional mpi code run on the first two cpus and let the third handle OS tasks, network operations and high performance filesystem operations. The latency is less of a value in this case, but simply keeping the OS from interrupting the 2 cpus running MPI could be a big win as well. Call it 2N+1 computing.

      Ok, I admit it- I like options when it comes to designing systems to meet the needs of different users.

      --
      The Internet has no garbage collection
    3. Re:Not as good as it sounds by tomstdenis · · Score: 1

      This is so bullshit I don't know where to begin. GCC is a single threaded application, you can invoke parallel builds with ANY NUMBER of jobs, be it 1, 2, 3, 4, 5, ..., whatever.

      So with a 3-way box you'd just use something like -j3 or -j4 to distribute load. unless they're dual cores than -j6 or -j7 would do.

      Tom

      --
      Someday, I'll have a real sig.
    4. Re:Not as good as it sounds by Short+Circuit · · Score: 1

      I thought he was referring to compilers, not make. But I don't get how he figures compilers have anything to do with thread count; I'm only aware of a couple cases where that's controlled my anything but how the application is coded.

    5. Re:Not as good as it sounds by tomstdenis · · Score: 1

      Yeah, but the thing is nobody invokes "gcc" at the prompt unless testing something, you use make scripts to organize your builds properly. And most properly written make scripts can be run with multiple jobs.

      And even disregarding compilers, not all tasks warrant powers of twos. I don't know where he got that from. For instance, when you run Apache you can control the number of threads you want, doesn't have to be a power of two. The kernel will distribute them across available cores, just as easily with 3 as with 4.

      Hey may have been thinking of problems which are solved using some sort of divide and conquer approach (e.g. binary search). But even then, you can still use an odd number of cores, it'd just be slightly different to code up (actually, it wouldn't matter, you could just spawn a power of two threads and the kernel will just load all cores anyways).

      In short, I have no idea what the OP was talking about, but 3 [or 6 if dual core] cores would be just fine and better than 2 [4 if dc] when the load demands it.

      Tom

      --
      Someday, I'll have a real sig.
  23. Is this new? by thatskinnyguy · · Score: 1

    I'm kinda new to enterprise servers. In the picture it looks as though each CPU has its own bank of memory. If so, is that efficient or not?

    --
    The game.
    1. Re:Is this new? by petermgreen · · Score: 1

      all amd opteron systems have a ram bank per CPU.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  24. Wrong? by Ygorl · · Score: 1

    I'm probably missing something, but you can definitely have a fully-connected planar graph with four nodes. Make a triangle out of three, stick the fourth in the middle of the triangle and connect it out to the other three.

    1. Re:Wrong? by poopdeville · · Score: 1

      Yes, but you're not going to be able to connect the middle core to anything but the other three cores.

      On the other hand, I didn't mention that the system bus was a "utility" for the purposes of the problem, so your counter-example is right in context.

      --
      After all, I am strangely colored.
  25. It's interesting that by porkchop_d_clown · · Score: 1

    people are more surprised by the 3 CPU sockets than they are by the IB ports.

    I thought IB was dead - replaced by 10gigE?

    1. Re:It's interesting that by Ant+P. · · Score: 1

      I'm surprised that there's no comparisons to the X360's 3 core design.

    2. Re:It's interesting that by keeboo · · Score: 1

      I thought IB was dead - replaced by 10gigE?

      Doesn't IB have lower latency than that?

  26. BTTF by Anonymous Coward · · Score: 0

    Is that a Flux Capacitor?

  27. Very Carefully. by ciroknight · · Score: 1

    Or better yet, bond the memory to the cores like Intel and IBM are working on.

    --
    "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
  28. I've thought of this by Burpmaster · · Score: 1

    I thought a while ago that AMD, specifically, should create a 3-core processor. Why? Because they can call it the TriAthlon!

    1. Re:I've thought of this by pecosdave · · Score: 1

      You know, that's sorta lame, but I like it. Mod parent up!

      --
      The preceding post was not a Slashvertisement.
    2. Re:I've thought of this by rdebath · · Score: 1

      Aw, come on, you can do better then that, I want a PentAthlon.
      Though a DecAthlon would be even better, the DecQuadium (aka Sequent 10x486-50) I used to use was a awesome machine in it's day.

  29. 940??? by 486Hawk · · Score: 1

    From the picture the sockets look to be of the 940 type. Why not make an L1 version of this so you can at least get DDR2 or Barcelona running.

  30. The article had me at 'three-way'. by FatSean · · Score: 1

    Something about weird non-standard systems gets me going. I think I want this system. Dunno what for or why,but I want it.

    --
    Blar.
  31. Could be marketed to China by wikinerd · · Score: 1

    A 3-way server could sell better than 4-way ones in China, as the number 4 in China is associated with death.

    1. Re:Could be marketed to China by Anonymous Coward · · Score: 0

      A 3-way server could sell better than 4-way ones in China, as the number 4 in China is associated with death.
      ... and the number 3 is associated with sex.
    2. Re:Could be marketed to China by Anonymous Coward · · Score: 0

      but 8 (== wealth) is even better ! And you do need lots of that to afford one.

  32. Silly idea by swordgeek · · Score: 1

    Does anyone know how the Opteron is designed? I'll give you a hint: Two cores/CPU, two CPUs/system is the optimum configuration. There is the ability to run signals across core cross-links, such that each core is only one step away from any other--in a four way system.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  33. reminds me of engines by BroadbandBradley · · Score: 1

    a 3 cylinder engine is smoother than a 4 cylinder, a 5 cylinder engine is smoother than a 6 (or an 8 for that matter). with an even number of cylinders, 1 is on a power stroke lined up with one on an intake stroke. with odd numbers, no 2 cylinders move at the same time.

  34. Tell it to a BMW or Jaguar driver by jkevin99 · · Score: 3, Informative

    Sorry, this just isn't true in practice. The Geo's, Suzuki's, VW's and Audi's which used odd-numbers of cylinders did so only for packaging considerations, not because the engineering (smoothness, etc.) made sense. They represented a cylinder added onto or removed from a 4 cylinder engine to meet displacement needs while still fitting in the car.

    The smoothest piston automotive engines are in-line 6 cylinder engines or V-12 engines, which provide a power pulse with every 30 degrees of crankshaft rotation.

    Anything else (3-, 4-, 5- cylinder in-line, V6, V8) has more widely-spaced power pulses and is less smooth. Most of these engines use a rotating counterweight (either an off-balanced flywheel or a separate rotating countershaft) in order to dampen these power pulses and increase smoothness. This works imperfectly and comes at the price of increased weight, rotating mass, and/or complexity.

    Yet another approach which should be very smooth is the boxter design, which is used by Subaru and Porsche: cylinders are horizontally opposed at 180 degrees; this works quite well for Porsche, somewhat less well for Subaru.

    Of course the smoothest automotive engine is the Wankel rotary currently used by Mazda - the "pistons" (rotors) rotate rather than reciprocate, and each power pulse lasts for 270 degrees.

    1. Re:Tell it to a BMW or Jaguar driver by Anonymous Coward · · Score: 0

      Definition: Wankel
      Noisey rotary engine,
      Usually owned by wankers