Slashdot Mirror


AMD Fusion To Add To x86 ISA

Giants2.0 writes "Ars Technica has a brief article detailing some of the prospects of AMD's attempt to fuse the CPU and GPU, including the fact that AMD's Fusion will modify the x86 ISA. From the article, 'To support CPU/GPU integration at either level of complexity (i.e. the modular core level or something deeper), AMD has already stated that they'll need to add a graphics-specific extension to the x86 ISA. Indeed, a future GPU-oriented ISA extension may form part of the reason for the company's recently announced "close to metal"TM (CTM) initiative.'"

19 of 270 comments (clear)

  1. Am I the only one? by man_of_mr_e · · Score: 4, Insightful

    Am I the only that thinks this is a bad idea? Either I change video cards more often than CPU's or CPU's more than graphics cards, but in either case I seldom want to upgrade both at the same time. Although I suppose I wouldn't mind a better GPU "for free" with my CPU, I suspect it won't be "for free".

    1. Re:Am I the only one? by r_jensen11 · · Score: 4, Informative

      I'm guessing that, as with integrated graphics, having (a) shared GPU/CPU(s) would allow having an additional video card. I seriously doubt they're going to remove the PCIe 16x slot from motherboards any time soon.

    2. Re:Am I the only one? by hawkbug · · Score: 4, Informative

      Yeah, I thought that same thing at first. However, I don't think we are the target market. I think Laptops and OEMs will be the market for this. Just imagine a mac-mini type computer from Dell or somebody. Onboard video has been around for ages, but if the board could be smaller since the gpu is on the cpu, then you'd save space and power so the machine could be smaller and theoretically cheaper.

    3. Re:Am I the only one? by cnettel · · Score: 4, Insightful
      On the other hand, the real payoff of low latency won't surface if every operation means going through a driver, which only then realizes "oh, I have a single instruction for this thing, let's head back to the caller". This means that game writers will either still need to batch up complex operations, that the driver will then translate into batches of suitable instructions, or that we'll see games/applications with radically different codepaths. Any attempt to benefit optimally from the integrated approach will perform badly on a separate card, while code tuned to a separate card won't come close to harnessing the good points of an underpowered, but lower latency, local graphics implementation.

      It's almost like they would add L3 in a non-transparent manner, that is, expecting the developers to write the code moving suitable data into the cache and addressing that data in a radically different manner, while still also supporting the normal style of memory access, where you of course need to care about the cache, but not so explicitly. (The Cell's explicit local RAM for each unit, and the whole design of that beast, comes to mind. At least ALL PS3s will have one, but the expected target market for Fusion-only adaptations is much less clear cut.)

      And, yeah, this is quite like the situation almost ten years ago, when 3D cards were hot and new. Writing a pipeline to feed those cards was quite different from rolling your own hacked-up software rendered. (And with T&L and shaders, the move has been even greater.)

      But maybe then I'm just speculating a bit too much here. It would make sense that AMD is designing these instructions to fit into the existing driver model (or at least the DX10 one), so that you can get pretty good performance by just doing the relevant translation there.

    4. Re:Am I the only one? by drinkypoo · · Score: 4, Informative
      All they're doing is shifting the GPU (as well as the cost) to the CPU core from the motherboard.

      They're also eliminating all of the components between the CPU core and the GPU. In theory they could have a HT chip that handled all of the I/O and didn't even present a traditional system bus, if they felt they didn't need expansion slots. Thus you could eliminate the PCI/PCI-E bus and all the things needed to support it; at minimum however you are eliminating the bus between the North Bridge and the GPU and all that entails... which is a lot.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    5. Re:Am I the only one? by mrchaotica · · Score: 4, Interesting
      I seriously doubt they're going to remove the PCIe 16x slot from motherboards any time soon.

      What I'd like to see is for AMD to put the CPU and GPU on separate chips, but make them pin-compatible and attach them both to the hypertransport bus. How cool would it be to have a 4-socket motherboard where you could plug in 4 CPUs, 4 GPUs*, or anything in between?

      *Obviously if it were all GPUs it wouldn't be a general-purpose PC, but it would make one hell of a DSP or cluster node!

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    6. Re:Am I the only one? by gripen40k · · Score: 4, Insightful

      From what I understand I think the parent is right. If you use OpenGL, you don't worry about pipelining or however else the computer actually 'makes' the graphics, you just code it. Buuuutt.... I guess you would need to compile two versions of the same thing and put it on the same game disk, or figure out some kind of neat system so that translations are done in real time with hardware (much faster than the soft approach).

      --
      Har?
    7. Re:Am I the only one? by ruiner13 · · Score: 4, Informative

      Dead on. Think of the power savings for laptops, not needing to have to use energy to drive a pci-e slot with a graphic chip that only gets replaced when the laptop does. It would also allow for really slick interfaces on smaller devices, such as tablets, pdas, etc. It would also have one hell of a bandwidth rate to the processor, including full speed access to the computer's RAM. I don't think they'd give it dedicated memory die to die size, but it sure would beat going over a pci-e bus like today's shared memory integrated chipsets.

      --

      today is spelling optional day.

    8. Re:Am I the only one? by rbanffy · · Score: 4, Insightful

      As added benefits:

      - With a public and standard ISA, you will have Linux-compatible drivers shortly

      - With a public and standard ISA, people will have a single standard to code against. Library support should be excellent.

      - While your über-FPUs/vector accelerators/stream processors (what GPUs are made of) are not GPU-ing something, they can accelerate SSL, physics processing and any other vector-friendly activity you may have. Playing Flash content, maybe.

      - GPUs are memory-hungry. The added memory bandwidth will benefit all software, not only graphics-intensive stuff.

      - There is nothing that precludes you from using a stand-alone GPU, provided you have the drivers. But your CPU will have a couple high performance units that can give it a hand. Think asymmetric SLI.

      We will see how well the idea performs by watching the Cell processor (a CPU with 8 "GPU"s attached) in the PS3. That's roughly the same idea.

      In the meantime, I bet it will work just fine.

    9. Re:Am I the only one? by mrchaotica · · Score: 4, Insightful
      Maybe I'm just projecting, but what I think that you want, and many others out there want is simply a better bus, bridge, magic, glue or whatever you want to call it between the major parts of a computer.

      Hmm... you might be right. But an equally important aspect is that I want a socketed GPU and graphics memory, so that it would be modular just like the CPU and system memory.

      A combo CPU/GPU either only targets the very low end generic computer or a specialized graphics type of computer. With the failure of the other addons to computers, I don't see the advantage. Graphics simply don't matter in the server market.

      You're thinking too small. Modern GPUs don't just do graphics; they are becoming able to do just about any kind of very-parallel computations.

      For example, imagine a server of some kind that processes every packet it sends in the same way. Let's say it has 1000 connections, and that each one is handled by a separate thread. Now, imagine that the thread can somehow be implemented as a shader. Then, instead of processing 1 or 2 packets at a time (as on a single- or dual-core CPU), you can suddenly process (tens? hundreds?) of packets simultaneously! Even if the thing is clocked lower than a CPU, I'll betcha it'll still have better overall performance -- assuming you can make it work.

      Personally, I think there are quite a few problems like this, that could be parallel even though they aren't usually implemented that way now. I think it's a matter of how most programmers are used to traditional CPUs, and so don't think in a parallel way. I'm actually banking on this becoming a big deal in the future, which is why I'm taking a bunch of graphics, systems and HPC classes (I'm a CS student). And it had better do so, because otherwise computers are going to stop getting faster -- the whole reason everybody's so focused on multi core systems now is that they've hit a wall wrt. the laws of physics.

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

  2. Yeah that's the future by Rosco+P.+Coltrane · · Score: 4, Funny

    ISA is definitely the future to interface a CPU and a GPU, but I keep hearing about this VLB technology that's even hotter!

    --
    "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
  3. How long until a physics extension? by User+956 · · Score: 4, Insightful

    'To support CPU/GPU integration at either level of complexity (i.e. the modular core level or something deeper), AMD has already stated that they'll need to add a graphics-specific extension to the x86 ISA.

    x86 is a great multi-purpose, but the reason we're seeing greater and greater offload onto a GPU is because that's great at a specific task. So my question is, how long until we see widespread PPU (Physics processing unit) usage, and beyond that, a Physics extension to the x86 ISA? Or will we all just be computing on the grid at that point?

    --
    The theory of relativity doesn't work right in Arkansas.
  4. Re:One unanswered question? by Anonymous Coward · · Score: 5, Insightful

    Can it run Linux? OK JUST KIDDING!

    Why joke? It is an important question.

    All the current nvidia and ati graphics cards require proprietary, closed-source drivers.
    If the GPU is to be integrated into the CPU, either they will have to keep the new ISA a secret or we will finally start getting access to the information required to really write Free graphics drivers.

  5. Showing my age by WidescreenFreak · · Score: 5, Funny

    I guess I'm showing my age. As soon as I saw "ISA" I immediately thought, "Why the HELL are they thinking about bringing this back?

    :(

    --
    The Overrated mod is for reversing inappropriate, positive mods, not for voicing disagreement with a post.
    1. Re:Showing my age by njchick · · Score: 4, Funny

      It's not your age. It's just a problem of the current TLA namespace. Another reason to switch to XTLA (extended three letter acronyms).

  6. Re:ISA? by MadEE · · Score: 4, Informative

    ISA = Instruction Set Architecture

  7. Fusion and CUDA lead the way. by ravyne · · Score: 4, Interesting

    I've been following GPGPU stuff for awhile now, casually at first but much more closely now with the AMD/ATI merger and the release of nVidea's G80 architecture. Both of these represent the first big steps toward GPGPU technology (buzzword: stream computing) becoming reality.

    The initial approach I suspect from the Fusion effort will basically be an R600-based, entry-level GPU tacked onto the CPU die. I'd imagine that this would have 4-8 quads (GPU 4-wide SIMD functional unit) as standard. This would mostly be targetted at the IGP market for laptops and small and/or cheap desktops. Its likely that CTM will enable this additional horsepower to be used for general clculations, but its primary purpose will be to replace other IGP solutions.

    A little further out I see the new functional units being woven into the fabric of the CPU itself. This model likens closely to having many 128-bit-wide extended SSE units, likely to have automatic scheduling of SIMD tasks (eg - tell the CPU to multiply 2 large float arrays and the CPU balances the workload across the functional units automatically.) A software driver will be able to utilize these units as a GPU, but the focus is now much more on computation. It functions as a GPU for low-end users, and suppliments high-end users and gamers with discreete video cards by taking on additional calculations such as physics. Physics will benefit being on the system bus (even though PCIe x16 is relatively fast) because the latancy will be lower, and because the structures typically used to perform physics calculations reside in system memory.

    Even further out I see computers very much becoming small grid computers unto themselves, though software will take a long time to catch up to what the hardware will be capable of. I see nVidea's CUDA initiative as the first step in this direction - Provide a "sea of processors" view to the machine and allow tight integration into standard code withought placing the burden of balancing the workload onto the programmer (which nVidea's CUDA C compiler attempts to do.) nVidea's G80 architecture goes one further by migrating away from the vector-based architecture in favor of a scalar one - rather than 32 4-wide vector ALUs, they provide 128 scalar ALUs. Threading takes care of translating those n-wide calls into n seperate scalar calls. Most scientific code does not lend itself well to the vector model, though over the years it has been shoe-horned into vector-centric algorithms because it was neccesary to get addequate performance. Even graphics shaders are becoming less and less vector-centric, as nVidea research shows, because many effects (or portions there-of) are better suited to scalar code.

    Eventually, I think this model will grow such that the CPU will be replaced by, to coin a phrase, something called a CCU (Central Coordination Unit) who's only real responibility is to route instructions to the correct execution units. Execution units will vary by type and number from system to system depending on what chips/boards you've plugged into your CCU expansion bus. The CCU will accept both scalar and broad-stroke (vector) instructions such as "multiply the elements of this array by that array and store the results in this other array" which will be broken down into individual elements and assigned to available execution units.

    All of this IMHO of course.

  8. Re:A super-FPU by Anonymous Coward · · Score: 4, Informative

    Yes. I've pointed this out every time that Fusion has been mentioned here: a GPU is parallel vector processor. The resources available for rendering games can just as easily be used to accelerate scientific applications, and integrating it into one die will reduce the power and cost requirements. Since the GPUs are already becoming more general-purpose for more sophisticated shader programs, it makes a lot of sense to utilize those same resources for other applications without depending on incompatible shader architectures or PCI-Express add-on cards. It also gives AMD something to do with future die space besides creating 32-core processors that will be largely underutilized by software. People should think of this as AMD taking SSE out back and and just say, "to hell with the amateur hour, we're going to have some monster fp power." The end result is that you'll also be able to have superawesome graphics in games, as well as efficient scientific simulations.

  9. Re:I like your solution by mabinogi · · Score: 4, Insightful

    Why don't you ask AMD, as they've apparently already considered it, or they wouldn't be talking about putting both the CPU and the GPU in the same package.

    Without knowing anything about it, it would seem that if CPU+GPU in the same package is possible, then CPU + GPU in two separate CPU sized packages would be possible.

    --
    Advanced users are users too!