Slashdot Mirror


AMD Fusion To Add To x86 ISA

Giants2.0 writes "Ars Technica has a brief article detailing some of the prospects of AMD's attempt to fuse the CPU and GPU, including the fact that AMD's Fusion will modify the x86 ISA. From the article, 'To support CPU/GPU integration at either level of complexity (i.e. the modular core level or something deeper), AMD has already stated that they'll need to add a graphics-specific extension to the x86 ISA. Indeed, a future GPU-oriented ISA extension may form part of the reason for the company's recently announced "close to metal"TM (CTM) initiative.'"

2 of 270 comments (clear)

  1. Re:Am I the only one? by mrchaotica · · Score: 4, Interesting
    I seriously doubt they're going to remove the PCIe 16x slot from motherboards any time soon.

    What I'd like to see is for AMD to put the CPU and GPU on separate chips, but make them pin-compatible and attach them both to the hypertransport bus. How cool would it be to have a 4-socket motherboard where you could plug in 4 CPUs, 4 GPUs*, or anything in between?

    *Obviously if it were all GPUs it wouldn't be a general-purpose PC, but it would make one hell of a DSP or cluster node!

    --

    "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

  2. Fusion and CUDA lead the way. by ravyne · · Score: 4, Interesting

    I've been following GPGPU stuff for awhile now, casually at first but much more closely now with the AMD/ATI merger and the release of nVidea's G80 architecture. Both of these represent the first big steps toward GPGPU technology (buzzword: stream computing) becoming reality.

    The initial approach I suspect from the Fusion effort will basically be an R600-based, entry-level GPU tacked onto the CPU die. I'd imagine that this would have 4-8 quads (GPU 4-wide SIMD functional unit) as standard. This would mostly be targetted at the IGP market for laptops and small and/or cheap desktops. Its likely that CTM will enable this additional horsepower to be used for general clculations, but its primary purpose will be to replace other IGP solutions.

    A little further out I see the new functional units being woven into the fabric of the CPU itself. This model likens closely to having many 128-bit-wide extended SSE units, likely to have automatic scheduling of SIMD tasks (eg - tell the CPU to multiply 2 large float arrays and the CPU balances the workload across the functional units automatically.) A software driver will be able to utilize these units as a GPU, but the focus is now much more on computation. It functions as a GPU for low-end users, and suppliments high-end users and gamers with discreete video cards by taking on additional calculations such as physics. Physics will benefit being on the system bus (even though PCIe x16 is relatively fast) because the latancy will be lower, and because the structures typically used to perform physics calculations reside in system memory.

    Even further out I see computers very much becoming small grid computers unto themselves, though software will take a long time to catch up to what the hardware will be capable of. I see nVidea's CUDA initiative as the first step in this direction - Provide a "sea of processors" view to the machine and allow tight integration into standard code withought placing the burden of balancing the workload onto the programmer (which nVidea's CUDA C compiler attempts to do.) nVidea's G80 architecture goes one further by migrating away from the vector-based architecture in favor of a scalar one - rather than 32 4-wide vector ALUs, they provide 128 scalar ALUs. Threading takes care of translating those n-wide calls into n seperate scalar calls. Most scientific code does not lend itself well to the vector model, though over the years it has been shoe-horned into vector-centric algorithms because it was neccesary to get addequate performance. Even graphics shaders are becoming less and less vector-centric, as nVidea research shows, because many effects (or portions there-of) are better suited to scalar code.

    Eventually, I think this model will grow such that the CPU will be replaced by, to coin a phrase, something called a CCU (Central Coordination Unit) who's only real responibility is to route instructions to the correct execution units. Execution units will vary by type and number from system to system depending on what chips/boards you've plugged into your CCU expansion bus. The CCU will accept both scalar and broad-stroke (vector) instructions such as "multiply the elements of this array by that array and store the results in this other array" which will be broken down into individual elements and assigned to available execution units.

    All of this IMHO of course.