Slashdot Mirror


AMD Fusion System Architecture Detailed

Vigile writes "At the first AMD Fusion Developer Summit near Seattle this week, AMD revealed quite a bit of information about its next-generation GPU architecture and the eventual goals it has for the CPU/GPU combinations known as APUs. The company is finally moving away from a VLIW architecture and instead is integrating a vector+scalar design that allows for higher utilization of compute units and easier hardware scheduling. AMD laid out a 3-year plan to offer features like unified address space and fully coherent memory for the CPU and GPU that have the potential to dramatically alter current programming models. We will start seeing these features in GPUs released later in 2011."

83 of 121 comments (clear)

  1. Long time coming by Kuruk · · Score: 1

    Whats wrong with hardware !

    Humans are too stupid to program it.

    Not sure want the fix is not hardware keeps exploding and we are stuck with Windows 7, lol 8 or (CAT), lol Lion.

    1. Re:Long time coming by noname444 · · Score: 3, Interesting

      Integrating CPU, GPU and unifying the memory address space will probably make things easier for programmers. So hopefully it'll help programmers utilize the hardware better.

    2. Re:Long time coming by Anonymous Coward · · Score: 1

      nvidia already offers this with cuda.

    3. Re:Long time coming by TheRaven64 · · Score: 4, Insightful

      It's not that difficult to write code that takes full advantage of modern hardware. The limitation is need. Every 18 months, we get a new generation of processors that can easily do everything that the previous generation could just about manage. Something like an IBM 1401 took a weekend to run all of the payroll calculations for a medium sized company in 1960, using heavily optimised FORTRAN (back when Fortran was written in all caps). Now, the same calculations written in interpreted VBA in a spreadsheet on a cheap laptop will run in under a second.

      It would be naive to say that computers are fast enough - that's been said every year for the last 30 or so, and been wrong every time - but the number of problems for which efficient use of computational resources is no longer important grows constantly. Look at the number of applications written in languages like Python and Ruby and then run in primitive AST interpreters. A decent compiler could run them 10-100x faster, but there's no need because they're already running much faster than required. I work on compiler optimisations, and it's slightly disheartening when you realise that the difference that your latest improvements make is not a change from infeasible to feasible, it's a change from using 10% of the CPU to using 5%.

      --
      I am TheRaven on Soylent News
    4. Re:Long time coming by eugene2k · · Score: 2

      No it doesn't. Like OpenCL, CUDA basically means you're sending instructions to the GPU by writing data to a mapped memory region. Sharing address space is not possible at that level. It's only possible to do at a CPU level.

      --
      Apple has "Mac vs PC", Microsoft has "Laptop Hunters", Linux has recession
    5. Re:Long time coming by noname444 · · Score: 2

      While I agree with you regarding application programming, need, etc. I must clarify that I was talking about graphics/game applications that require the full hardware potential.

      If you compare this new architecture with an arguably over complicated architecture like the playstation 3 I'd argue that writing software that utilizes the hardware to its full potential is indeed hard. And in this context, making a more elegant, integrated GPU/CPU will make the lives of us poor indie game programmers a bit easier.

    6. Re:Long time coming by Bert64 · · Score: 2

      The current trend seems to be towards more power efficient hardware and virtualization (and dynamic scaling etc), rather than ever faster hardware...
      So while your interpreted spreadsheet may be able to compute payroll calculations in a second, your hardware will consume more power doing it that way than using an optimized implementation... Also with sub optimal code, you won't be able to run as many instances on a single piece of hardware, and thus require more hardware.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    7. Re:Long time coming by TheRaven64 · · Score: 4, Interesting

      Not really. Now the CPU spends 95% of its time waiting for data from the network or disk instead of 90%, but the CPU is rarely the bottleneck these days.

      Around the time of the Pentium II, Intel did some simulations where they increased the (simulated) speed of the CPU running typical applications and measured performance. They found that, if the speed of other components didn't change, an infinitely fast CPU (i.e. all CPU operations took 0 simulation time) ran about twice as fast as the ones that they were shipping. It doesn't take much of an improvement in CPU speed before the CPU just isn't the bottleneck anymore, even in processor intensive tasks. RAM and disk bandwidth and latency quickly take over. This was one of the problems Apple had with the PowerPC G4 - the RAM wasn't fast enough to supply it with data as fast as it could process it, so it rarely came close to its theoretical maximum speed.

      --
      I am TheRaven on Soylent News
    8. Re:Long time coming by pandrijeczko · · Score: 4, Interesting

      I think what is going to be really interesting is to see what this does to PC gaming from the perspective of non-Windows operating systems.

      APUs are clearly a step forward in the direction of putting powerful graphics processing on portable devices, an area where Microsoft and Windows has very little marketshare at the moment.

      Therefore, this surely must bring DirectX's domination in the PC gaming market into question - will this therefore result in more commercial games being developed around OpenGL, thus making cross-platform games much easier to develop?

      --
      Gentoo Linux - another day, another USE flag.
    9. Re:Long time coming by ElusiveJoe · · Score: 1

      First, this system has distributed memory which is much harder to program than shared memory in case of AMD.

      Second, MPI is not a language, it's a standard. But in terms of parallel languages it's more like an assembler: data locality and synchronization have to be managed by the programmer. So it cannot substitute high-level languages.

      As to functional languages, I have no idea how information about the data locality could be extracted from the program. It's not that I doubt them, I really don't know, but would like to learn.

    10. Re:Long time coming by hairyfeet · · Score: 2

      Frankly I really don't see how much better GPUs can get picture wise myself. Hell my HD4850 which my GF got me for my BDay cranks the living hell out of the purty, so much I have to be careful not to be distracted by the purty and get my ass blown off! And maybe it is different with CUDA but the only thing I've seen come out for Streams is a video transcoder that frankly doesn't give you as good a result as a plain Jane CPU only transcode, and the time savings isn't worth the picture hit.

      So while I'm sure this will make programmers happy I really don't see how it will make much of a difference to Joe user. Hell even the sub $150 GPUs that are the biggest market have so much purty being thrown on the screen it is truly insane, I never thought I'd see the day that human faces and movements would get THAT realistic!

      And finally there is that bloated stinking dead elephant in the room no one mentions, I'm of course talking about the craptastic consoles that everyone is writing the games for. While i like the fact that the vast majority of games will run native resolution with lots of bling even on my 3 generations old HD4850, I'll be the first to admit PCs aren't the main target market anymore. Hell the new Nintendo is gonna have the HD4xxx series, which like mine is already three generations behind and it ain't come out yet!

      So I honestly don't see how all this extra goodness is gonna make much of a diff. The developers write to the consoles first, the consoles don't have these features, therefor nobody writes to them. Hell look at how few DX10 and DX11 games are out, simply because the consoles are DX9. If the other consoles follow Nintendo then we'll be seeing DX10 in late 2012, so maybe this cutting edge stuff will get used by the majority of games around 2022, when you can pick up these chips at a yard sale for $5. Depressing, but that is life.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    11. Re:Long time coming by Kjella · · Score: 2

      Actually, more a latency problem than a bandwidth problem. It's that the pyramid of L1-2-3 cache and system ram is quite a few cycles from the CPU. You can see with multi-core systems that they scale quite nicely as long as you are running well multi-threaded code.

      The other question is of course how much of the bottleneck is between chair and keyboard. Very often they'll complain if the computer takes 5 seconds to do some heavy processing while they happily goof off for 5 minutes. And it's not like computers practically lock up under load anymore, you can always do other things while it's working.

      --
      Live today, because you never know what tomorrow brings
    12. Re:Long time coming by Savantissimo · · Score: 1

      "hopefully it'll help programmers utilize the hardware better."

      Yes, this looks like a very nice architecture, which it should be possible to use to the max - if it weren't for AMD's plan to cripple its double-precision performance. NVIDIA already does this- if you don't shell out the extra $ for a Tesla or Quadro, they cut the 64-bit performance by half of what the chip can actually do. AMD's current chips are even slower than the crippled NVIDIA chips on 64-bit floating point, but the new AMD Fusion chips should match the NVIDA performance ratio, 64-bit being half as fast as 32-bit FP. But unless you get a Fire brand card, AMD plans to cut the potential performance by half, and lower end models may see their chips spoiled to do 1/8 or less of the design potential.

      This is stupid. If AMD wants to gain market share, they need to provide better value than NVIDIA or Intel. They've got a design that will do that, with faster memory transfers and potentially far more RAM than NVIDIA's GPU's can access, potentially opening up new markets for large data sets and real-time simulation. Intentionally screwing up their chips is not an effective way to try to get that market share. AMD should differentiate models by clock speed and compute units, not by spoiling good hardware.

      --
      "Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
  2. The first problem that comes to mind.. by Anonymous Coward · · Score: 1

    Is that the modular nature of current components allows for relatively easy upgrading and a comparatively low cost. Buying a new graphics card that has the price of a GPU and dedicated video RAM is reasonable. Having to buy a new CPU every time you want to upgrade your GPU could get unreasonably expensive fast.

    1. Re:The first problem that comes to mind.. by Rosco+P.+Coltrane · · Score: 4, Insightful

      I think only a small number of computer users upgrade components these days - gamers and power users. But the majority of people these days buy a beige box or a laptop and never ever open them. From a business point of view, combining the GPU and the CPU makes sense. Heck, nobody cried when separate math coprocessors disappeared.

      --
      "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
    2. Re:The first problem that comes to mind.. by kevinmenzel · · Score: 1

      That may be the case, but the boxes they buy benefit from the economy of scale offered by being able to seperate those components. Every time I go to a computer store, I'd say that within the boxes people can buy, there's a wide variety of CPUs and GPUs in those boxes - in many combinations. This allows customers to buy what they need. For some, that's a moderate processor with moderate graphics, for others, it's a moderate processor with relatively decent graphics (to play blu-ray discs or 1080p flash videos), for gamers they want specific GPUs mixed with specific CPUs to give them the best performance in the games they care about. In professional workstations, you want a workstation GPU that's going to have similarities to consumer GPUs, but will focus on different tasks. For home recording enthousiasts, as they delve deeper into the field, they need to have control over those elements in order to avoid potential conflcts with audio hardware. Some people need to be able to support more than 1 monitor, but others only need 1. Some people need to be able to output to S-Video to connect to an old projector - but might not need that feature in a year, when the projector is set to be upgraded to HDMI, at which point the IT team will want to replace some graphics cards.

      Essentially, there are damned good reasons to have things seperate, because computers, as much as they are general purpose machines - aren't actually that generalized to the point where one can say "You only need 3 different kinds of computer." If that were ACTUALLY true, Apple would be doing a lot better than they are in computer sales.

      But it's not true. So hopefully intel WON'T push the same thing, because then pretty much every application that matters will have to still support the current model with all of its difficulties, but likewise all of its benefits

    3. Re:The first problem that comes to mind.. by MareLooke · · Score: 1

      You forget the fact that the majority of the PC games is created with console hardware in mind and as such uses only a fraction of what a modern GPU is capable of.

      That said, the "good enough" argument does fly for desktop users. I expect graphics boards to become like sound cards, they will be useful for specific applications (musicians come to mind for sound cards) and the people that need them will buy them.

    4. Re:The first problem that comes to mind.. by Smirker · · Score: 1

      Heck, I can't even notice the difference these days to SCREEN 13.

    5. Re:The first problem that comes to mind.. by Targon · · Score: 1

      AMD will still make straight CPUs as well as GPUs, but Fusion makes sense for the low end of the market that was already going to use integrated graphics, the APU makes more sense. You can also add a video card to a desktop, or possibly some laptops that have a Fusion APU. As it stands now, Llano is still going to be using CPU cores that are based on current Athlon 2/Phenom 2 cores. Bulldozer is the next core design from AMD and will have both CPU-only implementations, and then later we will see new Fusion APUs that use that new processor core design.

      Think of it the way you do computers today, you have your low end with integrated graphics that NEVER gets updated, then you have your mid-range, and then you have the high end. For MOST users, there is virtually no need for a machine that is more powerful than what Llano offers, but there are still a good number who want or need more.

    6. Re:The first problem that comes to mind.. by TheThiefMaster · · Score: 2

      I would imagine that you'll likely still be able to upgrade by adding a discrete graphics card for quite some time.

    7. Re:The first problem that comes to mind.. by Targon · · Score: 3, Informative

      There will still be that same ability to get separate components, but the GPU element is being moved from the chipset onto the CPU(now called an APU).

      There really have been only three general configurations:
      1: CPU with integrated graphics on the motherboard
      2: CPU with integrated graphics on the motherboard PLUS a discrete video card/GPU.
      3: CPU without integrated graphics on the motherboard with ONLY one or more video cards.

      So, what this does is to update 1 and 2, since you can still add a discrete video card. Since the graphics portion of Fusion is better than what Intel offers, this isn't a bad setup. There will also be the option to swap the APU with a faster version that has both a faster CPU core as well as faster GPU core in most motherboards.

      Yes, there are certain advantages offered by the APU design, but it isn't an "all or nothing" offering, AMD will continue to offer straight CPUs(with Bulldozer being the next core design), and if you think about it, AMD may go to a tick-tock design like Intel has, but rather than it being based on core design and fab processor technology going back and forth, we may see AMD going CPU core design, GPU design, and then APU to combine the latest CPU and GPU designs.

      Right now, many are waiting for AMD to release its first all new core design since 2003, since that will hopefully get AMD the better CPU core performance that many have been waiting for.

    8. Re:The first problem that comes to mind.. by TheRaven64 · · Score: 2

      Laptop sales passed desktop sales a couple of years ago. Anyone buying a desktop is now in the minority. With laptops, the constraints are different. Having the CPU and GPU in separate chips complicates the board design, which adds to the cost. With integrated CPU and GPU designs, you can have a simple board design and just pop a faster chip in the top of the line models.

      Upgrading your GPU separately? My first PC had a slot for installing an FPU. You could get one from Intel, but you could get faster ones from AMD. Then Intel integrated their inferior FPU into the die with the 486. How many people now complain about not being able to replace their FPU with a faster third-party one?

      --
      I am TheRaven on Soylent News
    9. Re:The first problem that comes to mind.. by Tapewolf · · Score: 2

      Since this design seems to be about using the APU for non-graphics things as well, you could probably stick an nVidia card in the PCI-E slot for better video and continue to use the Fusion APU for OpenCL (or whatever) at the same time.

    10. Re:The first problem that comes to mind.. by fuzzyfuzzyfungus · · Score: 2

      Also, nothing about AMD's new design precludes discrete GPUs more or less similar to today's models, it is just an effort to make the (economically inevitable) integrated GPU more useful by virtue of its close integration with the system, rather than simply cheaper as integrated GPUs are today.

      Expansion will be slightly trickier than today's Crossfire/SLI, because certain GPU elements(while comparatively few) will enjoy much faster access to the CPU and main memory, while the expansion GPU(s) will presumably have many more elements, and their own pool of RAM; but be a PCIe bus away from the CPU. I'm sure that the beta drivers and the edge cases will be pretty dire; but it will eventually be worked out.

    11. Re:The first problem that comes to mind.. by MrHanky · · Score: 5, Insightful

      One reason why laptop sales passed desktop sales is of course that desktops last longer, due to their upgradeability.

    12. Re:The first problem that comes to mind.. by cynyr · · Score: 1

      Just upgraded my desktop, and by that i mean bought a new one, and moved the server into the old one. At some point in the future, this new desktop will become the family computer, I'll have a new desktop, and the server will still be humming along.

      As for "bumping up the price" give me tools to use it for GPGPU while using the PCIe card for video and I'm sold.

      I think some business users will notice, I have a nvidia Quadro in my work laptop for a reason.

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
    13. Re:The first problem that comes to mind.. by GreatBunzinni · · Score: 1

      The reason why nobody cried when separate math coprocessors disappeared was because not only math coprocessors didn't disappeared but also separate math coprocessors didn't disappeared also.

        Back in those days, you needed a math coprocessor because more often than not the CPU didn't offered any support for basic features such as floating point arithmetic, which happens to be of fundamental importance. Yet, even when providing that support directly on the CPU and even providing vectorized versions of it became a standard feature, you still have a considerable number of people spending ungodly amounts of money on separate math coprocessors which are more commonly known as.... graphics cards. If there is any doubt that a graphics card is nothing more than a glorified math coprocessor then learning about OpenCL and CUDA should be enough to dispel this myth. And you know what? Nowadays people spend more money on those graphics co-processors than in CPUs.

      Now, I don't know how the majority of the people you know decide to lead their life, but what I know is that I do believe that if someone decides to take away their ability to switch graphics card or even in some cases install multiple graphics cards on a computer. then they will proverbially cry. Or at the very least be extremely pissed. Adding to this, in the consumer standpoint the ability to choose and the ability to upgrade their graphics cards is one of the things that still make desktop computers as relevant as always. Just because nowadays we have practically disposable computers just lying around, such as cheap netbooks, and just because people purchase those portable computers to do other trivial stuff such as communicating and browsing simple sites it doesn't mean that everyone suddenly stopped needing a graphics card which can be upgraded.

      --
      Slashdot, fix your code or at least hire someone who is competent at it to do it for you.
    14. Re:The first problem that comes to mind.. by drinkypoo · · Score: 1

      Who upgrades desktop machines? Most desktops go through their entire life without a single upgrade. Most users will pitch them and buy another computer if they develop a problem they don't know how to fix, let alone if the machine is too slow. Remember, we live in a disposable culture. It's interesting in that the Native Americans were big on throwing stuff into big piles too, but of course nothing they were working with was leaving a toxic debt.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    15. Re:The first problem that comes to mind.. by nonicknameavailable · · Score: 1

      i upgraded my scaleo x with a new motherboard harddrive cpu and ram and gpu and harddrive

      --
      Mendacem Memorem Esse Oportet
    16. Re:The first problem that comes to mind.. by RobbieThe1st · · Score: 1

      And they're less likely to fail due to less movement etc.
      There's *still* P4's in use, though they are finally being phased out -- and then only(likely) because the PSU's caps are failing. Same with other hardware from the same vintage, like screens.

    17. Re:The first problem that comes to mind.. by obarthelemy · · Score: 1

      Yes and no. Most customer (I'd guess 80%) actually don't care at all about performance (neither CPU nor GPU) because whatever's current nowadays is good enough for them. For those, an APU means cheaper prices and more hardware/software reliability.

      The rest will indeed need more CPU and/or GPU power, and neither Llano nor its successor will be for them, because the CPUs are lackluster, the GPU is OK but not great (equivalent to an entry-level discrete card), and, on top of that, CPU and GPU have to fight for RAM bandwidth, which becomes a major bottleneck.

      Again, the vast majority of the market, the ones looking at a Core i3 and a sub $75 vidcard, should look at Llano.

      --
      The Cloud - because you don't care if your apps and data are up in the air.
    18. Re:The first problem that comes to mind.. by gad_zuki! · · Score: 1, Flamebait

      Cheap people, gamers, power users, and businesses do. That's probably a good chunk of the desktop market right now.

      > Remember, we live in a disposable culture.

      Would you like some cheese with your whine?

    19. Re:The first problem that comes to mind.. by hedwards · · Score: 1

      Perhaps for most folks around here it's low end, but I recently got one, and I've been shocked at how well it performs. You're not going to be playing games that were made in the last few years, but it does a really good job at the sorts of things that people typically do. I needed something portable, durable and power efficient, and it does that quite well. I'm really curious to see what the new tool kits are going to be able to provide.

    20. Re:The first problem that comes to mind.. by hairyfeet · · Score: 1

      Uhhh...didn't read TFA, did you MR AC? It says plainly that in cases where there is a dedicated GPU the integrated will take over the physics and leave the dGPU to push the graphics. Think of it as the biggest baddest FPU ever created, sure it'll do graphics, and if all your friends wans to do is play WoW it'll work fine for that, but it isn't gonna stomp some GDDR 5 800 stream processor beast.

      But the nice thing about this design is unlike today all those IGPs won't just be turned off if you have a real card, it'll be doing physics and other number crunching thus making your monster GPU even more insanely powerful because it isn't having to do both graphics and physics anymore. Frankly from the sounds of it it will be pretty sweet, just give it about 2 years for everything to get integrated which will be right about the time I'll be ready to move this AMD quad into the background and build me a new box. Go AMD!

      --
      ACs don't waste your time replying, your posts are never seen by me.
    21. Re:The first problem that comes to mind.. by Antisyzygy · · Score: 1

      The integrated GPU on the processor die won't make it impossible to buy a and install a aftermarket graphics card. In fact, you could just use the integrated GPU for other things, like super fast matrix computations. A video game could in effect use the Fusion processor by allocating matrix computations to the GPU and scalar computations to the CPU, then leave an aftermarket graphics card only for rendering. A programmer would have to write the program to take advantage of this, but its possible.

      --
      That brings me to an interesting point, / . is just "the ramblings of socially-inept, technology-literate news-mongers".
  3. Re:AMD's next strike against intel by loufoque · · Score: 1

    They already have Larrabee, which is pretty much the same thing but far better.

  4. Re:AMD's next strike against intel by chaboud · · Score: 2

    Dead. Project.

    Larrabee proved to have a few fundamental flaws, last I checked.

  5. I like the idea, but have concerns by Sycraft-fu · · Score: 5, Interesting

    One concern of mine is simply performance with unified memory. The reason is that memory bandwidth is a big factor in 3D performance. The kind of math you have to do just needs a shitload of memory access. This is why GPUs have such insane memory configurations. They have massively wide controllers, special high performance ram (GDDR5 is based on DDR3, but higher performance) and so on. That's wonderful, but also expensive.

    So it seems to me that you run in to a situation where either you are talking about needing to have much more expensive memory for a computer, possibly with additional constraints (at high speeds memory on a stick isn't feasible, electrical issues are such that you have to solder it to the board) or a system where your performance suffers because it is starved for memory bandwidth. Please remember that it would also have to share memory with the CPU.

    Perhaps they've found a way to overcome this, but I'm skeptical.

    I also worry this could lead to fragmentation of the market. What I mean is right now we have a pretty nice unified situation from a developer perspective. AMD and Intel have all kinds of cross licensing agreements with regards to instruction sets. So the instructions for one are the instructions for the other. While there are special cases, like 3DNow that only AMD does, or AVX which Intel has and AMD has yet to implement, by and large you have no problems supporting both with a very similar, or dead identical, codebase.

    Likewise GPUs are unified from an app perspective. You talk to them with DirectX or OpenGL. The details of how AMD or nVidia do things aren't so important, that handled. You use one interface to talk to whatever card the user has. Not saying there can't be issues, but by and large it is the same deal.

    Well this could change that. APUs might need a drastically different development structure. Ok fine, except AMD might be the only company that has them. Intel doesn't seem to be going down this road right now, and nVidia doesn't have a CPU division. So then as a developer you could have a problem where something that works well for traditional CPU/GPU doesn't work well, or maybe at all, for an APU.

    That could lead to a choice of three situations, none that good:

    1) You develop for traditional architectures. That's great for the majority of people, who are Intel owners (and people who own what is now current AMD stuff) but screws over this new, perhaps better, way of doing things.

    2) You develop for the APU. That is nice for the people who have it but it screws over the mass market.

    3) You develop two versions, one for each. Everyone is happy but your costs go way up from having more to maintain.

    Of course even if everything goes APU it could be problematic if AMD and Intel have very different ways of doing things. Their cross licensing does not extend to this sort of thing, and I could see them deciding to try and fight it out.

    So neat idea, but I'm not really sure it is a good one at this point.

    1. Re:I like the idea, but have concerns by Joce640k · · Score: 2

      This is why GPUs have such insane memory configurations. .... wonderful, but also expensive.

      Have you seen what sub-$100 graphics cards can do these days?

      This sort of integration could save enough money at the manufacturing end to make that level of graphics almost free to the end user, especially in laptops. It's a huge win.

      --
      No sig today...
    2. Re:I like the idea, but have concerns by YoopDaDum · · Score: 5, Interesting

      Unified memory is an implementation option, but not the only one. It definitely make sense when price matters more than performance. But for a higher end part you could have separate memories. Look at AMD multi-core CPUs, it's already NUMA (light) from the start: each core as a direct attached bank with minimum latency, and can access the other cores memory banks with a (small) additional latency. Extended here, the GPU could have a dedicated higher performance GDDR5 memory directly attached, but accessible from the CPU side (and similarly the GPU could access all the system memory). It's a NUMA extension for a hybrid architecture if you wish. It needs support from the OS/drivers to handle this in a transparent way, but NUMA is not new so existing know-how could be reused.

      Regarding performance, on principle an integrated solution can do better by offering tighter integration and more efficient exchanges between CPU and GPU than going through a lower speed / higher latency external bus as for a discrete GPU. We shouldn't judge the principle by today implementations, as they target the low (bobcat based) and middle (llano) ends only, not yet the high end.
      The con of integration is that you loose the flexibility of choosing CPU and GPU separately, and upgrading separately, but as others have pointed out most people do not care nor use this in practice.

      As for fragmentation, it's the usual situation. You can hide the differences using things like OpenCL, but you'll sacrifice some performance initially compared to a targeted implementation. Most should target this when the tools become sufficiently mature. But if you want to extract all the juice you will have to be target dependent, and face this fragmentation indeed. Still, over time we can expect some convergence (the good ideas will become clearer, and be adopted). So with time the generic approach (OpenCL or like) will become better and better, and less and less people will develop for a target as the decreasing performance advantage won't justify the cost. This process will not necessarily be fast ;) and we're just starting.

    3. Re:I like the idea, but have concerns by WaroDaBeast · · Score: 1

      Well, we could always have memory right on the motherboard, à la Sideport. Of course, more memory, such as 512 MB of GDDR5, would be better than today's Sideport memory's specifications (which is 1333 MHz DDR3, I think). But anyway, comparing HD 6xxx integrated GPUs to their non-integrated counterparts, I find the memory bandwidth not to be so bad.

      Any sub €60 graphics card I can buy comes with, at best, 1333-1400 MHz DDR3 memory anyway...

      --
      "The body may heal, but the mind is not always so resilient." -- Deus Ex: Human Revolution
    4. Re:I like the idea, but have concerns by drinkypoo · · Score: 1

      They have massively wide controllers, special high performance ram (GDDR5 is based on DDR3, but higher performance) and so on.

      I have a GT 240. It has 3/4 the functional units of the GTS 250, GDDR3 instead of GDDR5 (you can get a GDDR5 model now, but you couldn't when I bought it) and yet provides 3/4 the performance of the GTS. The memory bandwidth is clearly only an issue when you actually need that much bandwidth, which you don't if you're pushing slightly less polys etc. As long as the connection to memory is wide enough it won't be a problem for the low- to mid-range market they're aiming for.

      I also worry this could lead to fragmentation of the market. [...] Well this could change that. APUs might need a drastically different development structure.

      They might?

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    5. Re:I like the idea, but have concerns by Rockoon · · Score: 1

      Regarding performance, on principle an integrated solution can do better by offering tighter integration and more efficient exchanges between CPU and GPU than going through a lower speed / higher latency external bus as for a discrete GPU.

      This isnt quite right. On principle, a discrete solution doesnt have to compromise with the low-latency random access memory performance demands of the CPU, while an integrated solution does. For raw compute performance, the discrete solutions are starting out in a much better position.

      The latency savings only manifests as a win for small workloads, but small workloads ultimately dont matter (blink of an eye vs half-a-blink of an eye)

      --
      "His name was James Damore."
    6. Re:I like the idea, but have concerns by obarthelemy · · Score: 1

      I don't see why APUs need to be seen differently than discrete cards, from a software point of view. AMD has made abundantly clear that LLano is using a variant of their current Radeon architecture, all the hardware is and will remain abstracted anyway (through DirectX mainly).

      I'm sure there are specificities to an APU, and that they would benefit, possibly greatly benefit, from the Apps adressing them in a more "native" way. But the same can surely be said of the discrete AMD and nVidia cards, and nobody is interested. Such is the dominance of directX anyway than graphics chips designers actually target directX support at the design stage of their chips. The same will go for APUs.

      --
      The Cloud - because you don't care if your apps and data are up in the air.
    7. Re:I like the idea, but have concerns by LWATCDR · · Score: 1

      They do address this but I am going to suspect that their will always be room for high end GPUs or at least there will be for a long time. APUs are going to target the good enough category first. If they are good enough for 1080p video and gaming they will be good enough for 90+% of the market. This will hopefully raise the bar on integrated graphics up to the usable level. For high end users the APU could be used for things like trans-coding, physics modeling, and other GPU friendly tasks while the graphics cards can be used for the display. In theory the APU will be good enough for even light CAD work and none enthusiast gaming. There is always an an option to add GDDRx to the system as well for the APU if more performance is needed.
       

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    8. Re:I like the idea, but have concerns by LWATCDR · · Score: 1

      Yes
      http://www.tomshardware.com/reviews/best-graphics-card-game-performance-radeon-hd-6670,2935-2.html
      For $65 you can get a card with great 1680x1050 performance in most games.
      In other words good enough for most people.
      If they can get APUs up to that level which sounds possible it really will be great.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    9. Re:I like the idea, but have concerns by hairyfeet · · Score: 1

      Personally i'm hoping Nvidia buys out Via and then we could easily see two with this design (AMD and Nvidia) VS Intel going traditional CPU. According to TFA AMD plans to keep this design open and is happy to cross license, and I'm sure AMD and Nvidia have plenty of cross licensing deals already. Intel is doing its damnedest to cut off Nvidia from access to their arch and AMD really doesn't need Nvidia with the Radeon chips, so that leaves Nvidia the odd man out.

      So personally I'm hoping Nvidia just jumps in and buys out Via. The new Nano chips look pretty nice, especially with the built in crypto which would be great for servers when combined with Fermi, and Intel wouldn't say shit without risking antitust smacking them down like it did MSFT. They have been hit with one fine too many by the EU so I have a feeling if they said shit they could get some seriously nasty antitrust dropped upon them.

      So to me the question is what is Nvidia gonna do. AMD and Intel adding APUs means the low end is gone to them, thanks to Intel their chipset business is DOA, and discrete GPUs will only get them so far. So in my mind there is really only two outcomes for them, either buy Via and give us a three way race, or slowly bleed until they end up bought by Intel. Let us just hope its the former rather than the later, as i'd hate to see Intel rewarded for their douchebaggery against Nvidia.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    10. Re:I like the idea, but have concerns by YoopDaDum · · Score: 1

      Regarding performance, on principle an integrated solution can do better by offering tighter integration and more efficient exchanges between CPU and GPU than going through a lower speed / higher latency external bus as for a discrete GPU.

      This isnt quite right. On principle, a discrete solution doesnt have to compromise with the low-latency random access memory performance demands of the CPU, while an integrated solution does. For raw compute performance, the discrete solutions are starting out in a much better position.

      Let's keep in mind that I'm talking about a possible high-end integrated solution that doesn't exist yet. This device would be NUMA-like, and have a high-speed memory on a wide bus optimized for the GPU, in addition to a classical memory optimized for the CPU. Still, CPU and GPU can access each other memories with higher performance than in a current discrete GPU solutions. Think about a multi-core Opteron memory organization, but instead of being symmetric (all memory ports identical) the ports are optimized for either CPU or GPU.

      In this context, there's no need to compromise on the GPU accesses to the GPU memory bank. So no loss compared to a discrete solution. But accesses to the CPU memory for CPU/GPU exchanges would still be better than with a current discrete solution.

      I agree it's all hypothetical as all current integrated solutions are low/middle end and with a single unified shared memory, but I just wanted to say that on principle an integrated solution can be a no compromise solution too, with such a dual bank / NUMA implementation, and still come on top for CPU/GPU exchanges.

      The latency savings only manifests as a win for small workloads, but small workloads ultimately dont matter (blink of an eye vs half-a-blink of an eye)

      There can be a self fulfilling element here: because the current discrete solutions are poor whenever there are significant exchanges required between CPU and GPU, people only use them for workload mostly running on the GPU (big workloads). Indeed, with small workloads the exchange overhead would be too high. I agree that such workloads would not benefit from a high-end integrated GPCPU.

      But a high-end / NUMA-like integrated implementation, by reducing the CPU/GPU overhead could make GPU acceleration more practical to smaller workloads than today. How common that is I can't say, but it would be an advantage for AMD to push in this direction as they are stronger than Intel on the GPU side and weaker on the CPU side (for now, waiting to see what bulldozer will be like ;). In other words, the high-end integrated GPCPU would not necessarily be much better for current workloads, but could make more workloads suited for GPCPU acceleration.

    11. Re:I like the idea, but have concerns by Agripa · · Score: 1

      Let's keep in mind that I'm talking about a possible high-end integrated solution that doesn't exist yet. This device would be NUMA-like, and have a high-speed memory on a wide bus optimized for the GPU, in addition to a classical memory optimized for the CPU. Still, CPU and GPU can access each other memories with higher performance than in a current discrete GPU solutions. Think about a multi-core Opteron memory organization, but instead of being symmetric (all memory ports identical) the ports are optimized for either CPU or GPU.

      I am not sure this is physically or economically feasable. Not only would the CPU require a lot more pins but additional area around the CPU for the soldered on memory would be needed. I had not noticed that graphics cards have a whole lot of extra room around their GPUs.

      GDDR5 chips are 32 bits wide so two are used per 64 bit channel. How many 64 bit GDDR5 channels would be required to make the performance improvement worth the area and pins needed? Would it be better just to add additional standard DIMM channels?

    12. Re:I like the idea, but have concerns by Savantissimo · · Score: 1

      TFA says it can do 30GB/s memory transfers, while the CPU functions only need at most 12GB/s. 30GB/s isn't the fastest ever for a GPU, but it's quite respectable.

      Maybe I'm wrong, but it looks to me like it can cope with either CPU or GPU workloads without recompilation needed for either, and it can use most of the GPU silicon for parallelizable math computations with very little extra effort compared to most other GPUs.

      --
      "Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
  6. Re:AMD's next strike against intel by myurr · · Score: 4, Insightful

    Except Larrabee failed because performance didn't live up to expectations and was a generation behind the best from AMD and nVidia. What this development from AMD allows is much more efficient interaction and sharing of data between a traditional CPU and an on-die GPU through updates to the memory architecture. These memory changes will also allow the parts to take advantage of the very fastest DDR3 memory that current CPUs struggle to fully utilise.

    The two most obvious scenarios for this technology are for accelerating traditional problems that take advantage of the existing vector units (SSE, etc.) by utilising the integrated GPU to massively accelerate these programs, and in gaming rigs where there is a discrete GPU the new architecture allows the integrated GPU to share some of the workload. The example given, and one that is increasingly relevant as all games now have physics engines, is for the discrete GPU to concentrate on pushing pixels to the screen and the integrated GPU to be used to accelerate the physics engine.

    Is it a game changer? Probably not in the first couple of generations, although it would be a very welcome boost to AMDs platform that could get them back in the game as the preferred CPU maker. But long term Intel will have to come up with an answer to this in some form as programmers get ever more adept at exploiting the GPU for general purpose computing, and changes like those AMD are incorporating into their designs make these techniques ever more powerful and relevant to wider ranges of problems. Adding more x86 cores won't necessarily be the answer.

  7. 1996 called ... by psergiu · · Score: 2

    ... and congratulated AMD for redescovering sgi's O2 Unified memory Architecture..

    PS: IBM PC jr. (1984) & Commodore Amiga (1985) were actually the 1st one to use UMA. Could this mean we will have "Chip RAM" & "Fast RAM" again ? :)

    --
    1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.
    1. Re:1996 called ... by BiggerIsBetter · · Score: 1

      Could this mean we will have "Chip RAM" & "Fast RAM" again ? :)"

      That would actually make sense, given the current difference in graphics card RAM speed/cost vs system RAM speed/cost.

      --
      Forget thrust, drag, lift and weight. Airplanes fly because of money.
    2. Re:1996 called ... by Anonymous Coward · · Score: 1

      What? The BBC Micro (1981) had shared graphics memory as did many of its contemporaries (e.g. Vic 20, ZX81, Spectrum). I believe the Acorn Atom (1980) also did.

    3. Re:1996 called ... by bhtooefr · · Score: 1

      And the Apple II had it in 1977.

    4. Re:1996 called ... by psergiu · · Score: 1

      At least ZX81 & Spectrum had a fixed address for VideoRAM. With O2's UMA you could play a movie by just filling up the RAM with the uncmpressed mofie frames and moving the start address of the framebuffer at each vertical refresh. AFAIK, you could do the same on the Amiga (in ChipRAM) but there you were able to change the address and the resolution mid-frame - i have seen screens where the top part was high-res low-color and the bottom part low-res high-color.

      --
      1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.
  8. Re:AMD's next strike against intel by loufoque · · Score: 1

    Except Larrabee failed because performance didn't live up to expectations and was a generation behind the best from AMD and nVidia.

    The original plan was to release a 32-core Larrabee in 2009, with a maximum theoritical performance of 2 TFlops. That's more than the most powerful nvidia card available today.
    And unlike a GPU, you could actually reach that performance, since it's a real x86-compatible CPU you have full access to, with intrinsincs similar to that of SSE (Larrabee is pretty much the ideal SIMD ISA -- much better than SSE or AVX) available on regular compilers.
    It also doesn't contain hardcoded fixed-function pipelines, which is a good thing.

    What this development from AMD allows is much more efficient interaction and sharing of data between a traditional CPU and an on-die GPU through updates to the memory architecture. These memory changes will also allow the parts to take advantage of the very fastest DDR3 memory that current CPUs struggle to fully utilise.

    Larrabee uses a high-bandwidth ring bus to communicate between cores, like the Cell architecture; that has been proven to be a very good design, and Intel adds cache-coherency hierarchy on top of it so that all cores see the same shared memory.

  9. WebGL support? by nikanth · · Score: 1

    Does it have WebGL support? i.e., address space protection and preemption support/kernel mode for shader programs?

  10. unified space by sam0737 · · Score: 1

    Maybe someone read the TFA could chime in. The TFS mentioned unified address space, but not necessarily unified memory access right? it could be just another virtual memory paging mechanism....

  11. Re:AMD's next strike against intel by Chris+Mattern · · Score: 2

    The original plan was to release a 32-core Larrabee in 2009, with a maximum theoritical performance of 2 TFlops.

    But since they couldn't do it, the original plan does mean much, now does it?

  12. Will it run Linux? by vigour · · Score: 4, Informative

    Will it run Linux?

    I'm not being facetious, I got stung by the lack of support by Nvidia for their Optimus graphics cards on my ASUS U30JC.

    Thankfully Martin Juhl has been working on a solution using VirtualGL, which gives us the use of our Nvidia cards under linux

    1. Re:Will it run Linux? by fuzzyfuzzyfungus · · Score: 4, Interesting

      I would(given ATI's historically somewhat weak driver team) be wholly unsurprised to see some rather messy teething pains; but(given AMD's historical friendliness, and the long-term trajectory of this plan), I suspect that it will actually be a boon to Linux and similar.

      The long term plan, it appears, would be to integrate the GPU sufficiently tightly with the CPU that it becomes, in effect, an instruction-set extension specialized for certain tasks, like SSE on steroids. If they reach that point, you'll basically have a CPU where running OpenGL "in software" is the same as running it "in hardware" on the embedded graphics board, because the embedded graphics board is simply the hardware implementation of some of the available CPU instructions, along with a few displayport interfaces and some monitor-management housekeeping.

      I'd be unsurprised, as with Optimus, to see some laptops released with an embedded/discrete GPU combination that is fucked in one way or another under anything that isn't the latest Windows, possibly making the discrete invisible, possibly forcing you to run the discrete all the time, or some other dysfunctional situation; but I'd tend to be optimistic about the long term: GPU driver support has always been a sore spot. Compiler support for CPU instructions, on the other hand, has generally been pretty good.

    2. Re:Will it run Linux? by vigour · · Score: 1

      ...but I'd tend to be optimistic about the long term: GPU driver support has always been a sore spot. Compiler support for CPU instructions, on the other hand, has generally been pretty good.

      Excellent point!

  13. Re:Great move for laptops by pandrijeczko · · Score: 1

    It's probably a little dangerous to make that assumption because whenever I've looked inside a laptop, the CPU is soldered to the motherboard, not plugged into a socket as in a desktop.

    Besides which, inside a laptop you have much less free space for heat dissipation and many of them already run reasonably hot - giving you the option of plugging in a faster CPU that generates more heat may end up frying some of the other internal components, that brings things like manufacturer warranties into question.

    APUs are a next logical step in portability and compactness. I like desktops PCs as much as the next guy but with APU technology, desktops are one step closer to their eventual demise.

    --
    Gentoo Linux - another day, another USE flag.
  14. Re:"Intel doesn't seem to be going down this road. by Rockoon · · Score: 1

    To quote AnandTech, "On average the A8-3850 [GPU] is 58% faster than the Core i5 2500K [GPU]. If we look at peak performance in games like Modern Warfare 2, Llano delivers over twice the frame rate of Sandy Bridge. This is what processor graphics should look like.

    This is comparing AMD's flagship APU @ $170 vs Intels mid-range Sandy @ $220.

    The road Intel is going down is the same road its always gone down. Delivering sub-par graphics performance to a crowd that isnt going to notice.

    --
    "His name was James Damore."
  15. Re:"Intel doesn't seem to be going down this road. by Targon · · Score: 1

    Intel GPU technology is so far behind AMD/ATI and NVIDIA, it makes sense that it has not drawn as much attention. The graphics side of Fusion is far more advanced than the integrated graphics we have seen on motherboards to this point as well.

  16. CPU, FPU, GPU, ALU, control unit, packaging by DragonHawk · · Score: 3, Interesting

    A "math coprocessor" is just the FPU (Floating Point Unit) of a particular era of microcomputers. The FPU implements machine instructions for floating point math. Before the microcomputer, when machines filled cabinets, you might have an FPU (on one or more circuit boards), you might not. Same with the early micros. Eventually they built the FPU into the same die as the CPU, so no need for a separate chip. The FPU is always tightly coupled to the CPU because it shares the same control unit as the CPU. (A CPU consists of a control unit plus an arithmetic/logic unit.) You can't change the design of one without changing the other.

    A GPU is different from an FPU. It doesn't process CPU instructions -- it has its own control unit. GPUs operate independently of the CPU.

    Building a CPU into the same die or IC package as the CPU won't prevent you from installing a discrete graphics card. No need to get all upset about it.

    Although the tech may eventually get to the point where you won't bother with a discrete graphics card. I suspect we'll eventually see a large package containing CPU, GPU and memory, for performance reasons. One will upgrade them all together.

    Before you panic about that: In the early days of minicomputers, CPUs were implemented as many boards containing lots of discrete logic and small scale integration. It was possible to do things like change how the adder was implemented, how memory was accessed, or add whole new machine instructions. You could "upgrade" at that level. That capability was lost with the move to (very) large scale integration. However, things are so much cheaper and faster with (V)LSI that it's worth it.

    So if $100 will bring you a new CPU, GPU, and RAM, running 10x faster than what you had before, then yah, I can see it happening, and being a win.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  17. Re:"Intel doesn't seem to be going down this road. by Bert64 · · Score: 1

    Intel GPUs only really target the lowend, they are pretty weak compared to the offerings from ATI/AMD and nVidia...

    --
    http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  18. Re:Pendulum swinging back-n-forth by marquis111 · · Score: 1

    You say that like it's a bad thing.

  19. well CAD and useing the GPU as a CPU is still ther by Joe_Dragon · · Score: 1

    well CAD and useing the GPU as a CPU is still there. OpenCL makes the video card in to a HIGH end FPU the can do stuff that the main cpu sucks at.

    Any ways a video card still has faster ram that is not used shared with system ram. On board video on some boards has a max of 2 displays (some boards force one to be analog) Now if ATI / AMD can have on board video with DP then you can do more. But I think if you need like 3-4+ screens a add in video card may be better and save you the ram hit.

  20. but better video at a lower cost is something to by Joe_Dragon · · Score: 1

    but better video at a lower cost is something to keep in mine.

    Apple better look out a low end mini with i3 and on board video at $700+ will be a joke next to what AMD will have + it will have like 8-16 unused pci-e lanes. Apple better have a video chip in it on x8 pci-e and 2 TB ports on the other x8 pci-e.

  21. Lot's of laptops have CPU sockets not the apple on by Joe_Dragon · · Score: 1

    Lot's of laptops have CPU sockets not the apple ones but lot's of other ones.

  22. Re:but better video at a lower cost is something t by Rockoon · · Score: 1

    The Mac Mini uses an nVidia 320M, which benchmarks at about half of the AMD 6550 Llano.

    --
    "His name was James Damore."
  23. Larrabee by toastar · · Score: 1

    Believe it or not, Making a chip the size of a football field isn't really the best idea.

  24. Re:AMD's next strike against intel by Chris+Mattern · · Score: 1

    Argh. *doesn't* mean much.

  25. Re:AMD's next strike against intel by Dr.+Spork · · Score: 1

    I think about it like this: What are some computational problems which today justify a home user in buying an expensive machine rather than a cheap one? Not browsing or productivity or whatever else my mom does. It's media encoding, media processing, rendering and gaming. All of these could be radically sped up when programs effectively make use of the GPU as a supercharged vector unit extension of the CPU. Then there are computer functions like web hosting and compiling that won't benefit from this, but not that many computers do this. So this sort of thing will make a real difference to many real users.

  26. Re:AMD's next strike against intel by fast+turtle · · Score: 1

    Intel appears to be following a Discreete core design while AMD with Fusion is following an All-in-One design. From looking at what AMD has released as to their roadmap, it appears that unlike Intel, the APU will become the math core (fpu) of the chip, with the cpu core becoming even smaller. This appears to be planned for either the 2nd or 3rd generation of the chips

    Although we're seeing continual die shrinkage by Intel, I suspect that AMD's integration will result in far better energy savings then what Intel gains from die shrinkage. From a performance stance, the APU already beats Intel's GPU by a large margin and looking at the power consumption graphs from http://www.tomshardware.co.uk/a8-3500m-llano-apu,review-32207-22.html we're already seeing a more stable draw by the fusion design compared to the i3. Yes the Intel design does drop into a far lower power stage but with proper emphasis on the rest of the other system chips, AMD should be able to cut power even further while retaining performance.

    --
    Mod me up/Mod me down: I wont frown as I've no crown
  27. Preemptive Multitasking? by Twinbee · · Score: 1

    Does this Fusion APU multitask so that it can run 2 or more kernels at once (with no worries of the watchdog kicking in and stopping >5 sec kernels) ?

    --
    Why OpalCalc is the best Windows calc
    1. Re:Preemptive Multitasking? by Twinbee · · Score: 1

      Are you saying we'll have to keep switching in and out of kernels to get a long task done? Obviously that makes coding a lot more bloated and unnecessarily painful.

      --
      Why OpalCalc is the best Windows calc
    2. Re:Preemptive Multitasking? by import · · Score: 1

      From what I understand (I attended the AMD summit in question), Llano cannot multitask natively, although through the driver you should be able to do it and much more efficiently than in the past. I believe set up time for kernels has been drastically reduced with Llano, since there's no PCIE layer. Their future APUs will be introducing hardware scheduling so this will be better then...

  28. but now apple will be locked into intel video by Joe_Dragon · · Score: 1

    but now apple will be locked into intel video with the new intel cpus if they don't add in a video chip.

  29. WRT Intel by import · · Score: 1

    Anyone else notice the similarity between Llano's and Arrandale's memory controller configuration, i.e., that both put the MC on the GPU and have the CPU talk to the GPU via some protocol for data? Okay, in Llano's case there's the option of going directly to memory through WCs but still.

    And then, this FSA crap seems to be going in the direction of Sandy Bridge, i.e., a unified L3 cache... as much as I like AMD, they do seem like their following in Intel's footsteps. This new architecture reminds me a little of Larabee. Not that I know much about either, but IIRC in Demers' keynote he mentioned something like 24 CUs per chip... which seems way too low, I must have heard him wrong or there must be a factor of 40 or so I'm missing somewhere...

  30. Re:AMD's next strike against intel by chaboud · · Score: 1

    The academic use I was aware of. I'll stick to saying that products that have been relegated to non-commercial use are pretty much busted.

    That said, I'm still hopeful for some real-time global illumination. I'm doubtful that it will be Larrabee doing it, though, as the ring topology and memory transfer costs are just too wishful to work. Good first stab, but I'll wait for V2 (or V3).