AMD Fusion System Architecture Detailed
Vigile writes "At the first AMD Fusion Developer Summit near Seattle this week, AMD revealed quite a bit of information about its next-generation GPU architecture and the eventual goals it has for the CPU/GPU combinations known as APUs. The company is finally moving away from a VLIW architecture and instead is integrating a vector+scalar design that allows for higher utilization of compute units and easier hardware scheduling. AMD laid out a 3-year plan to offer features like unified address space and fully coherent memory for the CPU and GPU that have the potential to dramatically alter current programming models. We will start seeing these features in GPUs released later in 2011."
Whats wrong with hardware !
Humans are too stupid to program it.
Not sure want the fix is not hardware keeps exploding and we are stuck with Windows 7, lol 8 or (CAT), lol Lion.
Is that the modular nature of current components allows for relatively easy upgrading and a comparatively low cost. Buying a new graphics card that has the price of a GPU and dedicated video RAM is reasonable. Having to buy a new CPU every time you want to upgrade your GPU could get unreasonably expensive fast.
They already have Larrabee, which is pretty much the same thing but far better.
Dead. Project.
Larrabee proved to have a few fundamental flaws, last I checked.
One concern of mine is simply performance with unified memory. The reason is that memory bandwidth is a big factor in 3D performance. The kind of math you have to do just needs a shitload of memory access. This is why GPUs have such insane memory configurations. They have massively wide controllers, special high performance ram (GDDR5 is based on DDR3, but higher performance) and so on. That's wonderful, but also expensive.
So it seems to me that you run in to a situation where either you are talking about needing to have much more expensive memory for a computer, possibly with additional constraints (at high speeds memory on a stick isn't feasible, electrical issues are such that you have to solder it to the board) or a system where your performance suffers because it is starved for memory bandwidth. Please remember that it would also have to share memory with the CPU.
Perhaps they've found a way to overcome this, but I'm skeptical.
I also worry this could lead to fragmentation of the market. What I mean is right now we have a pretty nice unified situation from a developer perspective. AMD and Intel have all kinds of cross licensing agreements with regards to instruction sets. So the instructions for one are the instructions for the other. While there are special cases, like 3DNow that only AMD does, or AVX which Intel has and AMD has yet to implement, by and large you have no problems supporting both with a very similar, or dead identical, codebase.
Likewise GPUs are unified from an app perspective. You talk to them with DirectX or OpenGL. The details of how AMD or nVidia do things aren't so important, that handled. You use one interface to talk to whatever card the user has. Not saying there can't be issues, but by and large it is the same deal.
Well this could change that. APUs might need a drastically different development structure. Ok fine, except AMD might be the only company that has them. Intel doesn't seem to be going down this road right now, and nVidia doesn't have a CPU division. So then as a developer you could have a problem where something that works well for traditional CPU/GPU doesn't work well, or maybe at all, for an APU.
That could lead to a choice of three situations, none that good:
1) You develop for traditional architectures. That's great for the majority of people, who are Intel owners (and people who own what is now current AMD stuff) but screws over this new, perhaps better, way of doing things.
2) You develop for the APU. That is nice for the people who have it but it screws over the mass market.
3) You develop two versions, one for each. Everyone is happy but your costs go way up from having more to maintain.
Of course even if everything goes APU it could be problematic if AMD and Intel have very different ways of doing things. Their cross licensing does not extend to this sort of thing, and I could see them deciding to try and fight it out.
So neat idea, but I'm not really sure it is a good one at this point.
Except Larrabee failed because performance didn't live up to expectations and was a generation behind the best from AMD and nVidia. What this development from AMD allows is much more efficient interaction and sharing of data between a traditional CPU and an on-die GPU through updates to the memory architecture. These memory changes will also allow the parts to take advantage of the very fastest DDR3 memory that current CPUs struggle to fully utilise.
The two most obvious scenarios for this technology are for accelerating traditional problems that take advantage of the existing vector units (SSE, etc.) by utilising the integrated GPU to massively accelerate these programs, and in gaming rigs where there is a discrete GPU the new architecture allows the integrated GPU to share some of the workload. The example given, and one that is increasingly relevant as all games now have physics engines, is for the discrete GPU to concentrate on pushing pixels to the screen and the integrated GPU to be used to accelerate the physics engine.
Is it a game changer? Probably not in the first couple of generations, although it would be a very welcome boost to AMDs platform that could get them back in the game as the preferred CPU maker. But long term Intel will have to come up with an answer to this in some form as programmers get ever more adept at exploiting the GPU for general purpose computing, and changes like those AMD are incorporating into their designs make these techniques ever more powerful and relevant to wider ranges of problems. Adding more x86 cores won't necessarily be the answer.
... and congratulated AMD for redescovering sgi's O2 Unified memory Architecture..
PS: IBM PC jr. (1984) & Commodore Amiga (1985) were actually the 1st one to use UMA. Could this mean we will have "Chip RAM" & "Fast RAM" again ? :)
1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.
The original plan was to release a 32-core Larrabee in 2009, with a maximum theoritical performance of 2 TFlops. That's more than the most powerful nvidia card available today.
And unlike a GPU, you could actually reach that performance, since it's a real x86-compatible CPU you have full access to, with intrinsincs similar to that of SSE (Larrabee is pretty much the ideal SIMD ISA -- much better than SSE or AVX) available on regular compilers.
It also doesn't contain hardcoded fixed-function pipelines, which is a good thing.
Larrabee uses a high-bandwidth ring bus to communicate between cores, like the Cell architecture; that has been proven to be a very good design, and Intel adds cache-coherency hierarchy on top of it so that all cores see the same shared memory.
Does it have WebGL support? i.e., address space protection and preemption support/kernel mode for shader programs?
Maybe someone read the TFA could chime in. The TFS mentioned unified address space, but not necessarily unified memory access right? it could be just another virtual memory paging mechanism....
But since they couldn't do it, the original plan does mean much, now does it?
Will it run Linux?
I'm not being facetious, I got stung by the lack of support by Nvidia for their Optimus graphics cards on my ASUS U30JC.
Thankfully Martin Juhl has been working on a solution using VirtualGL, which gives us the use of our Nvidia cards under linux
It's probably a little dangerous to make that assumption because whenever I've looked inside a laptop, the CPU is soldered to the motherboard, not plugged into a socket as in a desktop.
Besides which, inside a laptop you have much less free space for heat dissipation and many of them already run reasonably hot - giving you the option of plugging in a faster CPU that generates more heat may end up frying some of the other internal components, that brings things like manufacturer warranties into question.
APUs are a next logical step in portability and compactness. I like desktops PCs as much as the next guy but with APU technology, desktops are one step closer to their eventual demise.
Gentoo Linux - another day, another USE flag.
To quote AnandTech, "On average the A8-3850 [GPU] is 58% faster than the Core i5 2500K [GPU]. If we look at peak performance in games like Modern Warfare 2, Llano delivers over twice the frame rate of Sandy Bridge. This is what processor graphics should look like.
This is comparing AMD's flagship APU @ $170 vs Intels mid-range Sandy @ $220.
The road Intel is going down is the same road its always gone down. Delivering sub-par graphics performance to a crowd that isnt going to notice.
"His name was James Damore."
Intel GPU technology is so far behind AMD/ATI and NVIDIA, it makes sense that it has not drawn as much attention. The graphics side of Fusion is far more advanced than the integrated graphics we have seen on motherboards to this point as well.
A "math coprocessor" is just the FPU (Floating Point Unit) of a particular era of microcomputers. The FPU implements machine instructions for floating point math. Before the microcomputer, when machines filled cabinets, you might have an FPU (on one or more circuit boards), you might not. Same with the early micros. Eventually they built the FPU into the same die as the CPU, so no need for a separate chip. The FPU is always tightly coupled to the CPU because it shares the same control unit as the CPU. (A CPU consists of a control unit plus an arithmetic/logic unit.) You can't change the design of one without changing the other.
A GPU is different from an FPU. It doesn't process CPU instructions -- it has its own control unit. GPUs operate independently of the CPU.
Building a CPU into the same die or IC package as the CPU won't prevent you from installing a discrete graphics card. No need to get all upset about it.
Although the tech may eventually get to the point where you won't bother with a discrete graphics card. I suspect we'll eventually see a large package containing CPU, GPU and memory, for performance reasons. One will upgrade them all together.
Before you panic about that: In the early days of minicomputers, CPUs were implemented as many boards containing lots of discrete logic and small scale integration. It was possible to do things like change how the adder was implemented, how memory was accessed, or add whole new machine instructions. You could "upgrade" at that level. That capability was lost with the move to (very) large scale integration. However, things are so much cheaper and faster with (V)LSI that it's worth it.
So if $100 will bring you a new CPU, GPU, and RAM, running 10x faster than what you had before, then yah, I can see it happening, and being a win.
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
Intel GPUs only really target the lowend, they are pretty weak compared to the offerings from ATI/AMD and nVidia...
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
You say that like it's a bad thing.
well CAD and useing the GPU as a CPU is still there. OpenCL makes the video card in to a HIGH end FPU the can do stuff that the main cpu sucks at.
Any ways a video card still has faster ram that is not used shared with system ram. On board video on some boards has a max of 2 displays (some boards force one to be analog) Now if ATI / AMD can have on board video with DP then you can do more. But I think if you need like 3-4+ screens a add in video card may be better and save you the ram hit.
but better video at a lower cost is something to keep in mine.
Apple better look out a low end mini with i3 and on board video at $700+ will be a joke next to what AMD will have + it will have like 8-16 unused pci-e lanes. Apple better have a video chip in it on x8 pci-e and 2 TB ports on the other x8 pci-e.
Lot's of laptops have CPU sockets not the apple ones but lot's of other ones.
The Mac Mini uses an nVidia 320M, which benchmarks at about half of the AMD 6550 Llano.
"His name was James Damore."
Believe it or not, Making a chip the size of a football field isn't really the best idea.
Argh. *doesn't* mean much.
I think about it like this: What are some computational problems which today justify a home user in buying an expensive machine rather than a cheap one? Not browsing or productivity or whatever else my mom does. It's media encoding, media processing, rendering and gaming. All of these could be radically sped up when programs effectively make use of the GPU as a supercharged vector unit extension of the CPU. Then there are computer functions like web hosting and compiling that won't benefit from this, but not that many computers do this. So this sort of thing will make a real difference to many real users.
Intel appears to be following a Discreete core design while AMD with Fusion is following an All-in-One design. From looking at what AMD has released as to their roadmap, it appears that unlike Intel, the APU will become the math core (fpu) of the chip, with the cpu core becoming even smaller. This appears to be planned for either the 2nd or 3rd generation of the chips
Although we're seeing continual die shrinkage by Intel, I suspect that AMD's integration will result in far better energy savings then what Intel gains from die shrinkage. From a performance stance, the APU already beats Intel's GPU by a large margin and looking at the power consumption graphs from http://www.tomshardware.co.uk/a8-3500m-llano-apu,review-32207-22.html we're already seeing a more stable draw by the fusion design compared to the i3. Yes the Intel design does drop into a far lower power stage but with proper emphasis on the rest of the other system chips, AMD should be able to cut power even further while retaining performance.
Mod me up/Mod me down: I wont frown as I've no crown
Does this Fusion APU multitask so that it can run 2 or more kernels at once (with no worries of the watchdog kicking in and stopping >5 sec kernels) ?
Why OpalCalc is the best Windows calc
but now apple will be locked into intel video with the new intel cpus if they don't add in a video chip.
Anyone else notice the similarity between Llano's and Arrandale's memory controller configuration, i.e., that both put the MC on the GPU and have the CPU talk to the GPU via some protocol for data? Okay, in Llano's case there's the option of going directly to memory through WCs but still.
And then, this FSA crap seems to be going in the direction of Sandy Bridge, i.e., a unified L3 cache... as much as I like AMD, they do seem like their following in Intel's footsteps. This new architecture reminds me a little of Larabee. Not that I know much about either, but IIRC in Demers' keynote he mentioned something like 24 CUs per chip... which seems way too low, I must have heard him wrong or there must be a factor of 40 or so I'm missing somewhere...
The academic use I was aware of. I'll stick to saying that products that have been relegated to non-commercial use are pretty much busted.
That said, I'm still hopeful for some real-time global illumination. I'm doubtful that it will be Larrabee doing it, though, as the ring topology and memory transfer costs are just too wishful to work. Good first stab, but I'll wait for V2 (or V3).