AMD A10 Kaveri APU Details Emerge, Combining Steamroller and Graphics Core Next
MojoKid writes "There's a great deal riding on the launch of AMD's next-generation Kaveri APU. The new chip will be the first processor from AMD to incorporate significant architectural changes to the Bulldozer core AMD launched two years ago and the first chip to use a graphics core derived from AMD's GCN (Graphics Core Next) architecture. A strong Kaveri launch could give AMD back some momentum in the enthusiast business. Details are emerging that point to a Kaveri APU that's coming in hot — possibly a little hotter than some of us anticipated. Kaveri's Steamroller CPU core separates some of the core functions that Bulldozer unified and should substantially improve the chip's front-end execution. Unlike Piledriver, which could only decode four instructions per module per cycle (and topped out at eight instructions for a quad-core APU), Steamroller can decode four instructions per core or 16 instructions per quad-core module. The A10-7850K will offer a 512-core GPU while the A10-7700K will be a 384-core part. Again, GPU clock speeds have come down, from 844MHz on the A10-6800K to 720MHz on the new A10-7850K but should be offset by the gains from moving to AMD's GCN architecture."
Is there something about polish that excites AMD?
I think AMD used Phenom || instead of bulldozer, as Phenom has already proven its significane over the bulldozer. Besides implementing CGN into pehnom || would againg start phenom ||'s production.
Laptops? While I'd love to see a nice, low cost CPU/GPU combo that can hang with my (rather meager) Athlon X2 6000+ and GT 240, I'm still running pretty low end gear. If this is targeted at enthusiasts they're just going to replace it with a card...
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
Come on AMD you can do it and then we can party like is 2009!
I do not know how they stay in business.
Dunno about you, but I ain't gonna be excited by AMD's offerings anymore, after what they dished to us on their Bulldozer roll out
For more than a year before Bulldozer came into being they told us that the Bulldozer gonna be revolutionary - they hyped the Bulldozer so much that many forums were filled with people just couldn't wait to get their hands on it
But when the rubber hits the tarmac everything gone flat - that Bulldozer was a dog
No man, I just ain't gonna believe AMD anymore
Muchas Gracias, Señor Edward Snowden !
"first chip to use a graphics core derived from AMD's GCN (Graphics Core Next) architecture" is kinda inaccurate, given that their GPU's are "chips" and have been in production using GCN for quite a while now The ps4 and xbox one also used GCN, and were even APUs.
Will this new Kaveri have True Audio? I can assume it will support mantel. I wonder if they are going to offer a configuration that will work with GDDR5 or DDR4. There are some interesting things happening on the memory front these days!
Will this new architecture of AMD support OpenCL 2.0?
"enthusiasts" don't give a rat's tail about on-board graphics.. so strip that shit out and give us an unlocked processor for less coin. tyvm.
Kavari looks good for a budget gaming PC, but I think they are being a bit optimistic about the "dual graphics" feature. This is where you pair the iGPU with a dGPU, to get better performance. AMD has never been able to get this feature to work properly. All it does is create "runt" frames, which makes the FPS look higher, but without giving any visual improvement.
http://www.tomshardware.com/reviews/dual-graphics-crossfire-benchmark,3583.html
Kaveri should be properly compared to the chips in the PS4 and Xbone. As such, it can be said that Kaveri is significantly poorer than either.
-Kaveri is shader (graphics) weak compared to the Xbone, which itself is VERY weak compared to the PS4.
-Kaveri should be roughly CPU equivalent (multi-threaded) to the CPU power of eother console
-Kaveri is memory bandwidth weak compared to the Xbone, which itself is VERY bandwidth weak compared to the PS4
-Kaveri is a generation ahead of the Xbone in HSA/hUMA concepts, but the PS4 is a generation ahead of the Kaveri
There was ZERO reason for AMD to release a new APU that was significantly weaker than even the solution in the Xbox One, but sadly this is what the idiots at AMD are doing. The Kaveri integrated graphics are far to poor for anyone who really cares about PC gaming, reducing the worth of Kaveri to a power hungry, fairly decent 4-core CPU. The REAL Kaveri II (not the Kaveri which is coming around Xmas 2013 under the confusing name of Kaveri II) will be released around Xmas 2014 or later, and should be somewhat more powerful than the Xbox One. Kaveri II will also share the same sophisticated HSA/hUMA that is currently only found in the PS4.
And maybe by some miracle, AMD will grow a brain, and give Kaveri II a 256-bit memory bus to GDDR5 memory, not the stinking 128-bit bus to DD3 that Kaveri has.
At least Kaveri has True Audio (like the PS4 and 260/290 GPUs), and hopefully AMD's Mantle will enable PC systems using Kaveri with discrete graphics cards to still benefit from doing some work on the integrated GPU cores as well.
This is the chip that unites the CPU and GPU into one programing model with unified memory addressing. Heterogeneous System Architecture(HSA) and Heterogeneous Uniform Memory Access(HUMA) are the nice buzzword acronyms that AMD came up with but it basically removes the latency from accessing GPU resources and makes memory sharing between the CPU cores and GPU cores copy free. You can now dispatch instructions to the GPU cores almost as easily and as quickly as you do to the basic ALU/FPU/SSE units of the CPU.
Will software be written to take advantage of this though?
Will Intel eventually support it on their stuff?
Ars article on the new architecture.
Anandtech article on the Kaveri release.
- Single-thread performance matters much more than multi-thread performance, and Kaveri has almost twice the single-thread performance of the Xbone and PS4 chips.
- Memory bandwidth is expensive. You either need wide and expensive bus, or expensive low-capasity graphics DRAM which need soldering, and means you are limited to 4 GiB of memory(with the highest capasity GDDR chips out there), with zero possibility of late upgrading it, or both(and MAYBE get 8 giB of soldered memory). Though there has been rumours that Kaveri might support GDDR5, for configurations with only 4 GiB of soldered memory.
- And when you have that limited memory bandwidth, it does not make sense to waste die space on creating monster GPU which is starved by the lack of bandwidth.
- ALL the mentioned chips are of same generation. All support cache-coherent unified memory.
As PC chip, Kaveri makes much more sense:
- Software that matters on PC cannot use 8 threads. Kaveri is much faster at most software
- Weaker GPU side, ability to use cheap DDR3, and narrower memory bus makes Kaveri chip and kaveri-bases systems cheaper to manufacture
- The CPU can be socket, need to to be soldered, and the memory chips can use DIMMs instead of soldering to motherboard. Ability to upgrade something and system manufacturers to easily create different configurations.
I had a core2 E5300, and I replaced it with a new Q6600 from ebay dirt cheap, yes more cores, a bit hotter, but more cores is more flops.
I'll be looking for an even faster Q9550, as its close to i7s, but way cheaper.
Yes, we can buy full PCs for $300+ that give you latest i7s running way faster.
But reusing old Qxxx's on older mbs is close enough when it costs less than 3 pizzas.
Liberty freedom are no1, not dicks in suits.
Exactly. That's why the big deal with Intel's Haswell was basically "consumes a lot less power", the rest was incremental and a few added instructions for the future. AMD seems to have the same tech analysts as Netcraft crying "The Desktop is dying, the desktop is dying!"
If you play to own anything that is a desktop, then anything like this from AMD or Intel, that can be replaced with something that is TWICE is fast using the cheapest 50$ dedicated video card, makes the advances absolutely meaningless.
In fact, the only thing this effects is that you might be able to get away from buying a 2000$ gaming laptop and buying a 500$ laptop that can marginally play most games. Congratulations for moving the mediocrity yard stick a bit further...
The current Richland APUs have a native memory controller that runs at 1866MHz so if you put in 9-10-9 RAM of that speed and overclock it a hair, you get graphics performance that ranks at a 6.9-7.0 in the WEI in Win7. REmember, you have to jack up the memory speed since the GPU inside the CPU is using system memory instead of GDDR5. That rating is medium speed for games. So that's around $139 for the top of the line chip and $75 for the RAM.
Now let's look at Intel's solution for a basic gaming or HD video playback style computer. Oh crap, that's still slightly inferior at a modern i3 but whatever, let's go with that for about $140. 1600MHz RAM would be about $60 (both 8GB btw) and now we need an Nvidia GT640 to come close in performance. There goes $80. Intel just got demolished. Anyone building a basic gaming PC for a kid or something or a DVR PC, that's a no brainer. (remember, video encoding + hyperthreading = bad idea).
But wait, there's more! Their 6-core non-APU chip blows away an i3 and some of their i5 processors while costing almost half. So I'd even put one into a high end gaming system with a dedicated graphics card. I really wouldn't go with Intel for anything other than multiple single-core-only processes. Why is anyone still buying Intel?
I could be wrong, but it had little to do with AMD and more to do with MS specifications.
The only difference between the graphic cores on the Xbox One and PS4 is that the PS4 uses newer DDR5 memory, while the xbox DDR3. Xbox tried to compensate for the slower memory by adding additional cache on die, however this takes up physical real estate, which forced them to use a couple less cores (in exchange for faster memory handling). To simply say one is faster/better than the other is a bit misleading.
The reason for this was that MS speculated that the new DDR5 memory would be in short supply and there would not be enough production to supply their manufacturing. Considering the PS4, if they both used it, they would have probably been correct.
So anyway you are only looking at the GPU aspect, put it is integrated with the CPU, that does include an increased amount of cache on the xbox side. I think it is a bit early to tell what real life difference it makes. Early comparisons say little to none. Given time maybe, of course by then MS may adapt their design.
I am unbiased in this as I am not buying either, at least not for some time...
Having seen it go wrong, I cringe when I hear of companies that depend on a "home run" or Hail-Mary "immaculate reception". Personally I'd love it if AMD was wildly successful. Intel is coasting without significant competition.
Can some of these cores work on game AI whilst others handle graphics, or can they only work on one task all at once? Could they do game AI at all? And can programmers program for gpu cores cross-platform or is it still one brand at a time?
Waterfox - a Firefox fork with legacy extension support, security updates and better privacy by default.
I was doing some reading on Mantle, and there's some interesting things I noted. One of the things about Mantle is you can create "task" queues. You register a queue with some consumer, be that the CPU or a GPU. Registering the queue is a system call, but the queue itself is in user land. Each task is a data structure that contains a few things, several of them were stuff that I was less interested in, but a few stood out. One was a pointer to a function and another was a pointer to some data.
The way this sounds to work is your CPU can do some work, then package it up in a nice area in memory and enqueue a function pointer and data pointer into this queue. The GPU will then be notified without any system calls, then it will at its leisure, look at the function pointer and start executing the code against the data pointer, which is probably your matrix of data to crunch.
Here comes another cool part. Once the GPU is done crunching this data, it can do the same thing back at the CPU because the GPU can have queues registered against the CPU. This means the CPU and GPU and ping-pong work back and forth with little effort.
How does the GPU/CPU get notified? Well, it just so happens that these tasks are 64bytes, the size of a cache line. This means the cache-coherency protocol could easily notify the device when a queue has work ready, effectively having a hardware accelerated event system. I'm sure there are other more traditional ways to do this for non IGPs.
Since both the GPU and CPU use the same protected memory, there is no data copying the programmer needs to be aware of, it's all transparent. All pointers naturally work. Not only that, but the GPU can cause page faults, so data sets no longer must fit into GPU memory, but can actually be stored in system memory, or event better, swapped out. I'm not saying swapping is good, just that it's much easier to handle than a programmer manually doing memory management.
Even more good news. These GPUs are full C/C++ capable. No funny custom languages to use, good old C. Nothing says "I like to work with buffers of data and pointers" than C.
Do the biggest reason Mantle will help is because it can completely by-pass system calls and allow producer-consumer queues and use event notification for when work is ready. Mantle is supposed to be GPU independent, so Nvidia should also be able to implement it, but without tight GPU integration, I'm not sure it will be as efficient, but still better than system calls.
What can happen now is a network of task queues connecting the CPU, IGP, and any other GPUs. If you have more than one GPU, each GPU can have it's own queue. You can actually register as many queues as you want, which means an 8 core CPU will probably one queue for each core for each device. This could be a first great attempt at unifying GPUs and CPUs into one massive processing system.
Dice has some interesting stuff about how they can get BF4 efficiently using 90%-95% of an 8 core CPU while offloading lots of work to the GPU and IGP. Better use of multi-core CPUs, lower latency, higher throughput, what's not to like? The design looks good, the idea sounds awesome, now we wait for the implementation. No matter what happens, I see this eventually being the future, be it Mantle or some other API.
How does it compare in per-core performance to Intel chips? Everything else is just meaningless techno-babble.