AMD's Next-Gen Steamroller CPU Could Deliver Where Bulldozer Fell Short
MojoKid writes "Today at the Hot Chips Symposium, AMD CTO Mark Papermaster is taking the wraps off the company's upcoming CPU core, codenamed Steamroller. Steamroller is the third iteration of AMD's Bulldozer architecture and an extremely important part for AMD. Bulldozer, which launched just over a year ago, was a disappointment. The company's second-generation Bulldozer implementation, codenamed Piledriver, offered a number of key changes and was incorporated into the Trinity APU family that debuted last spring. Steamroller is the first refresh of Bulldozer's underlying architecture and may finally deliver the sort of performance and efficiency AMD was aiming for when it built Bulldozer in the first place. Enhancements to Fetch and Decode architecture have been made, as well as increased scheduler efficiency and cache load latency, which combined could bring a claimed 15 percent performance-per-watt performance gain. AMD expects to ship Steamroller sometime in 2013 but wouldn't offer timing detail beyond that."
They all sound like sexual positions.
Things like hitting the 1GHz mark first, and making a workable 64bit chip that also speaks x86 only get you so far. AMD needs to come up with something cool, else they're doomed to play catch-up.
I want to delete my account but Slashdot doesn't allow it.
AMD boards have better PCI-E lanes then intel chips.
With Intel you need to go high end to get more then 16 lanes + DMI
I just got a gigabyte with dual PCI-e 4 ram slots (1833 and if you OC it a little 2000) with all the latest buzzwords for like 70 bucks ... you need to shop some more
AMD may be getting its shit together when in regards to chip design. but I'm still going Intel on my next PC because of their superior Linux drivers. At the moment I'm an unhappy owner of a laptop with a AMD graphics card that can't do anything because the drivers are useless. I'm looking forward to a new laptop with an Intel Ivy Bridge processor (I don't think I can wait Haskell).
The thing is, the low end don't care for masses of PCI lanes. They run integrated video. The high end want a fast CPU as well.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
I think AMD's work here will provide some great evolutionary speedups that will be significant to many people. Unfortunately for them, at the same time AMD is bringing out these small "free lunch" general improvements, Intel will be bringing out Haswell -- which in addition to such evolutionary improvements has some really fantastic, significant new features that'll provide remarkable performance boosts.
These are all pretty specialized features, yes, but they service some very high-profile benchmark areas: video processing and concurrency are always on the list, and AMD will get absolutely crushed when apps start taking advantage of it.
I'm a developer, a major optimization geek both micro- and macro-. I thrive playing with instruction latencies, execution units, and cache usage until my code eeks out as much performance as possible. Of course we'll never know until the CPUs are released for everyone to play with, but right now my money is on Intel.
AMD is in serious trouble here. I hope I'm wrong.
I got my Asus SATA3/USB3/Firewire2 AM2+/AM3 board in 2010 for $85, so I have no fucking idea what you're talking about.
Sorry but lots of PCIe lanes are just not the kind of thing that matters to non-high end users or people who focus on stats rather than real world performance. To even have a situation on a desktop board where it could theoretically matter you have to have multiple graphics cards. The 1x slots hang off the southbridge and have their own bandwidth separate from the lanes on the CPU for the video card. So if you stick on two GPUs then yes, you don't have enough to give them both 16 lanes.
However it turns out to not matter. We have more bandwidth than we need with PCIe, particularly now with 3.0. You test a card in 16x vs one in 8x and you find no difference in performance. So it just doesn't matter even if you have multiple GPUs.
Of course multiple GPUs are rather a high end proposition. Many people don't even bother with a GPU at all, they just use the onboard graphics which these days are surprisingly good (I've played with the integrated graphics on my new Ivy Bridge laptop since it has switchable graphics and for many games, you don't need anything more). Even those that do choose to have addon GPUs, most choose just one. I've got a GTX 680 in my desktop and there is just no need for anything more, it handles all games superbly. I'd have to move up to multiple surround monitors or something before I'd start needing more than one GPU.
So it is a situation where you only end up needing more lanes in a high end environment, thus I don't see the big deal in not having them. It is the kind of thing you won't notice.
+10 for being pedantic (the best kind of correct, technically correct), -1000 for knowing exactly what I was groping for, but choosing to be pedantic.
Just got back from a late-night concert, and my head hasn't stopped pounding yet (and there is some question of sobriety -> Jimmy Buffet with margaritas). Besides, and I am summoning my inner BOFH here, who teh f*ck would run OpenCl code on a CPU? I've tried, and the only thing I've succeeded in doing is giving my laptop a grand mal seizure.
And no one sane does video-transcoding on a 7-year old machine. No, no, just don't go there.
And ponying up an extra $500 on top of the regular CPU going rate ($200-300) for a new chip, from Intel, when a $150 four-generation displaced video card could / would spank it is a thought not even worth considering. But I digress, someone out there will decide that running a video transcoder, on a non-upgradeable laptop (which they will pay way too much money for this chip), with Intel HD integrated graphics (does it even support OpenCl? Is it still a separate chipset, or has it been integrated on-die?), and absolutely need this feature; they will also probably save the processed video onto a 4,000 RPM USB-1 portable value hard drive.
Comments subject to revision if / when I wake up tomorrow, and shake off that last of this Tequila. I think it's Tequila.
I am John Hurt.
What kind of workload needs more than 16 PCIe lanes, but doesn't similarly need a higher-end processor?
DATABASE WOW WOW
Personally I thought the whole idea was retarded except for the mobile chips like Brazos, on the desktop the idea was completely stupid and on the server even more so. For those that don't know the original plan was to go "Full APU" and have the GPU take the place of the FP on chip, which would be a much simpler and weaker design than in years past thus freeing up more TDP for more cores. Why is this dumb? Well what if you want to use the GPU AND do some floating point heavy task? Or what if you don't want the integrated GPU because you can't OC worth a crap with the GPU built in?
All correct, but I could live with those aspects. I usually don't OC, and if I know I want the GPU AND do some floating point heavy task, I could get an additional discrete GPU. There is, however, a worse one:
Memory bandwidth congestion. A typical lower midrange graphics card with 128 bit data bus and GDDR3 is significantly slower than the same model with GDDR5. In an APU, the GPU part has to share the even lower bandwidth of the DDR3 main memory with the CPU part.
When the LLano was new, Anandtech published a preview:
http://www.anandtech.com/show/4448/amd-llano-desktop-performance-preview
It shows some comparisons to discrete graphics cards, including the HD 5570 which represents the lower midrange graphics card w/128bit mentioned above.
In gaming frame rates, the HD 5570 beats the LLano even when it runs on DDR3-1866 RAM, which was not a JEDEC standard at the time. With standard DDR3, the difference gets bigger. Which shows clearly the LLano is limited by memory bandwidth and really could use four-channel memory as in Intel's socket 2011.
With bigger and faster GPUs, the bandwidth demands will only grow, and for a Bulldozer APU with matching GPU part even four-channel memory may be insufficient..
C - the footgun of programming languages
Heads up: The x264 project's incorporating OpenCL support for certain parts of the encoder. Take a look over here - initial results are very promising.
+10 for being pedantic (the best kind of correct, technically correct), -1000 for knowing exactly what I was groping for, but choosing to be pedantic.
Welcome to Slashdot!
"Ignorance more frequently begets confidence than does knowledge"
- Charles Darwin