AMD Licenses 64-bit Processor Design From ARM
angry tapir writes "AMD has announced it will sell ARM-based server processors in 2014, ending its exclusive commitment to the x86 architecture and adding a new dimension to its decades-old battle with Intel. AMD will license a 64-bit processor design from ARM and combine it with the Freedom Fabric interconnect technology it acquired when it bought SeaMicro earlier this year."
x86-64 and 64-bit ARM on the same chip?
I can see this being a remarkable selling point for Windows devices if both ARM and x86 code can execute on the same device without emulation.
Viable Slashdot alternatives: https://pipedot.org/ and http://soylentnews.org/
x86/AMD64 is overkill for many server functions.
It will be interesting to see if chips appear optimized for different functions.
For example hardware sql accelerators or massive i/o for file serving.
Since many hardware raid controllers are nothing but ARM cores anyway it would be interesting to see multiple cores, some used as RAID controllers and some more advanced cores for the os and file serving with a 10GB lan controller all on one chip.
Add power, drives and Ram and have a killer file server.
If AMD can push their engineering into ARM quickly, they might not only stand a chance but they might dominate fairly quickly, I'd think. They're not on par with Intel on die size, but IIRC they're pretty close - that knowledge is certainly applicable.
Remember, they've got good GPUs already. A lot of what they tried to do with the Mobility and later generations were very "ARM-like" already, it just didn't exactly work due to x86 limitations. I'd think they've got a pretty good chance overall. (If anything, it's a big market. Tegra# are really pushing NVidia along, after all...)
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
I'm sure this is just AMD hedging their bets against multiple processor ISAs. There are places where ARM is better than x86/x86-64, so it makes sense to try and dominate those niches. It falls in line perfectly with AMD being a less expensive alternative to Intel.
Given that Intel is trying to wind down its StrongARM line it inherited from DEC, AMD may see the ARM line as a place where it can finally be top dog. It has the expertise to give Broadcom, TI and Samsung a run for their money.
Taking a really big drink from the hypothetical Kool-Aid, I could see ARM64 processors being used as x86-64 replacements in palmtops and laptops. There are a couple of x86 to ARM translators on the market, which would solve the binary compatibility issue. I used FX!32 back during the NT4 and NT5beta days with my DEC workstation, and it made emulated binaries about 90% as fast as native. With advances in JITC translators and a cleanup of the x86-64 ISA to make it closer to meeting Popek and Goldberg virtualization requirements, I could see a good modern translator being 95+% as fast as native x86-64 code.
I've been expecting Apple to churn out a Power Book with an ARM processor and a binary translator. They did it with m68K -> PPC and PPC -> x86, so I wouldn't be surprised in the least to see x86 -> ARM. Now imagine it with an AMD ARM64 SoC at the heart of it.
What's sad is how bad the former CEO fucked AMD by doing a total slash and burn on their engineering and R&D and pushing for cheaper automated layouts that simply don't cut it. The Athlon64 guys? GONE. The Cryix guys? GONE. they pretty much have their backs against the wall because the former CEO burned the fucking company to get a short term bounce, which I'm sure he cashed out on.
And anybody who thinks ARM will save them might be interested in some magic beans I have for sale, as ARM frankly doesn't scale very well and from the early looks ARM64 isn't gonna be really any better for power than the CULV Intel chips while having a HELL of a lot worse IPC. Frankly, and this is coming from someone who has been building AMD systems exclusively for awhile now and is still hanging onto AM3+ for all its worth, the only real selling point they had was "bang for the buck" but by burning R&D and killing Thuban the former CEO left them holding the bag without shit besides Bulldozer, which we all know blows too much power, is too damned hot, and frankly their octocores get stomped by Intel quads on IPC while using a third of the power.
I have to agree with the engineer in that link, they should have done the same thing Intel did with Core, go back to their earlier K8 designs and start from there just as Intel did with P3 mobile but now they just don't have the money or the time. I truly hope the Athlon64/Apple A6 chip designer they hired back can come up with a design to save the company because right now? Right now they really got nothing. Hell the former CEO even pulled the plug on Krishna, which would have been a sub 20w quad core bobcat, which is why all we're seeing now is minor speedbumps on a 3+ year old design. I swear they got fucked raw by bad management and I only hope they pull through. Maybe if they would have done this 4 years ago they could have the niche Nvidia now holds, but now? Its just not enough.
ACs don't waste your time replying, your posts are never seen by me.
An over-priced slow server, ARM will grow to dominate the market. The same way Intel's slow and over priced servers have become commonplace.
Well we'd try something else, but it turns out monkeys with notepads and crayons are even slower (and more expensive).
Biodegradable, though.
Your facts are off two ways. First, going up against one big monopolistic company is a lot harder than going up against a lot of small ones. (Do you think it's easier to fight an elephant or a bunch of guys who are also fighting each other,) Second, they've managed to survive in the x86 market for 30 years. I think that counts as competing.
AMD no longer has a fab of their own, as of two years ago(?). I believe they are currently using TSMC for most of their production.
It usually normally takes a day or so for replays to be posted but it should show up on the AMD Investor Relations Website (same site that hosted the live webcast).
Who want's to make bets on who is going to win this race? AMD has won all of the previous ones.
I assume you are joking, right? It's not a sprint, it's a marathon. Being first to market means nothing, it's winning the market. And Intel is crushing the 64-bit processor market right now.
Maybe the new direction is going to be heterogeneous computing. We're already seeing AMD and Intel combine x86 and a GPU on one die; maybe AMD will try to combine everything and have a couple of ARM cores for low-power tasks, a couple of Bulldozer modules for more intensive tasks, all combined with their GPU.
In fact AMD has an amazing technology portfolio. Having graphics chip (ATI Division), the hypertransport technology and AMD64, we can expect some interesting developments
ARM architectures are considered more energy-efficient for some workloads because they were originally designed for mobile phones and consume less power.
Fuck no. The ARM1 was released in 1987 as a coprocessor for Acorn's BBC Micro. They were designed for low power operation because the engineers were impressed with the 6502's efficiency. There weren't any significant mobile phone deployments until 18 years later in 2005.
Indeed. I am trying to grasp, somewhat desperately, the events that must have taken place inside AMD headquarters when the CPU design team said they wanted to do hyper-threading. Having seen how badly Intel got knocked around when they did it, and the fact that for the price of duplicating a fair amount of the CPU, you are still only occasionally eking out a slight performance gain...and sometimes, a performance loss, their strategy doesn't make sense. What was so hard about welding two Phenom II X6's together, using the hyperlinks already present in the CPU design, and calling it a day? Knowing full well that Intel wouldn't be able to compete with that design (they've been core adverse compared to AMD), being happy that all of the cores were full cores (who'd complain?), and that they'd be a hot item for system builders everywhere. Sure, some of the gaming websites like to barf about how single-threaded performance still matters, on some games that no one cares about (the GPU, of course, mattering a lot more than the single-threaded performance of a CPU here), but to take the advantage of having 6 full cores, and trade it in for 8 half-cores...was this some idiotic attempt at market segmentation? Did some moron in a suit have a brain fart, and think "we can't have 12-core Phenom IIIs, it will cannibalize our Opteron server sales"? Fire his ass, and cut the strings on his golden parachute on the way out.
For the life me, I just can't fathom how they turned a major market advantage, with the CPU design practically on the design table already, with a popular and critically acclaimed design, and decided that f*ck it, we're doing so well here, let's go for a lobotomy, and compete on Intel's turd with an unproven half-assed design. Let's go from a full-core design that everyone complements, to some terrible half-core design that nearly killed Intel at some point. Seriously, who is commanding AMD such that they were in their nappies when the whole Intel hyper-threading business was going down (which every half-decent tech knows about), and how did they get boardroom approval?
The proper response, of course, was not the Business School of Failure's attempt at mandating some perverse product differentiation, which bears as much similarity to surgery as bludgeoning a person to death with a hammer, but through true, non-crippling differentiation. Phenom IIIs get 12-cores, and the latest SSE instructions + something that the boys down in the instruction lab cook up; Opterons get larger caches + more cores + special server instruction sets that mean something concrete, even if it means implementing hardware Apache threads; that's on top of the SSE3 stuff and so forth. Would companies buy Opterons over Phenoms if one had hardware accelerated support for web services over the other? I believe the survey would say hell yes.
As for the GPU stuff, the low-cost, low-power stuff is nice for chump change, but it's a fierce market with many competitors. What you want, what large companies no doubt want, is the ability to slam in GPU-daughter boards, to add 10 or 20 7970 GPUs on a single board (preferably with sockets, which drives up the cost a few cents, but also taps into the smaller markets, where you may buy 4 GPUs now, and 6 later), so that they can drive those large super-computing projects that already make use of these GPUs, but do so more efficiently.
As for gaming, the more stream processors, I imagine, the better. When in doubt, double them, as it will give Intel and Nvidia something to curse over.
I am John Hurt.
Indeed. The one order the CEO can give to save the company is this: "Magical turn-arounds for companies who have been f*cked only happens in textbooks and fair-tales; as such, all resources for CPU design will go into creating a Phenom III with 12 cores and PCI-Express 3.0 and an Opteron design which employs liquid cooling (for the short term), as we are going to give it a major Mhz boost on top of the extra cores / cache we are going to staple on."
Getting involved in the already overgrown ARM market shows nothing but lack of vision. "We're going where everyone else is going, that'll be profitable!" You are going to be *that* guy who shows up late to the party, and wonders why all the booze is gone. Seriously, how do you mismanage stuff this badly? You're a CPU company, and you come up with the brilliant plan that despite being a major competitor in the x86 market, you're going to fix things by buying an oversubscribed design for a CPU in a market that...recursion error.
Think of it being like Ford, not using its own resources to think up a new car design, but paying Honda to license it the design for the Civic. Things are either absolutely atrocious, like AMD's stock should be worth a Haitian penny right now bad and we just haven't been told anything, or somebody doesn't know what he's doing. Go get the old guys your predecessor fired, and bring them back for more money. Find the DEC guys, and offer stock options if you have to to get them on board. Then follow their advice. After a year or two of punishment, AMD will be back on firm ground again.
I am John Hurt.
Your argument doesn't stack up.
First you say they're bringing an 8 core chip to compete with a 4 core chip. Fine. Then you complain the cores cannot keep up 1:1. So you're expecting AMD's chips to be twice as good as intel's to be able to compete.
That, of course, is rigging the test, and so is dishonest.
One could also say that with single cores not much worse than the competition, but double the number of cores, and a lower price to boot, you get better value. Moreso if you can make good use of the double number of cores.
And that's before considering that single-core benchmarks are entirely unrepresentative for multi-core performance thanks to various tricks like turbo core and turbo boost — that aren't 1:1 comparable so you'd have to do full, sustained benchmarks on all cores simultaneously to find out which delivers the most sustained instructions per second.
Meaning that AMD's offering takes more marketing footwork, but technically is not all bad. Not at all.
.
And considering what they'd been doing with Pink / Taligent in keeping a parallel universe of development of their codebase always going on the x86 architecture while publicly showing only PowerPC development, they've probably got a skunks-work factory team somewhere that's already been running ARM-based IOS or even ARM-based OSX for a year if not for years...
The Thubans were good, but everything based on Bulldozer just blows through power while having terrible IPC, thanks to having shared integer and floating point units. If they were to be honest the "modules" would be treated as single cores with hardware assisted hyperthreading, because the benches show that is a hell of a lot closer to what they are than to true cores.
Errrm, all of the integer units are dedicated and the shared floating point units still give each core as much floating-point resources as on the previous generation of AMD chips even if every single core is using floating point 100% of the time. If AMD hadn't screwed up on the engineering side, it'd be a really great design.
I don't have mod points but I am equally as puzzled. AMD haven't had that many opportunities over the past few years (none at all really) but that was certainly one.
Sadly the systems I work on are all Intel because we do a great deal of report and post-processing on data and that requires CPU grunt and running as much as we can in parallel. Had AMD done this they would have been under consideration. Hyper-threading makes very little if any difference to us really, it's all about getting as many full cores on as possible.
It's been said before on this thread, but I'll say it again. AMD remaining solvent while competing against Intel for 30 years is a lot more impressive than most people realize, especially considering they competed using Intel's own ISA. It's too soon to tell now, but it's reasonable to expect that AMD (being in Intel's weight class) could plausibly compete with most of the current ARM manufacturers. I'd certainly expect their 64 bit server chip efforts to be a lot more interesting than what the cell phone chip makers have been putting out from a performance perspective.
It's a relatively small club. Note that both the headline and summary are wrong. AMD has not licensed a processor design, they have licensed the right to make their own implementation of the ARMv8 architecture (which isn't just a piece of paper, it includes access to ARM's rich set of regression tests and assistance from ARM engineers when requested on both the hardware design and the supporting software). I know of three other companies working on ARMv8 designs. For ARMv7, I think there is basically only ARM with the Cortex series and Qualcomm with the Snapdragon (which is a massively hacked-up Cortex A8, with a completely redesigned FPU, a better interconnect, and some other improvements, but not a complete independent implementation). Compare this with the ARMv4 and ARMv5 situation, where StrongARM and XScale were complete independent implementations. ARM has intentionally delayed producing their own ARMv8 design to give other companies a chance and promote more competition. This worked very well for x86 during the '90s, when Intel, AMD, Cyrix/IBM, IDT, and others were all pushing out compatible products at different market segments. In the ARM world, because they all have to go through the same set of conformance tests, compatibility should be even higher.
I am TheRaven on Soylent News
I am trying to grasp, somewhat desperately, the events that must have taken place inside AMD headquarters when the CPU design team said they wanted to do hyper-threading. Having seen how badly Intel got knocked around when they did it, and the fact that for the price of duplicating a fair amount of the CPU, you are still only occasionally eking out a slight performance gain...and sometimes, a performance loss, their strategy doesn't make sense
Perhaps they looked at IBM or Sun's implementation of SMT instead. Adding a second context to the POWER series added about 10% to the die area and gave around a 50% speedup. If you have multithreaded workloads (especially on a server) then it can significantly improve throughput for two very simple reasons. The first is that when one context has a cache miss, the CPU doesn't sit idle, it can let the other core work. The second is that it makes branch misprediction penalties lower, because if you're issuing instructions alternately from two contexts you can get the instruction that the branch depends on a lot closer to the end of the pipeline than before you need to make the prediction. This also helps with various other hazards, so you don't need so much logic for out-of-order execution to get the same throughput.
I am TheRaven on Soylent News
Also, there is nothing about ARM that inherently makes it more powersaving @ the same performance level than other RISC CPUs, be it SPARC, POWER, MIPS and so on.
I can think of several things. For Thumb-2, there is instruction density. MIPS16 does about as well as Thumb-1, but it is massive pain to work with. AArch64 doesn't (yet) have a Thumb-3 encoding, but one will almost certainly appear after ARM has done a lot of profiling of the kinds of instruction that CPUs like to generate. Even in ARM mode, the big win over the other RISC architectures is the it has fairly complex addressing modes, so you can do things like structure and array offset calculations in one instruction on ARM or 3-4 on MIPS. For AArch32, you also have predicated instructions. These make a big difference on a very low power chip, because you don't need to have any branches for small conditionals. For AArch64, most of these are gone, but there is still a predicated move, which is a very powerful version of a select instruction and lets you do mostly the same things. With AArch32 you have store and load multiple instructions, which basically let you do all of your register spills and reloads in a single instruction (the instruction takes a mask of the registers to save, the register to use as the base, and whether to post- or pre- increment or decrement it as two flags). With AArch64, they replaced this with a store-pair instruction, which can store two registers, and has the advantage of being simpler to implement (fixed number of cycles to execute).
I am TheRaven on Soylent News
I guess you take the words of Intel fanboys literally. No, the Bulldozer architecture is not hyper-threading. No, it does not mean only a slight performance gain and especially not a performance loss. I recently made 3 microbenchmarks on an Opteron 6234 (Bulldozer too). I measured the negative effect of sharing some circuits in a Bulldozer core. This negative effect varies from insignificant to small (3%, 13%, 25%). I run the same two threads on the two cores of a single bulldozer unit vs two cores on separate units. Intel hyper-threading brings 30% more performance - in the best case. The bulldozer core pair brings 75% more performance - in the worst case. How can you compare them? They are not in the same league.
The funniest benchmark was the floating point. The most frequent complaint against the Bulldozer architecture is that two cores share a single floating point unit. AMD should tell one million times that yes, they share a single floating point unit, but that is a 256 bit wide unit, which can be split into two 128 bit parts. And what is the size of the usual floating point number? Not 256 bit, not 128 bit, but only 64. In reality I measured that the two cores in a single unit processes floating point instructions almost at full speed. The negative effect of circuit sharing was only 3%, barely measurable. How ironic.