ARM Unveils Next-Gen Processor, Claims 5x Speedup
unts writes "UK chip designer ARM [Note: check out this short history of ARM chips in mobile devices contributed by an anonymous reader] today released the first details of its latest project, codenamed 'Eagle.' It has branded the new design Cortex-A15, which ARM reckons demonstrates the jump in performance from its predecessors, the A8 and A9. ARM's new chip design can scale to 16 cores, clock up to 2.5GHz, and, the company claims, deliver a 5x performance increase over the A8: 'It's like taking a desktop and putting it in your pocket,' said [VP of processor marketing — Eric Schorn], and it was clear that he considers this new design to be a pretty major shot across the bows of Intel and AMD. In case we were in any doubt, he turned the knife further: 'The exciting place for software developer graduates to go and hunt for work is no longer the desktop.'"
I for one certainly hope that ARM gets a chance in the more mainstream market; the more competition for Intel, the better!
I hope they don't also want us to put a mains lead into our pockets to power that beast.
That would really shake up the Wintel alliance.
"I've got more toys than Teruhisa Kitahara."
The block diagram:
http://img.hexus.net/v2/channel/news/2010/sep/armeagle3-big.jpg
refers to a "snoop control unit" and "snoop filtering". Is this some kind of DRM?
I thought most of the interesting stuff took place on the server?
Well either way, I wish them luck. Having competition and diversity in the processor market is a very good thing and forces everyone to step up to the mark, benefiting everyone.
And if they've managed to keep the power envelope down then even better.
32-bit addressing was seriously impressive in 1987, compared to Acorn's then-current machine with 32KB, including video memory. But now even smartphones are starting to come with 512MB, 1GB of memory. Does ARM have a strategy for getting past 4GB?
Someone please post the specs for power consumption and head dissipation of the ARM, so we
can direct this discussion on comparing how it compares to the leading and former CPU fab?
I still have a nice Dual Alpha 1GHz system that I use for development purposes, and it's
heat dissipation and rackmount footprint are simply superior to the modern equivalents,
so maybe a rackmount of ARM systems could retire this Alpha fanboi or maybe raise the Alpha
that nearer to the Halls of Val Halla?
Inb4 Linuxgames.com
"It's like taking a desktop and putting it in your pocket," said Schorn.
That's gotta be one of the most uncomfortable marketing images ever.
"Is that an ARM in your pocket or are you just glad to see me?"
pSo it's six times the speed of the A8 then? (1x + 5x = 6x)
Literally FLY on the back of a giant eagle's arm? huh? Someone's got this confused, guys...
I don't know the heat dissipation figures, but I can safely say I have never yet seen an ARM processor with a heatsink. As for power consumption a quick google seems to show that an 800MHz OMAP3 draws around 750mW at full load. This new A15 core is supposedly going to have similar figures.
Right now my Samsung 5000 series LED tv runs an arm with busybox linux as the firmware. It is only a matter of time before TVs become fully internet capable and use usb 3 for storage. I also have seen demos of touch screen remotes that have qwerty capability for your TV. So the only thing missing is a simple cursor system and presto you have it all. Seeing that arm processors are becoming this powerful the market for all in one home entertainment devices is there. If Microsoft does not see this coming and continues to have mediocre support for arm based devices then embedded Linux will continue to dominate the living room. Three of my home entertainment devices are already based on the Linux kernel!
'The exciting place for software developer graduates to go and hunt for work is no longer the desktop.'
Why, actually, why??
I am really really looking forward to a desktop with low power footprint. There is no need here to run MS-crapware; no Crysis or other high-resource gaming.
Gimme a nice desktop, low-low power, that boots to Debian on ARM, and I throw mine out of the window. And I already have a 80+ PSU, single row of RAM, dual-core EE AMD. It still has a 45W TDP; plus AMD does not sell the Energy Efficient (EE) any longer except to OEMs; at least in this country.
Throw out the 24-pin plus 12 V power supply, let's do everything on 12 V, give it 6 USBs, Sata, HDMI/DVI, Ethernet and WiFi. A mini ARM.
And, yes, I want to be able to add a hard disk of my own, maybe a DVD- or BlueRay-Drive, so add some space.
Not with the idea of a standards based chargers but this "Wintel alliance," crap. There is no such thing. x86 chips are used for desktop computers because they are the only things that have been cheap, common, and powerful. MS has no special interest in pushing Intel. DOS, and thus earlier Windows versions, were tied to x86. When NT came out, they abstracted it and indeed you could get NT4 for x86, PowerPC, and Alpha. Let me give you a hint how well those other versions sold. As such, they were discontinued.
Also when it came to 64-bit for the desktop time, MS cast in with AMD. Intel was pushing Itanium, which MS does support on their server OSes, but AMD's 64-bit extensions, called amd64 internally by the Windows tools, were what was used for the desktop. So you can get Windows 7 in x86 and x64 variants, and Server 2008R2 in x64 and IA64 variants.
Now for Windows CE (also the basis for Windows Mobile), their mobile/embedded OS, well then that runs on all sorts of things. x86, MIPS, ARM, and SuperH. Again, more could be added, this is just what is supported as that is what there is currently a market for.
What it comes down is they support the architectures that are used in the markets their OSes work in. There is no ARM version of Windows 7 because there are no ARM desktops that demand it. Porting an OS to a new architecture and maintaining it is not a zero effort task, so it isn't done unless it is worth it (unless it is NetBSD :D).
Also the reason x86/x64 continues so strong on the desktop is it works so well. It provides binary compatibility will all your old apps, and the CPUs that use it are fast and cheap. Thus far, I've seen nobody who can beat Intel and AMD in that market. Sure there are higher end CPUs that cost more and use tons more power, like Itanium and Power7. There are also chips that use less power and are cheaper, the ARM. However I've yet to see the chip that does better in their market, as in can do more operations with the same or less power and costs less.
So you want ARM desktops? Well first an ARM CPU that is competitive in that market has to come out. Competitive, please note, doesn't mean "Barely can compete with the low end." I'm talking something that makes you say "Wow, that is faster than my i5, and for less money." Then maybe there's interest. Should ARM desktops start to become popular, you can be pretty confident MS would move Windows over to them.
But please, stop pretending like there's some sinister conspiracy to keep alternate architectures down. There are only two reasons for the x86 dominance:
1) Compatibility. It is far nicer to have a chip that works with your old stuff. People will default to what's compatible unless given a good reason. I'm not going to pay the same amount for a CPU with the same performance that doesn't run my apps as for one that does. So whoever wants to break in to the market has to offer a good reason. Less cost, more performance, etc. Probably still need have a good emulators to support older apps.
2) Intel is really, really, good. Everyone likes to hate on Intel because they are big and there's automatic underdog love on Slashdot, but they are good at what they do. They spend a ton on R&D and the result is they are almost always ahead in terms of fabs and their CPUs tend to offer great performance for the money. Yes, they've bad problems, Netburst (P4) was an example, but currently it is impossible to touch the Core i series. They are fast, do a lot given their power budget, and have a good price.
As others have quoted, it delivers 5 times the performance of the A8 at a similar energy footprint.
You'd probably only require mains if it was a 16 core system in your pocket, as I doubt that the above performance/energy footprint is for 16 cores.
Donte Alistair Anderson Roberts - hi son!
Karma: Chameleon
I know many CS graduates who have thought that the most interesting stuff to play with is in the pocket.
Well, it's not that simple as ARM licenses the core to a CPU vendor, who then integrates one or several cores, along with a good deal of other stuff (DSP, GPU, video codecs, peripheral bus controllers, RAM, Flash, etc.), all at a process size of their choice, into an SoC, then you get power numbers for the whole thing. So any of the half-dozen or so OMAP 3xxx SoCs from TI, Tegra, from nVidia, A4 from Apple, and a bunch others are all single Cortex A8 core systems, but have varying capabilities and power consumption. Power also varies strongly with load and with what functional units are in use, down to around 1mW at idle (clock-stop).. Somewhere around 1W at full load on the CPU and reasonable duty cycles on other units is typical for a 0.8-1 GHz SoC with a full complement of functional units -- less if you don't have (or just don't use) the GPU/DSP/video codec accelerator, as you likely wouldn't for development purposes. They royally kick Atom's ass, but I have no idea how they compare to Alpha.
Imagine a Beowulf cluster of these!
According to this, a typical cortex a9 core draws about 250mW. As this has a very similar architecture (still ARMv7), it should be somewhere in similar regions, maybe more, as they boosted the frequency. So I guess a 16 core version will draw something like 4W+, maybe more. Non-the-less, this is still an incredibly good figure for a web server type processor, though a little heat sink might appear.
I'm only guessing here though, based on previous figures. There is no practical data so far on the exact figures.
Linux Support for the ARM Architecture
They royally kick Atom's ass,
The Atom looks bad on work/watt, but still wins in raw performance.
but I have no idea how they compare to Alpha.
The alpha is a "floating point monster", or was anyway, and since ARM doesn't focus on floating point I doubt they compare. The Atom might keep up though.
1. Simple connection, especially optical
2. Includes more video bandwidth than you could ever need
3. Supports USB for keyboard/mouse
4. Supports audio out
I'm currently working with several concurrency development groups within the SUNY system; we are partnered with Oracle, Google, and IBM as well as a few others. Upon mention of ARM not a single co-worker has been able to resist going into rant mode about the lack of reasonably quick CAS and LL/SC implementations. Further, barriers and fences apparently take so long to establish that to fake a CAS you are looking at three to six hundred cycles compared to about a dozen for current generation i7's and SPARCs (optimistic CASing). Can anyone speak to the implementation of the features on this new chip?
Have run all of these, in anger, in production, at one point or another.
I still have an extremely soft spot for the RAQ2, 64 bit MIPS processor.
Image link - http://dev.gentoo.org/~vapier/pics/mipsel-raq2/inside-main-board.jpg
Nota Bene, NO HEAT-SINKS OF ANY KIND, and yet these puppies could saturate a 10 Mbit connection (of course this was the days before flash and stuff) and the whole mainboard used about 10 watts, most of which was the RAM, the biggest power eater was the IDE HD.
Downside was it was MIPS, which is a lot like the downside of the Acorn ARM based A series and Risc-PC series, eg not x86 compatible, ergo not mainstream.
Now that ARM is used is zillions of other devices, ARM is no longer the backwoods, everywhere except in "a computer" eg desktop or server.
Which means ARM on the desktop or ARM on the server won't suffer so badly for not being x86... it will still suffer, but not so badly.
RAQ3 went away from MIPS to x86, IMHO because of this accessibility and availability of x86 code, not because it was technically superior to MIPS... one RAQ3 wasn't more powerful than two RAQ2 in any sense except power consumption and thermal rejection.
In practical terms x86 has gone nearly as far as it can go, both in terms of light speed and die size, and thermal dissipation per cubic mm, so the alternatives are catching up, not so much because of sheer lifting power, but because of thermal dissipation per cubic mm they still have "development room" left to play around in.
The next 5 years or so are going to be interesting, as this "development room" is explored and used up, and especially so if anyone comes out with a robust cross architecture compiler / translator.
http://slashdot.org/~GuyFawkes/journal
Actually, it's not _that_ bad for most applications.
I have actually programmed assembly back in ye goode olde days of 16 bit CPUs and segment registers, and the reason it was evil was that you ran into that limit all the time. Even the most trivial operations had to juggle registers. You couldn't even process a 640x480 pixel image in 16 colours without running into segment maths. (Incidentally that aforementioned image would need about twice the memory you could address with 16 bits without segment maths.) Even addressing two pixels on the same row or column could mean needing to change the segment first.
By comparison 4 gigabytes is still a lot. There are precious few applications where you need more than 4 GB in a single array, which is when you'd actually need segment maths.
And frankly those are nonexistent in the normal desktop or even vanilla web page world, because they have to be able to run on machines which don't even have that much.
Just having over 4 GB total data is not the old hell. If each individual piece of data is smaller than 4 GB, you can just have the segment be part of the pointer, and only need to load it once. You don't need to do more segment maths just to get the 65537'th byte of that buffer.
Don't get me wrong, it's still more elegant to not have to worry about segments at all. But the alternative is not anywhere near the old hell.
A polar bear is a cartesian bear after a coordinate transform.
one impediment may be that ARM is (at least at present) a 32-bit architecture.
Not all desktop applications need dozens of GB of RAM. A 32bit architecture is more than enough for the browsing/mail/chatting crowd.
I think the main impediment is that ARM runs a different instructions set.
Which means that a big proportion of the users won't even be able to run their x86-only favourite OS on it. And even, in the unlikely case of Microsoft finally delivering the so-long promised ARM port of Windows 7 (that it is still currently failing to produce), the software to which said users are addicted isn't likely to be usable, except in case of very slow emulation or massive porting/recompiling efforts from the software producers.
For the MacOS-X crowd too, the problems are present, although less critical : There's already a lot of shared code betwen iOS and OS-X (much more than between Windows 7 and Windows CE) so porting could be achievable. And the software writers are already used to the multiple change of architecture in the Mac world. It's only adding 1 more architecture to the hyper-fat binaries (PowerPC, x86, x86_64, now ARM) and Xcode is already doing it for them. Still it requires some efforts.
The only OS for which the problems are rather minimal are the open-source. Because availability of the code makes porting efforts not relying on the original authors, and because lots of ports exists already (homebrewers are having a great time with BeagleBoard based experiments). And in fact, Linux is already seeing some market penetration in the NetBook world, specially the ARM based netbooks.
The only problems are the few proprietary software for which no good opensource equivalents exist.
(We're currently still dependant on Adobe providing a good Flash port. Although, with OSS projects like LightSpark and Gnash, that too could change soon).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
..is that a desktop in your pocket or are you just pleased to see me....?!
Until then, could there be a 64b emulation with multi-core 32b ARM processors?
Unaccountable leaders are masters, and unrepresented people are slaves. How do US and EU fare?
Virtualization of server farms/Infrastructure
Unaccountable leaders are masters, and unrepresented people are slaves. How do US and EU fare?
While people love the idea of wireless, it just isn't going to happen for everything. In terms of power, it is impossible basically. You can do inductive charging which is technically wireless, I suppose, but it doesn't really fix anything. You device has to sit directly on the charger, which of course has a wire back to the outlet. It's been around forever, electric toothbrushes use it because having a waterproof system is important, but it just isn't that useful over all. Better to just use a wire, or have exposed connectors in a dock. Cheaper and more efficient.
You'll never see actual wireless, longer range, power until we discover some way of getting around that pesky inverse square law thing.
As for communications, well bandwidth is just an ironclad bitch, and one with no easy solution. The very best wireless technology can, in the best circumstances, compare favorably with old ass wired technology. Have a look at Wireless N as an example. If you have a good multi-antenna transmitter and receiver and you aren't too far away and there's no interference you can get 300mbps raw data rate. That works out to 100mbps of throughput. Oh yay. A whole 100mbps, you know, what the cheapest of the cheap wired ethernet can handle.
The real problem starts with video. So HDMI needs 2.8gbps so support 1920x1080 @ 60Hz. That is just the video, no audio. If we start to want things like higher resolutions, higher refesh rates/3D more than 8bpp and so on, it takes even more. Can't do that with any cheap wireless tech these days.
Also when trying to make ultra high bandwidth wireless you run in to the problem that is Shannon's Law. Bits per second is related to bandwidth and SNR. Well SNR is something you can't do much about with wireless. The noise level is what it is, so you have to increase bandwidth to increase throughput. That means increasing frequency. Here there's a problem, the higher the frequency, the less ideal the transmission characteristics. The high GHz stuff, what you need for big bandwidth links, gets rather directional, is quite short range (air even attenuates it) and doesn't pass through hardly any barriers, even walls. This is all aside from the general difficulties making stuff that signals cheap at those frequencies.
You also get the additional problem of needing even more bandwidth to avoid contention. With wires, there's no interference. I can HDMI to three displays side by side, and there's no problem. With wireless, each needs its own channel, which just further increases the amount of RF bandwidth you need to make things work.
Wireless is useful, don't get me wrong, but I don't see this "All wireless, all the time" future you do. You could spend a lot of money trying to do wireless video from your Blu-ray player to your TV, or you could just get a cheap cable. Given that both devices are going to be plugged in anyhow, is it really such an issue?
That means back to segmentation. That isn't a killer problem, but it is significant. In terms of how that works in modern computers, you can see it on Windows systems on Intel PAE processors. Basically the OS gets access to all the memory in the system, but it has to be divided up to be used. In the case of the Windows implementation, the kernel can get only 2GB and each application can get only 2GB. You can have multiple 2GB apps running, but they can't have more.
For an app to get more, it has to implement memory management internally. Basically it talks to Windows and gets a range of memory set up that will be paged, it then gets more RAM allocated and specifies how to page through it. Called AWE and used by a couple apps, like MSSQL. Of course that is complex on the part of the app and would be problematic if you had multiple ones running.
Also it makes task switching hit the system harder over all, because of the segmentation.
So i mean it works, don't get me wrong, I have seen servers doing it. However 64-bit is a much, much, cleaner solution both OS wise and software wise. It really is a hack when you get down to it.
I like current desktop CPUs, which have larger virtual address spaces than physical. You are right, 40-bits is fine for now. As far as I know the top end Intel CPUs only have 48-bits of address lines currently. No reason to implement all 64-bits, you wouldn't use it. However having a flat virtual memory space is something that is extremely useful. There's a reason everyone wanted to move to that with 32-bit CPUs as soon as it became feasible. We don't really want to go back to segmentation.
By far the greatest challenge is software, with no Windows or Mac support you'd be pushing Linux. A linux with no option to run Windows software through WINE or virtualbox for those occasional needs.
Linux on x86 runs Windows apps through Wine. Why couldn't a "Wine CE" be developed to run Windows CE apps on Linux on ARM?
I have a smart phone (HTC Eris) and it is SLOW when I go to make a call or receive a call. This puppy does everything, but it sucks at making phone calls. Sometimes, when I hang up, I press the home button, to take me to the home screen, and it literally takes 20 seconds for it to do that. When it comes to hanging up.... pressing the button makes the screen go black, so how do I know it hung up? I have to hit the button again to turn the screen on. Sometimes I end up dialing another phone number by accident due to the slowness of it. That is also very frustrating.
My POS Motorola CDMA phone's primary function is to make calls, and it works like it should. When it comes to texting it sucks (no qwerty - which is why I got the Eris).
I've heard reviews on other phones, even the Motorola Droid performing worse when it comes to phone calls than a standard clam-shell style cell phone.
A phone's primary purpose is to make a phone call, and it looks like that's the last thing they're good at nowadays.
*click*
ARM needs to get someone else to do their corporate videos. The talking heads make me cringe. I'm sure Eric Schorn's a great guy, but I wouldn't put him on a video that close-up. Really. Trust me on this.
This is why you should realize that "You" != "Most people"
Stop making wide-arching statements about what you think the rest of the world is doing when you are basing it solely on yourself.
This is what you should have done instead - namely actually do some research before talking out your ass:
Here's an old article that discusses how in Q4 2008, 25% of Vista sales were 64bit.
Also note that "Windows 7 is expected to be Microsoft's last native 32-bit version - Server 2008 R2 has already moved to 64-bit only".
Also, here we have stats indicating that 46% of Windows 7 PCs are 64bit.
Has it ever been the desktop??
At least since about 1985 almost all computers are embedded. Embedded systems became multi-tasking/mutli-processor quickly, so we've even been able to put all our "operating system 101" college learning to good use. A lot of embedded systems have involved networking and data base as well. Not to mention signal processing, and on and on.
Desktops have been a small slice of the pie for a long time.
Of course some of us were born before embedded systems (ahem..), but back then the desktop only had a dumb terminal anyway...
I'm developing a wireless sensor network that's architected as distributed sensor/actuator nodes gatewayed to a hub, which preprocesses data (and encrypts/compresses it for W/WAN transmission), has local scenarios for some immediate responses, and reports to / is controlled by a server that is one of a few thousand nodes on a WAN (some 3G, some wired broadband Internet) with the real application orchestration at a central datacenter. We are upgrading dumb wired sensors on a PIC-based embedded server back to the central Java application server. The embedded server will need to perform something like an Atom/1.66GHz, and the sensors could benefit from a little more smarts than a PIC (but need years of battery life).
So I've been planning a Zigbee WLAN gatewayed to a PC. I see that Ember has a line of integrated Zigbee/ARM-C3 SoCs. I also see some embedded PCs with ARM C-9, in interesting configs (highly integrated/bundled HW for the rest of the system). ARM C-3 in the sensor/actuator nodes is probably a little bit overkill, but if the node can cost $20 (qty1000), we can find a use for the extra smarts. If I do discover the Ember parts are the right fit, I wonder whether their ARM C-3 offers any reason to favor an ARM C-9 (or better) in the embedded server. We'd run Linux on the embedded server, but if Android were suitable and stable I'd like to try it instead.
So does using ARM C-3 in the wireless nodes give me any reason to prefer the server they feed also run on an ARM CPU?
--
make install -not war
Who would have thought that the next instruction set revolution would come through a puny cell phone to the humble end user. Not to forget fun stuff like ia64 and other VLIW architectures but they don't have that big a market share outside specialty apps. Ok, Apple was there but with negligible market share.
I just read the argument that RISC requires more memory and I would conclude that IBM/Intel was right to choose CISC in the late '70s for the IBMPC/8086 from that point of view. But in the mid '90s I was quite ready to buy the 16MB RAM for my PC, the same the HP workstation (PA-RISC) had, I was using.
It is amazing how sluggish the world can be to change (CISC->RISC), if the resulting improvement isn't blatantly obvious. I think power is the dimension the improvement happened in but I guess RISC is just one part of the story there.
Je me souviens.
ARM itself doesn't make any chips for consumers. They just design them and collect royalties from the chip manufacturers like TI and Marvell.
Another interesting thing is that JTAG/ICE(s) for ARM processors can be fairly cheap. $60 on the low end.
How much for a core i7 ICE? :D
-n
Well, yeah. Since he was talking about replacing a dual-processor system with "a rackmount of ARM systems", and wanting power consumption data, I bet work/watt was what he cared about -- use more CPUs to get more power, and compile in parallel.
Well, he said he was using it for "development" -- while I suppose GNU EMACS may need an awesome FPU, I was thinking more of the compiling end, which isn't a floating-point intensive operation.
So it looks like you have no clue what conversation you're jumping in the middle of, eh? But +3 informative for you; mods on crack and all.
Alpha's work/watt rate on integer workloads was never particularly stellar, so I doubt that.
Intel and AMD will not suffer as much as Microsoft and other companies who have missed this trend. The desktop is quickly losing importance and with it are Microsoft's biggest cash-cows.
Also, keep in mind that's the whole SoC. On the OMAP3 that includes RAM, DSP and 3D graphics (a PowerVR SGX). The actual draw of the CPU core is probably less.
Windows running on Alpha was a limited production run of aluminum framed cars. It was mostly a gesture, with the possibility of appealing to a niche enthusiast segment, with no guarantee the niche survived. Every sentient person involved knew the Alpha chips achieved it's outstanding performance by indulging in expensive fabrication steps that don't scale to the mass consumer market (without exceedingly large budgets).
What protected the Wintel alliance as much as anything was the alternatives rallying around the false prophet of reduced complexity. Reduced complexity (along the axis presumed to correlate most strongly with penis length) automatically sounds good, but doesn't always survive close analysis.
At least half the complexity in 21st century processor design involves the memory bus: address translation, cache hierarchy, TLB, coherence, snoop interface, the transaction model, and pin interface. RISC offers no advantage here. In fact, at a certain point of sophistication, the RMW instruction format from x86 reduces work in memory ordering, since you have fewer addresses delivered to the memory order logic to check for referencing identical (or overlapping) regions.
Based on the cost of the integration logic (peripherals and memory hierarchy) even if you come up with a zero-cost RISC core design, you still have to do at least as much work as your well-funded CISC competitor to finish the greater half of the chip. This is just the design aspect and ignores fabrication expertise and economy of scale.
The known problems with x86 were the variable length instruction formatting, the floating point model, and to many distinct partial updates to the flag register.
The extra registers in AMD64 don't get you nearly as much in the general case as the original RISC rhetoric conditioned people to expect without thinking. High-performance loops that are modestly register starved see a 10% gain for having twice the register set. Specific register-hungry loops see a much bigger gain. But a lot of that code now runs on the much larger SIMD register set, if it hasn't been booted to the GPU where it sees a x10 performance increase (rather than 10% to 30%).
The problem with x86 was never performance as much as power consumption to achieve that performance. x86 could trim a large amount of power with a different instruction encoding with far more predictable instruction boundaries (to enhance parallel dispatch). Whether it increases overall performance is another question. Probably not much.
Power economy for achieved performance is what ARM got right from the outset, and the reason that ARM is finally achieving dominance in a market niche that RISC as a performance gambit never achieved. In the mobile market, power efficiency matters enough to drive consumer preference.
It's the same deal with alternate energy technologies. Not until the price of gas is high enough at the pumps with the failings of the current model lead to an alternative technology gaining a permanent foothold.
Intel had a twenty five year run before hitting the wall where power efficiency mattered as much as price/performance/compatibility. Normally any technology that scales by a factor of one million before faltering to the competition is hailed as a resounding success. The original designers of such a technology don't normally hang their heads in a hall of shame.
RISC bugged me from the beginning, because it turned smart people into idiots. People smart enough to surprise the socks off me on other subjects would turn into soccer hooligans on the RISC subject and run around the room half-naked draped in the flag of orthogonality.
With a large enough register set, orthogonality damages power efficiency. You can't have a bazillion ways to encode exactly the same program in exactly the same space and not have to pay the price for that in wasted silicon. What you want is enough orthogonality that choosing an optimal code sequence is less complicated than solving a four-dime
hard X-rays to get the bandwidth we need?