RISC Vs. CISC In Mobile Computing
eldavojohn writes "For the processor geeks here, Jon Stokes has a thoughtful article up at Ars Technica analyzing RISC vs. CISC in mobile phones (Wikipedia on Reduced Instruction Set Computers and Complex Instruction Set Computers). He wraps it up with two questions: 'How much is the legacy x86 code base really worth for mobile and ultramobile devices? The consensus seems to be "not much," and I vacillate on this question quite a bit. This question merits an entire article of its own, though,' and 'Will Intel retain its process leadership vs. foundries like TSMC, which are rapidly catching up to it in their timetables for process transitions? ARM, MIPS, and other players in the mobile device space that I haven't mentioned, like NVIDIA, AMD/ATI, VIA, and PowerVR, all depend on these foundries to get their chips to market, so being one process node behind hurts them. But if these RISC and mobile graphics products can compete with Intel's offerings on feature size, then that will neutralize Intel's considerable process advantage.'"
There are no CISC CPUs anymore. There are RISC CPUs with RISC instruction sets (e.g. ARM) and there are RISC CPUs with CISC instruction sets (e.g. x86). The cores are mostly the same, except that the chips with CISC instructions need to do a little more work in the decoder. It requires a bit extra transistors and a bit more power, but it's not a huge deal for PCs and servers. Of course, for embedded applications, it makes a difference and for those it makes sense to have more "specialised" architectures (from microcontrollers to DSPs, ARM and all kinds of hybrids).
Opus: the Swiss army knife of audio codec
RISC vs. CISC? What is this, the early 90's? There are no RISC chips anymore, except as product lines that were originally developed with the RISC methodology in mind. Similarly, true CISC doesn't exist either. Microcode has done wonders in turning complex instructions into a series of simpler instructions like one would find on a RISC processor.
The author's real point appears to be: x86 vs. Other Embedded Architectures. Without even looking at the article (which I did do), it's not hard to answer that one: There is no need for x86 code in a mobile platform. The hardware is going to be different than a PC, the interface is going to be different than a PC, and the usage of the device is going to be different than a PC. Providing x86 compatibility thus offers few, if any, real advantages over an ARM or other mobile chip.
If Intel's ATOM takes off, it will be on the merits of the processor and not on its x86 compatibility. Besides, x86 was a terrible architecture from the get-go. There's something mildly hilarious about the fact that it became the dominant instruction set in Desktop PCs across the world.
Javascript + Nintendo DSi = DSiCade
RISC vs CISC was the architecture flamewar of the late 1980s. Welcome to the 21th century, you'll like it here. It's a world when, since the late 90s, the ISA (instruction set architecture), is so abstracted away from the actual micro-architecture of microprocessor, as to make it completely pointless to make distinctions between the two. Modern processors are RISC, they are CISC, they are vector machines, they're everything you want them to be. Move on, the modern problems are now in multi-core architecture and their issues of memory coherence, cache sharing, memory bandwidth, interlocking mechanisms, uniform vs non-uniform, etc. The "pure RISC" standard bearers of yore have disappeared or have been expelled from the personnal computing sphere (remember Apple ditching PowerPC ? Alpha anyone ? Where are those shiny MIPS-based SGIs gone?). Even Intel couldn't impose a new ISA on its own (poor adoption of IA-64). The only RISC ISA that has any existence in the personnal computing arena, including mobile, is ARM, but precisely, they do only mobile. There's really no reason at all to build any device on which you plan to run generic OSes and rich computing experience on anything else than x86 or x86-64 machines.
Intel sucessfully killed the high end CPU manufacturers. However, recently they have had poor performance in the very low power arena. Their main offering (XScale, until they sold it) was poor compared ot the competitors. Compare the Intel PXA27x to the Philips LPC3180. The philips chip has about the same instruction rate for integer instructions (at half the clock rate), hardware floating point (so it's about 5x as fast at this) and draws about 1/5 of the power. I know which one I prefer...
Unlike the old RISC workstation manufacturers which relied on a small market of high margin machines, the current embedded CPU manufacturers operate in a huge, cut-throat world where they need to squeeze up the price/performance ratio as high as possible to maintain a lead. I think this market will be somewhat tougher to crack than the workstation market, since intel does not have what they had before: an advantage in volume shipped.
SJW n. One who posts facts.
I've had to work with the ARM ISA in the past (I was studying its implementation as a soft core on an FPGA), and I can tell you it doesn't follow the RISC philosophy well, if at all.
One very non-RISC thing ARM did was move the shift instructions into every arithmetic instruction. That's right: there are no dedicated shift instructions. When you need a shift instruction, you have to encode it as part of a move operation or an add. In effect, every add, and, or, sub, etc. is actually a an add+shift, and+shift, or+shift, etc. This is the opposite of the RISC philosophy, and it significantly complicates the hardware, since a variable shifter has to be on the ALU critical path.
Other non-RISC things ARM did include the Java instruction set extensions, the Thumb instruction set extensions (further reduce code size), vector & media instruction set instructions, etc.
I think calling ARM "RISC" is a marketing decision only, done for historical reasons. It doesn't have much to do with the technical reality, IMO. Jon Stokes would have done better to say ARM vs. x86, instead of RISC vs. CISC, which is an outdated idea back from the 80s & 90s.
They just aren't very important distinctions anymore.
Both refer to the instruction sets, not the internal workings. x86 was CISC in 1978 and it's still CISC in 2008. ARM was RISC in 1988 and still RISC in 2008. AMD64 is a border line case.
People get confused with the way current x86's break apart instructions into microops. That's doesn't make it RISC. That just make it microcoded. That's how most CISC processors work. RISC process rarely use anything like microcode and when they do, it is looked upon as very unRISCy.
Today, the internals of RISC and CISC processors are so complex that the almighty instruction set processing is barely a shim. There are still some advantages to RISC but they are dwarfed by out-of-order execution, vector extensions, branch prediction and other enormously complex features of modern processors.
The only real point in x86 is Windows compatability. Linux runs fine on ARM and many other architectures. There are probably more ARM Linux systems than x86-based Linux systems (all those Linux cellphones run ARM).
Apart from some very low level stuff, modern code tends to be very CPU agnostic.
Engineering is the art of compromise.
There's no distinction between the two any more, and hasn't been for a long time. The whole point of RISC was to simplify the instruction format and pipeline.
The problem these days is that it doesn't actually cost anything to have a complex instruction format. It's such a tiny, isolated piece of the chip that it doesn't count for anything, it doesn't even slow the chip down because the chip is decoding from a wide cache line (or multiple wide cache lines) anyway.
So what does that leave us with? A load-store instruction architecture verses a read-modify-write instruction architecture? Completely irrelevant now that all modern processors have write buffer pipelines. And, it turns out, you need to have a RMW style instruction anyway, even if you are RISC, if you want to have any hope of operating in a SMP environment. And regardless of the distinction cpu architectures already have to optimize across multiple instructions, so again the concept devolves into trivialities.
Power savings are certainly a function of the design principles used in creating the architecture, but it has nothing whatsoever to do with the high level concept of 'RISC' vs 'CISC'. Not any more.
So what does that leave us with? Nothing.
-Matt
As I read the previous posts, it seems like the focus is on RISC vs CISC but I think the real question is there value-add for designers to have an x86 compatible embedded microcontroller?
People (and not just us) should be asking would end customers find it useful to be able to run their PC apps on their mobile devices? Current mobile devices typically have PowerPoint and Word readers (with maybe some editing capabilities) but would users find it worthwhile being able to load apps onto their mobile devices from the same CDs/DVDs that were used to load the apps onto their PCs?
If end customers do find this attractive, would they be willing to pay the extra money for the chips (the Atom looks to require considerably more gates than a comparable ARM) as well as for the extra memory (Flash, RAM & Disk) that would be required to support PC OSes and apps? Even if end customers found this approach attractive, I think OEMs are going to have a long, hard think about whether or not they want to port their radio code to the x86 with Windows/Linux when they already have infrastructures built up with the processors and tools they are currently using.
The whole thing doesn't really make sense to me because if Intel wanted to be in the MCU business, then why did they spin it off as Marvell (which also included the mobile WiFi technology as well)?
The whole think seems like a significant risk that customers will want products built from this chip and the need for Intel and OEMs to recreate the infrastructure they have for existing chips (ie ARM) for the x86 Atom.
myke
Mimetics Inc. Twitter
Until recently, there were still speed advantages to using a four core multi-processor G5 for some operations over the 3.0GHz eight-core Xeon Mac Pros because of VMX.
It is somewhat ironic that the Core architecture chips now used by Apple in all but the Mac Pros are all below the 3GHz clock "wall" that was never overcome by the G5, but the Intel name seems to have gone a long way in assuaging consumer doubts about buying a Mac.
""
The problem these days is that it doesn't actually cost anything to have a complex instruction format. It's such a tiny, isolated piece of the chip that it doesn't count for anything, it doesn't even slow the chip down because the chip is decoding from a wide cache line (or multiple wide cache lines) anyway.
""
The problem with your assumption is that it's _wrong_.
It does cost something. The WHOLE ARTICLE explains in very good detail the type of overhead that goes into supporting x86 processors.
The whole point of ATOM is Intel's attempt to make the ancient CISC _instruction_set_ work on a embedded style processor with the performance to handle multimedia and limited gaming.
The overhead of CISC is the complex arrangement that takes the x86 ISA and translates it to the RISC-like chip that Intel uses to do the actual processing.
When your dealing with a huge chip like a Xeon or Core2Duo with a huge battery or connected directly to the wall then it doesn't matter. Your taking a chip that would use 80watts TPD and going to 96.
But with ARM platform you not only have to make it so small that it can fit in your pocket, but you have to make the battery last at least 8-10 _hours_.
This is a hell of a lot easier when you can deal with a instruction set that is designed specifically to be stuck in a tiny space.
If you don't understand this, you know NOTHING about hardware or processors.
Every 4+ comment has the same "RISC|CISC is dead" comment talking about how x86 chips break down that massive, warty ISA into a series of RISC-like micro-ops for internal consumption. And that this has been the case since at least the Pentium Pro.
Read the article. Jon Stokes makes that point: but he also makes the point that in embedded processors, it does matter, because the transistor budget is much, much smaller than for a modern desktop CPU. It may come to pass in a few generations of die feature shrinking that we arrive back at the current situation of ISAs becoming irrelevant, but for the moment in the embedded space it does matter that you need to give up a few million transistors to buffering, chopping up and reissuing instructions compared to just reading and running them.
Remember, this is Jon Stokes we're talking about: he's the guy that taught most Slashdotters what they know about CISC and RISC as it is.
Classical Liberalism: All your base are belong to you.
I understand the theory--you simplify instructions, do things to speed up the processor so it can run faster, then optimize the processor to run as fast as you can.
In other words, you are designing your instruction set to your hardware.
Now, assuming that you are going to have close to infinite investment into speeding up the CPU, it seems that if you are going to fix an instruction set across that development time, you want the instruction set that is the smallest and most powerful you could get it.
That way for the same cycle instead of executing one simple instruction you are executing one powerful one (that does, say 5x more than the simple one)
Now at first the more powerful one will take more time than the simple one, but as the silicon becomes more powerful, The hardware designers are going to come up with a way to make it only take 2x as long as the simple one. Then less.
I guess I mean that you will get more relative benefit tweaking the performance of a hard instruction than an easy one.
Also, at some point the Memory to CPU channel will be the limit.
I'd kinda like to see Intel take on an instruction set designed for the compiler rather than the CPU (like Java Bytecode). Bytecode tends to be MUCH smaller--and a quad-core system that directly executes bytecode, once fully optimized, should blow away anything we have now in terms of overall speed.
The distinction you seem to be trying to draw here is not very sound. Modern CPUs "translating instructions into hardware instructions" with a gate maze is essentially the same thing as pulling a wide microcode word from ROM whose bits directly control the logic units. In both cases you put some bits in to start the process off, and you get a larger number of bits as a wide bus of signals out, which are used to direct traffic inside the CPU. The picture only looks
Specifically, the different parts of each microcode instruction executed in parallel then, just as now, though out of order execution was much rarer (some DSPs had it IIRC). This was not because microcode as it was then conceived couldn't handle it, but that the in-CPU hardware to support it wasn't there. There's no point going through gymnastics to feed your ALU if you've only got one and it's an order of magnitude slower than the circuit that feeds it.
One of the biggest annoyances of staying in any one field for too long is having to watch some technology following the logical path from conception to fruition go through an endless series of renaming (AKA jargon upgrades) that add nothing but confusion and pomposity to the field.
--MarkusQ
While this was by far the most common sort of implementation, it wasn't what drove the definition. Many factors can effect how things ultimately get laid out on the silicon, and nobody ever said "well, we thought it was going to be micro coded but the ROM area wound up L shaped instead of rectangular, so I guess it isn't."
What drove the definition was what differentiated micro-coded architectures from their piers and predecessors--the explicit use of a systematic way to organize and sequence the control lines (and there was some overlap and blur around the edges--ad hoc systems with "meta-control lines," gates arrays, RAM, and even demultiplexors instead of ROMS, etc.) to permit the design of more complex instructions. Because they were systematic, such systems could be written down like code instead of being laid out like circuits (which the ultimately were) and thus the name.
Microcode is a way of designing (and thinking about) a CPU, not at the end of the day a way of implementing one. You could take a fully specified microcoded architecture and opportunistically replace some or all of the microstore with a gatemaze without effecting it's formal behaviour. Since the result would often be smaller, faster, and use less power this was commonly done.
--MarkusQ