Slashdot Mirror


Research Shows RISC vs. CISC Doesn't Matter

fsterman writes The power advantages brought by the RISC instruction sets used in Power and ARM chips is often pitted against the X86's efficiencies of scale. It's difficult to assess how much the difference between instruction sets matter because teasing out the theoretical efficiency of an ISA from the proficiency of a chip's design team, technical expertise of its manufacturer, and support for architecture-specific optimizations in compilers is nearly impossible . However, new research examining the performance of a variety of ARM, MIPS, and X86 processors gives weight to Intel's conclusion: the benefits of a given ISA to the power envelope of a chip are minute.

10 of 161 comments (clear)

  1. isn't x86 RISC by now? by alen · · Score: 5, Informative

    i've read the legacy x86 instructions were virtualized in the CPU a long time ago and modern intel processors are effectively RISC that translate to x86 in the CPU

    1. Re:isn't x86 RISC by now? by drinkypoo · · Score: 4, Interesting

      That is correct. Every time this comes up I like to spark a debate over what I perceive as the uselessness of referring to an "instruction set architecture" because that is a bullshit, meaningless term and has been ever since we started making CPUs whose external instructions are decomposed into RISC micro-ops. You could switch out the decoder, leave the internal core completely unchanged, and have a CPU which speaks a different instruction set. It is not an instruction set architecture. That's why the architectures themselves have names. For example, K5 and up can all run x86 code, but none of them actually have logic for each x86 instruction. All of them are internally RISCy. Are they x86-compatible? Obviously. Are they internally x86? No, nothing is any more.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:isn't x86 RISC by now? by Anonymous Coward · · Score: 4, Insightful

      Yes. As noted by the study (That by the way isn't very good.) "When every transistor counts, then every instruction, clock cycle, memory access, and cache level must be carefully budgeted, and the simple design tenets of RISC become advantageous once again."
      Essentially meaning that "If you want as few transistors as possible it doesn't help to have the CISC to RISC translation layer in x86"

      They also claim things like "The report notes that in certain, extremely specific cases where die sizes must be 1-2mm2 or power consumption is specced to sub-milliwatt levels, RISC microcontrollers can still have an advantage over their CISC brethren." which clearly indicates that their idea of "embedded" systems is limited to smartphones.
      The cases where you have a battery that can't be recharged on daily basis is hardly an extremely specific case. Not that any CPU they tested is suitable for those applications anyway. They have essentially limited themselves to applications where "not as bad as P4" is acceptable.

    3. Re:isn't x86 RISC by now? by RabidReindeer · · Score: 4, Interesting

      x86 instructions, are in fact, decoded to micro opcodes, so the distinction isn't as useful in this context.

      They're not the only ones. The IBM mainframes have long been VMs implemented on top of various microcode platforms. In fact, one of the original uses of the 8-inch floppy disk was to hold the VM that would be loaded up during the Initial Microprogram Load (IMPL), before the IPL (boot) of the actual OS. So in a sense, the Project Hercules mainframe emulator is just repeating history.

      Nor were they unusual. In school I worked with a minicomputer which not only was a VM on top of microcode, but you could extend the VM by programming to the microcode yourself.

      The main differences between RISC and CISC, as I recall were lots of registers and the simplicity of the instruction set. Both the Intel and zSeries CISC instruction sets have lots of registers, though. So the main difference between RISC and CISC would be that you could - in theory - optimize "between" the CISC instructions if you coded RISC instead.

      Presumably somebody tried this, but didn't get benefits worth shouting about.

      Incidentally, the CISC instruction set of the more recent IBM z machines includes entire C stdlib functions such as strcpy in a single machine-language instruction.

    4. Re:isn't x86 RISC by now? by cheesybagel · · Score: 5, Informative

      After AMD lost the license to manufacture Intel i486 processors, together with other people, they were forced to design their own chip from the ground up. So they basically used one of the 29k RISC processors and put an x86 frontend on it. Cyrix did more or less the same thing at the time also coming with their own design. Since the K5 had good performance per clock but could not clock very high and was expensive AMD was stuck and to get their next processor they bought a company called NexGen which designed the Nx586 processor which was Intel compatible. AMD then worked on the successor of Nx586 as a single chip which was the K6. The K7 Athlon was yet another design made by a team headed by Dirk Meyer who used to be a chip designer at Digital Equipment Incorporated i.e. DEC. He was one of the designers of the Alpha series of RISC CPUs and the Athlon resembles an Alpha chip internally a lot because of that.

    5. Re:isn't x86 RISC by now? by enriquevagu · · Score: 4, Insightful

      This is why we use the terms "Instruction Set Architecture" to define the interface to the (assembler) programmer, and "microarchitecture" to refer to the actual internal implementation. ISA is not bullshit, unless you confuse it with the internal microarchitecture.

  2. Re:It's a question that WAS relevant by TWX · · Score: 5, Funny

    You mean, my Github project on Ruby on Rails with node.js plugins isn't optimized to use one versus the other?

    --
    Do not look into laser with remaining eye.
  3. Re:so why is intel's 14nm haswell still at 3.5 wat by timeOday · · Score: 4, Insightful
    Here is your answer, the A20 is freakishly slow compared to anything Intel would put their name on.

    Granted, you can build a tablet to do specific tasks (like decoding video codecs) around a really slow processor and some special-purpose DSPs. But perhaps the companies in that business aren't making enough profit to interest Intel.

  4. This is a myth that is not true by hkultala · · Score: 5, Informative

    That is correct. Every time this comes up I like to spark a debate over what I perceive as the uselessness of referring to an "instruction set architecture" because that is a bullshit, meaningless term and has been ever since we started making CPUs whose external instructions are decomposed into RISC micro-ops. You could switch out the decoder, leave the internal core completely unchanged, and have a CPU which speaks a different instruction set. It is not an instruction set architecture. That's why the architectures themselves have names. For example, K5 and up can all run x86 code, but none of them actually have logic for each x86 instruction. All of them are internally RISCy. Are they x86-compatible? Obviously. Are they internally x86? No, nothing is any more.

    This same myth keeps being repeated by people who don't really understand the details on how processors internally work.

    You cannot just change the decoder, the instruction set affect the internals a lot:

    1) Condition handling is totally different on different instruciton sets. This affect the banckend a lot. X86 has flags registers, many other architectures have predicate registers, some predicate registers with different conditions.

    2) There are totally different number of general purpose and floating point registers. The register renamer makes this a smaller difference, but then there is the fact that most RISC's use same registers for both FPU and integer, X86 has separate registers for both. And this totally separates them, the internal buses between the register files and function units in the processor are done very differently.

    3) Memory addressing modes are very different. X86 still does relatively complex address calculations on single micro-operation, so it has more complex address calculation units.

    4) Whether there are operations with more than 2 inputs, or more than 1 output has quite big impact on what kind of internal buses are needed, how many register read and write ports are needed.

    5) There are a LOT of more complex instructions in X86 ISA which are not split into micro-ops but handled via microcode. the microcode interpreter is totally missing on pure RISCs ( but exists on some not-so pure RISC's like Powe/PowerPC).

    6) Instruction set dictates the memory aligment rules. Architectures with more strict alignment rules can have simples load-store-units.

    7) Instruction set dictatetes the multicore memory ordering rules. This may affect the load-store units, caches and buses.

    8) Some instructions have different bitnesses in different architectures. For example x86 has N x X -> 2N wide multiply operations which most RISC's don't have. So x86 needs bigger/different multiplier than most RISCs.

    9) X87 FPU values are 80-bit wide(truncated to 64-bit when storing/loading). Practically all the other CPU's have maximum of 64-bit wide FPU values (though some versions Power have support for 128-bit FP numbers also)

    1. Re:This is a myth that is not true by hkultala · · Score: 4, Informative

      Some of what you said is legitimate. Most of it is irrelevant, since it does not speak to the postulate. You're speaking of issues which will affect performance. So what? You'd have a less-performant processor in some cases, and it would be faster in others.

      No.

      1) if the codition codes work totally differently, they don't work.

      2) The data paths needed for separate and compined FP and integer regs are so different that it makes absolutely NO sense to have them together in chip that runs x86 ISA, even though it's possible.

      3) If you don't have those x86-compatible address calculation units, you have to break most of memory ops into more micro-ops OR even run them with microcode. Both are slow. And if you have a RISC chip you want to have only the address calculation units you need for your simple base+offset addressing.

      4) In the basic RISC pipeline there are two operands, one output/instruction. There are no data paths for two results, you cannot execute operations with multiple outputs such as x86 muliply which produces 2 values(low and high part of result), unless you do something VERY SLOW.

      6) IF your RISC instruction set says you have aligned memory operations, you design your LSU to have only those, as it makes the LSU's much smaller, simpler and faster. But you need unaligned accesses for x86.

      9) If your FPU calculates with different bit width, it calculates wrongly.

      And