Slashdot Mirror


The Linux-Proof Processor That Nobody Wants

Bruce Perens writes "Clover Trail, Intel's newly announced 'Linux proof' processor, is already a dead end for technical and business reasons. Clover Trail is said to include power-management that will make the Atom run longer under Windows. It had better, since Atom currently provides about 1/4 of the power efficiency of the ARM processors that run iOS and Android devices. The details of Clover Trail's power management won't be disclosed to Linux developers. Power management isn't magic, though — there is no great secret about shutting down hardware that isn't being used. Other CPU manufacturers, and Intel itself, will provide similar power management to Linux on later chips. Why has Atom lagged so far behind ARM? Simply because ARM requires fewer transistors to do the same job. Atom and most of Intel's line are based on the ia32 architecture. ia32 dates back to the 1970s and is the last bastion of CISC, Complex Instruction Set Computing. ARM and all later architectures are based on RISC, Reduced Instruction Set Computing, which provides very simple instructions that run fast. RISC chips allow the language compilers to perform complex tasks by combining instructions, rather than by selecting a single complex instruction that's 'perfect' for the task. As it happens, compilers are more likely to get optimal performance with a number of RISC instructions than with a few big instructions that are over-generalized or don't do exactly what the compiler requires. RISC instructions are much more likely to run in a single processor cycle than complex ones. So, ARM ends up being several times more efficient than Intel."

26 of 403 comments (clear)

  1. Visual Studio is great, but what about MyCleanPC? by CRCulver · · Score: 4, Funny

    I'm glad Visual Studio also runs perfectly on Wine (I'm also making sure to have a party with my friends on Visual Studio 2012 Virtual Launch Party, where thousands of geeks around the globe connect together to party the release of latest Visual Studio).

    I'm happy for you that you can develop more efficiently with Visual Studio, but I'm piffed that MyCleanPC still isn't available for Linux. I mean, I'm looking at my friend on his Windows box, and ever since he installed MyCleanPC, his gigabits are running faster than ever!

    Plus, MyCleanPC completely eradicated any viruses on his computer, sped up his internet connection and gave him some peace of mind! We desperately need a Linux port of such outstanding software as MyCleanPC!

  2. oversimplified by kenorland · · Score: 5, Insightful

    ia32 dates back to the 1970's and is the last bastion of CISC,

    The x86 instruction set is pretty awful and Atom is a pretty lousy processor. But that's probably not due to RISC vs. CISC. IA32 today is little more than an encoding for a sequence of RISC instructions, and the decoder takes up very little silicon. If there really were large intrinsic performance differences, companies like Apple wouldn't have switched to x86 and RISC would have won in the desktop and workstation markets, both of which are performance sensitive.

    I'd like to see a well-founded analysis of the differences of Atom and ARM, but superficial statements like "RISC is bad" don't cut it.

    1. Re:oversimplified by lkcl · · Score: 5, Insightful

      I'd like to see a well-founded analysis of the differences of Atom and ARM, but superficial statements like "RISC is bad" don't cut it.

      i've covered this a couple of times on slashdot: simply put it's down to the differences in execution speed vs the storage size of those instructions. slightly interfering with that is of course the sizes of the L1 and L2 caches, but that's another story.

      in essence: the x86 instruction set is *extremely* efficiently memory-packed. it was designed when memory was at a premium. each new revision added extra "escape codes" which kept the compactness but increased the complexity. by contrast, RISC instructions consume quite a lot more memory as they waste quite a few bits. in some cases *double* the amount of memory is required to store the instructions for a given program [hence where the L1 and L2 cache problem starts to come into play, but leaving that aside for now...]

      so what that means is that *regardless* of the fact that CISC instructions are translated into RISC ones, the main part of the CPU has to run at a *much* faster clock rate than an equivalent RISC processor, just to keep up with decode rate. we've seen this clearly in an "empirical observable" way in the demo by ARM last year, of a 500mhz Dual-Core ARM Cortex A9 clearly keeping up with a 1.6ghz Intel Atom in side-by-side running of a web browser, which you can find on youtube.

      now, as we well know, power consumption is a square law of the clock rate. so in a rough comparison, in the same geometry (e.g. 45nm), that 1.6ghz CPU is going to be roughly TEN times more power consumption than that dual-core ARM Cortex A9. e.g. that 500mhz dual-core Cortex A9 is going to be about 0.5 watts (roughly true) and the 1.6ghz Intel Atom is going to be about 5 watts (roughly true).

      what that means is that x86 is basicallly onto a losing game.... period. the only way to "win" is for Intel and AMD to have access to geometries that are at least 2x better than anything else available in the world. each new geometry that comes out is not going to *stay* 2x better for very long. when everyone has access to 45nm, intel and AMD have to have access to 22nm or better... *at the same time*. not "in 6-12 months time", but *at the same time*. when everyone else has access to 28nm, intel and AMD have to have access to 14nm or better.

      intel know this, and AMD don't. it's why intel will sell their fab R&D plant when hell freezes over. AMD have a slight advantage in that they've added in parallel execution which *just* keeps them in the game i.e. their CPUs have always run at a clock rate that's *lower* than an intel CPU, forcing them to publish "equivalent clock rate" numbers in order to not appear to be behind intel. this trick - of doing more at a lower speed - will keep them in the game for a while.

      but, if intel and AMD don't come out with a RISC-based (or VILW or other parallel-instruction) processor soon, they'll pay the price. intel bought up that company that did the x86-to-DEC-Alpha JIT assembly translation stuff (back in the 1990s) so i know that they have the technology to keep things "x86-like".

    2. Re:oversimplified by Zero__Kelvin · · Score: 4, Informative

      "Someday linux devs will resign themselves to the fact that linux is (somewhat) great for servers and terrible for almost everything else"

      You don't know anything about Linux. It powers all RISC / ARM based Android smartphones. It also runs on more than 33 different CPU architectures. A huge number of those platforms are embedded systems that are probably sitting in your living room and enabling you to watch TV, DVDs, Blue Ray, etc as well as listen to it all in Surround Sound.

      " In my opinion this entire article is trolling."

      To be blatantly honest, I haven't quite figured out if it is you that is trolling, or you are really just that ignorant of the facts.

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    3. Re:oversimplified by kenorland · · Score: 4, Interesting

      As it turns out, it's much easier for software to do that job

      As it turns out, that's false. Optimizations are highly dependent on the specific hardware and data, and it's hard for compilers or programmers to know what to do. Modern processors are as fast as they are because they split optimization in a good way between compilers and the CPU. Traditional CISC processors got that wrong, as well as hardcore traditional RISC processors; the last gasp of the latter was the IA64, which proved pretty conclusively that neither programmers nor compilers can do the job by themselves.

  3. Re:Blast in time by Pseudonym · · Score: 4, Informative

    Hell, I remember using an Archimedes in 1988. Odd to think that my phone now has four of them.

    Back to the topic, the border between RISC and CISC is a bit fuzzy these days. Every modern CISC chip is basically a dynamic translator on top of a RISC core. But even high-end ARM chips can do some of this with Jazelle.

    To be fair, CISC does have a few performance advantages when power consumption isn't (as big) an issue. The code density is better on x86 (yes, even with Thumb), which does mean they tend to use instruction cache more effecitvely. ARM chips generally don't do out-of-order scheduling and retirement; that uses a lot of power, and is the main architectural difference between laptop-grade and desktop/server-grade x86en).

    I'd like to see what a mobile-grade Alpha processor looks like. But I never will.

    --
    sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
  4. Short term vs Long term thinking by UnknowingFool · · Score: 4, Interesting

    Some here were immediately crying anti-trust and not understanding why Intel won't support Linux for Clover Tail. It's not an easy answer but power efficiency for Intel has been their weakness against ARM. If consumers had a choice between ARM based Android or Intel based Android, the Intel one might be slightly more powerful in computing but comes at the cost of battery life. For how tablets are used for most consumers, the increase in computing isn't worth the decrease in battery life. For geeks, it's worth it but general consumers don't see the value. Now if the tablet used a desktop OS like Windows or Linux, then the advantages are more transparent; however, the numbers favor Windows are there are more likely to be desktop Windows users with an Intel tablet than desktop Linux users with an Intel tablet. For short term strategy, it makes sense.

    Long term, I would say Intel isn't paying attention. Considering how MS have treated past partners, Intel is being short-sighted if they want to bet their mobile computing hopes on MS. Also have they seen Windows 8? Intel based tablets might appeal to businesses but Win 8 is a consumer OS. So consumers aren't going to buy it; businesses aren't going to buy it. Intel may have bet on the wrong horse.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
    1. Re:Short term vs Long term thinking by 0123456 · · Score: 4, Insightful

      Oh, you're right. A company the size of Intel couldn't possibly spare one or two people for a few weeks to get support for their new power management into Linux.

  5. Re:RISC is not the silver bullet by Anonymous Coward · · Score: 5, Insightful

    Like I posted elsewhere, intel hasn't made real CISC processors for years, and I don't think anyone has.
    Modern Intel processors are just RISC with a decoder to the old CISC instruction set.
    RISC beats CISC in price performance trade-off, but backwards compatibility keeps the interface the same.

  6. Sorry Bruce, but that is total nonsense. by guidryp · · Score: 5, Insightful

    "ARM ends up being several times more efficient than Intel"

    Wow. Someone suffered a flashback to the ancient CISC vs RISC wars.

    This is really totally out to lunch. Seek out some analysis from actual CPU designers on the topic. What I read generally pegs the x86 CISC overhead at maybe 10%, not several times.

    While I do feel it is annoying that Intel is pushing an Anti-Linux platform, it doesn't make sense to trot out ancient CISC/RISC myths to attack it.

    Intel Chips have lagged because they were targeting much different performance envelopes. But now the performance envelopes are converging and so are the power envelopes.

    Medfield has already been demonstrated at competetive power envelope in smartphones.

    http://www.anandtech.com/show/5770/lava-xolo-x900-review-the-first-intel-medfield-phone/6

    Again we see reasonable numbers for the X900 but nothing stellar. The good news is that the whole x86 can't be power efficient argument appears to be completely debunked with the release of a single device.

  7. x86 to blame? by leromarinvit · · Score: 4, Insightful

    Is it really true that x86 is necessarily (substantially) less efficient than ARM? x86 instruction decoding has been a tiny part of the chip area for many years now. While it's probably relatively more on smaller processors like Atom, it's still small. The rest of the architecture is already RISC. Atom might still be a bad architecture, but I don't think it's fair to say x86 always causes that.

    Also, there is exactly one x86 Android phone that I know of, and while its power efficiency isn't stellar, the difference is nowhere near 4x. From the benchmarks I've seen, it seems to be right in the middle of the pack. I'd really like to see the source for that claim.

    --
    Proud member of the Ferengi Socialist Party.
  8. Re:Blast in time by TheRaven64 · · Score: 4, Informative

    Every modern CISC chip is basically a dynamic translator on top of a RISC core.

    And that's the problem for power consumption. You can cut power to execution units that are not being used. You can't ever turn off the decoder ever (except in Xeons, where you do in loops, but you leave on the micro-op decoder, which uses as much power as an ARM decoder) because every instruction needs decoding.

    But even high-end ARM chips can do some of this with Jazelle.

    Jazelle has been gone for years. None of the Cortex series include it. It gave worse performance to a modern JIT, but in a lower memory footprint. It's only useful when you want to run Java apps in 4MB of RAM.

    The code density is better on x86 (yes, even with Thumb), which does mean they tend to use instruction cache more effecitvely

    That's not what my tests show, in either compiled core or hand-written assembly.

    --
    I am TheRaven on Soylent News
  9. Re:RISC is not the silver bullet by UnknowingFool · · Score: 5, Informative

    I would argue the problem for Apple wasn't about performance but about updates, mobile, and logistics.. PowerPC originally held promise as a collaboration between Motorola, IBM, and Apple. IBM got much out of it as their current line of servers and workstations run on it. Apple's needs were different than IBM's. Apple needed new processors every year or so to keep up with Moore's law. Apple needed more power efficient mobile processors. Also Apple needed a stable supply of the processors.

    Despite ordering millions of chips a year, Apple was never going to be a big customer for Motorola or IBM. Their chips would be highly customized that none of their other customers needed or wanted and Apple needed updates every year. So neither Motorola or IBM could dedicate huge resources for a small order of chips as they could make millions more for other customers. PowerPC might have eventually come up with a mobile G5 that could rival Intel but it would have taken many years and lots of R&D. IBM and Motorola didn't want to invest that kind of effort (again for one customer). So every year Apple would order enough chips they thought they needed. If they were short, they would have order more. Now Motorola and IBM like most manufacturers (including Apple) do not like carrying excess inventory. So they were never able to keep up with Apple's orders as their other customers had more steady and larger chip orders.

    So what was Apple to do? Intel represented the best option. Intel's mobile x86 chips were more power efficient than PowerPC versions. Intel would keep up the yearly updates of their chips. If Apple increased their orders from Intel, Intel could handle it because if Apple wasn't ordering a custom part, they were ordering more of a stock part. There are some cases where Apple has Intel design custom chips for them, mostly on the lower power side; however, Intel still can sell these to their other customers.

    As a side note, as a difference in the relationship between IBM and Apple look at the relationship between MS and IBM for the Xbox 360 Xenon chip. This was a custom design by IBM for MS, but the basic chip design hasn't changed in seven years. As such chip manufacturing has been able to move the chip to smaller lithographies (90nm --> 45nm in 2008) both increasing yield and lowering cost.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
  10. RISC vs CISC, really? by fermion · · Score: 4, Informative
    Most 70's era microprocessor pretty much had 50 opcode and a few registers. It was possible to memorize these all and decompile from hex in your head. I never had the mental acuity to do so, but many of my friends in high school could. By the 1980's, there was a lot of big iron that used RISC, but as I recall these had more opcodes than, say, a 6502, and I know that RISC does not just mean reduced instruction. It is a simplified instruction set. Right now I think we have a lot of hybrid chips on the market. The war between CISC and RISC has come to place where both are used as needed. In the x86 space, legacy is an issue. MS has not done what Apple does which is to say support a machine for 3-5 years, then develop something that meets current demands. The common person would not even see a RISC processor until Apple switched to the PowerPC, which brought the conflict between CISC and RISC to the public. It is interesting to have this conversation now because this was exactly what was said back them. RISC is more efficient, so the chip can be about half as fast, and still be as fast as the CISC chip.

    So this OS specific chip is nothing new, and *nix exclusion is not new. Many microcomputers could not run *nix because they did not have a PMMU. The ATT computer ran a 68K processor with a custom PMMU. Over the past 10 years there have been MS Windows only printers and cameras which offloaded work to the computer to make the peripheral cheaper.

    Which is to say that there are clearly benefits for RISC and CISC. MS built and empire on CISC, and clearly intends to continue to do so, only moving to RISC on a limited basis for high end highly efficient devices. For the tablet for the rest of us, if they can ship MS Windows 8 on a $400 device that runs just like a laptop, they will do so., If efficiency were the only issue, then we would be running Apple type hardware, which, I guess, on the tablet we are. But while 50 million tablets are sold, MS wants the other 100 million laptop users that do not have a tablet, yet, because it is not MS Windows.

    --
    "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
  11. Re:RISC is not the silver bullet by stripes · · Score: 5, Interesting

    First, RISC instructions complete in one cycle. If you have multi-cycle instructions, you're not RISC

    LOAD and STORE aren't single cycle instructions on any RISC I know of. Lots of RISC designs also have multicycle floating point instructions. A lot of second or third generation RISCs added a MULTIPLY instruction and they were multiple cycle.

    There are not a lot of hard and fast rules about what makes things RISCy, mostly just "they tend to this" and "tend not to that". Like "tend to have very simple addressing modes" (most have register+constant displacement -- but the AMD29k had an adder before you could get the register data out, so R[n+C1]+C2 which is more complext then the norm). Also "no more then two source registers and one destination register per instruction" (I think the PPC breaks this) -- oh, and "no condition register" but the PPC breaks that.

    Second, x86 processors are internally RISCy and x86 is decomposed into multiple micro-ops.

    Yeah, Intel invented microcode again, or a new marketing term for it. It doesn't make the x86 any more a RISC then the VAX was though. (for anyone too young to remember the VAX was the poster child for big fast CISC before the x86 became the big deal it is today).

  12. Misleading slant on mention of Atom's RISC core by Dogtanian · · Score: 5, Informative

    Like I posted elsewhere, intel hasn't made real CISC processors for years, and I don't think anyone has. Modern Intel processors are just RISC with a decoder to the old CISC instruction set.

    Exactly. Intel has been doing this ever since the Pentium Pro and Pentium II came out in the 1990s. Anyone who knows much at all about x86 CPUs is aware of this, and Perens certainly will be. That's why I'm surprised that that article misleadingly states:-

    So, we start with the fact that Atom isn't really the right architecture for portable devices (*) with limited power budgets. Intel has tried to address this by building a hidden core within the chip that actually runs RISC instructions, while providing the CISC instruction set that ia32 programs like Microsoft Windows expect.

    The "hidden core" bit is, of course, correct, but the way it's stated here implies that this is (a) something new and (b) something that Intel have done to mitigate performance issues on such devices, when in fact it's the way that all Intel's "x86" processors have been designed for the past 15 years!

    Perhaps I'm misinterpreting or misunderstanding the article, and he's saying that- unlike previous CPUs- the new Atom chips have their "internal" RISC instruction set directly accessible to the outside world. But I don't think that's what was meant.

    (*) This is in the context of having explained why IA32 is a legacy architecture not suited to portable devices and presented Atom as an example of this.

    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    1. Re:Misleading slant on mention of Atom's RISC core by im_thatoneguy · · Score: 4, Insightful

      It also ignores the fact that in flops per watt Intel still dominates ARM.

      It's like comparing a moped to a bus and saying "see look how much more fuel efficient the moped is!"

      True... but then fill a bus with people and suddenly the mpg per person goes through the roof for the bus. You could get 300mpg per person from a bus. Good luck getting that with a moped.

      And like the introduction of plugin hybrids competing with even Mopeds for single occupancy MPG--you can also see RISC x86 chips out-competing ARM too on RAW watts. The next generation of Intel chips are going to be not only substantially faster but also on parity for watts.

      Simply stripping down technology inevitably will come back to bite you in the ass. I think the domination of ARM in the mobile space is about to evaporate within the next year on every conceivable metric.

  13. Re:The Year of Linux on Desktop Is Now by ColdWetDog · · Score: 5, Insightful

    So does it matter when someone sends you a .pptx file that Office 2003 freezes on? Yeah, yeah, I'm pretty sure you can get a converter, but I like telling people that if their file has an 'x' in the extension it means that it's 'experimental' and they shouldn't send it to others. They need to send the version without the 'x'.

    --
    Faster! Faster! Faster would be better!
  14. ARM is not RISC and x86-64 is not CISC by YA_Python_dev · · Score: 5, Informative

    Getting back on topic: the last ARM architecture, ARMv8, is far from what was called "RISC" back in the '70s. E.g. it can run instructions of different sizes (16 vs 32 bit), it has 4 specialized instructions for AES, registers with different sizes (32, 64 and 128 bits), instructions for running a subset of the Java bytecode, a rich set of SIMD operations and specialized instructions for SHA-1 and SHA-256.

    Similarily the architecture supported by the new Atom chips (which is AMD64/x86-64 BTW, IA32 is only present for backward compatibility) is almost universally run on RISC-like processors that have instruction translators. Considering that the increased density of the x86-64 instructions usually allows to save more cache transistors than the ones required for decoding the instructions themselves, I think that the power consumption differences that we see are more due to the implementation and different traditional focus areas of ARM vs Intel/AMD than inherent differences in the instruction sets.

    --
    There's a hidden treasure in Python 3.x: __prepare__()
    1. Re:ARM is not RISC and x86-64 is not CISC by imroy · · Score: 4, Interesting

      The ARM ISA may seem "complex" when you describe it like you have, but each instruction is still a fixed size, they all follow one of only a limited number of formats (R-type, etc), and memory is only accessed by load/store instructions. That's why many prefer the term "load/store architecture". Anyway, these things really help to simplify your instruction decoder stage and keep memory accesses simple. These in turn make it easier to implement things like pipelines, out-of-order execution, branch prediction, etc. And that's only the stuff that has been implemented in ARM so far. I wonder how long until ARM develops a core with more advanced features, like register renaming and specularitive execution, and how it will perform then compared to x86 (which already has these things).

    2. Re:ARM is not RISC and x86-64 is not CISC by Bruce+Perens · · Score: 5, Insightful

      None of today's "RISC" processors are what John Mashey was designing when RISC was introduced.

      I agree (and wrote in the article) that ARM has complicated their own architecture, and that Atom uses a RISC-like processor and instruction translation. However, backward compatibility with all of the generations of x86 still increases the complexity of Atom quite a lot.

      Thumb (ARM's 16-bit instruction set) is itself an instruction translator to the 32-bit opcodes, adding fixed or default operands for many of the instructions.

      The SIMD instructions used by Intel, AMD, and ARM go back to Pixar's CHAP compositing hardware in the 80's.

      None of this would have been in a Stanford MIPS.

    3. Re:ARM is not RISC and x86-64 is not CISC by Bruce+Perens · · Score: 4, Insightful

      I didn't write the summary posted on Slashdot. My summary (it's probably still in the "firehose" section) was one line. The Slashdot editor just scraped the first few paragraphs of my article. You can tell the number of people who actually read my article by the discussion of PowerVR graphics. There isn't one.

      Intel's competition with ARM right now is like a doped race-horse. They are hiding the problems of their architecture by using a semiconductor process half the size of the competition. Given equal FABs, we wouldn't see Intel as competitive.

    4. Re:ARM is not RISC and x86-64 is not CISC by AcidPenguin9873 · · Score: 5, Insightful

      Given equal FABs, we wouldn't see Intel as competitive.

      Intel has had a fab advantage for years, and it's only getting bigger. Ask AMD how it feels - AMD made nice gains with K8 while Intel had uarch problems (Itanium+P4), but as soon as Intel fixed that (Core2/Nehalem/Sandy/Ivy), AMD felt the pain of their fab advantage all over again, and now AMD has uarch problems AND fab disadvantage.

      Saying "given equal FABs" is a ridiculously stupid way to analyze the processor market. Real chips are what people buy, not some hypothetical ARM A15 produced on Intel's 22nm FinFET or an Atom produced in TSMC 28. If you want to talk about microarchitecture, sure, take process out of the equation. But people don't buy microarchitecture, they buy a final product. Fab advantage allows Intel to hide their uarch problems until they fix them. When the next-gen Atom (Silvermon/Valleyview) comes out, then Intel won't have uarch problems AND they will still have a massive fab advantage.

  15. Re:RISC is not the silver bullet by smash · · Score: 4, Interesting

    I bet that Apple did not make the decision based on technical grounds, it was probably a business decision.

    Actually it was both; the great irony is that Apple ditched PPC and went to x86 because of better power consumption with the new intel gear. The core onwards CPUs, in terms of performance per watt, have been awesome and were far and away leaps and bounds better than anything IBM/Motorola could offer with the RISC powerPC processors. If apple didn't go x86 and tried to stick with PPC, they would have been slaughtered in the notebook market, which is/was the fastest growing personal computer segment. Neither IBM or Motorola gave a crap about making a CPU to cater to apples 10% of the portable computer market.

    There's a lot of "RISC is so much better for power!" crap floating around, and maybe in theory it is. However in practice when you take into account real world applications and the "race to sleep", having a more powerful, CISC based core with an instruction set that provides many many functions in hardware can help offset the "in theory" better power consumption of the RISC competition. That and the fact that intel has the world's best fabs.

    --
    I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
  16. Re:The Year of Linux on Desktop Is Now by hack++slash · · Score: 4, Insightful

    Don't give up hope, hundreds of thousands of people in offices across the globe have made a living whilst playing Windows Solitaire.

    --
    To do something right, you often have to roll up your sleeves and get busy.
  17. Re:Visual Studio is great, but what about MyCleanP by ultrasawblade · · Score: 4, Informative

    Furthermore a distinguishing feature of CISC vs. RISC is number of general purpose registers. RISC always tried to do everything in registers and treat RAM as an I/O device, instead of stuff like "load accumulator with value in RAM and write it back to RAM" or "load this register with this value from RAM, multiply it with the value in this register, then store it back to RAM." - there are many instructions like this in CISC architectures that encourage treating RAM as just as good for temporary storage as registers - which, of course, it hasn't been for a long time now.

    Intel has become more RISCy with MMX/SSE and now with the amd64 extensions that give it 8 more general purpose registers.