Slashdot Mirror


Intel Dismisses 'x86 Tax', Sees No Future For ARM

MrSeb writes "In an interview with ExtremeTech, Mike Bell — Intel's new mobile chief, previously of Apple and Palm — has completely dismissed the decades-old theory that x86 is less power efficient than ARM. 'There is nothing in the instruction set that is more or less energy efficient than any other instruction set,' Bell says. 'I see no data that supports the claims that ARM is more efficient.' The interview also covers Intel's inherent tech advantage over ARM and the foundries ('There are very few companies on Earth who have the capabilities we've talked about, and going forward I don't think anyone will be able to match us' Bell says), the age-old argument that Intel can't compete on price, and whether Apple will eventually move its iOS products from ARM to x86, just like it moved its Macs from Power to x86 in 2005."

294 of 406 comments (clear)

  1. Speed versus complexity by girlintraining · · Score: 5, Interesting

    You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly. And the reason for that is because the bandwidth outside the processor, the I/O, is so damnably slow compared to what's possible on the die itself. That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can. Besides, look at Nvidia's GPU cores: They throw hundreds of cores onto the die, but it eats hundreds of watts as well. Massively parallel and simple instruction sets don't appear to translate into energy savings.

    --
    #fuckbeta #iamslashdot #dicemustdie
    1. Re:Speed versus complexity by phantomfive · · Score: 5, Insightful

      Intel won the CPU wars because of manufacturing, not because of a superior instruction set. They are always able to get a smaller manufacturing process.

      For example, taking your point about data bandwidth, because the x86 has so few registers, it has to do data IO a lot more compared to something like the PowerPC or SPARC.

      To make up for that, Intel built a lot of logic in microcode and pipe-lining. It was a lot of work, but they did it well, so the x86 gets acceptable performance. All that extra logic takes power though. So Intel has a tradeoff between power consumption and performance that they can make. This guy seems to be saying they will switch to reduce power consumption, and then make up for it by having the best manufacturing process once again.

      And they do. For probably as long as chips continue to get smaller, Intel will have the advantage.

      --
      "First they came for the slanderers and i said nothing."
    2. Re:Speed versus complexity by k(wi)r(kipedia) · · Score: 1

      That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can.

      I'm not sure I get where you're going. I think the more logical course of action, given your argument that it's faster inside than outside the CPU, would be to move everything inside the CPU. I know "that" would be a hell of a lot more problem to fix/debug, but if you want the best power efficiency it's best to have fewer parts that send bits of electrons back and forth some external bus. Compare the situation with how much more energy efficient it is to live in the city where you wrok than to commute from the suburbs

    3. Re:Speed versus complexity by Anonymous Coward · · Score: 2, Interesting

      In terms of market share CISC isn't even close to touching RISC. Every ARM processor is RISC. It's not just smartphones and tablets you need to consider, but PMPs, consumer routers, and an unfathomable number of other devices that all use ARM (Advanced RISC Machine, previously Acorn RISC Machine).

    4. Re:Speed versus complexity by Man+On+Pink+Corner · · Score: 4, Informative

      The instruction decoder is such an absurdly tiny part of a modern CPU that it really doesn't matter. CISC often has the ultimate advantage simply because it makes better use of the code cache.

    5. Re:Speed versus complexity by Chas · · Score: 5, Interesting

      Intel won the CPU wars because of manufacturing, not because of a superior instruction set.

      There's nothing inherently "superior" about ARM or PPC instruction sets.

      Each has its strengths and weaknesses and prescribed methods of capitalizing on the former while working around the latter.

      Is x86, possibly, more inelegant than ARM or PPC? Maybe. Then again, what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?

      x86 may be ugly and hackish. But it's probably THE best documented platform in history and has very VERY few platform segregation points.

      --


      Chas - The one, the only.
      THANK GOD!!!
    6. Re:Speed versus complexity by danlip · · Score: 5, Insightful

      And we know who lost that one. Badly.

      We do? The world's fastest supercomputer (K computer) is RISC based, and ARM is RISC, so it seems very much alive. Also CISC now has pipelining which was the thing that originally made RISC awesome, and RISC has gotten more complex, so they have evolved to be closer to each other. I am sure there are other factors that are more important for energy efficiency (mainly transistor size) and I don't have an opinion on that, but I don't understand where you are coming from.

    7. Re:Speed versus complexity by WorBlux · · Score: 1

      Depends on what you are doing. Some things GPU's are much more effecient. Say bitcoin mining. Best ratio of Mhash/Joule was .3 for the best intel processor, and 2.5 for some AMD cards. So even 3-4 years down the road there are other factors that may stop Intel. Switching speed for silicon have really met the limit, significantly higher speeds will require entirely new processes. Main memory is becoming faster and cache is getting bigger. In reality the designs have moved towards each other. RISC almost always has a moderately long pipeline (5-9 stages) with branch prediction (becuase almost noone goes back after writing code to eliminate or minimize pipeline stalls) and vector processors. CISC has an internal RISC instruction and an interpreter sub-assembly to reduce the number of transistors needed in the main Units.

    8. Re:Speed versus complexity by Anonymous Coward · · Score: 3, Insightful

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      Wait, which one lost?
      RISC lost because instructions took too much space and caused cache misses.
      CISC lost because it couldn't perform -- practically every CISC processor designed today is a RISC processor + instruction set translation.
      As is frequently the case between two pure ideas (that are both legitimate enough to be seriously considered long enough for a decent flame war), the winner is actually a clever but "impure" choice combining the merits of both.

      And the reason for that is because the bandwidth outside the processor, the I/O, is so damnably slow compared to what's possible on the die itself. That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can.

      Yes, which is why ARM, despite/because of being far from a true RISC, does so well: it speaks multiple instruction sets to fit more code in cache, and does some weird -- but elegantly implemented -- extra stuff to do more with data coming off the bus. (I'm thinking specifically of the inline barrel shifter, but there's a couple other, less drastic bits of cleverness I'm not thinking of ATM.) Plus there are SIMD extensions in wide use (just like x86 -- not an advantage, but not a disadvantage).

      No reason a pure CISC instruction set from the 80s can't make the same performance, but only at the cost of added logic for interpreting instructions -- and more logic means more power.

      Now this next bit is just silly, and a red herring to boot (since neither ARM nor x86 are typically "massively parallel" at all), but I'll answer it anyway:

      Besides, look at Nvidia's GPU cores: They throw hundreds of cores onto the die, but it eats hundreds of watts as well. Massively parallel and simple instruction sets don't appear to translate into energy savings.

      OK, now spec an x86 processor (with full modern SIMD instructions, naturally) that can do the operations typically done on such a GPU at the same speed. Or rather, spec a multi-processor machine or cluster with n CPUs, since that's what you'll need... Got it? Now multiply the TDP of that CPU by n, and compare. Oh, it looks like the massively parallel and simple instruction set does translate into energy savings for the same parallel-friendly workload.

    9. Re:Speed versus complexity by Anonymous Coward · · Score: 1

      I think what people most often miss is why Intel gets dismissed as being not serious on power. This is all Intel's own fault.

      Why does Intel have Laptop, Desktop, Server(Xeon), and Atom(Low Power) chips at all? There should only be one. There's no ECC on desktop and laptop desgins, thus preventing cheaper energy-efficient parts from being used in serious server configurations. So ARM is coming into this field http://www.engadget.com/2012/05/29/dell-test-deployment-arm-servers/ and http://www.engadget.com/2011/11/02/hp-and-calxedas-moonshot-arm-servers-will-bring-all-the-boys-to/ , as quoted here, 7-15watts per chip, vs Xeon's 69-135. When you can have 64 quad core CPU's in the same physical space space as, at most 8 Xeon's and power envelope as 4 Xeons. Intel is going to start seeing significant losses down the road.

      It's not a question about chip cost, it's power usage. Data center power is a premium (eg 600$/mo for a 42U rack, if I can stick 128 CPU cores in that rack, my cost per CPU is now 4.68$/mo instead of 18.75/mo . The current going rate for virtualized CPU cores is between 8 cents an hour and 2.4$/hr. Amazon EC2 is 57.60$/mo per cpu core. I only get 11 CPU cores at amazon and none of the benefits of physical servers for my 600$/mo. The same 128 core count would cost 7372.8 at the lowest tier at Amazon EC2.

      At this point it would make more sense to just use the ARM version of linux (I somehow doubt Microsoft is producing a Windows RT Server, but you never know) and run the exact same stuff that you'd normally run on the x86 servers. Minus the closed source software that has no ARM binaries.

    10. Re:Speed versus complexity by Darinbob · · Score: 4, Interesting

      Power-wise the argument is right. There's very little difference between the two instruction sets that makes one more power efficient than the other. However in practice the difference is that most Intel x86 family chips are optimized for high performance (desktop) where as most ARM chips are optimized for cost and efficiency (low power embedded systems, phones, etc). ARM probably has more experience in the chip design in making things smaller but as it ramps up into faster desktop or tablet oriented CPUs it is going to lose out more.

      It really does come down to software ultimately I think. Software needs to do minimal work if it wants to save power; stop checking the net every minute to see if there's an update, put the CPU to sleep when not in use, use interrupts instead of polling, do more in a compiled low level language and less in a byte code interpreted language or scripting language, keep things small, and don't let Microsoft touch you. As soon as you start demanding the ability to run MS Office then you are giving up on power savings.

    11. Re:Speed versus complexity by Anonymous Coward · · Score: 5, Informative

      The processor architecture is not wildly different between manufacturers. The System On Chip designs in which the CPU is just one element is what makes them different. Should Intel produce custom x86 SoC you can expect the same.

    12. Re:Speed versus complexity by Darinbob · · Score: 4, Informative

      The instruction set decoder should be an absurdly tiny part, but in modern Intel processors they're not necessarily small. They're dynamically converting an archaic x86 instruction set into an internal RISC-like set.

    13. Re:Speed versus complexity by Darinbob · · Score: 4, Insightful

      No one who had never seen x86 would design an instruction set like it has. It exists this way not because someone designed it from scratch but because it is the end result of a long series of backward's compatible decisions, stretching all the way back to the 4004. Everytime Intel tries to start from a clean slate those CPUs do not take off or get enough time in the market place to prove themselves. The customers always demand that the new CPUs be able to run old software.

      It's actually a surprise that ARM is taking off more in higher end systems (higher end meaning tablets and smart phones). I think this is precisely because the backward's compatibility is not necessary there.

    14. Re:Speed versus complexity by Olorion · · Score: 2

      That's why ARM has the compact Thumb instruction subset.

    15. Re:Speed versus complexity by Darinbob · · Score: 1

      Intel really is the last hold out for CISC, and I don't think it even wants to be in that position. It does create newer CPUs that are RISC based but the customers demand x86 compatibility in the desktop which is their cash cow. Everywhere else you see RISC dominating to a ridiculous degree. Sure there are a few 68000 based SoCs around, some people actually use 8051 or 8086 here and there, but they're such tiny parts of the market compared to PowerPC, ARM, PIC, AVR, MIPS, and so forth.

    16. Re:Speed versus complexity by BasilBrush · · Score: 2

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one.

      CISC did. ARM is RISC, and there are far more ARM chips in use in the world than X86 chips.

    17. Re:Speed versus complexity by Billly+Gates · · Score: 1

      Risc won.

      That Intel processor in your machine is not a CISC processor anymore. It merely takes ancient 8086 cisc instructions and internally translates them into RISC ones. THis is how the Pentium Pro became a true competitive beast to PowerPC in the 1990s. Sadly it also means you can't touch the hardware anymore if you write assembly as it simply translates them into risc.

      Also the icore series can predict the next instruction before it is even loaded and use math tricks to do the work guessing with compiler optimizations without actually loading the instruction in to compensate the 1/30th. THis is how Intel is killing AMD as well which its own branch prediction can't accurately guess the next set of instructions and execute them before they even load. THis is why a lot of extra bandwidth shoes little performance gain unless it is a server with a large load.

    18. Re:Speed versus complexity by philip.paradis · · Score: 1

      parts that send bits of electrons back and forth

      Listen man, we don't all have CPUs that incorporate significant quantities of Sr2CuO3. Damn kids these days and their newfangled electrons, decaying into goshdarned component bits. It's just shameful.

      --
      Write failed: Broken pipe
    19. Re:Speed versus complexity by KingMotley · · Score: 1, Informative

      x86-64 has 16 (64-bit) general purpose registers, but ARM has 8 (32-bit) general purpose registers, and a few specialized ones, some of which are only available in certain operating modes. PowerPC and SPARC both have 32 64-bit registers but can only do register-register type operations (load/store) which quickly forces the registers to be cycled, while x86-64 can do register-memory type operands which is much more efficient.

      And yes, Intel does do a lot of microcode, pipelining, and micro-ops. The great part is that because of it, the instruction set appears CISC externally, but internally through micro-ops, it gets 95% of the benefit that you would see through RISC. Today's x86 chips are more a CISC/RISC hybrid than they are of a pure CISC design. And of course a 200MHz SPARC was worse in performance than the Pentium 2's of the day, and isn't really in the same performance league as today's 4GHz x86-64 processors.

    20. Re:Speed versus complexity by 10101001+10101001 · · Score: 5, Insightful

      There's nothing inherently "superior" about ARM or PPC instruction sets.

      The GP didn't say anything of the sort. He was pointing out that to say "CISC won" is only true if you consider that x86 is CISC and Intel spend gobs of money to be at the forefront of CPU manufacturing technology, both in shrinking die size/increasing clock speed and shoehorning all the negative characteristics of the x86 design into a form that was more RISC like so it could allow for super-scalar and deep pipeline designs. Intel deserves a lot of credit in proving just how far CISC design can go. But it certainly wasn't that CISC won because it had greater strengths.

      Is x86, possibly, more inelegant than ARM or PPC? Maybe. Then again, what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?

      Sounds like Linux on the x86, actually. Seriously, though, RISC design tends to have a few very strong design elements: it tends to have a good many registers which absolves a lot of cache/stack work, it tends to have a fixed opcode size and requires aligned memory which usually improves throughput and allows for a much more streamlined instruction decoding engine, and precisely because there's a lot less need to support legacy platforms there's a lot more leeway to segment memory for power considerations.

      x86 may be ugly and hackish. But it's probably THE best documented platform in history and has very VERY few platform segregation points.

      Well, you can think MS's monopolistic actions for that. Seriously, "ugly and hackish"* might well describe near everything MS and Intel can be known for, in their question to maintain backwards compatibility. And if Intel had started out with an 8-bit RISC design, I'm certain there'd be the same problems, so it's not really an x86/CISC thing. Never the less, it's precisely the fact that Intel is unlikely to allow platform segregation points that x86 will probably never be low power.

      *And please realize, I say this with a great deal of respect towards both Intel and MS in maintaining performance giving how many hacks they've put in over the years to compensate for not only their own bugs but the bugs of other developers. So, as pretty and clever as a lot of the hacks may be, it's still ugly overall to have the hacks in the first place and to have so many over so many places and to be so incapable of removing any without the risk of significant backlash or simply to lose their customer base. Ie, the code may be pretty but it's put them in an ugly place.

      --
      Eurohacker European paranoia, gun rights, and h
    21. Re:Speed versus complexity by yakovlev · · Score: 5, Interesting

      For a "modern" CPU the instruction decoder is an absurdly tiny part. This is because the branch prediction, caches, issue queue, regfiles, etc. are all much larger or at least the same size.

      This isn't nearly so true in a super-low-power mobile design. The instruction decoder size for a given instruction set architecture is pretty much a fixed size per decode pipe. This means that in one of these tiny mobile chips the relative size of the decoders is dramatically larger. A super-low-power chip dramatically reduces the sizes of the caches and branch prediction, reduces the size of the regfiles, and often eliminates the issue queue. It probably also removes a decode pipe, but the relative reduction in decode size is much smaller than the relative size reduction in other areas.

      The limited register set absolutely hurts x86 on power usage, perhaps more than the decoders do, since it forces more data cache accesses for register spills and fills.

      Now, I'm not saying that x86 is necessarily worse than arm on power usage, as the richer instruction set may have other advantages such as reducing instruction cache miss rate which can be used to improve IPC which can be spent to lower frequency and reduce power. Also, microcoded instructions may turn out to be more power efficient because they don't have to access the instruction cache every cycle.

      None of this considers the fact that Intel has the best fab technology in the world. This means their processors will be a generation more efficient than everyone else's, which is probably more than enough to counter any "x86 tax" which the instruction set incurs.

    22. Re:Speed versus complexity by Tough+Love · · Score: 2

      Intel won the CPU wars because of manufacturing, not because of a superior instruction set. They are always able to get a smaller manufacturing process.

      When Intel was up against the 68000 they outperformed it at the same process size because of more compact instructions. This happened again with RISC which relied on the suboptimal premise that saving transistors in the processor trumps memory bandwidth and cache efficiency. Fail.

      ARM fixed that issue with its thumb instruction set, a 16 bit instruction encoding without which Intel surely would have squashed it too. To be sure, Intel has a few single byte instructions, mainly register inc/decs, but in general its encoding efficiency is roughly similar to ARM thumb. Intel can't win on that point. What Intel has working against it is a huge rambling legacy instruction set it has to support, for example, AAD still has to work properly, which can be pushed out to microcode but that still costs transistors, and how many flavors of of SIMD now? And three different addressing sizes... yes 16 bit addressing still has to work. And real mode too with its implied 4 bit segment register shifts, that's still in there. And all kinds of weirdo cruft from back in the days when Intel had no idea how to design a memory protection model. All that costs a *lot* of transistors, and Intel isn't going to get that monkey off its back with a few rah-rah interviews.

      And the question of how Intel is going to keep its margins, and therefore its stock price up while competing with ARM on price is far from answered. Predatory pricing might work, but that's just asking for those nice guys from the DoJ to come sniffing around again.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    23. Re:Speed versus complexity by hairyfeet · · Score: 5, Interesting

      While this is true frankly its only been fairly recently that either AMD or Intel gave a crap about power, and look at how far they've come? Intel has gotten Atoms to less than 3w, AMD has gotten dual cores AND a decent GPU down to 9w in the C Series bobcats, and of course the Intel CULV Core chips can do a scary amount of processing on I believe their latest are sub 10w. Now according to ARM their A9 duals are just a hair under 2w. that of course isn't counting the chips like the hardware decoders typically found with ARM because it just can't do as many IPCs as X86.

      So I'd say as folks demand more and more performance out of their mobile devices the advantage will probably swing to Intel.Tthey have the fabs and have been able to shrink quicker than anybody else so 1 or 2 more shrinks and its gonna be pretty damned close and with such a huge IPC difference between X86 and ARM in a damned close race I'm sure many would rather have the faster Intel chip, AMD will most likely be stuck at the niche they are now, at least as long as they stick with the faildozer "half a core" design so that just leaves Intel and ARM and with the money, the fabs, and the R&D budget that Intel has i think it'd be crazy to call it for ARM at this stage of the game.

      After all it wasn't too long ago that everyone was making netburst space heater jokes and look how quickly that situation changed. I seriously doubt Intel is gonna sit this one out and when looking at their past record there is no reason to think they can't make a chip that'll compete.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    24. Re:Speed versus complexity by Tough+Love · · Score: 1

      what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?

      Transistor efficiency.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    25. Re:Speed versus complexity by Anonymous Coward · · Score: 3, Interesting

      The processor architecture is not wildly different between manufacturers. The System On Chip designs in which the CPU is just one element is what makes them different. Should Intel produce custom x86 SoC you can expect the same.

      Intel is producing x86 SoCs (medfield) and yes, they are not PC compatible.

    26. Re:Speed versus complexity by Pulzar · · Score: 2

      but ARM has 8 (32-bit) general purpose registers, and a few specialized ones, some of which are only available in certain operating modes

      That's not correct.

      http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0245a/index.html

      2. Register set

      The ARM register set consists of 37 general-purpose registers, 16 of which are usable at any one time. The subset which is usable is determined by the current operation mode.

      --
      Never underestimate the bandwidth of a 747 filled with CD-ROMs.
    27. Re:Speed versus complexity by phantomfive · · Score: 2
      The x86 has four general purpose registers. No one in their right mind would design a chip like that today.

      When it was originally designed, it didn't matter much because memory accesses weren't much slower than register accesses, so people did arithmetic directly from RAM. It was more convenient that way. As a result, x86 has a lot of interesting, convenient addressing modes, which were really great when it was built.

      In a modern computer, RAM is significantly slower than registers, so having more registers can give you a large performance boost. There are other issues, but that's the most obvious. If you read the article, Bell keeps on going back to the manufacturing process as Intel's main advantage. He says things like, "our competitors are going to have trouble making it to the 9nm scale." That's where their advantage is, and he knows it.

      But it's probably THE best documented platform in history and has very VERY few platform segregation points.

      The question isn't whether it's well documented, the question is how well it performs in power/performance. No one ever said, "x86 is better documented than ARM, so let's put Intel Inside." And the ARMARM is fine documentation whenever I've needed it.

      --
      "First they came for the slanderers and i said nothing."
    28. Re:Speed versus complexity by gweihir · · Score: 1

      The Nvidia example is not convincing. Nvidia has to produce very fast chips in a limited time-frame in order to maintain their image, and market-share. There is no time for power-optimization at all. For the right workload, though, Nvidia (or AMD GPUs) are massively better in power consumption with regard to computing power than x86 CPUs. Just look, for example, at breaking encrypted passwords. You get speed-ups of 100...1000 compared to a normal PC CPU, while power consumption is only 1...10 that of the CPU.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    29. Re:Speed versus complexity by phantomfive · · Score: 1

      And yes, Intel does do a lot of microcode, pipelining, and micro-ops. The great part is that because of it, the instruction set appears CISC externally, but internally through micro-ops, it gets 95% of the benefit that you would see through RISC. Today's x86 chips are more a CISC/RISC hybrid than they are of a pure CISC design.

      It does that, but at the cost of power. The question is whether they can get the same performance while cutting power. Intel says they can. We'll see.

      --
      "First they came for the slanderers and i said nothing."
    30. Re:Speed versus complexity by naasking · · Score: 4, Interesting

      There's nothing inherently "superior" about ARM or PPC instruction sets.

      Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die. That's real estate that could be spent on bigger cache or more registers. ARM is a much better instruction set by comparison.

    31. Re:Speed versus complexity by Anonymous Coward · · Score: 1

      Do you know why MS made NT for those RISC? It's because those other companies paid MS to do it.

      No, they didn't. Moreover MS continues to develop the Windows CE product for x86, MIPS and ARM.

    32. Re:Speed versus complexity by phantomfive · · Score: 1

      Yes, the full pipeline, branch prediction, etc, is a lot bigger than the core instruction decoder. The point is, a complicated instruction decoder makes the pipeline circuitry that much larger.

      --
      "First they came for the slanderers and i said nothing."
    33. Re:Speed versus complexity by bzipitidoo · · Score: 5, Informative

      x86 is ugly. It's one of the most screwed up, inconsistent, crufty architectures ever created. Motorola's 68000 architecture was a lot cleaner. But Intel, through sheer brute force, has managed patch up many of its shortcomings and make x86 perform well in spite of itself.

      They went with a load and execute architecture for the x86 instructions. Then they didn't stick to that model for the floating point instructions, going with a stack for that. And remember they split the CPU into 2 parts. If you wanted the floating point instructions, you had to get a very expensive matching x87 chip. I still remember the week when 80387 prices collapsed from $600 to $200, and still no one would buy, not with free emulators and the 486DX nearing release. Another major bit of ugliness was the segment. Rather than a true 32bit architecture, they used this segmented architecture scheme, then buggered it up even more by having different modes. In some modes, the segment and address were simply concatenated for a 32bit address space, and in others 12 bits overlapped to give only a 20bit address space. Then you had all this switching and XMS and EMS to access memory above 1M. Nasty.

      x86 has been bashed for years for not having enough registers. And for making them special purpose. For instance, only one, AX, can be used for integer multiplication. Ask some compiler designers about the x86 sometime. Bet you'll get an earful.

      Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes. Then there's junk like the ASCII adjust and decimal adjust instructions: AAA, AAS, AAD, and AAM, and DAA, and DAS. Nobody uses packed decimal any more! And hardly anyone ever used it. Those instructions were a crappy way to support decimal anyway. If they were going to do it at all, should have just had AA for ASCII Add instead of "adjusting" after a regular ADD instruction. Then there's the string search instructions, REPNE CMPSW and relatives. They're hopelessly obsolete. We have much better algorithms for string search than that. They also screwed up the instructions intended for OS support on the 286. That's one reason why the lowest common denominator is i386 and not i286. 286 is also only 16bit.

      You might be tempted to think x86 was good for its time. Nope. Even by the standards and principles of the 1970s, x86 stinks.

      Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density. Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.

      That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.

      --
      Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
    34. Re:Speed versus complexity by hairyfeet · · Score: 3, Interesting

      But that is like saying "Linux beats Windows if you count routers' which is probably true, except in both cases you are talking about tiny low margin embedded that while might be good from a numbers game frankly isn't a market one should be chasing. Its like how Apple doesn't make any of the low rent stuff yet is the richest company last i checked, you get that way by getting the good profit markets, not the low end crap ones.

      In the end what will probably matter the most is money and despite the bigger numbers for embedded ARM Intel has it and ARM don't. Money gets you fabs, money gets you R&D, and frankly with the amount of both Intel has it can simply win by continuing on the current path. Both the CULV chips and the Atoms are dropping every rev, the IPC Intel has been getting with each new chip is just insane, and everyone wants their mobile devices to do ever more stuff, all of that plays straight into Intel's hands.

      So I wouldn't be counting them chickens because after all ARM could keep every router and PMP on the planet and STILL lose, because if Intel can deliver sub 5w chips that can do HD video and play games and do all the other things folks want to do? I seriously doubt it'll be hard for them to sell it to the masses. Hell most folks don't even know what ARM is but they have sure seen plenty of "bong bong ba bong" Intel inside commercials.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    35. Re:Speed versus complexity by timeOday · · Score: 1

      In the summary Intel's "mobile chief" says: 'I see no data that supports the claims that ARM is more efficient,' so where's the evidence either way? Has somebody counted the watt-hours to compress an mp3 on various chips (with bit-identical results), or something like that? A quick googling did not find it for me.

    36. Re:Speed versus complexity by dgatwood · · Score: 5, Informative

      Three watts isn't even close to usable for a mobile phone. At that level of power consumption, you would either have to charge your phone every half hour (by the time you add in the chipset consumption) or build a phone that looks like one of those old portable phones from the 1980s with the small suitcase attached....

      Intel's latest Atom offerings, however, claim to draw about two orders of magnitude less power than that at idle, and are thus in the ballpark for being usable for phones and similar devices. It remains to be seen who will adopt it.

      BTW, last I read, a 2GHz Cortex A9 CPU based on a 40 nm process drew about 250 mW max, not 2W, though those numbers could easily be wrong.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    37. Re:Speed versus complexity by rev0lt · · Score: 1

      Intel deserves a lot of credit in proving just how far CISC design can go.

      Intel also deserves credit for some RISC processors, namely the i860 and the i960. A lot of the effort of the RISC division was later packported to their mainstream CPUs.

    38. Re:Speed versus complexity by rev0lt · · Score: 1

      But no div instruction for all? Since the early nineties even microcontrollers have a div instruction...

    39. Re:Speed versus complexity by Dahamma · · Score: 5, Informative

      And the most insightful post of the thread is from an AC... if you had posted non-AC I might have modded you up ;)

      It also points out how the GP post talking about slow off-die IO is way overrated and really not all that relevant to the mobile/embedded space.

      ARM is winning the embedded STB/TV/BD/phone wars because their core is tiny and integrates well in SoCs. Many of these SoCs have graphics, Ethernet, Wifi, USB, SATA, HW crypto, MPEG decoding, etc all on die, on a $10-20 part. Intel may have something a bit faster, but they don't have anything close in overall features for that price.

    40. Re:Speed versus complexity by KingMotley · · Score: 3, Informative

      Yeah, right. 37 of which you can only ever use at most 16. Of which, 5 are taken up already, and personally I wouldn't call the flags register a general purpose register, nor the stack pointer, etc, but apparently they do, lol. Also, look down at the nice graph right below your quote, you will also notice that during Fast Interrupt Routines, you have only 5 registers free to use (R8-R12), and during user mode, you only have 13 (R0-R12) free for your use, and during an IRQ you have 0 free? lol.

      So you have R0-R7 which are what most would consider general purpose, R8-R12 are special and only available in certain operating modes, and R13,14,15 aren't what most would consider general purpose.

    41. Re:Speed versus complexity by tibit · · Score: 2

      I think that on a modern x86 implementation, with the CISC instructions you can use about a cacheline worth of BP-relative RAM just as it were registers. It's no slower than using registers, or so it seems. There's some instruction rewriting going on that makes it so, I bet.

      --
      A successful API design takes a mixture of software design and pedagogy.
    42. Re:Speed versus complexity by JDG1980 · · Score: 1

      The x86 has four general purpose registers. No one in their right mind would design a chip like that today.

      On what basis do you exclude ebp, esi, and edi from consideration as general-purpose registers? It's been a while since I did any serious assembly programming on x86, but as far as I remember, the only real limitations on them is that you can't access them in 8-bit chunks. (And there's specific register requirements for the string instructions, but does anyone even use those any more?) So I'd say x86 has 7 general-purpose registers, not 4. Granted, it is a limitation, and one area where x86 does fall behind some other architectures.

      However, it's worth pointing out that x86 running in 64-bit mode has extra general-purpose registers, bringing it up to a total of 15. And you also get MMX and SSE registers on top of that on all newer x86 architectures.

    43. Re:Speed versus complexity by KingMotley · · Score: 1

      That's totally believable since the energy required for those things is (relatively) very small especially compared to the total system power, while you have to increase the memory bandwidth on ARM because of the larger fixed sized instruction set and higher number of instructions to achieve the same performance. So the higher CPU power usage is offset by not requiring a faster (and more power hungry) bus, a higher frequency execution unit, memory controller and memory.

      But we'll have to wait and see.

    44. Re:Speed versus complexity by phantomfive · · Score: 3, Insightful

      In that case, it would mean the CPU is doing the optimization instead of the compiler. I am unfamiliar with that particular optimization, but it sounds like a good idea.

      Unfortunately every time you add circuitry like that, you also increase power consumption. Which is where difficulty comes in for Intel, when it's trying to make the tradeoff between power consumption and performance.

      --
      "First they came for the slanderers and i said nothing."
    45. Re:Speed versus complexity by rev0lt · · Score: 1

      The x86 has four general purpose registers. No one in their right mind would design a chip like that today.

      Not even Intel. Intel RISC processors have a lot more registers, and while you can also poke holes in their technology, they were the basis of modern Intel processors.The register limitation is more of a backward-compatible move than anything else.

      In a modern computer, RAM is significantly slower than registers, so having more registers can give you a large performance boost

      Yes and no. RAM is probably a magnitude slower than registers, but it is where your code (the one messing around with registers) lie; Modern processors also have a shorter pipeline than those pesky P4, so they usually have a smaller instruction cache. It all boils down do L1/L2 cache, and some schedulers go to great extent not to trash it, and have your code right there when the processor needs it.

      The question isn't whether it's well documented, the question is how well it performs in power/performance.

      The question is,can you have a baseline performance index for your product, given an arbitrary vendor. You can't. Not only there is a lot less fragmentation if everything comes from the same provider, but you can only provide consistency if the vendor is the same,

    46. Re:Speed versus complexity by rev0lt · · Score: 1

      Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die

      CISC instructions don't take "cpu die" at least since a decade ago. Modern processors translate whatever the opcodes are to "micro-ops" that are executed on the multi-lane, out-of-order (except Atom) RISC pipeline.

    47. Re:Speed versus complexity by phantomfive · · Score: 1

      Truly it's dangerous to bet against Intel. They have solid people.

      --
      "First they came for the slanderers and i said nothing."
    48. Re:Speed versus complexity by phantomfive · · Score: 1

      lol ok, we can make it seven. No one in their right mind would design a chip like that today.

      It's true x86 running in 64-bit mode has extra registers, and Intel really tried to go crazy on registers with Itanium, but are they really going to try to stick a 64-bit processor in a phone?

      --
      "First they came for the slanderers and i said nothing."
    49. Re:Speed versus complexity by tibit · · Score: 1

      Agreed.

      --
      A successful API design takes a mixture of software design and pedagogy.
    50. Re:Speed versus complexity by rev0lt · · Score: 4, Informative

      hen they didn't stick to that model for the floating point instructions, going with a stack for that. And remember they split the CPU into 2 parts. If you wanted the floating point instructions, you had to get a very expensive matching x87 chip.

      ... The same as Motorola.(http://en.wikipedia.org/wiki/Motorola_68881). They began to integrate an FPU about the same time (68040/486DX).

      Another major bit of ugliness was the segment. Rather than a true 32bit architecture, they used this segmented architecture scheme, then buggered it up even more by having different modes.

      You mean, having a 16-bit cpu support a FULL MEGABYTE instead of the usual 64Kb of Ram? In 1979? Pure evil.

      In some modes, the segment and address were simply concatenated for a 32bit address space, and in others 12 bits overlapped to give only a 20bit address space. Then you had all this switching and XMS and EMS to access memory above 1M. Nasty.

      You do know that XMS memory is just linear memory above real memory, right? And that EMS whas just a PC-compatible paging memory layout, right? Because you seem to lack basic understanding of the architecture.

      Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes.

      Thank god. I can still run FreeDOS.

      They're hopelessly obsolete. We have much better algorithms for string search than that.

      While the instructions you mentioned are used for string comparison, that's not their sole purpose. They compare bytes. not strings.

      We have much better algorithms for string search than that.

      Please do tell. Because null detection in a couple of opcodes isn't something easy to come by.

      They also screwed up the instructions intended for OS support on the 286.

      If you are talking about MMU, they dind't screw up. Nobody cared about 16-bit support.

      That's one reason why the lowest common denominator is i386 and not i286. 286 is also only 16bit.

      Nobody cared about i386 MMU either, upto Windows 3.0. That's why early versions of 386 were buggy as hell (such as skipping the first GDT entry - yup. it's a 386 bug, not a feature).

      Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density

      That is the first thing on your comment that is right on the spot.

      Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.

      ...And then you lose it. Vector instructions were a FPU feature (the 487 ITT had it), and Intel also had a peek with their RISC cpus, i860/i960. With the advent of DSPs, this kind of technology came even more common.

      That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.

      Older programs will run faster on new CPUs. In many cases, they won't take advantage of SSE at all if both the algorithm and the compiler aren't optimized for the use of those instructions.

    51. Re:Speed versus complexity by Tough+Love · · Score: 1

      But no div instruction for all?

      Right, that's pretty extreme isn't? But there's a floating point divide. I don't know about you, but my code has precious few integer divides in the hot path. Still, even a multi-cycle divide implemented in microcode would be better than a subroutine.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    52. Re:Speed versus complexity by rev0lt · · Score: 1

      The problem is, a quad-core i5 probably consumes less per core than a dual-core atom consumes per core, but will have much better performance per core. If you increase cache and clockspeed, you will also need more power. That's why Intel is investing heavily in increasing density (reducing consumption) instead of going after the GHz advantage.

    53. Re:Speed versus complexity by Animats · · Score: 1

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      The real question there is, do you want to go superscalar? Sequential RISC CPUs are simpler than sequential CISC CPUs, but once you have pipelines and multiple execution units, there's so much added complexity and transistor count that the difference disappears. If you're willing to have a slow RISC CPU, the transistor count can be quite low. Down at the bottom, where there's on-chip memory and no cache, as with Atmel ATMega CPUs, the transistor count is really low.

      ARM started down there, but it's been built up to a serious level of computing power, with multiple cores, layers of caches, MMUs, and all the stuff of a desktop CPU. Once you have all that stuff, the instruction decoder part is a tiny fraction of the transistor count.

      (Of course, x86 superscalar machines require the retirement unit from hell to manage all the hard cases, like self-modifying code. I still remember going to the talk where the Intel guy in charge of the Pentium Pro program explained how they did that. 3000 engineers on the design team at peak. One of the people at that talk was the guy who designed the Intel 8051 mostly by himself.)

    54. Re:Speed versus complexity by rev0lt · · Score: 1

      Also the icore series can predict the next instruction before it is even loaded

      That is a decade-old trick (badly) done in P4, because it led to costly cache invalidations. It's not like "trying to guess", but more like having the predictable execution paths already in cache,

    55. Re:Speed versus complexity by Bert64 · · Score: 1

      None of this considers the fact that Intel has the best fab technology in the world. This means their processors will be a generation more efficient than everyone else's, which is probably more than enough to counter any "x86 tax" which the instruction set incurs.

      So build up an advantage on manufacturing process, but instead of producing cpus as efficient as anyone else's but also on a more efficient process, pushing the cutting edge forwards and giving customers the absolute best product they can... They throw away the advantage to consumers by hobbling themselves with an inferior instruction set, so customers miss out on the advantages that the smaller process should bring.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    56. Re:Speed versus complexity by Bert64 · · Score: 2

      What killed them was the binary-only nature of most windows software...
      At a time, MIPS, PPC and Alpha were all considerably faster than x86, but except for a few specialist applications none of the existing windows software ran on them, making the hardware utterly useless.

      There is no incentive for a software developer, especially a commercial one to port to a platform with very few users, and there is no incentive for an end user to buy a platform for which there is no software.

      I don't remember seeing NT for PPC or MIPS being used anywhere, and I only ever saw Alpha or IA64 versions being used for 3d renderfarms and sql databases...

      PPC, MIPS and Alpha all saw somewhat more success as Linux servers (IBM still sell POWER based boxes with Linux and there are still embedded MIPS boxes around), because with the vast majority of applications coming with sourcecode you are not beholden to the original author to port them.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    57. Re:Speed versus complexity by Bert64 · · Score: 3, Interesting

      If you read the article, Bell keeps on going back to the manufacturing process as Intel's main advantage. He says things like, "our competitors are going to have trouble making it to the 9nm scale." That's where their advantage is, and he knows it.

      So basically he has a more efficient engine, but rather than give customers a more efficient car he adds lots of unnecessary weight that provides no benefit to users, so that the overall package isn't any better than what everyone else is offering.

      If he put that more efficient engine, in a car as lightweight as everyone else's then customers would benefit from a superior product.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    58. Re:Speed versus complexity by phantomfive · · Score: 1

      Yes. Your post basically summarized the main problem underlying the philosophy of intellectual property. That kind of thing happens all the time.

      --
      "First they came for the slanderers and i said nothing."
    59. Re:Speed versus complexity by hairyfeet · · Score: 2, Interesting

      You and the other poster seem to be forgetting ONE thing, which is nobody gives a shit how low the power draw is if it can't do what they want and what people WANT is MOAR, MOAR HD, MOAR games with MOAR graphics, MOAR MOAR MOAR.

      We must have a LOT of puppies using AC accounts here now, because they sure don't remember their history. this is the EXACT SAME THING we went through on X86, with people happy for years with weak ass chips and then suddenly everybody wanted sound, then they wanted games, then video, then HD video, just one after another. people now want to be able to do a hell of a lot more than just email with their mobile devices, they want to watch videos, play the latest games, use it as a PMP, and ALL of that takes MOAR POWER.

      ARM isn't "magical" and its been needing more and more extras like HD decoder chips to share the load because it simply can't get enough IPC to do the job on its own. That problem is only gonna get worse, as screens get bigger and HD, games get more complex and have better graphics, and the same fact is when it comes to doing all those tasks Intel can simply stomp ARM when it comes to IPC.

      So i have NO doubt ARM is gonna run into a wall, just as X86 ran into the heat barrier at 4GHz only with ARM it'll probably be the "too many cores" problem. We are already seeing 4 and 5 core ARM chips, trying to squeeze more performance out of the chip, but you can only go so far that way. The simple fact is the amount of IPC one gets on even the lowest end Atom now is truly insane compared to ARM. Look at the benches and you'll see the Atom for smartphones does a hell of a lot more for equal or less power than ARM. And remember this is just their first try, imagine what they will have after a couple of shrinks and a refresh?

      --
      ACs don't waste your time replying, your posts are never seen by me.
    60. Re:Speed versus complexity by msgmonkey · · Score: 3, Interesting

      Any superscaler processor is going to be doing instruction conversion, this includes RISC instruction set processors. The micro-ops in Intel processors convert to are less than RISC instructions. Once you start implementing things like Tomasulo the traditional advantages of RISC are eroded. If this was n't the case Intel would have never been able to leverage their process advantage to get better performance whilst retaining the x86 instruction set.

      In a high performance processor instruction set is irrelavant since 80%+ of the die area is cache any way.

    61. Re:Speed versus complexity by Anonymous Coward · · Score: 2, Insightful

      Special operating modes in which R8-R12 aren't available are limited to certain exceptionally-low-latency-interrupt code, or in some supervisor modes. So unless you're actually in the code bridging to your kernel, or you're in an interrupt, R8-R12 are available for use. Even in those 'other modes' R8-R12 are available for use if you preserve them so that you don't trash over whatever regular code was running.

      If we don't consider the mode-switching stuff sanely, x86 has NO general purpose registers since in some operating modes (like handling interrupts) you have to preserve their contents before using them! How awful, you have to do that for 4 out of 16 registers on ARM since they don't switch them in hardware for you.

      Incidentally, there is no such thing as a stack pointer register on ARM. You can use the stack operations on any R0-15 register. Common C compilers use an ABI such that R13 is reserved for use as a stack pointer, but that's not an architectural requirement. There's nothing different between R13 and the other registers R0-12 - all the same instructions behave in all the same ways.

      R14 and R15 are certainly 'special', but you can still use them in virtually any instruction and any addressing mode.

      So for points of comparison, in normal operation, I'd claim that x86 has 4 general purpose registers, 4 effectively reserved for addressing magic since you're limited in what instructions can operate on them, and some others. On ARM, I'd claim 13 general purpose registers, with 2 that were special, and some others.

    62. Re:Speed versus complexity by Anonymous Coward · · Score: 1

      CISC instructions don't take "cpu die" at least since a decade ago. Modern processors translate whatever the opcodes are to "micro-ops" that are executed on the multi-lane, out-of-order (except Atom) RISC pipeline.

      ...and they do this with magic pixie dust, not transistors like those old fashioned competitors.

    63. Re:Speed versus complexity by msgmonkey · · Score: 2

      And there are far more 8-bit and 16-bit CPUs that use CISC instruction sets than ARM chips. The quantities mean nothing, it is who is making the most money and Intel certainly won that one.

    64. Re:Speed versus complexity by Jonner · · Score: 1

      You speak of CISC vs. RISC as if it's in the past. However, that's exactly the competition that is now heating up in the form of x86 vs. ARM. ARM and other RISC designs have dominated in power efficiency for many years. Now, Intel is attempting some serious competition with a low power x86 design. Neither RISC or CISC has won or lost and it seems the competition is only going to get more intense.

    65. Re:Speed versus complexity by IntlHarvester · · Score: 2

      NT/Alpha at least had an emulator that worked well.

      What killed all these RISC PCs was the Intel Pentium Pro chip. It offered 90% of the performance of the DEC Alpha chip without any of the downsides of running some weird platform. (keep in mind NT was 32-bit only) In retrospect, the RISC PC groups had more prescience than the server crowd. They saw the writing on the wall and got the fuck out quickly.

      And you have to laugh at trolls knocking MS for not making software for dead platforms. As if Windows is the same as NetBSD or something.

      --
      Business. Numbers. Money. People. Computer World.
    66. Re:Speed versus complexity by KingMotley · · Score: 1

      during user mode, you only have 13 (R0-R12) free for your use

    67. Re:Speed versus complexity by gnasher719 · · Score: 2

      The x86 has four general purpose registers. No one in their right mind would design a chip like that today.

      x86 has eight general purpose registers. In 64 bit mode, it's 16 general purpose registers. Plus 16 vector registers of 256 bit each, holding 64 double precision, or 128 single precision floating point numbers, or up to 512 bytes. (That's the current versions).

    68. Re:Speed versus complexity by sFurbo · · Score: 1

      AFAIK, you are mostly correct. For simple programs, it is very much possible to read ahead and load the correct data. The problem arises with things like "if" statements, where you don't know which branch is going to be taken before it is taken, which can cause misses, which does take a long time to fix (the relevant parts needs to be retrieved from RAM). I'm sure they are much more advanced than that today, and bigger caches will also help, as you can load both branches, but that is the general idea.

    69. Re:Speed versus complexity by unixisc · · Score: 2

      The reason MS made NT for RISC was not b'cos RISC vendors paid them anything, as the GGP alleged. SGI, for instance, which owned MIPS, was still very solidly in the Unix camp, and made only one MIPS workstation for NT. DEC's initial Turbochannel Alphas were OVMS and OSF/1 only. The reason NT was made to be portable was that at that time, RISC CPUs were way ahead of the Pentiums, and so MS thought that they needed to have RISC platforms ready in order to participate in the workstation and server markets.

      As you point out, since x86 binary emulation was needed, and usually, what one had then was an apples to pineapples comparison of an NT/Pentium running Wintel apps vs NT/Alpha w/ FX!32 running Wintel apps, the advantage was heavily w/ Intel. Also, Intel continued making major architectural developments to the Pentium - first making it more superscalar, then making it more superpipelined (in the Pentium Pro), then adding things like MMX, SSI and so on.

      But the main turning point was that once Windows 2000 merged both the NT and Windows 98 branches, multi-processing was a given, and Intel could start making multi-core CPUs, which made performance enhancement a lot easier. As a result, more CPU intensive applications became multi-processing aware, and the one advantage that multiprocessor RISC servers had was no longer there. Also, Intel's own non-CISC CPU - the Itanium - bombed when it came to running x86 binaries. It's a fine platform for native code, and it's a bit of a pity that OSs such as Monterrey, RHEL, and Ubuntu abandoned the platform. Incidentally, I happen to think that if Intel is still going to develop it, it should be market segmented, and have a range of CPUs, just like one has the i3, i5 and i7, and offer them in everything from laptops to servers - just like the x86. Get Linux and BSD ported on them, so that support or lack of it from Microsoft is irrelevant.

    70. Re:Speed versus complexity by CoderJoe · · Score: 2

      ARM isn't "magical" and its been needing more and more extras like HD decoder chips to share the load because it simply can't get enough IPC to do the job on its own.

      The last time I checked, x86 needed extras like the GPU to offload HD decoding to as well.

    71. Re:Speed versus complexity by msgmonkey · · Score: 1

      And how much money do you think ARM makes on that $10 part?

    72. Re:Speed versus complexity by msgmonkey · · Score: 1

      I doubt it would add that much considering you already have to implement Tomasulo to go superscaler, it could be added onto that relatively easily. Of course it will add more circuitry and I agree Intel will have problems making something as low power as the current crop of ARM chips however when we get to something that is say the midpoint between an A9 and i3, Intel will be able to compete easily and also have its process advantage. It could easily be a case of ARM winning the battle and losing the war.

    73. Re:Speed versus complexity by Kjella · · Score: 1

      For example, taking your point about data bandwidth, because the x86 has so few registers, it has to do data IO a lot more compared to something like the PowerPC or SPARC.

      I should point out that when AMD made x86-64 they had a choice of how many registers to add, I remember reading an article about it long ago. They tested it out a lot and found adding another 8 registers (r8-r15) was the ideal number - their processors have more than 16 internally but it was more effective to let the hardware optimize the rest at run time rather than exposing them to the compiler. So it's not like this is really fixed, in fact I can't see any problem with adding more registers at any time. If you added r16-r31, existing binaries would never use them while a recompile would give you all of them to work with. That neither AMD or Intel has chosen to do suggests they're pretty close to the ideal already.

      --
      Live today, because you never know what tomorrow brings
    74. Re:Speed versus complexity by TheRaven64 · · Score: 3, Interesting

      The instruction decoder is such an absurdly tiny part of a modern CPU that it really doesn't matter.

      Not true. It is quite a small part, but it is the part that you can not turn off or put in a low power state as long as the CPU is doing anything. This is why it becomes important on low-power systems: it's a constant power drain. Big FPUs and SIMD units draw a lot more power, but they draw almost nothing when executing scalar integer code.

      CISC often has the ultimate advantage simply because it makes better use of the code cache.

      If you're comparing to something like the Berkeley RISC or Alpha architecture, yes. If you're comparing to ARM... not so much. In the comparisons I've done, on both compiler-generate code and hand-written assembly, ARM and x86 are within 10% of each other in terms of code size with ARM smaller in most cases. Note that this was comparing ARM to x86 and x86-64. For a modern ARM core, you would use the Thumb-2 instruction set, which is typically about 30% smaller, and 50% smaller in the best case.

      --
      I am TheRaven on Soylent News
    75. Re:Speed versus complexity by TheRaven64 · · Score: 2

      It's worth noting that with the Cortex A9 and newer, ARM has done a lot to standardise things. For example, interrupt controllers now have a standard well-defined interface. This means that once you have one Cortex A9 SoC working, getting the next working is about as hard as getting a new x86 laptop working: you may need device drivers for the GPU and a few other things, but the core functionality will be the same.

      --
      I am TheRaven on Soylent News
    76. Re:Speed versus complexity by msgmonkey · · Score: 1

      Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density. Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.

      This is n't the case, the only x86 processor that converted x86 instructions to RISC instructions was the AMD K5. Infact even in a RISC architecures the instruction decode stage expands out the instruction and this is what happens on a modern x86 processor.

      The complexity in a modern processor is not in the instruction decode, but the multiple execution units.

    77. Re:Speed versus complexity by serviscope_minor · · Score: 2

      oooooooooookay.

      alking about tiny low margin embedded that while might be good from a numbers game frankly isn't a market one should be chasing

      Given that there are many high volume, low margin companies out there, including ARM which are vastly more successful than anything you or I have done, I won't be taking your word for it that I should be doing something else.

      Its like how Apple doesn't make any of the low rent

      You know, you're right! I could never stand to be only Michael Dell level of rich when I could be Steve Jobs (estate) level of rich.

      In the end what will probably matter the most is money and despite the bigger numbers for embedded ARM Intel has it and ARM don't.

      Intel can't touch ARM in the majority of its market. They have nothing in remotely the same ballpark. Also, ARM doesn't need as much money, since they are fabless: they rely on various others. And that's another thing, since they license cores, other can make full SOCs, where as with intel, it's their SOC or nothing.

      So I wouldn't be counting them chickens because after all ARM could keep every router and PMP on the planet and STILL lose,

      My god if only I could lose with such wealth!

      --
      SJW n. One who posts facts.
    78. Re:Speed versus complexity by msgmonkey · · Score: 1

      Those hardly used instructions probably use less than 0.1% of the CPU die, that is because they are microcoded instructions and run hideously slow.

    79. Re:Speed versus complexity by RoboJ1M · · Score: 2

      pennies, but they "sold" 6 billion in 2010

    80. Re:Speed versus complexity by TheRaven64 · · Score: 3, Insightful

      Early ARM chips didn't have an integer divide instruction because it took up to 12 cycles[1] to perform integer division and you could get the same performance without complicating the pipeline without it. Integer division is often cited as one of the main reasons why RISC had problems, because newer techniques reduced the number of cycles required to perform integer division, so newer CISC chips just used those in place of the old microcoded loops, while RISC code got no benefit unless the instruction set was extended and the code recompiled. Modern RISC chips - including ARM - do have integer division instructions though, and compilers use them, so this is something of a moot point.

      [1] It was a variable number, which made life very difficult for hardware designers. One of the benefits early RISC architectures had was the fact that their instructions took the same length of time to execute, so the pipeline could be very simple.

      --
      I am TheRaven on Soylent News
    81. Re:Speed versus complexity by Anonymous Coward · · Score: 1

      To be fair, the decimal adjust instructions came from compatibility with the 8080 chip. Other 8-bit processors also had a similar system, for example both the 6800 and 6502 (IIRC, the 6800 could only adjust after decimal addition, not subtraction).

    82. Re:Speed versus complexity by itsdapead · · Score: 2

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      We do? Aside from the fact that the distinction is becoming less relevant as chips become more complex, It seems to me that pretty much any market that isn't dependent on MS Windows has gone with RISC.

      The x86 has a monopoly on desktop and laptop PCs and business servers not because it is a better architechture, but because of a huge, legacy code base - so big that even Intel failed when they tried to move to a new 64 bit instruction set (Itanium) and had to fall back on the current, backward-compatible solution. This monopoly gives them the economy of scale to adopt brute-force, complex solutions that (effectively) translate the x86 instruction set into something more efficient on-chip.

      The first ARM based computers (Acorn's Archimedes in 1987) weren't mobiles: they were powerful personal computers that could leave their intel-based contemporaries eating dust. But they didn't run Windows (other than by software emulation or, later, plugging in an actual x86) so they never had the market penetration to justify development of faster, desktop ARM chips and were left behind when Intel kept cranking up clock speeds and adding huge caches and on-chip floating point. ARM followed the money and focussed on mobile and embedded markets.

      Come the late 90s, PCs with Alpha RISC CPUs started to take off, and then died. Did they die because RISC was inferior, or was it more to do with MS suddenly dropping the Alpha version of Windows and Intel ending up owning the Alpha chip?

      Apple didn't switch to switch from PPC to x86 because they wanted a CISC chip - they switched because of supply difficulties and because nobody was interested in making a low-power G5 PPC suitable for laptops and SFF systems - not impossible, just not economical for the sake of making a G5 Powerbook.

      Anywhere where the Wintel monoculture isn't important, RISC is still widespread: supercomputers, high-end servers and "mainframes" use the grown-up relatives of PPC, ARM dominates phones and tablets, and both PPC and ARM are prevalent in things like NAS and routers.

      x86 and CISC only has a niche market - it just happens to be an absolutely enormous niche...

      --
      In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
    83. Re:Speed versus complexity by TheRaven64 · · Score: 1

      x86-64 can do register-memory type operands which is much more efficient

      This thread is full of strange definitions. Requiring the CPU to do alias analysis, in hardware, at run time, is certainly not what I would call 'efficient', but sure, let's go with that definition for now...

      --
      I am TheRaven on Soylent News
    84. Re:Speed versus complexity by zixxt · · Score: 1

      AMD will most likely be stuck at the niche they are now, at least as long as they stick with the faildozer "half a core" design so that just leaves Intel and ARM and with the money, the fabs, and the R&D budget that Intel has i think it'd be crazy to call it for ARM at this stage of the game.

      .

      Its not a half a core its a full core with a shared FPU/SIMD unit, if you think that Bulldozer is just half a core than what do you call most CPUs since before the i586 era that had no FPUs? What is my 68020 in my Macintosh is not really a chip/core/cpu? Most arm chips have no fpu either so do they count as core/cpus?

      Stop trolling...

      --
      ---- GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
    85. Re:Speed versus complexity by TheRaven64 · · Score: 2

      There's very little difference between the two instruction sets that makes one more power efficient than the other

      There are two aspects of an instruction set that effect power consumption. One is density: how much instruction cache do you need for a given algorithm. ARM does about as well as i386 here, and Thumb-2 does better (typically about 20% smaller code than x86). Smaller instruction cache means less power consumption. The other is decoder complexity: how many transistors do you need to decode the instructions. x86 instructions are somewhere between 1 and 15 bytes, and the encoding scheme is highly non-othrogonal, so the logic required to decode them is very complex. ARM instructions are 32 bits and have fixed encoding and opcode fields, so you need a tiny handful of transistors to decode them. Thumb-2 instructions are variable length, but only come in two sizes (16 and 32 bits) and are only slightly more complex to decode. The micro-op decoder on a modern x86 chip is about as complex as the Thumb-2 decoder...

      --
      I am TheRaven on Soylent News
    86. Re:Speed versus complexity by Kjella · · Score: 2

      BTW, last I read, a 2GHz Cortex A9 CPU based on a 40 nm process drew about 250 mW max, not 2W, though those numbers could easily be wrong.

      The answers are really all at the site the GP linked.
      Performance optimized: 1.9W
      Power optimized: 0.5W (250 mW/core)

      Anyway, Anandtech has a pretty good overview of actual phones. If you look at the normalized hours/watthour figures Medfield (the Xolo X900) is decidedly middle of the pack. It's not better than the ARM phones, but it's not terrible either. Of course newer ARM designs will beat it, but then again Intel isn't going to stand still either.

      --
      Live today, because you never know what tomorrow brings
    87. Re:Speed versus complexity by Ginger+Unicorn · · Score: 3, Informative

      There are two versions - speed optimised @2ghz and power optimised @800Mhz to ~1GHz. Speed optimised draws 1.9W and power optimised draws 0.5W.

      --
      (1.21 gigawatts) / (88 miles per hour) = 30 757 874 newtons
    88. Re:Speed versus complexity by citizenr · · Score: 1

      You and the other poster seem to be forgetting ONE thing, which is nobody gives a shit how low the power draw is if it can't do what they want and what people WANT is MOAR, MOAR HD, MOAR games with MOAR graphics, MOAR MOAR MOAR.

      and how is a CPU going to solve this problem? like you said people want "MOAR" GPU. Intel has no GPU.

      --
      Who logs in to gdm? Not I, said the duck.
    89. Re:Speed versus complexity by citizenr · · Score: 1

      AMD will most likely be stuck at the niche they are now, at least as long as they stick with the faildozer "half a core" design so that just leaves Intel and ARM and with the money, the fabs, and the R&D budget that Intel has i think it'd be crazy to call it for ARM at this stage of the game.

      .

      Its not a half a core its a full core with a shared FPU/SIMD unit

      oh, but but, but look at SuperPi score! AMD sucks!!!1
      Haha, those are the arguments I hear daily :) While next AMD CPU will finally implement their Fusion vision and connect GPU to CPU on cache level. Who needs SSE/FPU when you can just crunch double precision on the on die GPU.

      --
      Who logs in to gdm? Not I, said the duck.
    90. Re:Speed versus complexity by TwoBit · · Score: 1

      FWIW, ARM isn't a pure RISC instruction set.

    91. Re:Speed versus complexity by Alioth · · Score: 1

      Considering ARM outships all other architectures put together very handily, yes it's obvious: CISC lost badly. (ARM's original expansion is Acorn Risc Machine).

      But firstly: the right tool for the right job. x86 isn't going to go away in the forseeable future, but on the other hand, neither is ARM. What makes ARM more efficient - especially in embedded devices - is the part of an x86 processor that just figures out the length of the next instruction is the size of an entire ARM execution core. The other thing with ARM is the licencing model. You can buy the ARM IP and put it in any custom chip you want, you can't do that with anything Intel makes, you can only build what Intel wants to make. And thirdly, just as x86 has huge inertia in the desktop/server market due to binary compatibility and being well known and understood by the manufacturers, the same thing holds true for ARM in embedded.

      Of course someone from Intel sees "no future for ARM", but the reality is likely to be different. ARM is likely to continue to outsell all other architectures put together for the forseeable future because of the embedded and handheld market where it already has a stronghold.

    92. Re:Speed versus complexity by Alioth · · Score: 1

      It's not just the instruction decoder, but branch prediction and pipelining. The x86 ISA makes all of these things much more complex. Also the low number of registers means that x86 isn't terribly efficient with cache either.

      The instruction decoder in x86 is the size of an entire ARM execution core.

    93. Re:Speed versus complexity by makomk · · Score: 1

      The drivers are crap and the power efficiency is too bad to be suitable for mobile. On mobile and low-power devices, Intel licenses PowerVR's GPU designs just like many ARM manufacturers do.

    94. Re:Speed versus complexity by makomk · · Score: 1

      The funny thing is that even Bulldozer should be quite capable in SSE terms, but SuperPi is so ancient that it doesn't actually use SSE at all...

    95. Re:Speed versus complexity by makomk · · Score: 2

      I think this is precisely because the backward's compatibility is not necessary there.

      It's actually quite funny. One of Intel's main problems in smartphones is that their chips aren't compatible with existing software, so they have to use dynamic translation. (There are some incorrect benchmarks out there that reckon it's as fast as native code but that's because they didn't realise that Intel had paid the manufacturer of the Android benchmarking suite they'd used to include a native x86 version and that it was using that instead of the ARM one.)

    96. Re:Speed versus complexity by TeknoHog · · Score: 1

      In that case, it would mean the CPU is doing the optimization instead of the compiler. I am unfamiliar with that particular optimization, but it sounds like a good idea.

      It's a good idea until someone comes up with a better optimization, and we are stuck with the old hardwired one.

      On the other hand I imagine CPU designers have more freedom to experiment with new internal designs, when the translation layer presents a stable x86 ABI to the outside. Sure, it would be great to access the RISC internals directly, and optimize GCC etc. accordingly, but that would be a moving target.

      --
      Escher was the first MC and Giger invented the HR department.
    97. Re:Speed versus complexity by walshy007 · · Score: 1

      but are they really going to try to stick a 64-bit processor in a phone?

      when phones start shipping with 4gb of memory.. yes. We already have phones with 2gb of memory out right now, give it a year or a bit more and I expect even arm phones will be 64-bit for address space needs.

    98. Re:Speed versus complexity by BitZtream · · Score: 2

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      Which one do you think won? I really hope you're not really trying to imply CISC won, since you know, there isn't a CISC CPU on the market today. There are plenty of CPUs that have CISC decoders front ending for RISC cores, but I can't think of one actual CISC chip thats been used since the pentium.

      They throw hundreds of cores onto the die, but it eats hundreds of watts as well. Massively parallel and simple instruction sets don't appear to translate into energy savings.

      It does when you ... turn those extra cores off when not needed. Yes, if you have a ton of cores sitting idle, its a waste of power, thats why they can be turned off, even on your previous nVidia GPU the cores can be turned off, just go slap a TESLA card in your machine. You're setup may be another story, but mine for instance has absolutely no problem shutting off GPU cores, or the external GPU completely and falling back on internal rendering with the CPU.

      I work with several non-x86 CPU types regularly. All of which are openly RISC, none of which have the problems you're referring to as RISC problems.

      I think you've been listening too far too much Intel marketing bullshit as the real world contradicts you pretty much 100%. Whats next, you'll tell us the slotted CPU really WAS required for higher speeds not just for intentional incompatibility between intel and amd?

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    99. Re:Speed versus complexity by Pieroxy · · Score: 1

      Apple changed chipsets twice for its computers, all with maintaining backward compatibility.

      Intel alone cannot do anything to kill x86. Not without proper support from Microsoft that is. And even then, the Windows market is so fragmented that it may be impossible. Apple has a hold on its hardware and software giving them a big advantage here.

    100. Re:Speed versus complexity by SoupIsGood+Food · · Score: 1

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      CISC. Translating CISC into RISC and then back again was still faster than a native CISC instruction set - which is basically what x86 does these days, with some vector processing instruction-set special sauce.

      The failure of the RISC powerhouses (with the exception of SPARC, which always kind of sucked except for the Fujitsu chips) was mostly due to the internal politics of the companies using them. In the late '90s, early 00's, it was common knowledge that HP and Intel's new IA-64 chip family was going to be lightyears faster than RISC and that Windows was going to rule the universe. SGI and HP were incredibly invested in this strategy, to the point where they let their advanced RISC architectures wither and die, and stopped moving their Unix OS development forward.

      Well, SGI did. It spun off MIPS into its own company, and didn't give it any funding for high-performance R&D, though MIPS is still doing well as an embedded processor company. HP was a bit more prudent - probably as they were the ones actually developing the Itanium in partnership wit Intel - they kept PA-RISC and HP-UX development humming along, and it likely saved the company. Meanwhile the mighty Alpha was given a kiss goodnight after HP bought Compaq, who had bought DEC. HP already had two high performance chip families (well, one and a half, Itanium wasn't ready yet) and didn't need a third, even if it was faster and better.

      So, since IA-64 was a decade late in arriving and didn't live up to the hype once it arrived, that left the platforms who relied on RISC hanging in the wind. HP, who kept development of PA-RISC active until IA-64 was ready for primetime, managed to hang on, and is now happily selling giant IA-64 Unix servers (or they were until Oracle pulled the rug out from under them). SGI is now owned by Rackspace, and they just shove lots of x86 system boards in racks to run Linux these days. DEC lives on as HP OpenVMS running on Itanium servers. IBM is still kicking much ass with its POWER RISC architecture, although it's no longer in the high performance workstation game, and really, killing the desktop-class chip designs was a golden opportunity to screw over Steve Jobs(which backfired). Sun still kind of sucks, except for the Fujitsu chips.

    101. Re:Speed versus complexity by Kjella · · Score: 2

      Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die.

      Actually the most obscure instructions are implemented in software (microcode) and don't take up any hardware at all except the storage space. This makes them hideously slow but modern compilers avoid them and if you're running very old legacy code it runs fast enough anyway. Anyway, I heard these arguments back in the 90s when processors had 5 million transistors. Now they have 1.5 billion transistors and you still keep talking about the few thousand - yes, thousands - of transistors required. Sigh.

      --
      Live today, because you never know what tomorrow brings
    102. Re:Speed versus complexity by BitZtream · · Score: 2

      You realize that branch prediction is really simple right?

      Things like 'I see a branch not equal coming up, load the branch because I assume always not equal'

      and that its up to the compiler to figure out the best way to right the code so that branch prediction works efficiently, right?

      Branch prediction is really just 'We're going to assume you code your branches to do X most of the time, so you try to do that if you want good performance'

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    103. Re:Speed versus complexity by BitZtream · · Score: 1

      Also, ARM doesn't need as much money, since they are fabless: they rely on various others. And that's another thing, since they license cores, other can make full SOCs, where as with intel, it's their SOC or nothing.

      Intel would be happy to license you x86 tech for your SoC. However, if you're designing a SoC, you're probably smart enough to realize that Intels price for the license, power and transistor count make it silly to use x86 rather than ARM unless you must support x86 ... in which case, Intel has the one you want anyway.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    104. Re:Speed versus complexity by drinkypoo · · Score: 1

      If you are talking about MMU, they dind't screw up. Nobody cared about 16-bit support.

      When the 286 was new it was a big deal and there were a lot of releases just for it. Hell, I had Xenix for the 286, best OS you could run on there. My 286 has 1MB RAM and ran at 6MHz and it was still quite usable. I had a whole 40GB RLL disk so no compiler suite or GUI, but I did run a UUCP node.

      Windows 3 would even run on the 286, anything that didn't require Win32S usually ran fine. The score would loop around on Tetris, though. And the game ran slow enough that it was easy to do.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    105. Re:Speed versus complexity by drinkypoo · · Score: 1

      What killed all these RISC PCs was the Intel Pentium Pro chip. It offered 90% of the performance of the DEC Alpha chip without any of the downsides of running some weird platform

      Pretty sure that the PPro 200 does not offer 90% of the performance of an Alpha 21064@300 or 350 or whatever they got it up to. Pretty damned sure. On the other hand, it probably offers performance at half the price per flop or less when you consider the price of the full system...

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    106. Re:Speed versus complexity by drinkypoo · · Score: 2

      x86 has eight general purpose registers. In 64 bit mode, it's 16 general purpose registers.

      General purpose, you say? Tell me, how many of those "general purpose" registers can you use with an integer multiply? Tell me, how many x86 instructions expect their operands to be in specific registers, and place their output in specific registers? Tell me, which of those registers can you use as the address operand for instructions which require it? x86 has zero general-purpose registers. amd64 (let's call it what it is) has sixteen.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    107. Re:Speed versus complexity by drinkypoo · · Score: 1

      a multi-cycle divide implemented in microcode would be better than a subroutine

      it might be more convenient, but it wouldn't be better. your language/compiler/preprocessor/something should handle this for you and it doesn't have to be the CPU, plus that leaves opportunity for optimization in cases where you can make certain assumptions about the numbers you're working with.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    108. Re:Speed versus complexity by hattig · · Score: 1

      Phoronix just today compared ARM power efficiency (on a cluster of PandaBoards) to Atom, Ivy Bridge and AMD Zacate. http://www.phoronix.com/scan.php?page=article&item=phoronix_effimass_cluster&num=1

      End result: ARM was far more efficient - performance per watt was far superior to Atom and Zacate. And that was with GCC 4.6 and Ubuntu 12.4, GCC 4.7 optimises for ARM significantly better...

      Performance/Watt on IB was very good however, but overall power consumption was very high comparatively.

    109. Re:Speed versus complexity by drinkypoo · · Score: 1

      PCs with Alpha RISC CPUs started to take off, and then died. Did they die because RISC was inferior, or was it more to do with MS suddenly dropping the Alpha version of Windows and Intel ending up owning the Alpha chip?

      It was due to the Alpha being too expensive for the performance, while the Intel chips continued to improve and offered you more for your money. If they had made an Alpha PC instead of a bunch of Alpha workstations and servers they might have gotten further with it. I've read repeatedly that the Alpha wouldn't scale much further without a total redesign that wasn't really going to happen anyway, but I don't know if that is true or total shit.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    110. Re:Speed versus complexity by yakovlev · · Score: 4, Insightful

      Mobile processors (even those made by Intel) are NOT desktop processors. While it's pretty clear you know this, you make a mistake by trying to count the hardware decoders on the ARM but not on the x86. I don't care who makes the processor, no general-purpose mobile phone processor is going to be able to do 1080p video decoding in software. Intel couldn't even do it on Atom, which has substantially higher power draws than a mobile phone CPU. This is true of anything in the current generation of processors and should be true for the next few die shrinks. With technology scaling not providing the performance gains it once did, this really means it won't be possible for the foreseeable future. Even if the x86 cores could do 1080p video decoding, you'd still rather have the dedicated hardware, as dedicated hardware will use substantially less power doing it than the x86 core, and video decoding is one of the cases where power draw matters on mobile phones. The point of all this is that the graphics hardware comes out to be a wash when comparing Intel and ARM systems for mobile phones. Both of them need one, so you can't assume the x86 chip can get by without one. Thus, it really is comparing apples to apples to compare just the x86 cores and the ARM cores.

      As far as the IPC difference between Intel and ARM, I'm going to side with Intel this time and say that architecture doesn't really matter. The back-end of these chips all run RISC-like. Cache sizes are going to be similar and the Intel core isn't all that sophisticated. There is no reason to believe that, at a given frequency, x86 performance will be significantly better than ARM performance. The argument is whether or not, at a given frequency, the added area required to decode x86 represents a significant additional power draw (or, worse yet, additional pipeline stages, which would have a detrimental impact on x86 performance.)

      As far a fabs go, Intel is playing this in an interesting way. Intel seems to be using mobile chips as a way to keep their older fabs busy. This makes the mobile chips very nearly free for them to manufacture. They're just keeping up with ARM, rather than moving to their current process and absolutely blowing them away. So, let's be clear. Intel could be a die shrink ahead of where they are, which probably would make the x86 cores on a newer process better than the ARM ones on an older process. Intel is staying on the old process for cost reasons, not performance ones.

      AMD doesn't really have anything that plays in the mobile space, but their closest comparison is Bobcat. Bobcat is a pretty good core for the power envelope it works in. I think AMD could build an x86 core for the mobile space, if they wanted to. The real problem is that they couldn't maintain current performance while using a back-level process to compete with Intel on cost. In some ways Intel might prefer that they could, as it might make x86 in the mobile space seem less like locking yourself into a single vendor, indirectly helping Intel sell Medfield.

    111. Re:Speed versus complexity by CajunArson · · Score: 1

      You see, your post is why ARM has a reality distortion field around it that makes Apple's fanboys look completely objective.

      Ivy Bridge, even with a massive number of peripherals and extra devices that suck down power, was about THREE TIMES more efficient per-watt than the ARM boards. PCI controllers on those ARM boards? Nope. SATA controllers or drives on those ARM boards? Nope. USB 3? Nope. Real gigabit ethernet? Nope. Memory interfaces? Nothing near what Ivy Bridge has.

            Not to mention that the absolute performance figures for Ivy Bridge annihilated the ARM boards, even though these are integer-only and embarrassingly parallel benchmarks that are about as favorable to ARM as you are going to get.

        The end result is that ARM has a *huge* mountain to climb to even get Core 2 level performance and performance per watt. People on this site denigrate Intel because the latest A15s (that aren't even for sale yet) are ahead of Medfield. Those people think that a 30% advantage in Smartphones means that ARM has permanently destroyed Intel and that Medfield is mathematically proven to be the only chip that Intel can ever make for smartphones. They don't bother to think about the fact that ARM has *orders of magnitude* to go before they can even compete with consumer-grade Intel chips.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    112. Re:Speed versus complexity by tepples · · Score: 1

      I'd love to see more average people buying tablets and throwing away their PC's

      If that were to happen, it'd make things more difficult for those home users who want to take the step from passively viewing works to creating works with the devices that they already happen to own.

    113. Re:Speed versus complexity by JDG1980 · · Score: 1

      Yes. Your post basically summarized the main problem underlying the philosophy of intellectual property.

      I don't see the connection. Even if there were no patent laws, Intel's competitors wouldn't magically be able to catch up to its process technology. It's cost countless billions of dollars for Intel to get where it's going – and I mean billions in actual, on-the-ground manufacturing and operations costs, not just R&D.

    114. Re:Speed versus complexity by squiggleslash · · Score: 1

      Given the spec of most mobile phones coming out right now isn't far from the spec of PCs that were mainstream five years ago (maybe a lower GHz, but 1G RAM, two or more cores, a GPU, etc), and how far forward that is compared to, say, two years ago at around this time, I'm inclined to think that there's little reason to avoid going 64 bit for mobile phones in the near future. Certainly, anyone designing new smartphone silicon right now should be thinking in terms of how they can make a 64 bit smartphone that has a decent battery life.

      I find all the comparisons with x86-32 to be a little off to be honest for that very reason. I don't think Intel is talking about 32-bit ix86 vs ARM. I think they're talking about ix86 TODAY vs ARM - and that's almost all 64 bit. Modern Atom sets are 64 bit. The last 32 bit Atom was launched two years ago with the exception of one very-very low end Cedarfield variant in late 2011.

      --
      You are not alone. This is not normal. None of this is normal.
    115. Re:Speed versus complexity by unixisc · · Score: 1

      Pentium Pro 200 was much slower than the Alpha of that time when it came to each running native apps, as was obvious from their SPECint and SPECfp numbers, which were the benchmarks at the time. However, the market comparisons b/w them were how fast they ran Wintel binaries, and there, PPro 200 was 90% of Alpha, w/ all the other advantages you listed above.

    116. Re:Speed versus complexity by naasking · · Score: 1

      Of course, you're conveniently ignoring the microcode translator itself and the memory to store the microcode, which are significantly larger than merely thousands of transistors.

    117. Re:Speed versus complexity by JDG1980 · · Score: 1

      Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes. Then there's junk like the ASCII adjust and decimal adjust instructions: AAA, AAS, AAD, and AAM, and DAA, and DAS. Nobody uses packed decimal any more! And hardly anyone ever used it

      Actually, if you're emulating a Z80 CPU in x86 assembly, you probably will be using DAA or DAS (depending on whether the virtual N flag is set) to emulate the Z80's DAA opcode. I've seen this done in real-world code not too many years back.

    118. Re:Speed versus complexity by Locutus · · Score: 1

      here's the catch though, Intel has to constantly do this on a smaller die process than ARM to beat them and then they have to sell these at prices comparable to ARM. In the past, Intel has been able to use their smaller die process space for high cost / high end gaming and server CPU's. No so good for Intel on the financial side especially if they expect to keep their profits up.

      LoB

      --
      "Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
    119. Re:Speed versus complexity by Hatta · · Score: 1

      That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.

      If the CPU can convert x86 instructions to RISC, why couldn't that be done in software?

      --
      Give me Classic Slashdot or give me death!
    120. Re:Speed versus complexity by Bert64 · · Score: 1

      The x86 emulator Alpha had did work well, but it also significantly reduced performance.. The idea that the ppro offered 90% of the performance of the alpha was based on running emulated code, where the ppro was considerably less than 90% of the price of the alpha.

      Running native alpha code, the alpha was still miles ahead of the ppro especially for floating point code, but obviously this advantage was wasted if you were running emulated x86 code... If you were running one of the few native nt/alpha applications, or were running a unix platform on which you had compiled native binaries of everything then there were still huge performance advantages over the ppro.

      Incidentally, running Alpha/NT had upsides too, it always seemed to be a LOT more stable than the x86 version, and booted much faster.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    121. Re:Speed versus complexity by fnj · · Score: 1

      x86 is ugly. It's one of the most screwed up, inconsistent, crufty architectures ever created.

      So? Who cares? No, really. It certainly isn't even visible or detectable by the user. It matters to compiler writers, but both proprietary and FOSS compiler writers mastered the cruft, so that's a done deal. Why would anyone else give a crap?

      Anyway, it's only x86_32 (aka i386) that is really grossly idiosyncratic. x86_64 fixes the gross idiosyncrasies. Unlike the paucity and specialization of registers in x86_32, x86_64 has sixteen general purpose registers, and is well suited to simple flat memory model programming.

    122. Re:Speed versus complexity by Bert64 · · Score: 1

      Multi core CPUs were first introduced by AMD, Intel came later and their first dual cores were quite a half assed effort being basically two complete processors on a single package.

      Prior to that, there were machines with multiple processors, but they were targeted at servers and highend workstations, lowend windows 9x based workstations obviously couldnt take advantage of multiple cpus (not that it stopped stupid people from buying them anyway and running win9x on them).

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    123. Re:Speed versus complexity by jedidiah · · Score: 1

      ARM isn't "magical" and its been needing more and more extras like HD decoder chips to share the load because it simply can't get enough IPC to do the job on its own.

      The last time I checked, x86 needed extras like the GPU to offload HD decoding to as well.

      When was that? 1999?

      The x86 based AppleTV could manage HD decoding in software. Now while that was only true for MPEG2 and divx, it is still far beyond the ability of any CPU in subsequent ARM appliances.

      You need to get into BluRay level stuff before something like an Atom needs a special decoder for video.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    124. Re:Speed versus complexity by jedidiah · · Score: 2

      Intel has not GPU? Are you kidding? Intel has a GPU. It may not be the greatest but they certainly have one. If you don't like it then you do the same thing with Intel kit that they do with ARM kit.

      You grab 3rd party parts like Nvidia.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    125. Re:Speed versus complexity by jedidiah · · Score: 1

      > i hear it is very popular this days to have low ID

      That never ceases to amaze and amuse me whenever I hear it.

      Slashdot chic? Hilarious!

      --
      A Pirate and a Puritan look the same on a balance sheet.
    126. Re:Speed versus complexity by Pieroxy · · Score: 1

      ... Moreover MS continues to develop the Windows CE product ...

      I almost choked on that one !

    127. Re:Speed versus complexity by jedidiah · · Score: 1

      No. What killed the other versions of NT was a total lack of commitment from Microsoft. They were token gestures that Microsoft never put any real ongoing effort into.

      I don't even think Microsoft ported all of their stuff.

      There's even proprietary stuff that shows up for weird versions of Linux. It's not just the Free Software and in-house stuff.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    128. Re:Speed versus complexity by kiwix · · Score: 1

      You and the other poster seem to be forgetting ONE thing, which is nobody gives a shit how low the power draw is if it can't do what they want and what people WANT is MOAR, MOAR HD, MOAR games with MOAR graphics, MOAR MOAR MOAR.

      As far as I'm concerned, I don't give a shit about how much I can do with my phone if it draws too much power. If the battery can not last at least ten hours on idle, a phone is just useless.

    129. Re:Speed versus complexity by hattig · · Score: 1

      I don't think it has been claimed that ARM is performance competitive with top-end expensive processors. The comparison was put in just because it could be.

      Cortex A8 and A9 are Atom-class in terms of overall performance. A15 will probably raise the game but still not by that much (probably Core Duo performance).

      But what we have in this test is a bunch of hobby ARM dev boards against a high-end PC. A real ARM based server (Calxeda) will show better characteristics (http://www.calxeda.com/products/energycore/ecx1000). Oh, and look: SATA, PCIe, 10GigE...

    130. Re:Speed versus complexity by gnasher719 · · Score: 1

      General purpose, you say? Tell me, how many of those "general purpose" registers can you use with an integer multiply?

      Oh, are we getting excited. Nobody gives a rat's arse for x86 32-bit mode. And in x86 64-bit mode, all 16 integer registers can be used with an integer multiply. By the way, nobody gives a rat's arse what AMD's involvement in this is.

    131. Re:Speed versus complexity by phantomfive · · Score: 1

      lol you're definitely the first person I've ever heard describe a CPU instruction set as an 'ABI'

      --
      "First they came for the slanderers and i said nothing."
    132. Re:Speed versus complexity by Chas · · Score: 1

      He was pointing out that to say "CISC won" is only true...

      And I'm saying CISC didn't win.
      And neither did RISC.

      Both platforms hybridized. So the distinction in modern processors is pointless.

      --


      Chas - The one, the only.
      THANK GOD!!!
    133. Re:Speed versus complexity by phantomfive · · Score: 1

      I don't see the connection.

      Then look a little more deeply before replying. Intel WOULD be able to make ARM chips without paying any licensing/etc.

      --
      "First they came for the slanderers and i said nothing."
    134. Re:Speed versus complexity by phantomfive · · Score: 1

      Do you really think they're going to stick a 64bit processor in a phone?

      --
      "First they came for the slanderers and i said nothing."
    135. Re:Speed versus complexity by drinkypoo · · Score: 1

      Oh, are we getting excited. Nobody gives a rat's arse for x86 32-bit mode

      That is a load of dingo's kidneys. The whole argument for x86 is backwards compatibility. Most people running Windows are still running 32 bit Windows, even on 64 bit platforms. This is finally changing but it's still the case. Only in FOSS-land does it seem like the majority of software is 64 bit, that's why we live here... But that's really not the case in the wider world.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    136. Re:Speed versus complexity by rabun_bike · · Score: 1

      Customization and the ability to use any certified fabricator is a huge deal with embedded devices designers that use the ARM chip. The reality is that Intel does not want to go down the road of customization and only will if they absolutely have to. ARM is a very different business model and allows the device manufacturers to actually make products that offer different features. Apply has even pulled ARM development into their core functional design teams. They may even roll out the ARM in laptops which makes sense if they can do it. It will allow them to target a common base and share technology across product lines. http://www.zdnet.com/blog/apple/rumor-apple-dumping-intel-for-arm-processors-in-2013/10093

      Finally, for those making the CISC vs RISC argument you need to realize that the modern Intel processors run on a RISC core with a CISC microcode layer and an fairly large pipeline.

    137. Re:Speed versus complexity by phantomfive · · Score: 1

      I know one person who's done that already. She has an Android tablet with a keyboard so she gave her laptop to her parents.

      --
      "First they came for the slanderers and i said nothing."
    138. Re:Speed versus complexity by phantomfive · · Score: 1

      So it's not like this is really fixed, in fact I can't see any problem with adding more registers at any time.

      You make it sound so easy, but modifying an instruction set is something no one takes lightly.

      --
      "First they came for the slanderers and i said nothing."
    139. Re:Speed versus complexity by Chas · · Score: 1

      Whoever has the guts to say that "There's nothing inherently "superior" about ARM or PPC instruction sets.", he just shows that he has minimal or no knowledge about the instruction sets of Intel, ARM & PPC and he certainly has no significant experience in trying to optimize a program for any of those architectures.

      33 years of experience is laughing at this quote.

      --


      Chas - The one, the only.
      THANK GOD!!!
    140. Re:Speed versus complexity by phantomfive · · Score: 1

      The whole argument for x86 is backwards compatibility. Most people running Windows are still running 32 bit Windows, even on 64 bit platforms.

      Good point.

      --
      "First they came for the slanderers and i said nothing."
    141. Re:Speed versus complexity by mparker762 · · Score: 1

      By that definition CISC chips have always actually been RISC chips. What did you think all that microcode in those CISC chips really was? RISC was an early win because it exposed the microcode to the compiler for improved optimization opportunities, and the simpler fetch/decode/execute logic made it easy to implement an efficient pipeline scheme which compensated for the increased memory traffic. The weakness of RISC was that by moving the microinstructions out into main memory and exposing them to the compilers they wound up exposing too many details of their implementations, and made themselves vulnerable to changes in the speed differential between CPU and memory. RISC's initial advantage in internal complexity quickly eroded as they added multiple dispatch units, and their pipelines changed from 4 to 6 to 8 stages but the instructions were still claiming they had 4. And as the performance differential between the CPU and main memory increased the functional density of CISC code became a win. Intel noticed that they could fake out more registers than the instruction set explicitly revealed, and could treat near memory off BP and SP as though they were registers, neatly mimicking the register windows of machines like SPARC (this basic trick had been used decades earlier in the TI 990 minicomputer). And Intel noticed that since their micro-instructions were similar to RISC ops, many of the basic optimizations that compilers were doing to improve the outputted code were sufficiently simple peephole-type optimizations that it could be implemented in hardware if this hardware optimizer were able to see the a wide enough window of micro-ops in flight. And since the burgeoning delta between CPU and memory speeds meant greatly increased on-die caches, it turned out that adding all this hardware didn't really change the die size much.

    142. Re:Speed versus complexity by artemis67 · · Score: 1

      Intel won the RISC vs CISC debate of the 1990's for several reasons that all had to do with business:

      1) The PPC Consortium basically fell apart. IBM signed on board, and then quickly lost interest in helping out with the desktop PPC processor line. They built developed the original PPC line for the server market, and that's where all the profitability remained for them. Unfortunately, the server CPU's were unsuitable for the consumer market (and too expensive).

      2) The Motorola Brain Drain. Intel faced down the PPC threat by hiring away the best and brightest engineers from Motorola. Motorola floundered. Whereas the PPC was expected to be the first to hit the Ghz range, they ended up being years behind Intel. For the longest time, Motorola's high-end PPC chip was stalled out at 533 Mhz.

      3) Intel developed a blended processor. Where they could, they integrated RISC technology to achieve huge speed gains.

    143. Re:Speed versus complexity by mparker762 · · Score: 1

      Five years ago would you have seriously thought they'd stick multiple cores in a phone? 64-bit will happen because phones are rapidly becoming more complex, and also because just like the first multi-core phones it will be a huge marketing advantage, because suddenly all those 32-bit phones will look weak and puny.

    144. Re:Speed versus complexity by mparker762 · · Score: 1

      This may change as the x86/x64 compiler writers start paying attention to the "optimize for size" bits of the compiler. For the last few decades all the attention has been paid to speed optimizations. It's possible that some of the advantage of ARM in this regard is due to the historically different focus of their compiler teams.

    145. Re:Speed versus complexity by Bert64 · · Score: 1

      M68k was indeed cleaner, but Motorola chose cleanliness and dropped 68k in order to create PPC, while Intel went for backwards compatibility...
      The end result was that while the m68k was a very widely used processor, most of the vendors who had been using them switched to something else rather than PPC, which starved PPC of customers and ultimately of development.

      Had Motorola continued with 68k it might have been a different story, an extended 68k while still crufty would probably still have been better than an extended x86.

      Incidentally, Intel were not alone in selling a separate FPU... Motorola made the 68881 and 68882 FPUs for the 68k range, and it was not until the 68040 that they integrated the FPU. The 68040 FPU was also cut down relative to the 68882, as it removed some of the instructions and required software emulation in order to support software written for the 68882.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    146. Re:Speed versus complexity by Bert64 · · Score: 2

      You mean, having a 16-bit cpu support a FULL MEGABYTE instead of the usual 64Kb of Ram? In 1979? Pure evil.

      In 1979, in a rather crufty way... As opposed to the 68000 which was also released in 1979, that supports 16 FULL MEGABYTES of ram, and doing so using 32 bit addressing such that even tho only 24 address lines are connected, unless you do something crufty like use the upper 8 bits for storing data (as some software did) your code for the 68000 should run just fine on the 68020 which could support up to 4GB of ram.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    147. Re:Speed versus complexity by dgatwood · · Score: 1

      At first glance, their prototype's battery life numbers are around half what the iPhone 4S gets (slightly better for Wi-Fi, slightly worse for 3G), and that's pitting a 32nm Intel chip against a 45nm ARM chip. The current 32nm ARM chips used in the new iPad reportedly increase its battery life by an additional 25% over 45nm versions. So at first glance, it would appear that 32nm dual-core ARM chips can still provide somewhere around 2.5x the battery life when doing light CPU activity.

      Of course, it's hard to say how much of the difference is caused by the CPU, how much of it is caused by the OS they're running, how much of it is caused by differences in the cellular radio (and or aggressiveness when powering that radio off), compiler optimization, etc. I'll be interested to see if those numbers improve significantly once a few more cell phone companies begin fine tuning the OS and drivers.

      It will also be interesting to see how it performs under heavier load (e.g. games). It might be worth the loss in web browsing hours to get a gain in gaming hours for some people. Either way, I'm impressed by how quickly Intel has narrowed the gap, but at least at first glance, it looks like they have a little ways to go yet.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    148. Re:Speed versus complexity by GuB-42 · · Score: 1

      Macs switched from POWER to x86 too.

    149. Re:Speed versus complexity by Tough+Love · · Score: 1

      Then the nice man from the DoJ says "yeah then why do you sell this chip for $1 while selling that chip for $100 when they cost the same to make?" Busted.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    150. Re:Speed versus complexity by Fallingcow · · Score: 1

      My dual-core Atom can't handle 1080p in software.

      Linux was unusable because its video card drivers only sort-of understood how to use the thing's GPU, so I switched to win7--now, with proper drivers, 1080p works if it's a codec that gets offloaded to the GPU, but becomes a stuttering mess if it has to do it in software.

      Count me as someone who will never buy an Atom again. Some of them may be fine, but the platform consistency is dodgy as hell. Too risky.

    151. Re:Speed versus complexity by TeknoHog · · Score: 1

      OK, so I used the wrong term, but I hope it does not ruin my general point of a stable interface.

      --
      Escher was the first MC and Giger invented the HR department.
    152. Re:Speed versus complexity by hairyfeet · · Score: 1

      Not to mention that might be a valid argument IF...Intel and AMD weren't already building GPUs into their chips, but they are. And in both cases the power draw has been dropping steadily while the IPC has been going up and up. Hell I get anywhere from 6 to 7 hours on my E350 and that's playing 720p video the entire time, if I'm only surfing and watching the occasional YouTube video, which is hardware accelerated BTW? that time goes up. And of course the Intel Atoms are getting more advanced on the GPU side every day, it won't be long at all before it surpasses the Radeon GPUs that AMD uses in their Bobcats.

      In the end it all comes down to IPC where Intel has been stomping for half a decade. look at the benches i posted and see that on their FIRST TRY they got 30% higher performance for the SAME POWER DRAW which is just unreal. Now imagine...what do you think will happen after a couple of Intel's tick tocks? it won't even be a contest.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    153. Re:Speed versus complexity by hairyfeet · · Score: 1

      No offense, but Linux and hardware acceleration? NOT the best of friends, in fact the only one I've seen do it consistently is Nvidia and that is only with proprietary drivers which rumor has it guts a lot of the graphics subsystem and replaces it with their own. This is why I prefer AMD on netbooks, because OOTB with Win 7 I could not only have every major format, including flash and DivX in full 1080p over HDMI, but I could do so while getting nearly 6 hours on the battery.

      But the new Intel chip already beats ARM by 30% on their very first try, and while they haven't been good for gaming intel HAS been improving their HD decoder pretty steadily and quickly. Give Intel a couple of tick tocks and I have a feeling it won't be a contest. Again this comes down to money, Intel can afford to work out the bugs on the new shrinks quicker and they can afford insane amounts of R&D that frankly nobody else can. they can leverage their practically total control of X86 and pour some of that vast warchest into these new chips whereas nobody doing ARM, hell i doubt even Apple, will be willing to sink the kind of R&D that Intel is gonna be doing.

      In any case the one I see winning is the consumer, because just as we saw with X86 when there is competition the products get better and cheaper.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    154. Re:Speed versus complexity by rev0lt · · Score: 1

      I'm far from being an expert in ARM (or RISC), but according to Wikipedia, integer divide is only present in ARM since v7.

    155. Re:Speed versus complexity by unixisc · · Score: 1

      Yeah, that was a shame. Had Microsoft ported all their software to, say, the Alpha and the MIPS, letting the Alphastations be used as workstations and MIPS be used as laptops, they'd have done just fine.

    156. Re:Speed versus complexity by rev0lt · · Score: 1

      The 68000 wasn't a direct competitor of 8086. It was an high-end cpu, and reached the general public in 1980 (8086 was introduced in 1977). By then, you also had eg, Z8000, that addressed upto 8Mb of Ram, and support for 64-bit registers.
      The 68020 processor was released less than a year before than i386, and you could also run 8086/8088/286 code on it. Intel could have easily beated Motorola if they had stopped playing with the i432 architecture, and instead concentrated on x86.

      But yeah, the 68000 family is awesome :)

    157. Re:Speed versus complexity by IntlHarvester · · Score: 1

      There actually was a window before the 21064 started shipping where Intel was damn close in SPECmarks. Might not have been "fair", but it was enough to convince people that NT/Alpha was a bad investment.

      --
      Business. Numbers. Money. People. Computer World.
    158. Re:Speed versus complexity by Dahamma · · Score: 1

      Except I contributed to the discussion, and you did not. Hypocrite much?

    159. Re:Speed versus complexity by rev0lt · · Score: 1
      I had no experience with Xenix, but given that there was an earlier version for 8086/8088, it probably didn't implement memory protection in the 286 either. I still have a 8MHz 286 with a couple of 20MB RLL drives.

      Windows 3 would even run on the 286, anything that didn't require Win32S usually ran fine.

      When Windows 3 was released, high-end i386 workstations were already common. Windows 3 did not use any memory protection features wth 286 (it runned on "real" or "standard" mode). To take advantage of actual memory protection features, you needed a 32 bit processor ("enhanced" mode).

    160. Re:Speed versus complexity by IntlHarvester · · Score: 1

      It depends, as Drinkypoo pointed out, the later 21064 chips demolished the Pentium Pro. However, there was a window in 1995 when Intel was putting out comparable SPEC scores, and that was enough to 'freeze' interest in NT/Alpha. (and yes, I mean native code)

      Also NT/Alpha benchmarks generally didn't show the chip in the best light, probably due to running in 32-bit mode, MS compiler, etc. Some of the application benchmarks were pretty miserable as well, considering.

      --
      Business. Numbers. Money. People. Computer World.
    161. Re:Speed versus complexity by BasilBrush · · Score: 1

      It doesn't really matter what 8-bit and 16 bit chips are classified as. Where they are still being used it's in places where performance doesn't matter.

      ARM and X86 on the other hand are head and head in current day computing devices. And ARM is vastly more used than X86.

    162. Re:Speed versus complexity by rev0lt · · Score: 1

      plus that leaves opportunity for optimization in cases where you can make certain assumptions about the numbers you're working with

      but the availability of a div instruction doesn't stop you from using whatever optimizations you see fit, as most compilers already do in architectures that do have a div instruction.

    163. Re:Speed versus complexity by Rich0 · · Score: 1

      That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.

      If the CPU can convert x86 instructions to RISC, why couldn't that be done in software?

      I can use software (qemu) to convert ARM opcodes into x86 and simulate a 400MHz Smartphone on my 3GHz quad-core desktop. Said smartphone takes about 5 minutes to boot (just try out the Android SDK sometime).

      Sure, it can be done. You can also implement a quad-core i7 using a sendmail configuration file. However, it isn't going to perform as well as dedicated hardware that is HIGHLY optimized to the task.

    164. Re:Speed versus complexity by snadrus · · Score: 1

      Enough backward compatibility is a big reason why they're in tablets: because upgrading C-side Android software from the smart phone is a recompile with virtually no gotchas. Phones are always about the battery life and ARM is an order of magnitude better here while being fast enough to eek by.

      --
      Science & open-source build trust from peer review. Learn systems you can trust.
    165. Re:Speed versus complexity by painandgreed · · Score: 2

      FWIW, ARM isn't a pure RISC instruction set.

      I think that was his point. There is nothing that is "pure" CISC or pure RISC these days as they have been borrowing tech from each other for years now.

    166. Re:Speed versus complexity by Hatta · · Score: 1

      Sure, but presumably it only has to be done once. Run your x86 to risc converter on the binary and you're done, right?

      --
      Give me Classic Slashdot or give me death!
    167. Re:Speed versus complexity by crashumbc · · Score: 1

      Do you THINK their not? Seriously like the other poster mentioned, its 2-4 years tops... probably much less for tablets...

      Look where we where 5 years ago...

      http://www.mobilewhack.com/top-ten-cell-phones-of-2007/

    168. Re:Speed versus complexity by Chuckstar · · Score: 1

      But that's not why Intel doesn't make ARM chips. The licensing fees are tiny.

    169. Re:Speed versus complexity by mzs · · Score: 1

      True, but that's not the case with ARM though since thumb mode. Most instructions are then two bytes. ARM also has some neato features in it's instruction set, like you can shift for free almost every time you do anything else with a register like arithmetic. Also condition codes only change when you want them too. And almost every instruction can be conditional. This makes it so that the little cache there tends to be is utilized pretty well and you don't really need the branch prediction logic as much. I think unlike the other RISC chips, in the case of ARM, IA32 keeps up only since they are made in more modern efficient process to a greater extent.

    170. Re:Speed versus complexity by Tough+Love · · Score: 1

      No offense, but Linux and hardware acceleration? NOT the best of friends, in fact the only one I've seen do it consistently is Nvidia and that is only with proprietary drivers which rumor has it guts a lot of the graphics subsystem and replaces it with their own.

      I'm getting 75 million phong shaded triangles per second at 1920x1200 using the open source Radeon driver on a fanless 6450. If that isn't hardware acceleration, what is? The Intel driver is also turning in a respectable performance lately, not in that class but respectable.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    171. Re:Speed versus complexity by Tough+Love · · Score: 1

      No offense, but Linux and hardware acceleration? NOT the best of friends, in fact the only one I've seen do it consistently is Nvidia and that is only with proprietary drivers which rumor has it guts a lot of the graphics subsystem and replaces it with their own.

      You're basically talking out of your ass. AMD's proprietary catalyst driver turns in roughly the same performance as NVidia's driver, and the open source Radeon driver is perfectly usable, impressive even, provided you stick to core OpenGL, which you should anyway. I don't bother running Catalyst any more for that reason. And for your information it is normal for an OpenGL hardware driver to reimplement as much of the "graphics subsystem" as possible within itself. But like any other OpenGL driver on Linux, NVidia relies on GLX for DMA to the card, which works perfectly well.

      You need to update your clue.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    172. Re:Speed versus complexity by Alomex · · Score: 1

      Exactly this.

      Intel's victory has more to do with the adoption of RISC principles which include deprecation of various truly CISC instructions of the x86 set, than manufacturing power. Granted, initially they won some rounds on manufacturing power, but by the 386 they were already deprecating the worse instructions. They are still there for backward compatibility and they are silicon compiled in the fly into simpler RISC-like instructions, but compiler writers are told not to use those instructions to begin with.

    173. Re:Speed versus complexity by Alomex · · Score: 1

      That's real estate that could be spent on bigger cache or more registers.

      Right, because real estate is at such a premium that we can barely manage to fit in four cores on a single die with 8M cache, so we couldn't possibly afford a few hundred transistors to decode the arcane instruction set.

      You should go back to bed and set your alarm clock to 2012. Real estate for instruction decoders stopped being an issue over ten years ago.

    174. Re:Speed versus complexity by naasking · · Score: 1

      Right, because real estate is at such a premium that we can barely manage to fit in four cores on a single die with 8M cache, so we couldn't possibly afford a few hundred transistors to decode the arcane instruction set.

      Cores can be shut down to conserve power, as can caches in some cases, but instruction decoders cannot. I think you underestimate how power usage scales with numbers of transistors. Since this whole article is heavily biased towards low power and mobile computing, that's a very relevant factor.

    175. Re:Speed versus complexity by Alomex · · Score: 1

      Cores can be shut down to conserve power, as can caches in some cases, but instruction decoders cannot.

      This is not how a modern x86 processor works. The arcane instruction set doesn't even reach the instruction registers. At the pipeline stage a simple power efficient test can isolate the CISC-like instructions and handle them through a different, normally dormant silicon compiler. This means CISC instructions execute way slower, since they are pulled out of the fast path, which is why they are so heavily deprecated in Intel's technical documents, but they are still there, and no, they do not consume massive amounts of power to decode.

    176. Re:Speed versus complexity by Alomex · · Score: 1

      which are significantly larger than merely thousands of transistors.

      They are not. Microcode translators are not particularly big even for an entire instruction set, much less for a few deprecated instructions.

       

    177. Re:Speed versus complexity by arkane1234 · · Score: 1

      Macs switched from POWER to x86 too.
      ... and there were greater than zero legacy compatibility requirements.

      --
      -- This space for lease, low setup fee, inquire within!
    178. Re:Speed versus complexity by TheRaven64 · · Score: 1

      ARMv7 means anything with Cortex in its name. That's basically any ARM chip that you're likely to encounter these days, unless you are doing really low-end embedded development.

      --
      I am TheRaven on Soylent News
    179. Re:Speed versus complexity by TheRaven64 · · Score: 1

      Speaking as a compiler writer who works mainly on optimisation: we do the same size optimisations for ARM and x86. All of that stuff happens in the processor-independent part of the compilation pipeline. Anything that benefits x86 in this regard is likely to benefit ARM equally.

      --
      I am TheRaven on Soylent News
    180. Re:Speed versus complexity by Darinbob · · Score: 1

      Yes but you still had to recompile. Compatibility on desktop means binary compatibility.

    181. Re:Speed versus complexity by Bengie · · Score: 1

      Exactly. The Bulldozer idea is a great idea. FPU/SIMD is a lot of idle transistors for most workloads.

    182. Re:Speed versus complexity by Bengie · · Score: 1

      "Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions."
      What's wrong with SIMD style instructions? Many archs use it, from ARM to GPUs to x86.

    183. Re:Speed versus complexity by Bengie · · Score: 1

      "The ARM register set consists of 37 general-purpose registers, 16 of which are usable at any one time."
      In that case, Intel CPUs have hundreds of general-purpose registers, 16 of which are usable at any one time. The CPU does behind the scenes optimization with access to hundreds of registers. The 16 registers you see are virtual registers, not the real ones.

    184. Re:Speed versus complexity by Bengie · · Score: 1

      Allows for much much faster atomic updates when doing thread syncing.

    185. Re:Speed versus complexity by KingMotley · · Score: 1

      Why would you be comparing ARM to x86? If you are going to be pulling an ancient version of the instruction set for intel from 1978, then let's do an apples to apples comparision and compare that to the 1978 version of ARM. The 1978 intel chip (8086) had AX, BX, CD, DX, SI, DI, BP, and SP registers (I would count IP and not including the segment registers CS,DS,ES,SS, or the flag register, or floating point operators). You are right, there really only were 4 general (AX,BX,CX,DX) with some operations limited to using BX,SI,SI,BP like string operations, CX for counting/loop operations. ARM had 0.

      Or we could compare the 32-bit version of Intel chips that have been around since 1985, extend those out to 32-bits and make them all general registers, add 2 more segment registers, add 8 multimedia registers (MMX0-MM7), add 8 SIMD registers (XMM0-XMM7), and a few more status registers (CR0-CR4,TR3-TR4,DR0-DR3,D6,D7,TR,GDTR, LDTR, IDTR). Still ARM had 0 (Unless you want to count acorn, which came out a few months later, and an actual working computer 2 years after that).

      In 1987 Intel added register renaming making the total general purpose registers 128, while only being able to see 16. While you can't directly address these 128 general purpose registers, they work behind the scenes to effectively give you a much larger register size than what you see. All the benefit of more registers but without having to specifically use them.

      Or we could compare what we had in 2004. Intel allowed to specify an additional 8 more registers R8-R15 (Still 128 in the background). ARM has 13-14ish.

    186. Re:Speed versus complexity by phantomfive · · Score: 1

      Then why doesn't Intel make ARM chips?

      --
      "First they came for the slanderers and i said nothing."
    187. Re:Speed versus complexity by phantomfive · · Score: 1

      If it does, your song in your sig more than makes up for it

      --
      "First they came for the slanderers and i said nothing."
    188. Re:Speed versus complexity by hairyfeet · · Score: 1

      It is NOT a good idea and here is why: because the suits at AMD never bothered to pick up the damned phone and call MSFT you have a situation where AMD's newest chip will ONLY run decently on Windows 8 and probably won't run WELL until Windows 9, 5 years from now. Which of course by then the chip will look like a Pentium D.

      You see for this idea to actually work Windows has to recognize the BD for what it is and schedule the chips completely different to the way its scheduled chips on EVERY SINGLE CHIP ever made. its the same way that you can't run Hyperthreading on Windows like Win98 because it didn't understand WTF to do with hyperthreading. MSFT has released a patch but even they have admitted it'll be Win 8 at the earliest before the scheduler is even fixed, which means it won't ACTUALLY be fixed or optimized fully until Win 9, as we have seen MSFT is stuck in the "One shitty one good" Star Trek formula.

      So you see friend, when you have the most popular OS that runs on your chip, that is on over 90% of X86 desktops and laptops AND of which nearly 70% are running previous versions making a chip that runs like it has a boat anchor tied to it unless you go buy a new version in Oct that is so far looking to be the next MS Bob is a BAD IDEA. It is an insanely bad idea, hell even netburst wasn't that stupid of an idea as at least you could run current Windows on it. as it is if you have XP, Vista, or Win 7 then bulldozer is faildozer and there isn't a damned thing you can do about it because MSFT has already said WON'T FIX. The best you get is a half assed patch that only added about 6% to the already piss poor numbers. Hell Thuban drinks Faildozer's milkshake on any of the above OSes damned near across the board!

      So i'm sorry friend, if AMD were writing their own OS and selling it along with faildozer, like say Apple? THEN they could do this. But since they are at the mercy of MSFT and didn't even bother to give anyone at Redmond the heads up as to what they are planning Faildozer is screwed, on win 7 (which looks to be the next XP) even the Intel duals stomp all over it unless you OC the living shit out of it. its just not a good design when you can't get it to work without a low level OS rewrite when you don't control the OS.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    189. Re:Speed versus complexity by unixisc · · Score: 1

      I'd argue that this fault was HP's. Unlike Intel, they had a great RISC CPU - the PA-8x00 - which was the closest rival to the Alpha at that time (before POWER3 regained the lead). It was they who acquired 2 VLIW companies - Multiflow and Cydrome - and made VLIW the centerpiece of their next CPU architecture. If HP - w/ that expertise - thought that VLIW was a progression from RISC, who could blame Intel for believing that it would be even a bigger quantum jump from x86? The move to 64-bit was a golden opportunity to leave the CISC legacy behind, but AMD put a spanner in the works by coming out w/ the x64 instruction set extensions.

      The compiler argument is a good one, but it wasn't a mere case of the compilers being difficult. It's that some of the functions, such as branch prediction, speculative execution, register renaming and so on, which were implemented in silicon on RISC but moved to compilers in EPIC, turned out to have only a 5-10% savings in die area. As a result, it turned out that RISC was actually a good optimal balance b/w too complex hardware, as in CISC, vs too complex compilers, as in EPIC.

      From what I've read, Itanium did do a fine job emulating PA-RISC, which is how the initial transitions were made from PA-RISC to Itanium. Still, I do think that the Itanium's introduction into the market was a lamentable one, since it saw to the end of some great RISC CPUs - like PA-RISC and Alpha, as well as the marginalization of some other fine CPUs, such as MIPS and UltraSparc. However, w/ all OSs today being multicore, they now have the same opportunity as other CPUs of going multi-core and extracting more performance there. What they need to do - if they don't want to EOL the platform - is to drop the idea of it being exclusively a server platform - and market segment it for different uses - from laptops to supercomputers.

    190. Re:Speed versus complexity by TheRaven64 · · Score: 1

      If you want the update to be visible to multiple threads, it still needs to go via the cache coherency protocol. Whether you're doing a register-memory add or a register-register add and a store makes no difference.

      --
      I am TheRaven on Soylent News
    191. Re:Speed versus complexity by Glasswire · · Score: 1

      There's two things a shrink lets you do. Either add more transistor real estate (more cores, cache, cpu engines, specialized functions - basically more performance and functionality) OR it let's you reduce die size. (Or mixture of both) Reducing die size means more dies per wafer and since wafer cost is somewhat fixed. your cost per die comes down and you can more aggressively compete on price. When you imply the better process technology means lower profits, you're not necessarily right - you can increase your % margin even on processors you're selling for less money. Of course, to make the same or better profit, you need greater revenue (more of those lower price but lower cost parts sold).

    192. Re:Speed versus complexity by drinkypoo · · Score: 1

      but the availability of a div instruction doesn't stop you from using whatever optimizations you see fit, as most compilers already do in architectures that do have a div instruction.

      While that is true, if it's not going to buy you anything to put it in the CPU, putting it in there only increases the opportunity for an error you can't fix in software... So I guess the opportunity isn't for optimization so much as to not make a mistake in the first place, which I admit is pretty different.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    193. Re:Speed versus complexity by Glasswire · · Score: 1

      I don't think it has been claimed that ARM is performance competitive with top-end expensive processors.

      Obviously, Intel is going to introduce their best technology in the premium parts first, but Intel waterfalls architecture generations quickly. There are already 'value brand' versions of the Ivy Bridge predecessor, Sandy Bridge available and I'm guessing you'll see inexpensive IB gen procs by end of year.
      So there' nothing inherently expensive about Ivy Bridge, just the models available at this moment.

    194. Re:Speed versus complexity by Locutus · · Score: 1

      good point and valid as long as your failure rate after the die shrink plays in your favor. And history has shown that Intel hasn't dropped prices on CPU's on new processes when they first hit the market. They don't seem to be low balling their ARM competing chips as it is so they are already trying to do a balancing act to keep profits up. So far, they have had little success in the mobile devices sector so far. Talk is cheap though.

      LoB

      --
      "Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
    195. Re:Speed versus complexity by rev0lt · · Score: 1

      if it's not going to buy you anything to put it in the CPU

      I did not say that, but the opposite. Having a div instruction (specially if it is a high-performance one, such as the one available since the Pentium line in x86) greatly simplifies software development, and often performs faster than subroutine-based divs. That doesn't stop you from applying the usual tricks and optimizations for the special cases that will benefit from them.

    196. Re:Speed versus complexity by tzot · · Score: 1

      The x32 ABI would be most useful in phones. Hell, I can't wait till I can use x32 software on my computer.

      --
      I speak England very best
    197. Re:Speed versus complexity by drinkypoo · · Score: 1

      I did not say that, but the opposite. Having a div instruction (specially if it is a high-performance one, such as the one available since the Pentium line in x86) greatly simplifies software development, and often performs faster than subroutine-based divs.

      Yes, I know you said the opposite, but you were wrong. You're not going to add a multi-cycle div instruction to a RISC CPU because that's not RISC. If you can't handle inserting a 13 line macro you shouldn't be writing assembler, and if your compiler or preprocessor or whatever can't handle it then you need a new one.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    198. Re:Speed versus complexity by rev0lt · · Score: 1

      You're not going to add a multi-cycle div instruction

      You are hilarious. Let's say 15 years is moderately modern - every single x86 chip released in the last 15 years (at least) implements a single-cycle, pairable div instruction, so it's not rocket science. And while they *do* have a complex CISC instruction set, they are all internally decoded to RISC-like operations.

      If you can't handle inserting a 13 line macro you shouldn't be writing assembler,

      Actually, it is called "assembly", and some *very popular* RISC processors DO implement the instruction (ARMv7, POWER, PowerPC, SH4, etc). Oh, and OpenSPARC? You even have multiple versions of it. We are not in 1990 any more.

    199. Re:Speed versus complexity by cheekyboy · · Score: 1

      as i recall, those 5w atom laptops actually used more, because the intel chipset next to the cpu, used 20w + to do all the chipset IO etc...

      --
      Liberty freedom are no1, not dicks in suits.
    200. Re:Speed versus complexity by Chuckstar · · Score: 1

      When Intel sold it's ARM business in 2006 it said: "The sale also will enable Intel to focus its investments on its core businesses, including high-performance, low-power Intel Architecture-based processors and emerging technologies for mobile computing, including Wi-Fi and WiMAX broadband wireless technologies."

      They were making ARM chips. They decided to sell the business. It's unlikely that such a decision hinged on a few pennies a chip for the licensing fees.

  2. Well... by QuietLagoon · · Score: 5, Insightful

    What did you expect him to say... that an Intel product was not suitable for the mobile marketplace? That would have been career suicide for him. He is singing from the Intel songbook. Those songs may not be sung with what is best for the customer in mind.

    1. Re:Well... by symbolset · · Score: 1

      At least he settled one thing quite clearly. We need not hold off our purchases of a quad-core Android tablet like the new Nexus 7" tablets to be released soon, in hopes of getting a cool Intel Android tablet instead. Because they're not going Android on tablets anytime soon. He thinks tablets are for Windows. BWaaaa hahaha.

      --
      Help stamp out iliturcy.
    2. Re:Well... by mewsenews · · Score: 1

      Exactly. Intel seems like a great company with intelligent engineers, but look how long it's taken for them to come even close to ATI or Radeon discrete graphics. They're not going to be in the cell phone game anytime soon.

      And an intel based iphone? Not soon. Maybe in a few years. MAYBE. I'll believe it when I see it.

    3. Re:Well... by viperidaenz · · Score: 1

      x86 != Windows. Intel do work on x86 Android support as well.

    4. Re:Well... by yuhong · · Score: 1

      Funny that Intel once made the XScale ARM rpocessor.

    5. Re:Well... by ganjadude · · Score: 1

      to be fair, intels bread and butter is their CPU, not the GPU. obviously someone who specializes in GPUs should have the edge

      --
      have you seen my sig? there are many others like it but none that are the same
    6. Re:Well... by fuzzyfuzzyfungus · · Score: 1

      Given that Intel's mobile graphics strategy has simply been 'license the same stuff from PowerVR as most of the ARM licencees that don't have an in-house design' there doesn't seem to be anything obviously uncompetitive about it.

      They aren't going to pull any design wins on the strength of their GPU; because it's the same damn GPU as a number of others; but they also aren't going to be put out in the cold by it...

    7. Re:Well... by symbolset · · Score: 1

      It seems that the no Intel Android tablet and the expensive WinRT licensing are part of the old WinTel tango. It seems the lovebirds are settling their spat.

      --
      Help stamp out iliturcy.
    8. Re:Well... by hattig · · Score: 1

      The issue is that with low power devices, the efficiency comes from dedicated hardware blocks, so the CPU doesn't need to be as powerful.

      (And Intel have licensed those blocks for their Atom SoCs, they're not stupid).

    9. Re:Well... by tlhIngan · · Score: 1

      Exactly. Intel seems like a great company with intelligent engineers, but look how long it's taken for them to come even close to ATI or Radeon discrete graphics. They're not going to be in the cell phone game anytime soon.

      And Intel's eating AMD/s and nVidia's lunch in graphics - their graphics power something like 80-90% of the PCs shipped today. They don't need something fast and fancy - just something that an OEM can stick in a PC.

      Basically, OEMs making cheap PCs wanted a cheap video card. Intel obliged by providing an OK video card to go in them, and OEMs flocked there because they could start making PCs below the $1000 mark easily. ATI and nVidia were competing against the high end while the others were all low-end offerings that were still discrete. Intel put it on the chipset and OEMs could build a PC with one less set of chips (graphics+memory) saving a lot of money.

      Intel's not in the high end game and probably won't be - they just need ot iterate as much as necessary to still be a viable GPU for the vast majority of cheap computers

  3. Turn that boat around by busyqth · · Score: 5, Insightful

    Intel spent many years chasing performance with little thought of power draw.
    Now they are putting all their engineering muscle into minimizing power requirements, while maintaining high performance.
    I don't see any reason to think they won't succeed, and if they do, then ARM will end up a niche architecture.

    1. Re:Turn that boat around by Sir_Sri · · Score: 1

      They worried a lot about power draw and leekage current. They were just worried about somewhat arbitrary targets of 45, 65 and 130 W TDP. If you give their engineers and equally arbitrary 4.5W power envelope they'll work on that.

      The thing for intel has always been that the easiest way to reduce power consumption is a die shrink. Which it is. If they can stay one node ahead of the competition and transistor for transistor match performance more or less they'll have a big advantage. And as you say, they've turned their attention to power consumption.

      From my perspective Intel has been designing its CPU's for very different markets than mobile. Gaming, cheap, and servers. Gaming they've basically lost out to nvidia and ATI/AMD on, because there's no way to make a CPU do floating point the way a GPU does. There's probably a lot of stuff they can 'leave out' and just have it work fine in mobile. You could probably do 32 bit for mobile and ditch virtualization.

    2. Re:Turn that boat around by Darinbob · · Score: 1

      A niche that already sells more CPUs per year than Intel does. The high end computing market such as the desktop, smart phones, netbooks, tablets, those are just a fraction of the total CPUs sold. Every automobile has at least one CPU now, every home is going to have a CPU or two in the electric meter (even dumb ones), every microwave oven has one, every new appliance will have one, etc. Even your phone will have one front end CPU for the display and apps but probably a couple behind the scenes CPUs to do the real work like making phone calls, cleaning up audio quality, and managing the radios. The small CPU market is anything but a niche market.

    3. Re:Turn that boat around by 10101001+10101001 · · Score: 1

      Just like how all those automakers will make a big turn around... Oh, right... No, I'm not so optimistic that Intel can have its cake and eat it too. So far, all Intel's efforts, while impressive by x86 standards, are horrible by ARM standards. The only chance I see Intel really having is the same that the big three in the US have--be willing to fork a new brand and release subcompacts with the full knowledge that (a) it might take years for it to catch on in any meaningful sense and (b) it'll probably never supplant your main line because way too many people want performance with little thought of power/gas draw. One could argue that's what Intel's Atom line is all about, but look above. I'd argue Intel's Atom line is tantamount to the whole hybrid/electric fad. It misses the point that the only way to strip out most the power draw is to significantly shrink the die usage/car weight. Once you've gotten to that inherent point of improved performance, only then do you look in ways to augment extant components to incorporate technology that doesn't increase the die usage/car weight while still decreasing the power draw. Intel sort of went that direction...but they're going to have to regress a lot further back than the Pentium M with its multiple instruction cores. :/

      --
      Eurohacker European paranoia, gun rights, and h
    4. Re:Turn that boat around by gweihir · · Score: 1

      Already too late. Intel is about 2 decades late and it will take even them a long time to catch up. Also note that for devices running a Linux kernel, using ARM is not that much effort and is already well known to the developers, so these people do not need x86 for anything. The only people that would desperately need x86 with low power is Microsoft, because they have this basically x86-only monster of an OS, just look at all the limitations of Win8 on ARM.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    5. Re:Turn that boat around by tibit · · Score: 1

      A mid-range car would have about 10 CPUs in it, easy. High-end cars -- a few times that many.

      --
      A successful API design takes a mixture of software design and pedagogy.
    6. Re:Turn that boat around by drinkypoo · · Score: 1

      I don't see any reason they won't succeed in lowering their power requirements, but it remains to be seen if they can get them as low as ARM; Even Intel's ARM processors weren't as good as the real thing, so they abandoned them. Remember what happened when Intel tried to reduce the power consumption of the P4? They went back to the P3. I think you are ascribing competence to intel that they do not possess. I think their only actual advantage is inertia. They were able to illegally abuse their monopoly position in order to get ahead, and now that they have more fabs with a fancier process they will be ahead right up until someone else eats their lunch with another approach, and so far, that's looking like ARM.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  4. He's mostly right by Erich · · Score: 5, Insightful
    All those scalar processors look the same. You can trade energy efficiency for performance and end up with a lower power processor that's a lot slower. When you push the performance, the architecture doesn't matter as much, because most of the energy is spent figuring out what to run and when to run it.

    Compounding this fact, ARM isn't that great of an architecture. It's got variable length instructions, not enough registers, microcoded instructions, and a horrible, horrible virtual memory architecture.

    The big thing that ARM has is the licensing model. ARM will give you just about everything you need for a decent applications SOC. Processor, bus, and now even things like GPU and memory controllers. Sprinkle in your own companies' special sauce, and you have a great product. All they ask is for a little bit of royalty money for every chip you sell. And since everyone is using pretty much the same ARM core, the tools and "ecosystem" is pretty good.

    But there's not much of an advantage to the architecture... the advantage is all in the business model, where everyone can license it on the cheap and make a unique product out of it.

    And nowadays, the CPU is becoming less important. It's everything around it -- graphics, video, audio, imaging, telecommunications -- is what makes the difference.

    --

    -- Erich

    Slashdot reader since 1997

    1. Re:He's mostly right by Anonymous Coward · · Score: 3, Interesting

      Phoronix just did an article on 6 clustered Panda boards (Cortex A9) VS the other guys. It's worth a read.

    2. Re:He's mostly right by Darinbob · · Score: 5, Informative

      ARM has fixed length instructions. Thumb is a separate instruction set from ARM and is also a fixed size set. You can't easily interchange ARM and Thumb without making a function call. There is Thumb 2 that interchanges them more easily now. However the instruction set decoder for Thumb to ARM is so very very simple that it could even be a standard project in an undergrad CS class. Thumb really is for people who are willing to give up some performance to save space anyway. ARM has plenty of registers compared to Thumb. I think it has the sweet spot of 16 registers which is enough to not feel cramped but not so many that context switching or interrupts get in your way. ARM is not micro-coded in any model as far as I know, it is RISC and there's no reason to do any micro-coding (maybe in an FPU coprocessor?).

      However it does have a goofy MMU at times, however this is treated as a separate coprocessor and is not intrinsic to the ARM (a different ARM system-on-chip will handle memory mapping and VM differently, it is not standardized).

    3. Re:He's mostly right by locketine · · Score: 1

      "When you push the performance, the architecture doesn't matter as much, because most of the energy is spent figuring out what to run and when to run it."

      I doubt that.

      Hyperthreading, an Intel tech, significantly increases speed while not doing the same to power consumption or die size. Another Intel only tech, power boost allows them to run the processor at an unsustainable clock speed for a short period of time. There's also a concept of pipelining that allows multiple instructions from a single thread to run staggered as long as they won't collide in their use of a particular component within the CPU architecture and don't have hard inter-dependencies such as reading the result of the previous operation.

      Basically, features specific to a CPU architecture very much impact execution performance and efficiency. I guess you could have been talking strictly about the instruction set but that's only a very small part of a CPU architecture.

      --
      Think globally but act within local variable scope.
    4. Re:He's mostly right by fermion · · Score: 1
      I see it this way. As the processor cycles and memory became cheaper, it became less economical to pay humans to write efficient code. It also frees up cycles that can drive all the eye candy in the modern OS. There is a limit to this as we saw with MS Vista Aero. People are not going to pay just for eye candy. The purpose of faster processor is to reduce the overall cost.

      I think the ARM revolution is greater than the license issue. I think it has to do with minimizing cost of the total product that is going be interacted with using high level APIs and not directly using the hardware. The low level routines must be written once, and then distributed globally. I think this is kind of the approach used with the Intel microcode. The fact is on mobile devices cycles are not nearly as cheap as on desktops. There are real costs in term of batteries and opportunity costs in terms of heat.

      What is interesting, when I think of the RISC losing the desktop and laptop, I think of the heat problem with the PowerPC. Just using the RISC processor does not mean efficiency. It must be designed in. We see this with phones. Not all pohones use the SOC efficiently. I think this is what we are going to determine the fate fo the MS Windows phone.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    5. Re:He's mostly right by Tough+Love · · Score: 1

      And nowadays, the CPU is becoming less important. It's everything around it -- graphics, video, audio, imaging, telecommunications -- is what makes the difference.

      The CPU gets important again when you start multiplying cores.

      Nice post.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    6. Re:He's mostly right by pitchpipe · · Score: 3, Funny

      You can't easily interchange ARM and Thumb without making a function call.

      ARMs weakness lies in the ELBOW implementation. Whereas Thumb is opposable to 4finGer which some see as a strength, but I find that the pinKey shadow architecture complements Thumb nicely with hAnd holding the whole set together in a CRISP burrito.

      --
      Look where all this talking got us, baby.
    7. Re:He's mostly right by rev0lt · · Score: 1

      Hyperthreading, an Intel tech, significantly increases speed

      It doesn't, at least on earlier processors, where the pipelines are shared between threads. And while an "Intel" trademark, it isn't really new or Intel tech.

      There's also a concept of pipelining that allows multiple instructions from a single thread to run staggered as long as they won't collide in their use of a particular component within the CPU

      Every x86-compatible CPU designed in the past 10 years has multiple pipelines. It is hardly an Intel concept. And those multiple pipelines are really a RISC core.

    8. Re:He's mostly right by buglista · · Score: 1

      AAAARRGH! The whole damn point of RISC is that there's no microcode. Small number of FIXED width instructions, that do one thing and do it well.

    9. Re:He's mostly right by TheRaven64 · · Score: 1

      Hi, what's it like in 2005? Over here in 2011, we have Thumb-2 as the standard instruction set for newly compiled ARM code (the Cortex-M series only supports it, The Cortex-A series supports ARM for legacy compatibility). Unlike Thumb-1, Thumb-2 can encode the entire ARM instruction set, with most instructions being 16 bits and some being 32. If you're using the unified assembly syntax, you can switch between the two with an assembler flag: the assembly code is the same because they're just two encodings of the same instruction set.

      --
      I am TheRaven on Soylent News
    10. Re:He's mostly right by Ed+Avis · · Score: 1

      Which ARM instructions are microcoded?

      --
      -- Ed Avis ed@membled.com
    11. Re:He's mostly right by mzs · · Score: 1

      Somebody mod this up, that's exactly what's done on ARM.

    12. Re:He's mostly right by illtud · · Score: 1

      I admire the work that went into this, but:

      "The network switch power consumption wasn't monitored as part of the power monitoring since eventually the cluster will move back to its intended location where it will be tapping an already present 24-port enterprise-grade network switch and thus not lead to any net increase in power draw"

      If you're comparing single multicore system per-watt performance with a cluster, you don't get to magic away the power draw of the switches that provide the fabric for communicating between the nodes, even if it's an 'already present' switch.

    13. Re:He's mostly right by Bengie · · Score: 1

      HT is a mixed bag. When the scheduler knows about it, it can give an average decent speed up, in some corner cases, it's bad.

      The whole issue is that modern desktop CPUs have all of these execution units to help speed up single thread performance by checking for dependencies and executing instruction in parallel when there is no dependency. Quite often, there is serial code that is just loaded with dependencies. For a small 10% transistor cost, you can make a second virtual CPU that can make use of these idle execution units.

      HT also kicks in when one thread stalls for memory loads/etc. It works at the cycle level, so it's really fast to switch. If there is even one cycle where the FPU/int/SIMD is free, HT can make use of it.

      Sounds great on paper, but then you realize that you need to share the same front ends like L1 cache.

      The good news is when one of the virtual HT cores is turned off(OS has to sleep it), it frees up the front end shared resources for the other virtual thread, letting it run as if HT is off.

    14. Re:He's mostly right by locketine · · Score: 1

      This article says HT increases performance 10-20% on average with a 30-40% observed maximum. That's with a modern HT enabled chip of course, at least I assume so since they are comparing it against a very recent AMD cpu with a similar architectural component. Considering AMD finally implemented something similar to HT after 10 years of competing with it, I doubt ARM will be implementing something similar anytime soon.

      I in no way implied or meant to imply that pipelining is unique to Intel. You are absolutely right that ARM has it as well. The point of my post was to show several examples of architectural components that do meaningfully impact CPU performance. I find the idea that ARM could somehow compete with Intel on a performance basis rather naive. Then again, just a few years ago when Intel Atom came out, I didn't think Intel could compete with ARM in power consumption. They proved me wrong, maybe ARM will do the same somehow.

      --
      Think globally but act within local variable scope.
    15. Re:He's mostly right by petermgreen · · Score: 1

      The arm instruction set is the original instruction set (extended many times over the years) of the arm series and is 32-bit fixed width.

      Thumb1 is a 16-bit fixed width instruction set, it was handy when you had very limited memory and/or very limited memory bandwidth but the performance penalty from the reduced instruction set was too high to make it a good choice for general use.

      Thumb2 is a variable width instruction set with a mixture of 16-bit and 32-bit instructions. Raw core performance is lower than with the arm instruction set but cache pressure is also lower so AIUI overall performance is comparable to the arm instruction set. Afaict this is the instruction set that arm is pushing people to use at the moment.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  5. He's missing the point... by romanval · · Score: 5, Insightful

    ARM works because 1) it's good enough while being 2) cheap enough. As far as I know, ARM is getting license royalties in the pennies per chip or SoC core using their design. For how much better Intel can make their low power x86 CPUs, its going to have to compete with dozens of foundries churning out millions of ARM devices when it comes to pricing...and thats where I see Intel having a hard time.

    1. Re:He's missing the point... by gman003 · · Score: 2

      Actually, they answer that in the article. He claims that, even if Intel chips *are* more expensive, a) the price of the processor is pretty much negligible compared to the price of the full unit (particularly the screen), and b) the performance advantage is worth the cost.

      And he kind of has a point. The Raspberry Pi has been described as "a smartphone minus the screen". It's $25-$35. A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor.

    2. Re:He's missing the point... by viperidaenz · · Score: 2

      Yes, a smartphone without the screen, gsm radio, wcdma radio, bluetooth, wifi, gps, battery, case... You can buy a 700mhz smartphone for $100

    3. Re:He's missing the point... by Tough+Love · · Score: 1

      A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor.

      Smartphone prices are overdue for a precipitous drop. And ten times better battery life would be nice.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    4. Re:He's missing the point... by TheRaven64 · · Score: 1

      The Raspberry Pi has been described as "a smartphone minus the screen". It's $25-$35. A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor

      The Raspberry Pi has a 700MHz ARM11 core. A modern Smartphone has at least a 1GHz+ Cortex A8 core. The ARM11 was introduced in 2002. This is a really old design, the sort that you find in ultra-low-end $100 Android tablets.

      --
      I am TheRaven on Soylent News
    5. Re:He's missing the point... by romanval · · Score: 1

      but the vast majority of embedded/low power devices are not going to be $300-$600 smartphone or tablets---- They're going to be things like routers, set top boxes, automobile dashboard screen computers, kid's toys, etc....each of them designed with various SoC's and with various price points.

  6. Re: No future?? by symbolset · · Score: 1

    In the very same article the author asks about the KRAIT ARM SOC at 22nm, which is on a process technology well ahead of the very same Intel smartphone chip he's flogging. At least the author was kind enough to put that after the remarks about others being unable to compete.

    --
    Help stamp out iliturcy.
  7. Definition of "efficient" by White+Flame · · Score: 4, Insightful

    From Intel: Work done per watt
    From ARM: System power draw small enough for handheld & long battery life

    A year or two ago, I read a study that the most ops/watt were still done by high-end Intel processors sucking tons of power each. They did so much work so fast that the per-watt work done was still beyond the tiny-power-sipping ARMs that were relatively slow but still quite capable. Has this changed in the last generation or two of CPUs?

    1. Re:Definition of "efficient" by Anonymous Coward · · Score: 1

      no it hasn't:

      The efficiency was at 85 Mop/s per Watt compared to the Effimaß cluster at 30.79 Mop/s per Watt

      http://www.phoronix.com/scan.php?page=article&item=phoronix_effimass_cluster&num=12

    2. Re:Definition of "efficient" by romanval · · Score: 4, Insightful

      What matters is if you can comfortably keep the device in your bag/pocket and not have to recharge it more then once a day.

    3. Re:Definition of "efficient" by symbolset · · Score: 1

      So now all we need is for them to come out with a mobile processor with 0.1 cores, and it will have all-day runtime. Is that how this conversion factor is supposed to work?

      --
      Help stamp out iliturcy.
    4. Re:Definition of "efficient" by siddesu · · Score: 1

      and not have to recharge it more then once a day.

      That was back then when I had a Dell Axim with an extra battery. These days I only want to recharge it once a week, if that.

  8. Re: No future?? by greg1104 · · Score: 1

    So far this year Intel has basically finished off AMD from the high-end of the desktop CPU market, while advancing into the useful mobile desktop GPU market via their 22nm mobile Ivy Bridge HD 4000 chipset. There's nothing really competitive from them yet for under 15W of TDP, but it's obvious they intend to battle more on the mobile and SoC markets. Only new market to expand into at this point, and the only one still growing usefully. They're not there yet.

    But it wasn't that long ago that Intel's integrated GPUs were the target of jokes too. The HD 4000 isn't great, but advocates of discrete GPUs aren't just laughing now. In smartphone and tablet land, the interesting question is not about this year's product, it's how long it will take the beast to retarget. I'd wager that the 14 nm shrink of Haswell is where things will get interesting.

  9. Slashdot, please do something ! by Taco+Cowboy · · Score: 1

    This is getting too serious

    I will not mention the name, but the post I'm replying to, is littered with links to that joint

    I am not asking for censorship, but what those guys are doing (I am not sure it's one person or several) is too much

    Being parasitic is one thing, being parasitic _and_ annoying is a totally different beast altogether !!

    Do something, Slashdot, please do something !!

    --
    Muchas Gracias, Señor Edward Snowden !
    1. Re:Slashdot, please do something ! by Khyber · · Score: 1

      Which is why I'm able to search for it on Slashdot and find it, eh?

      No, SEO bombing is the way to go. Also notify google.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    2. Re:Slashdot, please do something ! by Billly+Gates · · Score: 1

      Doing my own searches for key terms that (don't want to reference it)is the top and rapidly rising with the other keywords. Apparently his schemes are working and it looks like he bought tons of bad pools or server farms which just repeat the same keywords over and over again.

    3. Re:Slashdot, please do something ! by nedwidek · · Score: 2

      rel="nofollow" is what you use with a link to indicate that it should not be considered for page rank. Slashdot already uses that as you note.

      It will still show up as a hit in a search.

      --
      Post anonymously - For when your opinion embarrasses even you!
    4. Re:Slashdot, please do something ! by Dahan · · Score: 2

      My examination of the link content (using Chrome, right-click the link and pick "Inspect element"), shows that there is NO rel=nofollow attribute for any link

      Maybe Chrome just sucks then, since the rel=nofollow is in fact there for all the links.

  10. Re: No future?? by symbolset · · Score: 1

    Oh, what they've got is interesting now if they'd drop Windows like the bad habit it is and give us a decent Intel Android tablet. You'd think they'd leap at it - bigger tablets mean more room for a bigger battery.

    It's not like Microsoft is holding back on the Tegra 3 WinRT tablets to give them a leg up.

    --
    Help stamp out iliturcy.
  11. Re:Make mycleanpc reference shit eating by Billly+Gates · · Score: 4, Insightful

    Oh come on moderators.

    That link is the 2nd most disgusting thing besides Goatse and I am sick and tired of that Mycleanx troll (wont say it as it will increase his SEO and page ranking.

      The only way we can stop that dipshit is to lower his Google ranking or the more he spams the more we will bring troll sites for his potential customers instead.

  12. Re:Again: You idiots fell for the straw man argume by Colonel+Korn · · Score: 2

    The topic with the *architecture* was about the simple and clean elegance of ARM vs x86 with its tons of old shit.

    And the topic with the *processors* was about efficiency.
    ARM processors are 10 times as efficient as Intel ones. The architecture isn’t even mentioned in that.

    Those are two completely separate things!

    And yet Intel's first real entry into the phone processor market, Medfield, is equivalent to ARM in terms of power efficiency. ARM is 1x as efficient as x86, not 10x.

    --
    "I zero-index my hamsters" - Willtor (147206)
  13. ARM has some advantages by Required+Snark · · Score: 4, Insightful
    The ARMed camp has intrinsic advantages over INTEL.

    They don't have all the legacy instruction set issues to deal with. Intel must be backward compatible with all previous versions. Remember, the 8080 subset is still alive and well in the INTEL architecture. This comes with a cost.

    It's easier to move up from a lower power system to a higher power system. In this context power can be thought of as both electrical power consumption and as compute power. Moving down means something must be simplified/eliminated, and the backwards compatibility issues makes this much harder.

    When it comes to mobile devices, ARM owns the market and has the network effect working for it. This is how INTEL kept a stranglehold on the PC market, but it works against them for mobile.

    ARM is not monolithic in the same way as INTEL. Because of the license based IP model, there are many more variations of ARM chips then INTEL chips. The resources to make variations comes from the IP user base, not from ARM. A single company, no matter how dominant, cannot afford to support that many variants. If some of the versions fail, the cost is not born by ARM. If INTEL guesses wrong and makes a dud, they have to absorb the cost.

    INTEL is no pushover, but I think ARM has the advantage.

    --
    Why is Snark Required?
    1. Re:ARM has some advantages by Anonymous Coward · · Score: 1

      I'm amazed that nobody seems to be mentioning ARM's biggest advantage. ARM is happy to license a core to a manufacturer to incorporate as part of an overall SoC design which means that manufacturers are free to create whatever overall SoC functionality they like. If you want a security chip, special IO, a DSP, some integrated RAM, a GPU, or anything else then any manufacturer is free to put together that chip and sell it. With Intel you get to use the chips that Intel produce with the features they decide to include and unless you're as big as Apple or Microsoft you've really go no say over the features that go into that chip. With Intel's manufacturing advantage that could work well in many cases, but the different approach will mean there's always a place for ARM. Maybe Intel will produce chips good enough to one day take over much of the tablet and mobile phone market if they do well, but computing capability is making its way into more and more devices and the CPU in, say, a smart electricity meter, or a home alarm system or a washing machine is far better suited to the custom offerings that ARM can provide, particularly since for many of those applications speed is not really an issue, so Intel's advanced manufacturing does not give them an advantage in those situations.

    2. Re:ARM has some advantages by FrostedWheat · · Score: 1

      Intel is not an acronym, you don't need to type it in capitals. Looks really odd.

    3. Re:ARM has some advantages by BitZtream · · Score: 1

      Remember, the 8080 subset is still alive and well in the INTEL architecture. This comes with a cost.

      Seriously? There is much 8080 instruction set in an current x86 processor as there is in an ARM or AVR processor. They ALL share a small common set of instructions like nop and xor, but beyond that saying x86 is carrying that kind of legacy is silly.

      Its worth noting however, that ARM already suffers from the same issue. Look at ARM devices that support the ENTIRE ARM instruction set. They are shit.

      What you do with ARM however is pick the instruction set you want and just build a CPU with that, and that is pretty efficient. You don't built an iPad with the entire ARM instruction set, legacy, thumb, native java, and the other couple major parts I can't remember for arm off the top of my head. You build an iPad with thumb and current native like the original iPads, or just the one instruction set as was done on the dual core iPads.

      Go check out http://en.wikipedia.org/wiki/ARM_architecture#CPU_modes. Almost EVERYTHING in section 6 of that page is an OPTIONAL component to ARM. With Intel and x86, you have to carry ALL of those things, you don't get to pick and choose. Should Intel start making the same options (not the incompatibility that will create) then you'll see the same kind of performance boosts.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  14. Re:STRAW MAN ARGUMENT! by FrankSchwab · · Score: 1

    The topic with the "architecture" was about the simple and clean elegance of 680x0 vs x86 with its tons of old shit.

    Oh, wait, am I in the wrong century?

    --
    And the worms ate into his brain.
  15. Caveat lector by gweihir · · Score: 3, Interesting

    Simply put, as Intel has no standing in the ARM market (and AMD has now), Intel has every motivation to distort the facts.

    That said, there is indication that while x86 is not in principle more power-hungry than ARM,in practice, on silicon, it is today. The main reason is that it requires more chip area and more complex circuitry, which in practice leads to higher power consumption because of communication and signal distribution overheads and because complex circuits are far harder to optimize, not only for power consumption. Again, that does not mean that in principle it is infeasible. But note that larger chip area is also a strong argument against x86 if size matters.

    There is also the fact that low-power ARM is more energy efficient than low-power x86 when you look at the market. So maybe this person is just saying that Intel messed up and failed to make good low-power x86 implementations while ARM did not. Looking back at power-disasters like the P4, this would be plausible as well. If, on the other hand, I look at CPUs like the AMD LX800 x86 offering, (e.g. used in the Alix boards), these are pretty power efficient and may even get into ARM ranges. They are pretty slow at full load though and have a large chip area.

    So my impression is that the Intel person just said that while they do not have any offering comparable to ARM, it is their fault and not a fundamental problem of x86. I am unsure this is right, although I certainly agree that Intel does not have a leg to stand on in the market for power-efficient CPUs.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Caveat lector by dkf · · Score: 2

      Simply put, as Intel has no standing in the ARM market (and AMD has now), Intel has every motivation to distort the facts.

      Did you know that Intel used to make ARM processors (StrongARM, XScale)? And that they are (probably) still an ARM licensee?

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    2. Re:Caveat lector by wwbbs · · Score: 1
  16. I see nothing by epine · · Score: 1

    Me thinks Sgt Schultz doth protest too much. Since my first post here in the early days of URL speak-and-spell, I've propounded that the disadvantages of x86 to RISC in performance were almost entirely illusory (brazen bubbles in the fabric of reality now feeding the worms notwithstanding).

    That said, on the power front, x86 bites. Possibly it bites like an undershot chihuahua in some small way that a billion dollars of doggy dentistry could adequately rectify—but it most certainly bites. Jumbles of instruction prefix opcodes and the inconsistent and partial nature of flag register updates spring to mind in bow-legged glory. A time machine erected in the lobby of an Intel design center with a small do-not-disturb sign hung above the door would sit unmolested by the stampede of pocket-protectors for not so long as a virgin newly arrived in 72 member frat-house of Perpetual Erection. (Turns out the prophet was a touch dyslexic. [snide]I've been reading God Is Not Great which I've privately subtitled Ridicule, Where Art Thou?. "Seventy-two virgins each? WTF? Do you think virgins grow on trees? It's a regrettable misprint. Sorry, you'll have to share—but not until you reach consensus on who goes first. I see nothing that prevents you from enjoying a satisfying afterlife all the same, so quit your bitching."[/snide]) The shrewdest Intel engineers will set the time machine to the late 1960s, enjoy the party for a year or two (virgins will be in short supply), before charting a cruise ship to California to doctor some 8008 family architectural specifications when no-one is looking.

    I'm kind of looking forward to the success of the SETI program so we can conduct some proper black-box bake-offs. Let's boxgram up the C language specification along with the ARM and x86 instruction set specifications and warble them into subspace to a couple of competitive Ferengi monasteries (Shaolin temples of combinatoric reasoning), giving only the fabrication detail the the embodied processors are fabricated primarily in the element silicon, and that we really care about power consumption. Then run the generated code from the Xeno-compilers side by side on the chips where Sgt Schultz presently sees nothing to see which wins and by what margin.

    The point I'm making is that over the years Intel has contributed an awful lot to the dentistry of GCC and other compilers to promulgate this mirage that there's nothing to see here.

    Yet rare is the architecture so trammelled by men it doesn't freshen up nicely advantaged by a die shrink.

  17. Are people insane? by twistofsin · · Score: 1

    ARM is a fairly open architecture. If you want to create ARM chips you buy a frigging license.

    How the hell can Intel be threatened by something that they can produce if they choose to?

  18. I still get more bang with intel by Osgeld · · Score: 1

    I can get a ~22 watt intel atom from local retail stores, drop any dam os I please on it make it do a job, and when its over move on. ARM I either need to settle for some decade + old speed, while yea its drawing much less power, its taking much longer, or spend a butload of time designing the dream machine ... to only keep up with the atom, which cost lots of time and money.

    dumbass gp computing intel wins, fine tuned amazing technology arm wins, now how much money and time do you have?

  19. Re:Make mycleanpc reference shit eating by couchslug · · Score: 1

    Sounds good. There's no point in chivalry.

    --
    "This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
  20. Re:Make mycleanpc reference shit eating by couchslug · · Score: 1

    Might as well add as many shock sites as convenient to the response.

    http://encyclopediadramatica.se/ has plenty of references.

    http://goatse.ru/ is a goatse mirror.

    When I pasted your content the links weren't highlighted as in your original post. Any idea why?

    --
    "This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
  21. Yet nobody uses intel in the mobile market... by Foske · · Score: 1

    They can dismiss it, but when you look at all the tricks they have to apply to keep their current processors running MSDOS 1.0, their design is simply scary. As a processor designer I am amazed how well they manage to keep their bloated processors running, adding extensions of the x86 architecture on top of each other. I want to bet that if they would start from scratch and drop support for Microsoft Flight Simulator 1.0 (i.e. make a decent 64 bit processor, with a decent, not bytewise instruction set without 20 layers of extensions) they could easily lower the power consumption with a factor of 2.

    Then again, the ARM processors lean a bit too much to the RISC approach to be a fair comparison. (yes I know, under the hood modern Intel processors are not CISC any more either, but I'm talking assembly level) The performance per cycle of an ARM is really crap compared to modern intel architectures. The good news is: if ARM manages to improve that a bit, they will manage to stay in the mobile processors drivers seat.

    Intel and ARM are coming from a different direction when it comes to the sweet spot of mobile computing: ARM needs to improve performance, Intel has to reduce power. Oh, and ARM is powering the mobile world, so who are you to say Intel is better, mister marketing guy ?

  22. Did MyCleanPC... by unixisc · · Score: 1

    Okay, once your computer got cleaned w/ MyCleanPC, then what happened? Did you stop abusing your daughter - both physically, verbally and mentally? Did your insurance company restore your coverage? Did your cancer get magically cured? Did your wife come back to you?

    1. Re:Did MyCleanPC... by BlackSnake112 · · Score: 1

      Okay, once your computer got cleaned w/ MyCleanPC, then what happened? Did you stop abusing your daughter - both physically, verbally and mentally? Did your insurance company restore your coverage? Did your cancer get magically cured? Did your wife come back to you?

      No, that happens when you play country music songs backwards.

  23. Re:It might be pew pew along the lines of magazine by mister_playboy · · Score: 2

    Considering the usernames chosen for these posts, I have to conclude it's just GNAA-style trolling. A company paying people to post here probably wouldn't allow them to pick usernames like "JonesFuckAssFucker".

    --
    Do what thou wilt shall be the whole of the Law ::: Love is the law, love under will
  24. Backward compatibility is there... by SuperKendall · · Score: 2

    It's actually a surprise that ARM is taking off more in higher end systems (higher end meaning tablets and smart phones).

    Since the iPhone and iPad are in effect the start of those becoming really widespread things, they are the definition of backwards compatible, the base... that's what will make it difficult to move the market away from them.

    The Motorola chips never had a totally massive market penetration the way Arm does now in mobile/tablet worlds... I am not sure even slightly superior chips from Intel would sway many hardware makers.

    I think Intel is really banking on Windows 8 to make headway in the tablet market so they can build up marketshare again to base an attack on Arm from.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
    1. Re:Backward compatibility is there... by MachineShedFred · · Score: 1

      If Apple decided to take the iPad / iPhone / iPod to x86, it wouldn't be their first barbecue. They've done that twice before (MC680x0 -> PPC -> x86).

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
  25. "RISC" is misleading by sqldr · · Score: 1

    What vital instructions does x86 have that ARM doesn't? ARM is far easier to program and I don't see anything missing. Most of the extra instructions on x86 are hacks to make up for the lack of conditional instructions and 15 registers which make ARM such a joy to program in the first place.

    --
    I wrote my first program at the age of six, and I still can't work out how this website works.
  26. Re:Again: You idiots fell for the straw man argume by TheRaven64 · · Score: 1

    And yet Intel's first real entry into the phone processor market, Medfield, is equivalent to ARM in terms of power efficiency

    This is a strange definition of 'equivalent' meaning 'uses more power at idle than a similarly performing ARM core does under full load'.

    --
    I am TheRaven on Soylent News
  27. Totally agree --- only SoC price matters by Morgaine · · Score: 2

    ARM works because 1) it's good enough while being 2) cheap enough.

    I think that you are totally right about this. Maintaining x86 compatibility may hurt Intel a little, but it's not the key issue.

    ARM-based SoCs cost under $10 in volume, and Intel simply cannot compete in that space. It doesn't want to. It likes large prices and huge profit margins.

    Meanwhile, ARM keeps improving the performance of their cores, while the SoC manufacturers keep improving the capabilities of their SoCs, including (critically) power savings. It's a marriage made in heaven, and the only way that ARM can lose this market to Intel is by upping their license royalties massively so that ARM-based SoC prices move into Intel's territory. There is no sign of that happening.

    Short version of the above: Intel fails in the mobile space because of price inertia. There is no sign of that changing either, at least judging by the article. They refuse to compete on SoC pricing. And they're in denial that price matters.

    Morgaine.

    --
    "The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
  28. GMA == Graphics My Ass by tepples · · Score: 1

    For devices with no keyboard, see makomk's comment. For laptops, when you compare Intel's graphics offering to AMD's and NVIDIA's, you'll probably end up with the impression that GMA stands for "Graphics My Ass".

  29. Re:Again: You idiots fell for the straw man argume by hattig · · Score: 1

    Microsoft Office 2013 is available for Windows RT (the ARM version). Indeed I believe it is included in the licensing cost, so it will come with every WinRT tablet, netbook and nettop. And that must make Intel sweat a little!

  30. Re: No future?? by hattig · · Score: 1

    That is just one measure of a process. Apparently TSMC's 40nm node had transistor density comparable to Intel's 32nm node, so it is quite possible that TSMC's 28nm has transistor density comparable with Intel's 22nm, even if other aspects aren't comparable. I wouldn't say it is a trick for marketing myself.

  31. x86 tax : The translation front-end by Theovon · · Score: 1

    Maybe things have changed, but the last time I checked out the Atom floor plan, about half the chip area was cache (which is normal), about a quarter was the actual computation back-end of the CPU, and the remaining quarter was the x86-to-RISC translation front-end. Like all modern x86 processors (as well as PowerPC and probably some other architectures), the CISC instruction set (well, more complex RISC in the case of PPC) is translated dynamically to a simpler RISC-like code that is easier to execute. In a Sandy Bridge, the translator is tiny compared to the rest of the huge 4-issue superscalar massively out-of-order back end. But Atoms are simple 2-issue in-order pipelines, which makes them very small and energy-efficient (albeit a lot slower), but there's not much we can do about that front-end.

  32. SOC + licensing, not ISA by mevets · · Score: 1

    Its in the module licensing that ARM really has the lead. There are a huge number of firms which design their own SOC with ARM core(s) and their own components. That means there are a generation (almost a generation + 1/2) of designers comfortable with ARM tools, integration and understanding of the architecture.
    It took intel until last year to sideline the approach of designing an SOC for each application they could see; and are now finally working on licensing cores for companies to include in their own designs. A bit late to the party, but who knows.
    20 years ago, industries like automotive electronics and telecommunications were owned by Motorola. Not for its ISA - 68k, 88k, ppc were all different - but because of the expertise of the hardware designers. Now x86 is in both those industries, and probably soon to dominate.

  33. Re:Again: You idiots fell for the straw man argume by jedidiah · · Score: 1

    > Do you really think you'll ever see the day where you genuinely want to run a desktop OS and Office on such a small device?

    Sure. Intel just announced a "desktop" system that's not much larger than an iPhone. Give a Phone an HDMI port and a USB port and a real OS and you can use it just like a desktop.

    Size or what's built into the device in terms of input peripherals is really quite irrelevant. It's a red herring that only distracts the clueless.

    --
    A Pirate and a Puritan look the same on a balance sheet.
  34. Re: No future?? by Bert64 · · Score: 1

    AMD were never really interested in the high end desktop market, they were the performance leader for a while but only because Intel dropped the ball...

    High end desktops are low volume, and mostly about marketing and bragging rights these days. A few years ago you bought the fastest cpu you could because even that would be relatively slow, and quickly obsolete. Today any CPU made in the last few years will suffice for 99% of users, so only a small niche need the high end cpus.

    --
    http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  35. At this part of the Universe, people are different by marcosdumay · · Score: 1

    It's funny the way you describe history. I can't even guess where did you meet those people.

    Around here, the PC industry is facing their "demise" because PCs have become good enough. While people were always screamming MORE until the last decade, they've just stopped and realised that their hardware does everything they want nowadays.

    Also, those people more concerned with processing power of mobiles than consuption, well, I could never find one of them.

  36. Re:Again: You idiots fell for the straw man argume by shutdown+-p+now · · Score: 1

    This is a strange definition of 'equivalent' meaning 'uses more power at idle than a similarly performing ARM core does under full load'.

    That's not what I took away from Anandtech reviews of Medfield phones. If it draws so much power, how come its battery life is in the middle of the pack of ARM smartphones running the same OS with the same battery capacity?

  37. So how much faster would the chip have to be... by SuperKendall · · Score: 1

    If Apple decided to take the iPad / iPhone / iPod to x86, it wouldn't be their first barbecue. They've done that twice before (MC680x0 -> PPC -> x86).

    Either the chip would have to emulate AR pretty well (the MC6800 and PPC were not that different) or the new chip would have to run fast enough to make an emulation layer work.

    Developers could also re-compile pretty quickly, and it might be that Apple would leverage that. But I don't see the transition being as easy as the ones they did before.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
    1. Re:So how much faster would the chip have to be... by armv7 · · Score: 1

      There wouldn't be a financial incentive for Apple to do this since they design their own ARM SoCs which are manufactured by Samsung. (Cortex A8, A9, A15s are all cheaper than Atoms) A company like Apple cares about their bottom line. The reason Apple ditched the PowerPC architecture was because Intel was producing mobile processors (Centrinos etc) while Motorola/IBM were not designing any mobile (power efficient) powerpc chip or had any plans to. I wouldn't be surprised if Apple eventually ditched Intel in the MacBook Air line, replacing them with custom Apple designed ARM SOC. (Manufactured by either TSMC or Samsung) especially if they can offer comparable performance and power efficiency at a much much lower cost. It would allow them to make an even higher profit off the device.

  38. Thumb uses R13 as stack by tepples · · Score: 1

    Common C compilers use an ABI such that R13 is reserved for use as a stack pointer, but that's not an architectural requirement.

    I'd say it's an architectural requirement if some of your code uses the Thumb ISA. The push and pop instructions in Thumb depend on R13.

  39. RISC by DarthVain · · Score: 1

    I was led to believe that RISC was going to change everything...

  40. ARM has no future? by nurb432 · · Score: 1

    Umm ya sure it doesn't.

    --
    ---- Booth was a patriot ----
  41. Re:ARM already won by arkane1234 · · Score: 1

    Intel lost the CPU wars about as much as Microsoft lost the computer market.

    It's still out there, and it's used in lots of places. ARM is just making a dent, it's nowhere near cornering the CPU market. I'm not saying it's inferior, I'm saying it's not as prevalent as you think it is apparently.

    --
    -- This space for lease, low setup fee, inquire within!
  42. Re:It might be pew pew along the lines of magazine by Hognoxious · · Score: 1

    A company paying people to post here probably wouldn't allow them to pick usernames like "JonesFuckAssFucker".

    Are you assuming that the company paying them is the same company they're talking about?

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  43. Re:Again: You idiots fell for the straw man argume by Bengie · · Score: 1

    ARM goes from 250mw at 800mhz to 5 watts at 1.5ghz. If you're willing to clock low, you can make your numbers really good.