The Linux-Proof Processor That Nobody Wants
Bruce Perens writes "Clover Trail, Intel's newly announced 'Linux proof' processor, is already a dead end for technical and business reasons. Clover Trail is said to include power-management that will make the Atom run longer under Windows. It had better, since Atom currently provides about 1/4 of the power efficiency of the ARM processors that run iOS and Android devices. The details of Clover Trail's power management won't be disclosed to Linux developers. Power management isn't magic, though — there is no great secret about shutting down hardware that isn't being used. Other CPU manufacturers, and Intel itself, will provide similar power management to Linux on later chips. Why has Atom lagged so far behind ARM? Simply because ARM requires fewer transistors to do the same job. Atom and most of Intel's line are based on the ia32 architecture. ia32 dates back to the 1970s and is the last bastion of CISC, Complex Instruction Set Computing. ARM and all later architectures are based on RISC, Reduced Instruction Set Computing, which provides very simple instructions that run fast. RISC chips allow the language compilers to perform complex tasks by combining instructions, rather than by selecting a single complex instruction that's 'perfect' for the task. As it happens, compilers are more likely to get optimal performance with a number of RISC instructions than with a few big instructions that are over-generalized or don't do exactly what the compiler requires. RISC instructions are much more likely to run in a single processor cycle than complex ones. So, ARM ends up being several times more efficient than Intel."
The x86 instruction set is pretty awful and Atom is a pretty lousy processor. But that's probably not due to RISC vs. CISC. IA32 today is little more than an encoding for a sequence of RISC instructions, and the decoder takes up very little silicon. If there really were large intrinsic performance differences, companies like Apple wouldn't have switched to x86 and RISC would have won in the desktop and workstation markets, both of which are performance sensitive.
I'd like to see a well-founded analysis of the differences of Atom and ARM, but superficial statements like "RISC is bad" don't cut it.
Like I posted elsewhere, intel hasn't made real CISC processors for years, and I don't think anyone has.
Modern Intel processors are just RISC with a decoder to the old CISC instruction set.
RISC beats CISC in price performance trade-off, but backwards compatibility keeps the interface the same.
"ARM ends up being several times more efficient than Intel"
Wow. Someone suffered a flashback to the ancient CISC vs RISC wars.
This is really totally out to lunch. Seek out some analysis from actual CPU designers on the topic. What I read generally pegs the x86 CISC overhead at maybe 10%, not several times.
While I do feel it is annoying that Intel is pushing an Anti-Linux platform, it doesn't make sense to trot out ancient CISC/RISC myths to attack it.
Intel Chips have lagged because they were targeting much different performance envelopes. But now the performance envelopes are converging and so are the power envelopes.
Medfield has already been demonstrated at competetive power envelope in smartphones.
http://www.anandtech.com/show/5770/lava-xolo-x900-review-the-first-intel-medfield-phone/6
Again we see reasonable numbers for the X900 but nothing stellar. The good news is that the whole x86 can't be power efficient argument appears to be completely debunked with the release of a single device.
I would argue the problem for Apple wasn't about performance but about updates, mobile, and logistics.. PowerPC originally held promise as a collaboration between Motorola, IBM, and Apple. IBM got much out of it as their current line of servers and workstations run on it. Apple's needs were different than IBM's. Apple needed new processors every year or so to keep up with Moore's law. Apple needed more power efficient mobile processors. Also Apple needed a stable supply of the processors.
Despite ordering millions of chips a year, Apple was never going to be a big customer for Motorola or IBM. Their chips would be highly customized that none of their other customers needed or wanted and Apple needed updates every year. So neither Motorola or IBM could dedicate huge resources for a small order of chips as they could make millions more for other customers. PowerPC might have eventually come up with a mobile G5 that could rival Intel but it would have taken many years and lots of R&D. IBM and Motorola didn't want to invest that kind of effort (again for one customer). So every year Apple would order enough chips they thought they needed. If they were short, they would have order more. Now Motorola and IBM like most manufacturers (including Apple) do not like carrying excess inventory. So they were never able to keep up with Apple's orders as their other customers had more steady and larger chip orders.
So what was Apple to do? Intel represented the best option. Intel's mobile x86 chips were more power efficient than PowerPC versions. Intel would keep up the yearly updates of their chips. If Apple increased their orders from Intel, Intel could handle it because if Apple wasn't ordering a custom part, they were ordering more of a stock part. There are some cases where Apple has Intel design custom chips for them, mostly on the lower power side; however, Intel still can sell these to their other customers.
As a side note, as a difference in the relationship between IBM and Apple look at the relationship between MS and IBM for the Xbox 360 Xenon chip. This was a custom design by IBM for MS, but the basic chip design hasn't changed in seven years. As such chip manufacturing has been able to move the chip to smaller lithographies (90nm --> 45nm in 2008) both increasing yield and lowering cost.
Well, there's spam egg sausage and spam, that's not got much spam in it.
LOAD and STORE aren't single cycle instructions on any RISC I know of. Lots of RISC designs also have multicycle floating point instructions. A lot of second or third generation RISCs added a MULTIPLY instruction and they were multiple cycle.
There are not a lot of hard and fast rules about what makes things RISCy, mostly just "they tend to this" and "tend not to that". Like "tend to have very simple addressing modes" (most have register+constant displacement -- but the AMD29k had an adder before you could get the register data out, so R[n+C1]+C2 which is more complext then the norm). Also "no more then two source registers and one destination register per instruction" (I think the PPC breaks this) -- oh, and "no condition register" but the PPC breaks that.
Yeah, Intel invented microcode again, or a new marketing term for it. It doesn't make the x86 any more a RISC then the VAX was though. (for anyone too young to remember the VAX was the poster child for big fast CISC before the x86 became the big deal it is today).
Like I posted elsewhere, intel hasn't made real CISC processors for years, and I don't think anyone has. Modern Intel processors are just RISC with a decoder to the old CISC instruction set.
Exactly. Intel has been doing this ever since the Pentium Pro and Pentium II came out in the 1990s. Anyone who knows much at all about x86 CPUs is aware of this, and Perens certainly will be. That's why I'm surprised that that article misleadingly states:-
So, we start with the fact that Atom isn't really the right architecture for portable devices (*) with limited power budgets. Intel has tried to address this by building a hidden core within the chip that actually runs RISC instructions, while providing the CISC instruction set that ia32 programs like Microsoft Windows expect.
The "hidden core" bit is, of course, correct, but the way it's stated here implies that this is (a) something new and (b) something that Intel have done to mitigate performance issues on such devices, when in fact it's the way that all Intel's "x86" processors have been designed for the past 15 years!
Perhaps I'm misinterpreting or misunderstanding the article, and he's saying that- unlike previous CPUs- the new Atom chips have their "internal" RISC instruction set directly accessible to the outside world. But I don't think that's what was meant.
(*) This is in the context of having explained why IA32 is a legacy architecture not suited to portable devices and presented Atom as an example of this.
"Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
So does it matter when someone sends you a .pptx file that Office 2003 freezes on? Yeah, yeah, I'm pretty sure you can get a converter, but I like telling people that if their file has an 'x' in the extension it means that it's 'experimental' and they shouldn't send it to others. They need to send the version without the 'x'.
Faster! Faster! Faster would be better!
Getting back on topic: the last ARM architecture, ARMv8, is far from what was called "RISC" back in the '70s. E.g. it can run instructions of different sizes (16 vs 32 bit), it has 4 specialized instructions for AES, registers with different sizes (32, 64 and 128 bits), instructions for running a subset of the Java bytecode, a rich set of SIMD operations and specialized instructions for SHA-1 and SHA-256.
Similarily the architecture supported by the new Atom chips (which is AMD64/x86-64 BTW, IA32 is only present for backward compatibility) is almost universally run on RISC-like processors that have instruction translators. Considering that the increased density of the x86-64 instructions usually allows to save more cache transistors than the ones required for decoding the instructions themselves, I think that the power consumption differences that we see are more due to the implementation and different traditional focus areas of ARM vs Intel/AMD than inherent differences in the instruction sets.
There's a hidden treasure in Python 3.x: __prepare__()