Why Do We Use x86 CPUs?
bluefoxlucid asks: "With Apple having now switched to x86 CPUs, I've been wondering for a while why we use the x86 architecture at all. The Power architecture was known for its better performance per clock; and still other RISC architectures such as the various ARM models provide very high performance per clock as well as reduced power usage, opening some potential for low-power laptops. Compilers can also deal with optimization in RISC architectures more easily, since the instruction set is smaller and the possible scheduling arrangements are thus reduced greatly. With Just-in-Time compilation, legacy x86 programs could be painlessly run on ARM/PPC by translating them dynamically at run time, similar to how CIL and Java work. So really, what do you all think about our choice of primary CPU architecture? Are x86 and x86_64 a good choice; or should we have shot for PPC64 or a 64-bit ARM solution?" The problem right now is that if we were going to try to "vote with our wallets" for computing architecture, the only vote would be x86. How long do you see Intel maintaining its dominance in the home PC market?
Until someone replaces the PC.
PC architecture sits in a local minima where the fastest route to greater profits lies in improving existing designs, rather than developing new approaches.
The reason "We" use x86 is because "we" use PCs, where x86 technology is dominant and obvious. However, "we" also use PDAs, cell phones, TiVos and even game console systems. As the functions of those devices melt into a new class of unified devices, other architectures will advance.
The real irony is that, for most of these other devices, the underlying architecture is invisible. Few know that Palm switched processors a few years back. Fewer still know what kind of cpu powers their cell phone.
tasks(723) drafts(105) languages(484) examples(29106)
because they don't cost an ARM and a leg and they don't pose as much of a RISC
Change is expensive.
So don't change unless there is a compelling reason.
Hard to optimize? You only have to optimize the compiler once, over the millions of devices this cost is small.
Runtime interpreter/compilers, you lose the speed advantage.
Volume and competition makes x86 series products cheap
The reason given, which people seem to keep forgetting, was pretty simple and believable:
Performance per watt.
The PPC architecture was not improving _at all_ in performance per watt. Apple's market was growing fastest in the portable space, but it was becoming impossible to keep temperatures and power consumption down with PPC processors.
And IBM's future plans for the product line were focusing on the Power series (for high-end servers) and the Core processors (for Xbox 360's) and not on the PowerPCs themselves.
While I've never had any particular love for the x86 instruction sets, I, for one, enjoy the performance of my Macbook Pro Core 2 Duo, and the fact that it doesn't burn my lap off, like a PowerPC G5-based laptop would.
Why do we drive on the right side of the road in some places, left in others?
Why do most screws tighten clockwise?
Why do we use a 7 day calender, 60 second minutes, 60 minute hours, and a 24 hour clock like the Sumerians instead of base 10?
Why do we count in base 10 instead of binary, hex, base 12?
Why don't we all switch to Esperanto or some other idealized language?
Or if you're familiar with the story: Why are the Space Shuttle boosters the size they are?
Because sometimes it's easier to stick with a standard.
There. Question answered. Next article please.
The world is made by those who show up for the job.
There's no doubt that x86 is an ugly, hacked-together architecture whose life has been extended far beyond reason by various extensions which were hobbled by having to maintain backwards compatibility. x86 was designed nearly 30 years ago as an entry level processor for the technology of the day. It was originally built as a 16-bit architecture, then extended to 32-bit, and recently 64-bit (compare to PowerPC, designed for 64-bit and, for the earlier models, scaled back to 32-bit with forward-looking design features). Even the major x86 hardware vendors, Intel and AMD, have long since stopped implementing x86 in hardware, choosing instead to design decoders which rapidly translate x86 instructions to the native RISC instruction set used by the cores.
So why the hell do we use x86? A major reason is inertia. The PC is centered around the x86, and there are mountains and mountains of legacy software in use that depend on it. For those of us in the open-source world, it's not to difficult to recompile and maintain a new binary architecture, but for all of the software out there that's only available in binary form, emulation remains the only option. And although binary emulation of x86 is always improving, it remains much slower than native code, even with translation caches. Emulation is, at this point, fine for applications that aren't computationally intensive, but the overhead is such that the clocks-per-instruction and performance-per-watt advantages of better-designed architectures disappears.
A side effect of the enormous inertia behind x86 is that a vast volume of sales goes to Intel and AMD, which in turn funds massive engineering projects to improve x86. All things being equal, the same investment of engineer man-hours would bear more performance fruit on MIPS, SPARC, POWER, ARM, Alpha, or any of a number of other more modern architectures, but because of the huge volumes the x86 manufacturers deal in, they can afford to spend the extra effort improving the x86. Nowadays, x86 has gotten fast enough that there are basically only 2 competing architectures left for general-purpose computing (the embedded space is another matter, though): SPARC and POWER. SPARC, in the form of the Niagra, has a very high-throughput multithreaded processor design great for server work, but it's very lackluster for low-latency and floating-point workloads. POWER has some extremely nice designs powering next-generation consoles (Xenon and the even more impressive Cell), but the Cell in particular is so radically different from a standard processor design that it requires changes in coding practice to really take advantage of it. So, even though the Cell can mop the floor with a Core 2 or an Opteron when fully optimized code is used, it's easier (right now at least) to develop code that uses an x86 well than code which fully utilizes the Cell.
Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
Andy Grove: "Not Much."
One perspective on the question:
Non x86 architectures are certainly not inherently better clock for clock. That's a matter of specific chip designs more than anything else. The P4 was a fairly fast chip, but miserable clock for clock against a G4. An Athlon however, was much closer to a G4. (Remember kids, not all code takes advantage of SIMD like AltiVec!) And, the G4 wasn't very easy get bring to super high clock rates. The whole argument of architectural elegance no longer applies.
The RISC Revolution started at a time when decoding an ugly architecture like VAX or x86 would require a significant portion of the available chip area. The legacy modes of x86 significantly held back performance because the 8086 and 80286 compatibility areas took up space that could have been used for cache or floating point hardware, or whatever. Then, transistor budgets grew. People stopped manually placing individual transistors, and then they stopped manually fiddling with individual gates for the most part. Chips grew in transistor count to the point where basically, nobody knew what to do with all the extra space. When that happened, x86 instruction decoding became a tiny area of the chip. Removing legacy cruft from x86 really wouldn't have been a significant design win after about P6/K7.
Instead of being a design win, the fixed instruction length of the RISc architectures no longer meant improved performance through simple decoding. They meant that even simple instructions took as much space as average instructions. Really complex instructions weren't allowed, so they had to be implimented as multiple instructions. Something that was one byte on x86 was always exactly 4 bytes on MIPS. Something that was 12 bytes on x86 might be done as four instruction on MIPS, and thus take 16 bytes. So, effective instruction cache sizes and effective instruction fetch bandwidth grew on X86 compared to purer RISC architectures.
At the same time, the gap between compute performance and memory bandwidth on all architectures was widening. Instruction fetch badwidth was irrelevent in the time of the PC XT, because RAM fetches could actually be done in like a single cycle. Less that it takes to get to SRAM on-chip caches today. But, as time went on, memory accesses became more and more costly. So, if a MIPS machine was in a super tight loop that ran in L1 cache, it might be okay. But, it it was just going balls to the wall through sequential instructions, or a loop that was much larger than cache, then it didn't matter how fast it could compute the instructions if it couldn't fetch them quick enough to keep the processor fed. but, X86 absurdly ugly instruction encoding acted like a sort of compression, meaning that a loop was more likely to fit in a particularly sized cache, and that better use of instruction fetch bandwidth was made.
Also, people had software that ran on X86, so they bought 9000 bazillion chips to run it all. The money spent on those 9000 bazillion chips got invested in building better chips. If somebody had the sort of financial resources that Intel had to build a better chip, and they shipped it in that sort of volume, we might well se an extremely competetive desktop SPARC or ARM chip.
tIs' ebacsue ilttel neidna si ebttre.
-- MartinG To mail me: echo kewyjlcxyzvjfxbqwh | tr bcefhjklqvwxyz
The x86 ISA hasn't been bound to Intel for some time now. There are currently at least three manufacturers making processors that implement the ISA, and of course there is a vast number of companies making software that runs on that ISA. Not only that, Intel isn't even the source of all of the changes/enhancements in their own ISA -- see AMD64.
With all of that momentum, it's hard to see how any other ISA could make as much practical sense.
And it's not like the ISA actually constrains the processor design much, either. NONE of the current x86 implementations actually execute the x86 instructions directly. x86 is basically a portable bytecode which gets translated by the processor into the RISC-like instruction set that *really* gets executed. You can almost think of x86 as a macro language.
For very small processors, perhaps the additional overhead of translating the x86 instructions into whatever internal microcode will actually be executed isn't acceptable. But in the desktop and even laptop space, modern CPUs pack so many millions of transistors that the cost of the additional translation is trivial, at least in terms of silicon real estate.
From the perspective of performance, that same overhead is a long term advantage because it allows generations of processors from different vendors to decouple the internal architecture from the external instruction set. Since it's not feasible, at least in the closed source world, for every processor generation from every vendor to use a different ISA, coupling the ISA to the internal architecture would constrain the performance improvements that CPU designers could make. Taking a 1% performance hit from the translation (and it's probably not that large) enables chipmakers to stay close to the performance improvement curve suggested by Moore's law[*], without requiring software vendors to support a half dozen ISAs.
In short, x86 may not be the best ISA ever designed from a theoretical standpoint, but it does the job and it provides a well-known standard around which both the software and hardware worlds can build and compete.
It's not going anywhere anytime soon.
[*] Yes, I know Moore's law is about transistor counts, not performance.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
x86 only refers to a set of API interfaces with the CPU architecture. As of the launch of the Pentium, the modern "x86" processor is a RISC based CPU with an internal x86 translation layer. Start you learning here. x86 is also refered to as x86-32 or IA-32. And with the current generation of processors, we are leaving that behind for "x64" also known as EM64T, IA-32e or IA-64 in its various iterations. Many of the "x64" series generally maintian "x86" interface compatibility in order to allow legacy operation. For instance you can run Warp Server on a dual Opteron server just fine.