Slashdot Mirror


How to Kill x86 and Thread-Level Parallelism

kid inputs: "There's an interesting article discussing how one might go about 'killing' x86. The article details a number of different technological solutions, from a clean 64-bit replacement (Alpha?), to a radically different VLIW approach (Itanium), and an evolutionary solution (Opteron). As is often the case in situations like these, market forces dictate which technologies become entrenched and whether or not they stay that way (VHS vs Beta, anyone?). Another article by the same author covers hardware multi-threading and exploiting thread level parallelism, like Intel's Hyperthreading or IBM's POWER4 with its dual-cores on a die. These types of implementations can really pay off if the software supports it. In the case of servers, most applications tend to be multi-user, and so are parallel in nature."

19 of 72 comments (clear)

  1. endian-little post first! by zulux · · Score: 3, Funny



    Post! First

    A From Litte A system endian!

    Rules! x86

    --

    Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

    1. Re:endian-little post first! by Haeleth · · Score: 2, Informative

      Programs do the exact same thing on every run. Jumps are jumps, they can be followed.

      If only it were that simple. Ever heard of a "computed goto"?

  2. How to kill x86 by Anonymous Coward · · Score: 2, Funny

    Buy Apple :D

  3. Don't forget by Misinformed · · Score: 4, Interesting

    The space shuttle still uses 16-bit x86s, the financial system is reliant on v_e_r_y old systems which spew out dot-matrix printed backups. Old systems survive today, and IMHO will always. It has to be organic.

    --
    --

    Slashdot: Racism against Indians OK. China bad, USA good. Blue pill in water supply.
  4. Let's kill x86! by ObviousGuy · · Score: 2, Insightful

    We should rewrite all of our COBOL programs in C while we're at it.

    Might as well compound the folly of tossing out a perfectly good instruction set with the folly of tossing out perfectly good source code.

    Update, don't reinvent. The desire to reinvent is a junior engineer character flaw. It takes several experiences in spending long hours tracking down bugs in the new implementation rather than simply updating some older code that worked fine.

    --
    I have been pwned because my /. password was too easy to guess.
    1. Re:Let's kill x86! by boelthorn · · Score: 2, Funny

      There are too opposing opinions in this matter:

      1. The mythical man-month: Plan to build one to throw away. You will anyhow.

      2. Hack something together. Extend it. It will work fine. (This approach really works excellent in Common Lisp and proves deadly for Perl programs)

      It is true that Intel's base instruction set survived the last 18 years quite unchanged. And if you consider the pre-80386-era even longer. It is also true that it is proven and works. But if you ever tried to write an assembler or disassembler for that instruction set, you know that it is a amazingly huge heap of crap.

      Intel's IA64 is a nice try, I personally like the approach of trying something new and clean, even though I dislike Intel's business strategies.

      Back to the point: It is sometimes really neccessary to reinvent and not to place more and more stones onto an unstable foundation.

    2. Re:Let's kill x86! by Valar · · Score: 4, Informative

      The problem with keeping the x86 architecture and the ISA are that it is carrying around legacy burdens from the 286. Even the p4 still boots into real address mode at boot up, and has to be PUT into protected mode. There are hundreds and hundreds of instructions, over 100 registers (but still only 8 GPRs), many of which overlap in purpose or are used for entirely non-intuitive purposes (CMPX EAX, EAX). x86 is ready, at the least, for a real version 2, that isn't afraid to break compatibility in order to add major architectural advances (I wouldn't mind a register ring :).

    3. Re:Let's kill x86! by Anonymous Coward · · Score: 2, Insightful

      It has to be put into protected mode at boot-up? Wow, that must take like 10 ns at least, every single time you cycle the power! You're right, better just scrap the whole thing...

  5. h/w vs s/w by StarBar · · Score: 2, Insightful

    This is much like my day to day work. The h/w guys thinks they are gods and always blames us s/w guys not to utilize the smartness of their designs fast enough. s/w compatibility is what counts for general purpose systems, and it always will. You can cry the guts out of yourself about bad system design and segment hell etc etc and it will not help.

    1. Re:h/w vs s/w by oscast · · Score: 2, Interesting

      I think the opposite... software should accomidate hardware. Software should be the comodity and hardware the primary asset.

  6. This is what SUN calls by caesar79 · · Score: 4, Informative

    "Throughput computing"..where the performance is measured not individually but in aggregate.
    See their media kit available at
    http://www.sun.com/aboutsun/media/presskits/thro ug hputcomputing/ for more details.

    However, I believe the whole idea is nothing new. AFAIK, there are only two ways of increasing the performance of a processor (Operations Per Second) - either increase the IPC (Instructions per cycle) by increasing parallelism or decrease the cycle time by increasing the clock Rate (Ghz).

    Each method has its limits and follows the law of diminishing returns - for e.g. increasing the clock rate implies increasing the number of stages in the pipeline...and after say 10000 stages, the penalties imposed due to flushing the pipeline might compensate for the increased GhZ. Similarly if you manage to place 100000 cores on a chip, scheduling amongst these cores and providing realtime access to the memory for all these cores will become the bottleneck. Hence, I take statements like "how to kill the x86" with a pinch of salt.

    Finally, it will the fabcrication (physical) technology that decides which one of these dies. For e.g. if tomorrow someone is able to come up with a process that enables 100Ghz chips at the (think extensions of SOI etc) decreasing the cycle time will win. Similarly, if someone comes out with femto (10^-15 ) metre fabrication technology, then parallelism will win.

  7. it doesn't matter anymore by ajagci · · Score: 2, Insightful

    Two decades ago, the instruction set still mattered because it was closely tied to how the processor executed things. Today, we can put enough logic between the instruction strem and the processor that the instruction set makes no difference anymore.

    And VLIW in particular is quite unconvincing: processors should rely less on compilers, not impose a bigger burden on software writers.

  8. Architecture for software reliability by Animats · · Score: 4, Informative
    Depends on the goal. Here's an architecture for reliability. If vendors had to pay whenever a program crashed, we would have seen this years ago.
    • Channelized I/O With current peripheral bus to memory interfaces, peripherals can store anywhere in memory. So drivers impact system stability. It doesn't have to be that way. IBM got this right in mainframe design in the 1960s. You want an MMU between the peripherals and memory. Drivers then become non-privileged programs. Existing peripherals don't even need to know there's an MMU in the middle, just as programs don't.
    • High-speed copy. Copying data in memory should be really fast, so fast that it's almost free, even if it takes copy-on-write hardware in the cache to do it. Why? Because then the temptation to put everything in one big address space decreases. With good interprocess communication (think QNX messaging, not CORBA, or horrors, SOAP), building programs out of components can actually work. This includes the OS. File systems, networking, and drivers should all be user programs.

      The neat hardware implementation of this would be to make all MOV instructions take nearly the same time, regardless of the amount of data moved. A MOV should result in a remapping of the source and destination memory in the cache system. Even if this were just implemented for aligned moves, it would be a big help. When your application's 8K buffer needs to be copied to the file system, that copy should be done by updating cache control info, not by really doing it.

    • Graphics MMUs Get rid of the "window damage" approach, and have real hardware support for overlapping windows. All that's needed are big sprites. Then programs don't have to know or care which window is on top. "Overlay planes" do some of this, but it's not general enough.

      With this, windowing becomes far simpler. Each window is maintained locally. Shared window management is reduced to screen space allocation, which is done by commanding the window MMU.

    1. Re:Architecture for software reliability by RalphBNumbers · · Score: 4, Interesting

      Actually, what you refer to under "Graphics MMUs" has been done for a while under OSX with Quartz and Quartz Extreme.
      Windows are drawn on OpenGL surfaces and their layering is handled entirely by the GPU in Quartz Extreme, and plain old Quartz does basically the same thing in software buffers. In either case, an app never has to do any redreawing when one of it's windows is revealed, it's all handled by Quartz.
      And supposedly, whenever it eventually comes out, Longhorn will do more or less the same thing.

      Channelized I/O is probably a good idea, but it's either going to cost you some bandwidth (route all IO through a expanded version of current MMUs), or be expensive (a seperate MMU for IO). I'm not saying it might not be worth it in the long run, but it will take a bite out of price/performance in the short term for questionable immeadiate stability gains (one would hope that most people writing kernel space drivers have the sense to KISS).

      High speed copy sounds really interesting, but I'm not sure how practical it is to add to current systems.

      --
      "The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
    2. Re:Architecture for software reliability by Spy+Hunter · · Score: 2, Interesting

      Explicit hardware support for overlapping windows is unnecessary. You don't really want the number of windows you can open to be limited by your video hardware, do you? It can be handled easily in software, using video card acceleration features that are standard today. XFree86 still does things the old-fashioned "redraw when windows are exposed" way, but I don't think there's any technical reason why a new X server couldn't keep all contents of all windows in memory at the same time and never redraw due to expose events. In fact I believe Keith Packard's new X server does this to allow neat effects in the style of Mac OS X.

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
  9. Multiple chips by Tablizer · · Score: 3, Interesting

    Why not define a new standard machine code set and start making new chips with it? Old software can use the old chip and new software use the new chip. Game machines do something like this.

    Emulators can be implemented such that old chips can still run code from the new standard (and visa versa), just slower. For development, training, simple apps, and testing that is usually fast enough.

    A box could come with both an X86 and an Alpha-clone, for example. Eventually over time the X86 chip is not worth it. The few old apps laying around just use emulation mode.

    1. Re:Multiple chips by Lapzilla · · Score: 2, Informative

      IIRC, Apple did this with their Dos Compatible Macs.... You could run both DOS and MacOS at the same time, and had some bios function that switched between the two.

  10. Re:Power 4? by chez69 · · Score: 2, Informative

    The power 5 will have 2 cpus on a die, and they both will behave like hyperthreading intel cpus.

    so each 'cpu' will look like 4 logical cpus

    --
    PHP is the solution of choice for relaying mysql errors to web users.
  11. Re:Power 4? by beerman2k · · Score: 2, Insightful

    Don't underestimate Intel. Unlike the Gnomes they have a plan

    Step 1: Hyperthreading
    Step 2: Multicore
    Step 3: Crush competition (i.e. Profit)