Slashdot Mirror


How to Kill x86 and Thread-Level Parallelism

kid inputs: "There's an interesting article discussing how one might go about 'killing' x86. The article details a number of different technological solutions, from a clean 64-bit replacement (Alpha?), to a radically different VLIW approach (Itanium), and an evolutionary solution (Opteron). As is often the case in situations like these, market forces dictate which technologies become entrenched and whether or not they stay that way (VHS vs Beta, anyone?). Another article by the same author covers hardware multi-threading and exploiting thread level parallelism, like Intel's Hyperthreading or IBM's POWER4 with its dual-cores on a die. These types of implementations can really pay off if the software supports it. In the case of servers, most applications tend to be multi-user, and so are parallel in nature."

72 comments

  1. endian-little post first! by zulux · · Score: 3, Funny



    Post! First

    A From Litte A system endian!

    Rules! x86

    --

    Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

    1. Re:endian-little post first! by Directrix1 · · Score: 1

      Why can't there just be a recompiler for x86. Have a program that crawls through the executable, recompiling the instructions along the way, and at conditional jumps ignore the conditional and recompile both possible paths. Doesn't seem too hard. Wouldn't this work.

      --
      Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
    2. Re:endian-little post first! by addaon · · Score: 1

      It's called binary translation. It's not impossible; search on citeseer. But how do you telll (in the general case) the difference between code and data?

      --

      I've had this sig for three days.
    3. Re:endian-little post first! by Directrix1 · · Score: 1

      Thats why I'm saying have it virtually execute it. Have it trace along the program jumping to all possible paths. Anything it executes will be code, everything else will be data. And also have it take both paths of conditional jumps. Why wouldn't this work?

      --
      Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
    4. Re:endian-little post first! by addaon · · Score: 1

      It would work for any code which does the exact same thing on every run. However, it would not work any program that (a) depends on user input, (b) depends on input from other hardware, (c) uses a random number generator or is time sensitive, or (d) is not guaranteed to terminate. Oh, come to think of it, it'll fail on any program that (e) uses certain information for both code and data, which is less rare than you might think.

      It's the same issue as decompiling, really; binary translation is decompile-recompile, with certain optimizations made if you don't actually need the full source. Play with dissassemblers, let alone decompilers, and you'll see how bad they do in the general case.

      --

      I've had this sig for three days.
    5. Re:endian-little post first! by Directrix1 · · Score: 1

      Programs do the exact same thing on every run. Jumps are jumps, they can be followed. Conditional jumps either jump or they don't. So you just follow both possible paths. You do have a point about the code & data in one place. But this usually only happens in JITs, Virtual Machines, and Program Packers. The JITs and Virtual Machines you'll just have to have a port to the destination system (which is pretty likely to already exist if its popular). You could have a plugin system that recognizes executables packed by popular packers and decode them. I mean its not going to be pretty but wouldn't this work?

      --
      Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
    6. Re:endian-little post first! by Haeleth · · Score: 2, Informative

      Programs do the exact same thing on every run. Jumps are jumps, they can be followed.

      If only it were that simple. Ever heard of a "computed goto"?

    7. Re:endian-little post first! by addaon · · Score: 1

      Echo does different things when I call it with the argument "foo" than when I call it with the argument "bar", no?

      --

      I've had this sig for three days.
    8. Re:endian-little post first! by sadangel · · Score: 1

      The crusoe processor from Transmeta did just that. Intels own chips do some form of conversion from x86 CISC into RISC microcode though to a lesser degree than the Transmeta chips. They have to perform such conversions in order to keep up with the RISCs.
      The x86 is terrible. It sprung from a chip designed for calculators and still carries all of the baggage from that origin. This is why the Athlon64 is so distressing. Sure it's a more convenient transistion and I'm all for dethroning Intel, but it maintains that horrible x86 garbage. "When you make a car, you don't want to get rid of the steering wheel" they argue. The fact is, the x86 is not a car, it's a boat that has been retrofitted with wheels and an engine, but still sports a useless sail and rudder. The Itanium, for all its faults, gave the promise of doing away with this CISC crap for good.

    9. Re:endian-little post first! by CTachyon · · Score: 1

      The dynamic linker is nothing more than a program that does the "same thing" every time it runs: what it does is equivalent to reading the requested executable, writing its contents to memory as data, then jumping to it. (The reality is both more and less complicated due to mmaping.) System-level programs and libraries, plus just-in-time environments like Java, do stuff like that all the time.

      --
      Range Voting: preference intensity matters
    10. Re:endian-little post first! by Anonymous Coward · · Score: 0

      C Code like (*f[i])() can not be traced unless you knwo the bounds of the f array.

    11. Re:endian-little post first! by sjames · · Score: 1

      If only it were that simple. Ever heard of a "computed goto"?

      Computed goto can be a problem, but it can be overcome, or in a new archeteture eliminated entirely.

      A fully general solution requires JIT translation. Lay the code out in blocks with block metadata to indicate the state of translation. Pre-translate by starting at the entry point and following program flow through both sides of branches. As you note, a computed jump cannot be predicted for all cases. However, once a computed jump instruction is actually executed, start translating from the entry point. Since code blocks hav metadata, the next time through, you can see that the block was already translated and move on.

      In a new archeteture, you can just design the instruction better. I can't think of a case where a computed jump semantically means that the execution may start anywhere at all (since the majority of addresses are data, undefined, or are not valid entry points). So, define an instruction for jump vector tables. Mark the vector table as special.

    12. Re:endian-little post first! by Anonymous Coward · · Score: 0

      Dude, learn how to read. Learn how to think. My guess is that either you aren't a programmer, or consider yourself above the "lowly" art of assembly language. I could explain to you why your post is stupid, but I think you'd learn more if you figured it out yourself. Hopefully you will.

    13. Re:endian-little post first! by Anonymous Coward · · Score: 0

      how about when C programmers use function pointers

      (*f)(a,b,c);

      i know this isn't asm, but this will make code that can't be virtually run, where does the code go from here?

      what would work better is more of a jit recompiler, run code till you get to a jump or rewrite that section of memory (such as loading a program or library)
      it seems arstechnica had an article about an HP processor doing something like that.

      and there it is.
      http://www.arstechnica.com/reviews/1q00/dynam o/dyn amo-1.html

    14. Re:endian-little post first! by addaon · · Score: 1

      (My main source of income is assembly language programming for multiple architectures, primarily PowerPC/Altivec these days.)

      Please explain why my post is stupid... the original post that we are all responding to suggested doing static binary translation. I explained why this is extremely difficult. His response showed that he had not considered the case of variable input; if you neglect this case, you can statically compile all programs to a few print statements and a return value, so it's really just as well not to. I gave an example of a program that can not be statically translated this way.

      --

      I've had this sig for three days.
    15. Re:endian-little post first! by jovlinger · · Score: 1

      Do a google search for HP's dynamo. Fascinating stuff.

  2. How to kill x86 by Anonymous Coward · · Score: 2, Funny

    Buy Apple :D

    1. Re:How to kill x86 by Anonymous Coward · · Score: 0

      Why is this funny? This should be labeled interesting or informative.

    2. Re:How to kill x86 by Anonymous Coward · · Score: 0

      feimwi fdskmi v vs

    3. Re:How to kill x86 by Anonymous Coward · · Score: 0
      Why is this funny? This should be labeled interesting or informative.


      Because Apple is dying. Everybody knows that...

  3. Don't forget by Misinformed · · Score: 4, Interesting

    The space shuttle still uses 16-bit x86s, the financial system is reliant on v_e_r_y old systems which spew out dot-matrix printed backups. Old systems survive today, and IMHO will always. It has to be organic.

    --
    --

    Slashdot: Racism against Indians OK. China bad, USA good. Blue pill in water supply.
    1. Re:Don't forget by Anonymous Coward · · Score: 0

      Well, keep in mind that financial institutions need printers that actually strike paper with force to work with carbon copies, so dot-matrix is about as good as they can do.

    2. Re:Don't forget by Haeleth · · Score: 1

      financial institutions need printers that actually strike paper with force to work with carbon copies, so dot-matrix is about as good as they can do.

      But why do they need that? Why use carbon copying to make three copies of a form when you can just run off three copies on a laser printer?

    3. Re:Don't forget by PD · · Score: 1

      Wrong, the shuttle does not use x86 processors.

    4. Re:Don't forget by Dwonis · · Score: 1
      I know of at least one instrument used on the space shuttle that *does* use x86 processors. 286, to be exact. The reason for this is that the 286 (at least, the one they're using) is fully static, so it won't get affected by the radiation the way that the dynamic components do in newer processors.

      Basically, if you use DRAM in space, the tiny capacitors inside end up getting disrupted by the ambient radiation, causing bits to get flipped.

    5. Re:Don't forget by PD · · Score: 1

      When people talk about the space shuttle computers they usually mean the 5 computers that control vehicle flight.

      Anyway, lots of Intel Pentium and later class computers have flown on the shuttle. I don't think they have too much trouble with the radiation, despite being off-the-shelf models. The shuttle is still well protected from radiation at its typical altitude.

    6. Re:Don't forget by Anonymous Coward · · Score: 0

      well, due to some bizarre legal quirks, "writing" in several countries requires making an impression with an implement (inc. finger..) on paper. Laser or inkjet printing doesn't qualify as writing under these strange laws. So if you need a "written" document, you have to dot-matrix or daisywheel it, even today...

    7. Re:Don't forget by jabuzz · · Score: 1

      That is not going to work for those pay slip things that get printed inside the envelope by a dotmatrix printer with no ribbon, is it now.

    8. Re:Don't forget by jovlinger · · Score: 1

      I was reading in ... ars?... that if you have a gigabyte of memory, you can expect about a random bit flip per week, just from quantum fluctuations. Course, most bit flips are benign, occuring on pages marked clean, and unused, or subsequently overwritten before read.

      However, if desktop memory gets bigger, ECC RAM will become necessary. It appears to have been constant at 256/512 for a while now, so the increase has slowed, if not stopped.

  4. Let's kill x86! by ObviousGuy · · Score: 2, Insightful

    We should rewrite all of our COBOL programs in C while we're at it.

    Might as well compound the folly of tossing out a perfectly good instruction set with the folly of tossing out perfectly good source code.

    Update, don't reinvent. The desire to reinvent is a junior engineer character flaw. It takes several experiences in spending long hours tracking down bugs in the new implementation rather than simply updating some older code that worked fine.

    --
    I have been pwned because my /. password was too easy to guess.
    1. Re:Let's kill x86! by boelthorn · · Score: 2, Funny

      There are too opposing opinions in this matter:

      1. The mythical man-month: Plan to build one to throw away. You will anyhow.

      2. Hack something together. Extend it. It will work fine. (This approach really works excellent in Common Lisp and proves deadly for Perl programs)

      It is true that Intel's base instruction set survived the last 18 years quite unchanged. And if you consider the pre-80386-era even longer. It is also true that it is proven and works. But if you ever tried to write an assembler or disassembler for that instruction set, you know that it is a amazingly huge heap of crap.

      Intel's IA64 is a nice try, I personally like the approach of trying something new and clean, even though I dislike Intel's business strategies.

      Back to the point: It is sometimes really neccessary to reinvent and not to place more and more stones onto an unstable foundation.

    2. Re:Let's kill x86! by Valar · · Score: 4, Informative

      The problem with keeping the x86 architecture and the ISA are that it is carrying around legacy burdens from the 286. Even the p4 still boots into real address mode at boot up, and has to be PUT into protected mode. There are hundreds and hundreds of instructions, over 100 registers (but still only 8 GPRs), many of which overlap in purpose or are used for entirely non-intuitive purposes (CMPX EAX, EAX). x86 is ready, at the least, for a real version 2, that isn't afraid to break compatibility in order to add major architectural advances (I wouldn't mind a register ring :).

    3. Re:Let's kill x86! by Anonymous Coward · · Score: 2, Insightful

      It has to be put into protected mode at boot-up? Wow, that must take like 10 ns at least, every single time you cycle the power! You're right, better just scrap the whole thing...

    4. Re:Let's kill x86! by Valar · · Score: 1

      And of course, address translation doesn't cost anything in terms of die size, performance or power consumption. It didn't contribute to unnecessarily complicated code and operating systems. Obviously, the engineers at AMD agree with you, otherwise why would they have dumped the segmented memory model for x86-64... Oh wait.

    5. Re:Let's kill x86! by boelthorn · · Score: 1

      At least they could dump 16-bit Real Mode all together, but imagine all the problems this would create. Most noticeable: BIOS (which is itself mostly written for Real Mode, except some half harted attempts to introduce Protected Mode interfaces, which proved to be too buggy to use anyway) as we know it may a) cease to exist b) grow into something that has at least some importance to the OS.

    6. Re:Let's kill x86! by sjames · · Score: 1

      There's no reason that real mode can't be phased out. The first take on it will need a strap to determine if the CPU starts in real or protected mode. This is mostly to avoid a great deal of chaos for BIOS. There's no reason the CPU can't start in flat 32 bit mode with all segments set to 0-0xffffffff.

      Of course, LinuxBIOS spends as little time as possible in real mode before going to flat 32bit mode but other BIOS will need more significant changes.

      It should be possible to phase in a new mode where the internal registers are actively accessable in the machine codes as well.

      However, that adds complications for Intel since now, they are free to arbitrarily change the internals in each revision, but once they make access public, they're stuck with it. If it's not just so, they would have a problem at that point.

    7. Re:Let's kill x86! by LWATCDR · · Score: 1

      "Update, don't reinvent. The desire to reinvent is a junior engineer character flaw. "
      That is not always true. There was company called Wright Aircraft engines and yes it was started by the Wright brothers. In the late 40s and early 50s they where one of the top engine makers. They did not want to waist time with those new fangled jet engines. There piston engines where the standard in airliners and they thought it would go on for ever.... It didn't.
      Sometimes starting over is a good thing.
      The x86 is also far from a perfectly good instruction set.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    8. Re:Let's kill x86! by akuma(x86) · · Score: 1

      And of course, address translation doesn't cost anything in terms of die size, performance or power consumption

      Nope. It doesn't. It used to in the early 90s, but now we have transistors to spare. The ISA doesn't matter anymore. It's at most 2nd order effect on die size, power and performance. I design x86 processors for a living - It's a fact.

      There are a million tricks architects can play to get around poor ISAs. What are the fastest SPECint machines on the planet? Hmm...x86 machines!

      The only reason Itanium wins on SPECfp are cache and memory bandwidth which are completely orthogonal to the ISA.

      It didn't contribute to unnecessarily complicated code and operating systems

      The cost of devloping and validating that code has been paid. We're enjoying the benefits of that labor. Now you want to start over? Re-validate? Re-compile? You're asking way too much. Compilers can compile to x86 now. Processors can optimize out the cruft. Productive programmers don't spend their lives programming in assembly anymore...Why should anyone care that they have an x86 ISA under the hood instead of say...Alpha?

  5. h/w vs s/w by StarBar · · Score: 2, Insightful

    This is much like my day to day work. The h/w guys thinks they are gods and always blames us s/w guys not to utilize the smartness of their designs fast enough. s/w compatibility is what counts for general purpose systems, and it always will. You can cry the guts out of yourself about bad system design and segment hell etc etc and it will not help.

    1. Re:h/w vs s/w by oscast · · Score: 2, Interesting

      I think the opposite... software should accomidate hardware. Software should be the comodity and hardware the primary asset.

    2. Re:h/w vs s/w by StarBar · · Score: 1

      I agree on that idealistic view of it but software is much more complex than hardware and as a pragmatist I can tell that it will not happen that a completelly new architecture will take over the x86 domination in the current market. It's just too expensive. If just hardware guys could understand that too. They need to invent something that is 10 times better and 10 times cheaper to manufacture to stir the bowl. Not twice the performance to half the price becuase that will not be good enough.

  6. This is what SUN calls by caesar79 · · Score: 4, Informative

    "Throughput computing"..where the performance is measured not individually but in aggregate.
    See their media kit available at
    http://www.sun.com/aboutsun/media/presskits/thro ug hputcomputing/ for more details.

    However, I believe the whole idea is nothing new. AFAIK, there are only two ways of increasing the performance of a processor (Operations Per Second) - either increase the IPC (Instructions per cycle) by increasing parallelism or decrease the cycle time by increasing the clock Rate (Ghz).

    Each method has its limits and follows the law of diminishing returns - for e.g. increasing the clock rate implies increasing the number of stages in the pipeline...and after say 10000 stages, the penalties imposed due to flushing the pipeline might compensate for the increased GhZ. Similarly if you manage to place 100000 cores on a chip, scheduling amongst these cores and providing realtime access to the memory for all these cores will become the bottleneck. Hence, I take statements like "how to kill the x86" with a pinch of salt.

    Finally, it will the fabcrication (physical) technology that decides which one of these dies. For e.g. if tomorrow someone is able to come up with a process that enables 100Ghz chips at the (think extensions of SOI etc) decreasing the cycle time will win. Similarly, if someone comes out with femto (10^-15 ) metre fabrication technology, then parallelism will win.

    1. Re:This is what SUN calls by Slack(er)ware · · Score: 1

      I just have to point out: 1 femtometer is about 10000 times smaller than the nominal "diameter" of a hydrogen atom.

      Buy who knows, maybe we will have superstring transistors in the future.

    2. Re:This is what SUN calls by jovlinger · · Score: 1

      Especially what the article called vertical MT (or coarse grained) was the basis of the coolest supercomputer evar: The Tera.

      It had no cache -- no cache logic! It did have hardware support for a god-awful number of threads per cpu tho. Each time one of them stalled, it would thread switch and keep going. After about 60 or so cycles (this was a few years back), the memory read would be back, so if you had 64 threads per CPU, you would never see a memory-latency related stall.

      As all things extreme (think CM-1) the practical issues and inexorable march of intel performance made it ultimately a losing idea, but is still one of the coolest, cleanest, most simplifying out-of-the-box thinkings I know of.

  7. it doesn't matter anymore by ajagci · · Score: 2, Insightful

    Two decades ago, the instruction set still mattered because it was closely tied to how the processor executed things. Today, we can put enough logic between the instruction strem and the processor that the instruction set makes no difference anymore.

    And VLIW in particular is quite unconvincing: processors should rely less on compilers, not impose a bigger burden on software writers.

    1. Re:it doesn't matter anymore by Anonymous Coward · · Score: 0
      Two decades ago, the instruction set still mattered because it was closely tied to how the processor executed things. Today, we can put enough logic between the instruction strem and the processor that the instruction set makes no difference anymore.

      It's this kind of mentality that makes the AMD Athlon XP in my server-closet generate 70 Watts of power, when it really would suffice with maybe 10 Watts using current technology and skipping DOS-compatibility. :P

      I say: sucks!

  8. Cost-efficiency > * by Anonymous Coward · · Score: 1, Interesting

    Since I use linux and it or its applications can be ported to most architectures you throw at it, I could theoretically have my pick of the litter for a future system. What I consider most is the bang-for-my-buck factor.

    Sure, I could spend $20 on eBay and get a Sparc Lunchbox, but there's not enough processing power in there for me. I could also go out and buy a year-old IBM mainframe, but I doubt any auction site will have them anywhere near my price range. I want something that's decent but also cheap. I don't care what architecture it is as long as it 'works' and I can afford it.

    This is for my Desktop/Workstation, mind you.

  9. Power 4? by rawgod0122 · · Score: 1

    The Power 4 is two full CPUs (cores) on a single die, not that crap that Intel puts out called Hyper-Threading where you only have a single full CPU and then some extra logic to quickly swap over to another thread when needed.

    1. Re:Power 4? by chez69 · · Score: 2, Informative

      The power 5 will have 2 cpus on a die, and they both will behave like hyperthreading intel cpus.

      so each 'cpu' will look like 4 logical cpus

      --
      PHP is the solution of choice for relaying mysql errors to web users.
    2. Re:Power 4? by Anonymous Coward · · Score: 0

      SMT is not crap, and that is what the next Power5 will have. SMT also only adds about 5-10% to the size of the die.

    3. Re:Power 4? by beerman2k · · Score: 2, Insightful

      Don't underestimate Intel. Unlike the Gnomes they have a plan

      Step 1: Hyperthreading
      Step 2: Multicore
      Step 3: Crush competition (i.e. Profit)

  10. Architecture for software reliability by Animats · · Score: 4, Informative
    Depends on the goal. Here's an architecture for reliability. If vendors had to pay whenever a program crashed, we would have seen this years ago.
    • Channelized I/O With current peripheral bus to memory interfaces, peripherals can store anywhere in memory. So drivers impact system stability. It doesn't have to be that way. IBM got this right in mainframe design in the 1960s. You want an MMU between the peripherals and memory. Drivers then become non-privileged programs. Existing peripherals don't even need to know there's an MMU in the middle, just as programs don't.
    • High-speed copy. Copying data in memory should be really fast, so fast that it's almost free, even if it takes copy-on-write hardware in the cache to do it. Why? Because then the temptation to put everything in one big address space decreases. With good interprocess communication (think QNX messaging, not CORBA, or horrors, SOAP), building programs out of components can actually work. This includes the OS. File systems, networking, and drivers should all be user programs.

      The neat hardware implementation of this would be to make all MOV instructions take nearly the same time, regardless of the amount of data moved. A MOV should result in a remapping of the source and destination memory in the cache system. Even if this were just implemented for aligned moves, it would be a big help. When your application's 8K buffer needs to be copied to the file system, that copy should be done by updating cache control info, not by really doing it.

    • Graphics MMUs Get rid of the "window damage" approach, and have real hardware support for overlapping windows. All that's needed are big sprites. Then programs don't have to know or care which window is on top. "Overlay planes" do some of this, but it's not general enough.

      With this, windowing becomes far simpler. Each window is maintained locally. Shared window management is reduced to screen space allocation, which is done by commanding the window MMU.

    1. Re:Architecture for software reliability by RalphBNumbers · · Score: 4, Interesting

      Actually, what you refer to under "Graphics MMUs" has been done for a while under OSX with Quartz and Quartz Extreme.
      Windows are drawn on OpenGL surfaces and their layering is handled entirely by the GPU in Quartz Extreme, and plain old Quartz does basically the same thing in software buffers. In either case, an app never has to do any redreawing when one of it's windows is revealed, it's all handled by Quartz.
      And supposedly, whenever it eventually comes out, Longhorn will do more or less the same thing.

      Channelized I/O is probably a good idea, but it's either going to cost you some bandwidth (route all IO through a expanded version of current MMUs), or be expensive (a seperate MMU for IO). I'm not saying it might not be worth it in the long run, but it will take a bite out of price/performance in the short term for questionable immeadiate stability gains (one would hope that most people writing kernel space drivers have the sense to KISS).

      High speed copy sounds really interesting, but I'm not sure how practical it is to add to current systems.

      --
      "The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
    2. Re:Architecture for software reliability by Animats · · Score: 1
      Channelized I/O is probably a good idea, but it's either going to cost you some bandwidth (route all IO through a expanded version of current MMUs), or be expensive (a seperate MMU for IO).

      It shouldn't hurt bandwidth. The problem with MMUs is latency, and adding a few hundred nanoseconds to I/O latency isn't going to hurt. I/O accesses have far more coherency than regular memory accesses, so you don't need that much cacheing within the I/O MMU.

      The original Apollo Domain machines had an MMU between the CPU and the Multibus card cage for add-on peripherals. Apollo lost out to Sun for other reasons, their choice of a propretary token ring networking system being one of them.

    3. Re:Architecture for software reliability by Spy+Hunter · · Score: 2, Interesting

      Explicit hardware support for overlapping windows is unnecessary. You don't really want the number of windows you can open to be limited by your video hardware, do you? It can be handled easily in software, using video card acceleration features that are standard today. XFree86 still does things the old-fashioned "redraw when windows are exposed" way, but I don't think there's any technical reason why a new X server couldn't keep all contents of all windows in memory at the same time and never redraw due to expose events. In fact I believe Keith Packard's new X server does this to allow neat effects in the style of Mac OS X.

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    4. Re:Architecture for software reliability by Anonymous Coward · · Score: 0

      I can't imagine how can can count SOAP, or even Corba as a tool for IPC! Get a clue.

    5. Re:Architecture for software reliability by Animats · · Score: 1
      There have been implementations with support for 8 windows, and that clearly wasn't enough. If you had support for, say, 1024, that probably would be enough, even if every icon took up a window slot. The transistor count for this isn't a big deal any more.

      The point here is that we're tied to some architectural decisions from an era when transistors were more expensive, and those decisions are worth a new look.

  11. Multiple chips by Tablizer · · Score: 3, Interesting

    Why not define a new standard machine code set and start making new chips with it? Old software can use the old chip and new software use the new chip. Game machines do something like this.

    Emulators can be implemented such that old chips can still run code from the new standard (and visa versa), just slower. For development, training, simple apps, and testing that is usually fast enough.

    A box could come with both an X86 and an Alpha-clone, for example. Eventually over time the X86 chip is not worth it. The few old apps laying around just use emulation mode.

    1. Re:Multiple chips by Lapzilla · · Score: 2, Informative

      IIRC, Apple did this with their Dos Compatible Macs.... You could run both DOS and MacOS at the same time, and had some bios function that switched between the two.

  12. heavily scripted page by Anonymous Coward · · Score: 0

    speaking of thread-levels and x86 tainting..

    I get a blank page except for the advert; front page too.

    I use netscape 4.77, cookies/javashit off. Now I'm a long way from text-only browsing but is it asking too much to have standardized, *.INT/insert_your_international_web_commission_her e compliant sites or *at least* clean links to them? Can someone post a plaintext version?

    1. Re:heavily scripted page by Anonymous Coward · · Score: 0

      I use netscape 4.77

      That's your problem. Netscape 4 can't handle standard HTML. Use a browser that isn't broken! Even Internet Explorer is better than Netscape 4, for heaven's sake!

    2. Re:heavily scripted page by Dwonis · · Score: 1
      Netscape 3 is better than Netscape 4, because Netscape 3 doesn't support CSS at all, while Netscape 4's support for CSS is a broken mess that ends up destroying even well-designed pages.

      So, yes. Please stop using Netscape 4.

    3. Re:heavily scripted page by Anonymous Coward · · Score: 0

      real men browse with vi.

    4. Re:heavily scripted page by gl4ss · · Score: 1

      * real men browse with vi.*

      wait, real men browse????????

      --
      world was created 5 seconds before this post as it is.
  13. Re:Cost-efficiency * by aminorex · · Score: 1

    Yes, economy of scale determines who provides
    the most bang for the buck, but there are more
    dimensions to the purchasing decision than
    mips, mflops, and $$. There are watts and
    hours and then, god forbid, intangibles.

    ARM and PPC have the best shot at displacing
    ia32 and its best successor, amd64, because
    they accomodate very real market segments.
    We keep waiting for commodity PPC hardware,
    but it never emerges because the OSS community
    isn't big enough to drive sales to economical
    volume; but some magical event could happen
    at any moment in PPC-land, nonetheless, as
    IBM is quite motivated to see it happen.
    ARM has economy of scale, but no one
    is pushing its performance into competitive
    domains right now.

    --
    -I like my women like I like my tea: green-
  14. Re:Cost-efficiency * by Anonymous Coward · · Score: 0
    Since I use linux and it or its applications can be ported to most architectures you throw at it, I could theoretically have my pick of the litter for a future system.

    This assumes linux compiles on your pick-of-litter system. If intel moves to a non-x86 instruction set, then someone needs to port gcc to the new instruction set. Someone would also need to port the kernel to the new architecture. Userland apps may be mostly ok, but some of them will need tweaked, too.

    That is why everyone is hesitant to throw out an architecture upon which literally decades of computing is built.

  15. I would sell my soul for commodity PPC by Anonymous Coward · · Score: 0

    Maybe not quite, but I've programmed M68K, x86, and MIPS assembly, and if PPC is anywhere near as clean as M68K, I would absolutely love it on my desktop. Even if I never program in assembly. x86 just feels like some undead stalker that just crawled out of the basement, with bits of slime dripping off it.

  16. old stuff by SanityInAnarchy · · Score: 1

    So does old software. I've seen many a checkout-like system that appeared to be running on DOS, and some terminals I walked up to (even though I may not have been supposed to get to them physically, they were relatively unguarded) responded directly to the three fingered salute.

    And while old hardware still works, especially as long as you have software that's ported to it, old software does not. For that matter, since old hardware is so cheap, people who would keep using 16-bit processors should buy 32-bit ones just as 64-bit starts becoming de-facto, because that way they have a low-cost upgrade cycle. Not because they necessarily need it to accomodate new software, but because I'm sure they will start wanting to do new things with their computer system, and sometimes that requires hardware.

    --
    Don't thank God, thank a doctor!
  17. Smaller decoder benefits by tepples · · Score: 1

    Today, we can put enough logic between the instruction strem and the processor

    Wouldn't less decoder logic allow for a smaller decoder, which requires less die space and emits less heat?

    processors should rely less on compilers

    To the other extreme, do you propose a processor that can run Perl directly? What compromise would you find best?

  18. Multicore would increase the Windows Tax by tepples · · Score: 1

    I wouldn't expect seeing multicore in home PCs within the next five years, even if multicore becomes so cheap Intel could start putting it in its Celeron chips. The limitation is that Microsoft charges for Windows licenses per core; a license for Windows XP Professional, which can handle two cores, costs much more than a license for Windows XP Home Edition, which can handle one core. Wouldn't multicore require selling the machine with a more expensive version of Microsoft Windows?

    I say "next five years" to estimate how long it will take before Linux desktop environments reach the maturity of even Windows 2000.

  19. not quite ... by vlad_petric · · Score: 1
    I agree with you that CISC vs. RISC is not an issue any more. The decoded tracecache that P4 has, for instance, takes the conversion off the critical path.

    x86 however has a ridiculously small number of registers. This means that you have to go to memory A LOT. It's easy to make register operations fast, extremely hard to make memory fast. The performance gap between memory and processors is constantly increasing.

    That's why x86-64 has 16 general purpose registers, Alpha - 64 and Itanium ... 128.

    Bottomline: we do need a replacement for x86. Not because it's CISC, but because it has too few registers.

    And, btw, it's not a bad idea to demand more from the compiler. Big table structures on the chip require space and affect clock cycles. Compilers just don't.

    --

    The Raven

    1. Re:not quite ... by Anonymous Coward · · Score: 0
      And, btw, it's not a bad idea to demand more from the compiler.
      I use Gentoo, how will this affect me?
      Oh... badly, I see.