Slashdot Mirror


Linux On Another New Architecture: PowerPC 64-bit

An unnamed correspondent writes: "This one rather silently whizzed by on the kernel mailing list. IBM reports that they have ported Linux to PowerPC hardware running in 64-bit mode. This no doubt applies only to the larger processors but it's pretty cool all the same." I don't see this processor yet listed on the NetBSD page, even on the mind-bending list of not-yet-integrated ports; is this a first? :)

131 comments

  1. Re:IBM by rdean400 · · Score: 1

    I think the impetus for this was their committment to deliver Linux running on the i-series (formerly AS/400). Of course, there's a lot more to bringing Linux to i-series than making it run on the processor (you also have to virtualize primary and secondary memory to map them into a single address space, which takes a bit of doing for OS's that weren't designed that way from the ground up).

  2. Systems that use the POWER3 chip by Anonymous Coward · · Score: 1

    The POWER3 processor is used in a number of RS/6000 and a E-server p-series system: The 44P 170 desktop system The 44P 270 deskside system Three types of SP node The E-server p-series p640 server These systems range from 1-2 processors (44P-170) to a 16 way system (nighthawk II). The p640 is probably the most interesting system, as it is rack mounted and support up to 4 processors. All in all quite a few powerful systems to choose from. The more options the merrier!

  3. Re:OK, dumb 32/64 bit question by rgmoore · · Score: 1

    64 bit is going to be helpful if you're interested in high precision floating point math. A 64 bit processor can do operations on 64 bit FP numbers as a single operation, rather than 4 operations as you'd need with a 32 bit processor, so there's a big speed up in heavy duty number crunching. That's why you always see people doing really chunky numerical modeling- like predicting the weather- using 64 bit computers instead of cheaper 32 bit ones. Not what everyone needs, mind you, but the people who do need it are willing to pay.

    --

    There's no point in questioning authority if you aren't going to listen to the answers.

  4. how long until... by simpl3x · · Score: 1

    ...ibm produces inexpensive ppc workstations ala sun? would ibm rather use their own processors as opposed to intel's, given the issues intel is having with itanium? linux is linux is linux, whether it runs on intel hardware or ppc.

  5. Re:The real question by Tower · · Score: 1

    I grabbed an "old" Alpha 21164PC 533Mhz/mainboard for $250 a while back. Miles faster than the Multia, and fits in an ATX case. Run it with standard PC hardware (ATA controllers built in). I haven't seen them for that cheap recently, but there's usually an auction on eBay once in a while.

    --

    --
    "It's tough to be bilingual when you get hit in the head."
  6. Re:Big Time Linux: Itanium, S/390, PPC64 by biostatman · · Score: 1


    And where is a semi-usable UltraSPARC distribution?

    I actually have been running RH 6.2 for some time on this machine - runs like a champ. Often have to recompile software, but no real problems to speak of...

    [chris@gomez chris]$ cat /proc/cpuinfo
    cpu : TI UltraSparc IIi
    fpu : UltraSparc IIi integrated FPU
    promlib : Version 3 Revision 15
    prom : 3.15.2
    type : sun4u
    ncpus probed : 1
    ncpus active : 1
    BogoMips : 665.19
    MMU Type : Spitfire

    --
    For the love of $DEITY, loose != not win!!!!!
  7. Re:Dumb answer by Tower · · Score: 2

    _PowerPC Concepts, Architecture, and Design_ by Chakravarty and Cannon (McGraw Hill, 1994) mentioned the 64-bit architecture.

    --

    --
    "It's tough to be bilingual when you get hit in the head."
  8. Re:OK, dumb 32/64 bit question by BlowCat · · Score: 1

    Bitness is very important for the programming model. If (char *) cannot address a byte in a file of a size comparable to the size of a modern hard drive (it is not unreasonable to have such files) it's a major problem - you must either refuse to support large files or use long long everywhere, which is slow without native CPU support.

  9. PowerPC - does anyone care? by Ars-Fartsica · · Score: 2
    The PPC market seems to have splintered between AIX and the Mac, both declining markets...how long can IBM continue to develop and market an independent architecture when most users and developers have concluded that Intel processors are "fast enough"?

    Given the current trend of consolidation, I see room Intel, AMD, and a high-end player yet to be named - either Alpha or PPC. I'm discounting the Mac userbase in advance as I believe Mac users care the least about the technical details of their platform, and hence constitute an OS market more than a microprocessor market.

    1. Re:PowerPC - does anyone care? by RevRigel · · Score: 1

      I'm not sure you understand the market IBM's Power4 chips are aimed at. These chips are two processors on a die, 1GHz, but they'd smoke anything Intel or AMD have even if they were running at 250MHz. They can have incredibly low yields and still make a profit, because these chips are freaking expensive ($10-20k for the chip, I think, although you wouldn't buy one outside an IBM RS/6000). Intel and AMD don't reach that high end, so they have to have the commodity market to prop up their higher priced offerings (Xeons are where the bulk of the profit is). IBM is able to skim off the top, selling a small number of units integrated with all their own hardware and software, and make plenty of money. These are not chips 99% of users will ever have to use or care about.

  10. Re:IBM Gaining Marketability in Mainframe Industry by Tower · · Score: 2

    Neither Sun nor Compaq builds mainframes... not sure what you are referring to...

    --

    --
    "It's tough to be bilingual when you get hit in the head."
  11. apple, 64 bit ppc, etc. by hawk · · Score: 2
    > and the one IBM/Motorola 64-bit PowerPC, the 620, was a horrible flop, coming
    > out six weeks before the internally developed 630.

    The 620 made it int at least one Apple server, iirc. And when it trounced the wintel boxes in a benchmark, the predictable response back was that it wasn't fair to compare a 64 bit machine to 32 bit machines.


    hawk

  12. Why does this matter? by mr · · Score: 2

    If the PPC group can't get their changes into the linux kernel, (as has been noted on /.), they why does it matter?

    --
    If it was said on slashdot, it MUST be true!
  13. Re:IBM by hawk · · Score: 2

    > The guys in the server room who clean the crap off the floor get hot
    > and bothered by operating systems. The guys these people clean up
    > after only care about getting the job done right.

    For the most part. But if you told them to run NT on big iron, they would probably get hot under the collar and very bothered :)


    hawk


    :)

  14. Re:Actually, SMT is much better. by JohnZed · · Score: 2

    Ok, I'm sure I do need a clarification (I'm a compiler person, not a hardare person, and have never worked with the Power series). Why not just use multiple cores on a single chip? Last time I heard, that's what Power4 was going to do, right? Also, are you saying that the 8 execution units are split between several separate threads? If so, does anybody know if it's a fixed split (4 for first thread running, 4 for second) or dynamic (which would be surprising, but cool. . .)?
    Thanks, all!
    --JRZ

  15. Re:Big Time Linux: Itanium, S/390, PPC64 by ppetrakis · · Score: 1

    convienient that you omit the only mature 64 bit port that has almost as many distributions available 2nd only to x86. That would be the Alpha Processor. You can buy then new for around 2k and used under 1k.

    Peter
    --
    www.alphalinux.org

    --
    www.alphalinux.org
  16. Benchmarks used. by Christopher+Thomas · · Score: 2

    What set of benchmarks were you running when you collected those numbers? I assume you were using a benchmark suite such as SPEC. This is only one measure, useful to a certain audience.

    For that particular project, I was using the go and cc1 integer benchmarks from the SPEC suite (not sure which year). No special reason; these were just the ones I had on-hand, and for the project it didn't really matter (as I was interested in relative and not absolute results).

    I cite the figures from that project as a reference point to give some idea of the ballpark values that can be expected. A 50% increase in ILP for "average" code I might believe. A 400% increase I wouldn't.

    Yes, certain scientific applications can be written to be easily parallelized, but this is only one niche. For most code, I am deeply skeptical of filling 8 issue units per clock. SMT offers the potential for across-the-board speedup (as long as you're running more than one CPU-bound thread on the machine at once).

  17. Multiple cores on a chip. by Christopher+Thomas · · Score: 2

    Why not just use multiple cores on a single chip? Last time I heard, that's what Power4 was going to do, right? Also, are you saying that the 8 execution units are split between several separate threads? If so, does anybody know if it's a fixed split (4 for first thread running, 4 for second) or dynamic (which would be surprising, but cool. . .)?

    The multiple cores idea has been around for a while, and certainly works; SMT is just more resource-efficient.

    My impression is that the Power4 is going to have two cores, but I haven't been following it closely, so I could easily be wrong about that.

    In a SMT system, functional units are indeed shared dynamically between the threads. As far as most of the chip's concerned, there's only one instruction stream, composed of interleaved instructions from the two threads (well, not interleaved in lockstep, but close enough). All you'd need to do would be to add an extra bit onto the register specifier tags (so that the two threads access non-overlapping sections of the register file) and give each thread its own page table identifier (selected by a few bits tacked on to the address). You could even get away with having a single TLB cache.

    In summary, you can keep most of the design the same as for a single-thread machine, and make relatively minor changes in a few places to implement SMT. This takes far less silicon than dual cores, and lets you use the functional units more efficiently and use a wide issue unit efficiently (by boosting parallelism in the instruction stream).

    1. Re:Multiple cores on a chip. by Christopher+Thomas · · Score: 2

      In summary, you can keep most of the design the same as for a single-thread machine, and make relatively minor changes in a few places to implement SMT. This takes far less silicon than dual cores, and lets you use the functional units more efficiently and use a wide issue unit efficiently (by boosting parallelism in the instruction stream).

      I'm not sure I buy this argument. It is far easier to duplicate an existing design to make another core than to modify the design to support SMT. SMT requires big fetch/decode/rename hardware, a big register file and probably big caches, too. All of this stuff is in the critical path and will be difficult to run at the high clock speeds expected from modern cores.

      Actually, unless you want to take a moderate speed hit from recycling external bus protocols internally, you'll have quite a bit of design work on your hands building the internal communications bus for a multi-core system. Whether this is comparable to the amount of work needed for a SMT system is an open question, but it's definitely not negligible.

      The fetch/decode hardware is a straight duplication of the existing hardware - it doesn't take up any more space than for two duplicated cores.

      The architectural register file is in two independent banks; again, no more space than you'd have normally. Your physical register file is probably in the form of distributed reservation stations on the functional units; again, no more space than for duplicate cores.

      You do need more bandwidth on your result busses if you want to use more functional units at once, but this holds true for an aggressively-superscalar single-thread processor too. This is a manageable problem. If necessary, you can trade off bandwidth and latency when building the thing, because your execution stream is much less sensitive to latency than it would be with a single-thread machine.

      Renaming hardware is manageable. You'd need the bandwidth anyways on a wide-issue single-thread processor with the same issue rate.

      Caches will have less locality, which can be partly addressed by the operating system (keeping related threads on the same die), but will still be a problem. I don't think this will kill performance. You can pull tricks like having multiple cache banks to fake multiporting, or you can pipeline the cache and run it at a higher clock speed (as you don't mind _some_ extra latency), or do a number of other things. We're reaching the point of diminishing returns with cache size anyways, so you'll probably still have enough cache to effectively handle both threads.

      Re. clock speed, again, you don't mind a _moderate_ amount of extra latency, because you have enough parallelism to reschedule around it. You also can get away with a smaller instruction window, because you won't have to work as hard to find independent instructions. This saves latency in the scheduler.

      In summary, while you raise legitimate concerns, I don't think that they'll be significant problems.

    2. Re:Multiple cores on a chip. by David+Greene · · Score: 1
      Actually, unless you want to take a moderate speed hit from recycling external bus protocols internally, you'll have quite a bit of design work on your hands building the internal communications bus for a multi-core system.

      I don't know exactly what you mean by "recycling external bus protocols." Certainly you'd want to design the core interface to be efficient in a single-die environment, but it seems to me that the fundamental protocols (cache coherence, etc.) are the same. But I'm not an MP expert, so you probably have more insight on this than I do.

      The fetch/decode hardware is a straight duplication of the existing hardware - it doesn't take up any more space than for two duplicated cores.

      Not true. The SMT has to worry about fetch policy. This is not a trivial problem to solve. Starvation is a real concern here. Two independent cores don't need to worry about the fetch stream of one interfering with that of the other.

      Decode is not a trivial problem, either. IA32 has bug problems with this. My (and others') guess is that's why a trace cache was put on the P4. It's decode cache!

      Then there are problems with a large, muti-ported L1 instruction cache. Or interleaved fetch, which gets back to the first problem.

      The architectural register file is in two independent banks; again, no more space than you'd have normally.

      Except for the additional routing logic and wiring.

      Your physical register file is probably in the form of distributed reservation stations on the functional units; again, no more space than for duplicate cores.

      First off, pet peeve of mine. This is not directed to you personally, but to the computer architecture community in general. It's function (or execution, etc.) unit, not functional unit. I would hope all our execution units are functional. :)

      As for the physical file, distributing it implies a non-uniform register acces time a la the 21264. It's not an impossible problem to handle but there is a penalty with such a large file. Register caching can help with this and it may not be a large concern in the end. More study is needed in this area.

      If necessary, you can trade off bandwidth and latency when building the thing

      Certainly. This is why engineering is fun. :) The point of my post is that SMT is not a guaranteed win. It might be beneficial in some situations, but not all. I'm not sure it's justified for a POWER-class machine.

      Renaming hardware is manageable. You'd need the bandwidth anyways on a wide-issue single-thread processor with the same issue rate.

      Ah, but there is no single-thread machine that has the bandwidth of the proposed SMT schemes. Building a 4-way machine is a challenge. 8-way should be much more challenging.

      We're reaching the point of diminishing returns with cache size anyways, so you'll probably still have enough cache to effectively handle both threads.

      That's not true on server-class machines. Capacity is still a problem. This is why we see superlinear speedup on MP systems. The extra cache on each chip makes the machine as a whole run faster than you would expect, given the number of processors. SMT takes this distributed cache and puts it in one big array. This will slow it down in one way or another.

      Re. clock speed, again, you don't mind a _moderate_ amount of extra latency, because you have enough parallelism to reschedule around it.

      Eh? ILP has nothing to do with cycle time, save the impact on cycle time that complex O-O-O hardware can have. One cannot "get around" a slow clock through scheduling. Pipelining can be used, but that has its own costs.

      You also can get away with a smaller instruction window, because you won't have to work as hard to find independent instructions. This saves latency in the scheduler.

      Hmm...maybe. But you argue above that any extra latency in the system can be masked by the O-O-O engine. To me, this implies a larger window. Eventually one thread or another is going to get backed up waiting on memory. When that happens you have to have room available to fetch from the other threads. Has anyone done any studies of instruction queue utilization in SMT? I'd like to know how often the queue is full and how big it has to be to sustain execution. I seem to recall some of the work out of U. Wash. doing this, but I can't find the reference at the moment.

      All this is not to say that STM is worthless. Far from it. In fact, the fast thread context switching allows some super-cool techniques not previously possible. I'm trying to temper the enthusiasm for SMT a bit. Think of me as a Devil's Advocate. :)

      --

      --

    3. Re:Multiple cores on a chip. by peter · · Score: 1
      You're comparing one SMT CPU to two regular CPUs (on the same chip or not.) Certainly, two CPUs will perform better than a single SMT CPU (for most, but probably not all, tasks.) So what's SMT good for? It doesn't cost nearly as much as two whole CPUs. The idea is to get quite a bit more performance for a bit more cost. You say "SMT is not a guaranteed win", but relative to a single non-SMT CPU on the same amount of silicon, I think's a almost always a win. (Unless you only have one process that needs to run fast, so multithreading won't help you much).

      From what I've seen, SMT is based on the idea that "We have all these execution units to supply peak demand, but they go unused a lot." You add more of everything else, and share the execution units. It's unlikely that both threads will be making peak demands at the same time, so you can probably keep them satisfied. It's a good way to get the average throughput up. This is like ethernet, where the bandwidth is shared, but not everybody uses it all at once, so it works.

      I also want to reply to some of your specific points:

      Decode is not a trivial problem, either. IA32 has bug problems with this. My (and others') guess is that's why a trace cache was put on the P4. It's decode cache!

      Decode is pretty much trivial on anything but an IA32. They have this problem because they have to run an old instruction set that wasn't designed to be easy for hardware like we have now to deal with. New designs try to be good compiler targets, and to make the CPU's job easy. Most instruction sets are like the decoded x86 instructions that are generated internally on modern IA32 processors.

      Eh? ILP has nothing to do with cycle time, save the impact on cycle time that complex O-O-O hardware can have. One cannot "get around" a slow clock through scheduling. Pipelining can be used, but that has its own costs.

      Of course you'd use pipelining! Pipelining is has a similar goal to SMT: Keep more of the hardware busy more of the time, to increase total work/time. Since run time is the quantity of interest when measuring how fast a computer is, ILP and scheduling have everything to do with cycle time. You can trade off one to get the other, and have a CPU that gets the job done in the same amount of time.
      Time = instructions * CPI * clock period
      So raising the clock speed (smaller clock period) has the same effect as decreasing the average number of cycles per instruction (more ILP).

      As for window size, remember the law of diminishing returns. Two small windows on two threads will find more instructions to run than one large window on one thread.

      All this is not to say that STM is worthless. Far from it. In fact, the fast thread context switching allows some super-cool techniques not previously possible. I'm trying to temper the enthusiasm for SMT a bit. Think of me as a Devil's Advocate. :)

      Cool. Just remember that we're not claiming an SMT cpu can do the work of two whole CPUs. As I said, it's an idea in the same category as pipelining. One pipelined CPU with 5 stages probably isn't as good as 5 non-pipelined CPUs (as long as there are five tasks to keep all the CPUs busy. If there's only one job to do, and the compiler didn't parallelize it, the single pipelined CPU will be faster). However, a pipelined CPU only takes a bit more silicon, and gives a big (but not quite x5) speedup.

      You probably figured out some of that before I said it, but I hope helped :)
      #define X(x,y) x##y

      --
      #define X(x,y) x##y
      Peter Cordes ; e-mail: X(peter@cordes , .ca)
    4. Re:Multiple cores on a chip. by David+Greene · · Score: 1
      You're comparing one SMT CPU to two regular CPUs (on the same chip or not.)

      Yes, but I'm assuming equal (or as near equal as is practically possible) execution resources.

      So what's SMT good for? It doesn't cost nearly as much as two whole CPUs.

      See, now that's where I disagree. In terms of raw transitors you are right. But you are forgetting the cost of design and slippage in time-to-market. It is less costly to design and fabricate an MP design from previously designed cores than it is to take such a core and modify it for SMT.

      Realize that I am not saying this should never be done. Just that more evaluation is needed. Some of that evaluation will involve silicon, probably in the consumer market.

      but relative to a single non-SMT CPU on the same amount of silicon, I think's a almost always a win.

      If you look at raw transistor count, I might agree with you. I depends greatly on the architecture and the expense of the duplicated vs. shared resources of the two designs (decode logic, etc.). SMT has a cycle time impact and you have to balance that against the extra transistors required for a CMP.

      You add more of everything else, and share the execution units.

      This is an argument I've never understood. The execution units are such a tiny, tiny part of the die that I don't see much benefit in sharing them. Sharing the decode/O-O-O logic seems more beneficial, but even an SMT requires more of that (in terms of bandwidth).

      Decode is pretty much trivial on anything but an IA32.

      Sure about that? The POWER architecture is pretty complex. Even on a MIPS-like machine, the rename and dependency logic complexity rises rapidly with fetch/issue width. As does the wakeup logic with larger instruction windows.

      ILP and scheduling have everything to do with cycle time. You can trade off one to get the other, and have a CPU that gets the job done in the same amount of time.

      Time = instructions * CPI * clock period

      So raising the clock speed (smaller clock period) has the same effect as decreasing the average number of cycles per instruction (more ILP).

      My point is that raising ILP with SMT can decrease the clock speed (increase the cycle time). So you get more ILP but everything runs more slowly. A CMP with an equivalent number of contexts should get the same ILP without the extra cycle time penalty (ignoring messaging and coherence overhead). SMT does not make any one thread run faster. It increases throughput, which is exactly what a CMP does.

      Pipelining is not a panacea, either. There is a limit to how deep you want to make your pipe. This is one of the reasons good branch prediction is so important -- it allows a longer pipe.

      Remember also that cycle time is pretty much the only thing companies have to market, everything else (i.e. number of threads) being equal. Is this unfortunate? Clearly. But it is reality and engineers need to deal with it when making design decisions. Which reminds me to plug the excellent book The Soul of a New Machine by Tracy Kidder, a fascinating account of Data General's race to kill the VAX. There's a bit of discussion devoted to market times and perfect designs.

      Two small windows on two threads will find more instructions to run than one large window on one thread.

      A CMP has two small(er) windows for two threads. An SMT has one big window for two threads. Two smaller windows should run faster. Whether they find more ILP is an open question. SMT does have the advantage that it can trade off window space, etc. between threads. I think this is more difficult to do than most people realize due to the challenges with fetch policy.

      In any event, I am extremely curious to see what happens with the SMT chips coming out. Let's sit back and see if they make it! :)

      --

      --

    5. Re:Multiple cores on a chip. by David+Greene · · Score: 1
      My impression is that the Power4 is going to have two cores, but I haven't been following it closely, so I could easily be wrong about that.

      It has two cores for redundancy and error-checking, not for execution bandwidth.

      In summary, you can keep most of the design the same as for a single-thread machine, and make relatively minor changes in a few places to implement SMT. This takes far less silicon than dual cores, and lets you use the functional units more efficiently and use a wide issue unit efficiently (by boosting parallelism in the instruction stream).

      I'm not sure I buy this argument. It is far easier to duplicate an existing design to make another core than to modify the design to support SMT. SMT requires big fetch/decode/rename hardware, a big register file and probably big caches, too. All of this stuff is in the critical path and will be difficult to run at the high clock speeds expected from modern cores.

      --

      --

  18. Now what? by ajuda · · Score: 1

    It's all well and good that IBM has ported linux to the powerPC, but when are they going to port linux to the system that we've all been waiting for? I mean, I don't know how much longer I can wait to run linux on Natalie Portman. I can't wait to show her my uptime; the 'touch' command alone would make me happy... I mean, can you imagine a beowulf cluster of those? Ohhh mamma!
    This message was encrypted with rot-26 cryptography.

  19. Dumb answer by Matthew+Smith · · Score: 1

    One of us must be mistaken here... I've always thought that the general "bitness" of a processor referred to the width of the data bus which can be vastly different from its address bus. There is no abvious (to me) reason while a 32bit CPU cannot address more than 16GB of memory provided its address bus is sufficiently wide. But then again my CPU knowledge is limited to 16bit CPUs so take this with a grain of salt as I generally have no clue what I'm talking about.

    1. Re:Dumb answer by rdean400 · · Score: 1

      The PowerPC has a 64-bit mode, added for the AS/400, that supports the 64-bit single address space (really cool concept...basically combine your RAM and DASD into a really big virtual memory space, where the program can assume that something is always in memory, and the kernel takes care of making sure that it's in RAM when needed).

    2. Re:Dumb answer by Phil+Wilkins · · Score: 1

      That sounds about right, as I said, it was a long time ago...

    3. Re:Dumb answer by Phil+Wilkins · · Score: 1

      Segment registers, ala 80286...

    4. Re:Dumb answer by Hank+the+Lion · · Score: 1

      The ancient Sinclair QL used a 68008, which could handle 32 bit addresses, and thus 4GB of memory, but only had an 8-bit combined address and data bus. It'd take 4 bus clocks to select an address, and another four to read/write a 32 bit value from/to the location.

      I'm sorry to correct you, but the 68000 series of microprocessors does not have a combined address / data bus. The 68008 had a separate address bus of 20 or 22 pins, so it could directly address 1 or 4 MB of memory (depending on package)

    5. Re:Dumb answer by Gsus2 · · Score: 1

      The 68000 address registers were 32bit long, but only the lower 24 bit were used for addressing. Externally the chip had 23 pins, because it addressed words instead of bytes, and had 16 data pins. It could access 2^24 = 16MB. Check out this link

    6. Re:Dumb answer by Phil+Wilkins · · Score: 2

      Actually even the width of the address bus isn't necessarily a limiting factor. The ancient Sinclair QL used a 68008, which could handle 32 bit addresses, and thus 4GB of memory, but only had an 8-bit combined address and data bus. It'd take 4 bus clocks to select an address, and another four to read/write a 32 bit value from/to the location.

      Ouch!

    7. Re:Dumb answer by Wesley+Felter · · Score: 2

      If your general purpose registers are 32 bits (which is the definition of a 32-bit CPU) and addresses are 64 bits, where do you store the pointers? That's why in most recent chips, pointers are the same size as the integer registers.

    8. Re:Dumb answer by Hank+the+Lion · · Score: 1

      The 68000 address registers were 32bit long, but only the lower 24 bit were used for addressing. Externally the chip had 23 pins, because it addressed words instead of bytes, and had 16 data pins. It could access 2^24 = 16MB

      Correct for the 68000, but the original poster complained about the 68008, not the 68000.
      The 68008 has only 20 or 22 address pins, and 8 data pins.
      My comment on the 68000 series was only that it did not have a multiplexed data/address bus, as the original poster claimed.

    9. Re:Dumb answer by Guy+Harris · · Score: 2
      The PowerPC has a 64-bit mode, added for the AS/400, that supports the 64-bit single address space

      The 64-bit PowerPC architecture antedated the RISC AS/400's, as far as I know - as I remember, I saw a PowerPC architecture manual describing 64-bit mode before the RISC AS/400's came out (it was some time in 1994, I think, when I saw it).

      The PowerPC 620 was supposed to be the first 64-bit PowerPC; I don't know whether any machines shipped with it. IBM now have 64-bit PowerPC's in both the AS/400 and RS/6000 machines (I think some of the RS/6000's use the same chip as some of the AS/400's, with the tag bits and other AS/400 extensions disabled in the RS/6000's).

    10. Re:Dumb answer by Rares+Marian · · Score: 1

      Nah you just create Floozy processor that requires an 8 stage decoder and uses 8 bit registers.

      --
      The message on the other side of this sig is false.
    11. Re:Dumb answer by rdean400 · · Score: 1

      It's only within the last two or three years that the RS/6000's gained 64-bittedness. Prior to that, PPC chips were dual mode, with the AS/400s using the 64-bit "AS" instruction set and the RS/6000's using the 32-bit native mode.

  20. Re:OK, dumb 32/64 bit question by Ikkyu · · Score: 1

    "I am not a smart man" as a friend of mine is fond of saying, but I believe that it can best be explained this way. Let us say that we have two processors identical in every way save one is 32 bit and one is 64 bit. A majority of the work of a cpu is not in calculation but rather data transfer. A fetch takes 2 cycles while a write also takes 2 cycles. If we were to write 128 bits of information to disk from memory it would take 8 cycles for the 64 bit machine and 16 cycles for the 32 bit machine. End result: a 32 bit machine has to be clocked 2X the 64 bit machine to keep up in this sort of senario.

  21. Re:The real question by robert-porter · · Score: 1

    http://slashdot.org/article.pl?sid=01/02/27/145820 3&mode=thread

  22. Re:OK, dumb 32/64 bit question by CargoCult · · Score: 2

    Good way of looking at it, but lots of other things matter as well:

    pipeline efficiency, memory bus bandwidth, smp cache coherency efficiency.

    If you don't need >4GB of address-space then you're probably better off with a high-clock 32-bit chip and a good memory bus

    --
    **Vanuatu or bust**
  23. read the link first, dumbass! by rdean400 · · Score: 1

    power3 = the other 64-bit PowerPC implementation. Ignore previous comment.

  24. Now by mike260 · · Score: 1

    Playstation2 is *kind of* 64bit:
    sizeof(int)==4
    sizeof(long)==8
    sizeof(void*)==4 (IIRC)

    1. Re:Now by Phil+Wilkins · · Score: 1

      It's also kinda 128bit, in that the register file is 128bits wide, although to access the upper 64 bits of a register you have to load it into the vector unit coprocessor, VU0. Thus allowing you to do such lovelies as single instruction, single cycle, floating point vector MACC's embedded in your normal instruction stream, and indeed, with a little help from a well defined vector class, genererated by the compiler.

  25. Re:IBM by rgmoore · · Score: 1
    So if IBM can cut server OS development and maintenence costs by 50% by having much of the work done by the Linux community, that increases their profit margins. And it also benefits the Linux community, since they'd be developing and maintaining Linux anyway, and this adds IBM and IBM server customers to the people who have an interest in helping develop and maintain Linux.

    I think that this precisely underlines why Free/Open Source software is such a great idea. When you share, everyone wins. Getting more people onto the platform increases the development effort much less than the support base, so the average effort per user is less. As long as IBM is truly sharing by adding some effort into the system rather than leeching off everybody else, bringing them onboard helps everybody. Admittedly, adding new platforms as IBM is doing is more effort than more users on already supported platforms, but there's also potentially more benefit. Adding users with different needs adds new features (which is why it's more effort), but many of them will provide trickledown benefits to other users who wouldn't necessarily have been willing to develop them by themselves. And IBM is playing fair by putting in the development effort of adding those new platforms and features themselves rather than demanding that others do the work.

    --

    There's no point in questioning authority if you aren't going to listen to the answers.

  26. Get a life by Anonymous Coward · · Score: 1
    You've been reading slashdot way to long if you need to see a conspiracy in everything. IBM is interested in having Linux available on all of its server platform and has commited to that. This included the former Netfinity which I most assuredly predict will include Itanium. (Either this or possibly Athlon's sledgehammer will be the processor for Windows which unfortunately customers demand.)

    You have to understand that IBM has traditionally been a hardware company but services really drive revenue and profit now. Sun hasn't stepped up to Linux the way that IBM, Compaq, and HP has so IBM is hoping to pick up some of the sales and services as Linux momentum picks up.

  27. Re:Linux is dying by Klaen · · Score: 1

    I've used GNU/Linux since early '98, and I haven't posted to USEnet since 1996, sooo.... what good are your figures?

    Seriously, most of us have better things to do with our time than chat in newsgroups or IRC (I finally gave up MUDding in 1997, and I still miss it...), like work for a living!

    Sheesh. This doesn't even qualify as FUD; your 'logic' is ^severaly^ flawed!

  28. Articles on Testing Web Apps, Kernel by goingware · · Score: 2
    Hot off the presses tonight:

    Maybe the folks who write the Slashcode would find it helpful.

    I've posted this here before, but don't want the IBM folks who might be reading to miss it:

    Comments, criticism, additional links and resources to add, suggestions for future articles to write and of course articles you would like to write are appreciated.

    I could also use some help from someone with expertise in designing database schemas.

    Thank you for your attention.


    Mike

    --
    -- Could you use my software consulting serv
  29. Re:OK, dumb 32/64 bit question by tjrw · · Score: 1

    Dunno how the latest snapshots are doing, but historically, the ia32 gcc compiler has produced *really* lousy code for 64-bit long long data.
    So much so, that Linus has shunned use of 'long long' in the kernel as much as possible. This is rather a pity. I hope that someone improves this in the future.

    Tim

  30. Re:IBM by slyfox · · Score: 2
    Right now the top non-clustered TPC-C score is held IBM's s80 system. TPC-C (not SPEC) is considered by many to be the most important server benchmark.

    The system from the benchmark report has 24 RS64-IV 64-bit processors running at 600 Mhz with 96GB (yes, GB) of system DRAM. Each processor has 128kB L1 data cache, 128kB L1 instruction cache, and a 16MB L2 cache. The chips also support course-grain multithreading (simpler, but similar to SMT).

    (600 Mhz sounds slow until you realize that it uses a simple, very efficient 5-stage pipeline. Intel and others achieve high clock rates through deep piplines and rely on branch prediction and other techniques to keep the pipe full. Branch mispredictions and cache misses can kill the actual performance of these chips on real server code.)

    This system with 24 processors outperforms HP's 48 processor "SuperDome" and Sun's 64 processor EU10k (though the UE10k is an old system by now, it is the fastest server Sun is shipping.)

    The above system is not using the Power3 chip from the posted story. You can bet IBM will port Linux to this beast next. We won't see a 24 processor systems with Linux right away, but an s80-like system would make a sweet 4-processor Linux server.

    One last note: these systems are not vapor-ware. A 12-processor system with an earlier version of the same processor has been shipping since the summer of '98.

  31. Re:The real question by Guanix · · Score: 1

    Try finding an old Multia. They have 166 MHz Alpha processors and generally cost around $100 on eBay.

  32. Re:Power3 or PowerPC? by chromatix · · Score: 2

    Here's some acronyms for you, which should clear up the mess:

    POWER == Performance Optimised With Enhanced RISC
    PowerPC == POWER for Personal Computers

    The PowerPC was developed as a cut-down (32-bit instead of 64-bit and lacking a few rarely-used and complex instructions), largely binary-compatible version of the POWER.

    PowerPC isn't really any particular processor, but a specification, which was first implemented as the PowerPC 601 back in 1994 (remember how it totally wiped out the Pentium-75?). Subsequently, embedded versions have been made, along with more powerful desktop versions of the PowerPC - the 603, 604, 750 (G3), 7400 (G4) and now the 7450 (G4+).

    Meanwhile, the POWER has been developed as well, remaining a high-end 64-bit monster for the enterprise-level RS/6000 machines. The PowerPC 601 was based more on the POWER1 than anything else, the chip shown in the log is a POWER3, and the current hot topic is the POWER4 with all these nice new features (one or two of which have reportedly already made it into the 7450...).

    The bottom line is that the POWER and the PowerPC are different but surprisingly similar beasts. They are nearly binary-compatible, which is why the kernel reports it as a PowerPC-class processor.

    --
    --- The key to knowledge is not to rely on people to teach you it ---
  33. IBM by Cirvam · · Score: 1

    What value does this have? I don't know of too many systems that use that processor. Was there a demand for it in the community or was IBM just scratching an itch?

    1. Re:IBM by civilizedINTENSITY · · Score: 1

      So if the last two are correct, does this mean we've won? Has IBM waved its wand and made Linux mainstream? Heres one hopin' happy human.
      :-)

    2. Re:IBM by civilizedINTENSITY · · Score: 1

      Tell me...my school has one and the bastards use it for a *mail server*. Physics can't get dick out of "Office of Information Systems" because they all have MBAs and think Win2k is *the* answer. Oh the shame!

    3. Re:IBM by carlfish · · Score: 4
      Off the top of my head:
      • Products such as Websphere can be released on one OS platform (Linux) and run on IBM's entire range of hardware.
      • Linux has a lot more "geek momentum" than AIX. The guys in the server room would probably be much more excited to get a kick-ass RS/6000 if it meant they could stick Linux on it.
      • It gives them something to talk about in their upcoming advertising campaign.
      • IBM is a hardware company. To them, software is a way to sell hardware. If Linux is popular, then it's in IBM's interest to make sure it runs on their most expensive kit. They'd rather sell an RS/6000 than a Netfinity. (this also explains their porting it to S/390 first)
      • I wouldn't be surprised if the long term plan was to fold the enterprise functionality of AIX into Linux, have the OS maintained by the open source community with much less IBM manpower than AIX takes, and then put AIX out to pasture.
      Charles Miller
      --
      --
      The more I learn about the Internet, the more amazed I am that it works at all.
    4. Re:IBM by Klaen · · Score: 1

      I have a number of rs/6000s at work, running AIX currently. It's nice to know that as soon as I don't need AIX (read: as soon as I don't need V4 CATIA or CATIA is ported to GNU(!)) I can run GNU/Linux on those bad boys!

      For those of you who haven't used a late model rs/6000, I highly recommend them. Those chips are darned fast!

    5. Re:IBM by slyfox · · Score: 1
      Yes, but if you don't artificially exlude [sic] clustered results because you don't like what they say...

      One of the well-known problems with TPC-C is that it uses a hierarchical system where all the data is part of a particular warehouse and only a small percent of transactions need to access any cross-warehouse data. The little intra-warehouse communication required is evenly distributed, so load imbalance is not a problem.

      For this reason, the TPC-C benchmark can run more efficiently on a cluster of servers than many real world OLTP setups. From what I've been told, real clustered DBMS setups suffer from load imbalance and are even more difficult to setup and tune than 'single instance' systems. The majority of OLTP systems in the field don't use clusters for performance.

      That said, one of the best uses of small clusters is to provide fault tolerance and high availability of data, but that is a different setup than the 'clustered' TPC-C results you mentioned.

    6. Re:IBM by StorminNorman · · Score: 2

      this is kinda offtopic and pedantic but...

      IBM haven't ported Lunix (A UNIX implementation for the Commodore 64/128) to the 64-bit PPC platform. They ported Linux. Get It Right.

      Sorry... someone had to do it though :)

      --
      life is a canvas/and the paint is hope and promise/the world is ours/no one can ever take it from us.
    7. Re:IBM by vipw · · Score: 1
      hehe thanks :)

      i was referencing some Jeff K stuff.

      i suppose the moderators are gone by now, here's the link ;)
      USAR FREINDLEY

      and this strip in particular

    8. Re:IBM by Anonymous Coward · · Score: 1

      Unfortunately, there is demand for 64-bit PowerPC processors out there now. IBM will not sell these chips outside of the company as a general rule. It is because these processors where designed by the RS6000 group, and not the microelectronics group. Further, Apple and Motorola have repeatedly turned there backs on a 64-bit implementation, and the one IBM/Motorola 64-bit PowerPC, the 620, was a horrible flop, coming out six weeks before the internally developed 630. IBM developed these processors as follow on to the 604 processors, and only for the RS6000 and AS400 lines. This move is more in line with the migration to move RS6000 away from being AIX centric and towards Linux. Linux is an ideal convergance point for IBM. IBM at this time has 6 OSes they support throught the company, OS2, Win9X, WinNT, AIX, OS390, and OS400. Merging all there different hardware platforms, Netfinity, RS6000, S390, and AS400, on to one common software platform seems ideal to them, and seems consistant with there other porting efforts reported on slashdot such as the very cool S390 ports. I for one would love to see IBM sell these processors to outside vendors, but until that day comes.....

    9. Re:IBM by dapprman · · Score: 1

      It's called AIX 5l, and is the next (and now late) release of AIX. The core feel is still AIX (I'm sorry I know this will cause argumetns, but AIX has about the best LVM I've used, plus it's ls.. commands I miss on other unices) but with linux libraries in addition to the AIX ones, allowing for the easier porting of freeware and shareware.

    10. Re:IBM by Account+Number+Three · · Score: 1

      I wouldn't be surprised if the long term plan was to fold the enterprise functionality of AIX into Linux, have the OS maintained by the open source community with much less IBM manpower than AIX takes, and then put AIX out to pasture.

      Absolutely. IBM will have the service contract on its servers anyway, IBM doesn't make money selling AIX for other peoples' servers, and essentially nobody buys an IBM server for AIX.

      So if IBM can cut server OS development and maintenence costs by 50% by having much of the work done by the Linux community, that increases their profit margins.

      And it also benefits the Linux community, since they'd be developing and maintaining Linux anyway, and this adds IBM and IBM server customers to the people who have an interest in helping develop and maintain Linux.

    11. Re:IBM by vipw · · Score: 2

      IBM can build big-ass proprietary servers and deploy them for customers while still using standard software products. Big deal for IBM since lunix is now a well respected server operating system. Easy to port software to and easy to market.

      So, you can see this as yes IBM is scratching an itch, but at the same time making lunix more available in the high-end enterprise environment.

    12. Re:IBM by Anonymous Coward · · Score: 2

      Without passing value judgement...

      The PowerPC running in 64-bit mode will help them get Linux up-and-running on the eServer iSeries (that platform has been 64-bit for longer than just about any other major server). It allows them to funfill their goal of getting Linux running across all the eServers.

    13. Re:IBM by swb · · Score: 1

      The guys in the server room would probably be much more excited to get a kick-ass RS/6000 if it meant they could stick Linux on it.

      The guys in the server room who clean the crap off the floor get hot and bothered by operating systems. The guys these people clean up after only care about getting the job done right.

    14. Re:IBM by rgmoore · · Score: 4

      This is probably IBM anticipating. After all, just because there's no demand now doesn't mean that there won't be demand when the system is available. Getting the system ready ahead of demand is smart; it means that when people running PPC want more horsepower, IBM will be able to provide them with a nice smooth path to 64 bit PPC. This looks like it's just a regular part of IBM's Linux strategy. They want to make it available everywhere, so companies can upgrade to more and more powerful systems without having to relearn everything.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    15. Re:IBM by Cef · · Score: 2

      Something that I can see this having a use for, is for boxes that are going to soon (very probably) reach an 'end of life' in the IBM OS camp. True there is Project Montery, but some of these 64 bit machines could definately end up lying in the dust. Especially with a whole new OS and designers that may decide that the earlier systems are too 'hard' to support easily.

      I'd much rather see IBM make sure that something decent still runs on older boxes, than having no shipping OS at all that will support such platforms.

      And I'm not saying IBM will drop support for these platforms anytime soon, but it's much easier to get the thing started now, than later, when it could be too little, too late.

      HP's PA-RISC's are a prime example. Linux runs on these, as various numbers of the machines fell into the hands of Linux developers that have experience in kernel code/porting. Unfortunately the machines never really made it, but at least the people out there have something that runs and is probably better supported than what you would otherwise get. It was the thing, just too late in the game. At least IBM aren't making the mistake of leaving people lingering with unusable hardware.

      Also remember that IBM has a considerably large Second Hand group that reconditions trade-in systems and then sells them to more disadvanged groups or countries - and if they don't have an OS to ship on those machines, what do they do with them?

      Of course, what contributions they get from some of the truely bright sparks in the community with the Linux port, may actually improve the way their own code monkeys write/implement their next OS kernel, which they can only view as a win-win situation.

    16. Re:IBM by sawdey · · Score: 1

      A correction and a comment... RS64-IV (aka IStar) has a 4 stage pipe, not 5. Also notable is that due to the short pipe, there is no branch penalty in most cases. This processor also has HMT (hardware multithreading). It only executes one thread at a time, but can switch threads on cache misses or other long latency events.

  34. The real question by sageFool · · Score: 1

    is when are they going to make swanky 64+bit hardware I can actually afford? :)

    1. Re:The real question by mikefoley · · Score: 1

      Define "afford"?

      --
      What's my Karma Mr. Burns? "Excellent"
  35. Nope. by Guido+del+Confuso · · Score: 1

    I don't see this processor yet listed on the NetBSD page, even on the mind-bending list of not-yet-integrated ports; is this a first? :)

    Well, not exactly. Perhaps some of us remember MkLinux? Apple ported that to the older Nubus based Macs, something nobody else seems to have done with any other OS. So IBM porting Linux to a chip they designed? Whoo hoo. Yay Linux. They could've just as easily ported NetBSD or any other operating system to it, if they had the inclination (which, obviously, they don't given their latest Linux kick). Now, if they'd ported Mac OS X to the PPC64, that'd be something to write home about. =-)

  36. Re:Big Time Linux: Itanium, S/390, PPC64 by bugg · · Score: 2
    I'll bite, only because the parent comment somehow managed to be modded up.

    Linux was the first OS ever to boot on Itanium. (*bsd not there).

    Where are the Itanium computers? This port isn't of much use to nearly everyone.

    Linux was first on PPC64. (*bsd not there).

    Where are the PPC64 computers?

    Linux was first free OS on S/390. (*bsd not there.)How many people own an S/390?

    Linux was first on UltraSPARC.

    And where is a semi-usable UltraSPARC distribution?

    Heck, all of these ports require much hand-rolling. And you also mentioned hardware which the vast majority of people here have never even touched or seen- have you?

    Proof of concept ports, and ports that aren't deployed anywhere in the real world: these aren't of much use, regardless of if the port is of a Linux or a BSD.

    --
    -bugg
  37. Re:Big Time Linux: Itanium, S/390, PPC64 by norwoodites · · Score: 1

    Actually AIX was first on PPC64.

    Also most people cannot just buy new hardware, that is why NetBSD is ported to the VAX, but why not LINUX?

  38. Re:OK, dumb 32/64 bit question by CargoCult · · Score: 2

    Datacenter can address 64 GB via Intel's Processor Address Extension hack (basically allows 34bit addresses via "dual address cycle" hardware (can pump in two 32 bit addresses in once clock cycle).

    It performs pretty well for a kludge but does require your application to use the MS AWE (address windowing extensions) memory allocation api's which have some restrictions, such as only providing page fixed memory and only allowing you to dealloc in the same unit you alloc'd (so writing dynamic memory handling is not easy))

    The increased address space is cool if your o/s has a good (fast, influenceable) vm manager - you can strip out buffer mgt code from your app (reduces complexity)

    Also great for server apps that do lots of read io as you can buffer even at large concurrent user workloads, so can see Real/Oracle/Akamai type apps benefiting

    --
    **Vanuatu or bust**
  39. Re:Big Time Linux: Itanium, S/390, PPC64 by barneyfoo · · Score: 4

    Where are the Itanium computers? This port isn't of much use to nearly everyone.

    Itanium represents the first commodity 64bit enterprise computing platform. A major advance if you ask me (regardless of performance), and linux will be there first, along with SCO, and win2k bringing up the rear.

    Where are the PPC64 computers?

    Ever hear of Power3 and Power4, and AIX? 'nuff said.

    How many people own an S/390?

    I think the count of people that use S/390 is far less inportant than the importance of those people. S/390 has no peer in its class as a mainfraim. Sun's starfire comes close.

    And where is a semi-usable UltraSPARC distribution?

    Debian has a semi-usable distribution for Ultra Sparc. I beleive they have Xfree working, among other things, along with the trivial ports that just require a linux kernel

    Proof of concept ports... these aren't of much use...

    Needless to say, I disagree.

  40. No SMP yet though by Cassivs · · Score: 1

    The little thread this started noted discrepancies in the number of CPUs reported in the bootlog (4, 8, and 1). There are 4 CPUs in the machine, it supports 8, and the native 64bit Linux port supports 1. But the 32bit Linux port (emulates 32bit on the Power3) supports SMP. I'd be interested to see a performance comparison between the 64bit native and 32bit emulation kernels. :)
    Of course, I assume SMP will be arriving sometime shortly.

    --
    -skip
  41. Re:Big Time Linux: Itanium, S/390, PPC64 by Lx · · Score: 1

    I said early 90s, not 1991. You're telling me that Linux was stable and usable in 91? I never ran into *anyone* running a server doing serious work on Linux before 95.

    -lx

  42. Re:Big Time Linux: Itanium, S/390, PPC64 by Lx · · Score: 1

    All I can do here is give my personal perspective, and point out that this is all based on a very small part of the original post I wrote.
    I'm a long time BSD user, who's also tried the top 10 Linux distros at various points. The only ones I even came close to liking were Slackware and SuSE. SuSE is very commercialized, and it's hard to get ISOs for, so I don't mess with it much.

    But I do know that the various times I've played with Linux over the years, it's proven to be quite a bit less stable(first time I installed redhat, it locked up after the first reboot, redhat, caldera, debian and corel all failed to install and boot on a rather flaky p166, whereas FreeBSD did flawlessly), far behind in terms of package management, i.e., rpm(I know about apt, and I think it's cool, but that's one Linux distro, and one I don't dig on much), and fragmented(even though this is what people say about the BSDs, the different distros are very dissimilar, and are quite large in number).

    These are the things that have made me stick with BSD over the years. I get my OS from a central location, worked on people in an actual team environment with democracy and accountability, released under a license which is truly free, is easy to get ahold of and install(many linuces didn't have install over ftp for years, some still don't), they have great package management, great performance, great support, and great stability. I'm not saying there's no place for Linux, but given the reasons that I've just mentioned, why would I want to use it?

    Anyhow, that's all just my opinon, and you expressed yours. What I *do* take issue with is the "never caught up" bit. In what way is any given BSD distrobution not equal or superior to Linux?

    -lx

  43. Re: That's why you need a damn good compiler by civilizedINTENSITY · · Score: 1

    SGI's, on the other hand, was redesigned from the ground up (starting with the gcc parser for compatability) to use all of the neat, theoretical tricks that you need to get ILP in this situation. TurboLinux has already gone with it and demonstrated good results (that's one reason why NCSA will be using Turbo for the second stage of their huge, new cluster).

    This sounds a bit like the apache patches SGI did for the Accelerated Apache project. The patches were submitted but aren't (apparently) going into apache.

    My concern is that as more and more code is given to us, will we refuse it becuase we didn't write it? And would that be a bad thing?

  44. Re:Power3 or PowerPC? by RottenApple · · Score: 1

    Well, folks...
    I know that PowerPC is subset architecture of the Power. ( not Power2, Power3, Power4 )
    I'm not sure if current PowerPC processors have binary compatibility with current Power processors. Although the begining of PowerPC is similar to the Power, I don't think PowerPC architecture is identical to Power architecture.

    You can't say that Z80 is x86. ( Z80 is said to have binary compatibility with x86. )

    Does AIX for RS/6000 with PowerPC processors run on RS/6000 with Power2, Power3 without modification? I'm not sure of it.
    Is there any person who tried it?

  45. Re:OK, dumb 32/64 bit question by ivan256 · · Score: 1

    Right, but who said these chips aren't targeted at servers? That's what I'll be using them for :)

  46. Re:Linux is dying by civilizedINTENSITY · · Score: 1

    Except for the server market where linux is the only platform to grow faster(24%) than MS(20%). As long as we can "falter" our way to the front of the pack, then lets "falter".

  47. Re:IBM Gaining Marketability in Mainframe Industry by searleb · · Score: 1

    Compaq and HP, my bad.

  48. Wonderful by wolfman3000 · · Score: 1

    Isn't the scalability and adaptibility of linux wonderful! :) Way to go IBM

    --
    "Never let your sense of morals prevent you from doing what is right."
  49. Re:Big Time Linux: Itanium, S/390, PPC64 by Klaen · · Score: 1

    1) Q: Where are the Itanium computers?
    A: Still vapour.

    2) Q: Where are the PPC64 computers?
    A: Any late-model RS/6000. I have quite a few at work; don't you? *grin*

    3) Q: How many people own an S/390?
    A: A great many large corporations. Duh. What could anyone do with one in their living room! For that matter, how many people owned VAXen when they were new? Again, corporate hardware.

    4) Q: And where is a semi-usable UltraSPARC distribution?
    A: Try SuSE; you'll be glad you did.

  50. Re:Big Time Linux: Itanium, S/390, PPC64 by gimpboy · · Score: 1

    Debian has a semi-usable distribution for Ultra Sparc. I beleive they have Xfree working, among other things, along with the trivial ports that just require a linux kernel

    dont forget mandrake. they also have a version for sparc: ftp://fr.rpmfind.net/linux/Mandrake-iso/sparc/

    although it is beta. i must confess that i have never used it, but i would think it's at least semi-useable.

    use LaTeX? want an online reference manager that

    --
    -- john
  51. Re:OK, dumb 32/64 bit question by Trepalium · · Score: 5

    eh? The 'bittiness' of the CPU rarely has anything do with floating point capabilities. The Intel x86 line all have the ability to use 80-bit floating point numbers (10 bytes). In fact, it was because of this the [in]famous FPU memory move was created for the Pentium processors -- it was faster to move memory into the FPU registers and then out back to memory than it was to use the usual movsd instructions to do the same, because via the FPU you moved 8 bytes (64 bits) at a time, whereas with movsd, you were only moving 4 bytes at a time. On the Pentium Pro and Pentium II, they finally fixed this by the use of write combining so that movsd'ing a block of memory was as fast or faster than doing it via the FPU. The numbers of bits generally refers to one of two features of the CPU -- either it's bus, or the size of the general purpose registers and address space. The Intel Pentium for example, had a 64-bit bus, but still only 32-bit registers and memory space. The Intel 80386SX had a 16-bit bus, and 32-bit registers.

    --
    I used up all my sick days, so I'm calling in dead.
  52. Power3/4 and OS/400 by wokie-bug · · Score: 1

    Power3/4 compatiblity would be very nice for me. I can run Linux and OS/400 at the same time on the same machine. This means I do not have the need to buy other hardware, like pc-servers to provide a total solution from databases and mail with OS/400 to file, print and webserving with Linux. The hardware based on the powerseries chip is scalable to a maximum of 24 processors, so one machine does it all!

  53. Re:its pretty bad when the editors troll... by Tony-A · · Score: 1

    Ok, I'll bite.
    First, /. is biased, biased in favor of Linux, biased in favor of Open Source (whatever that really means).
    Second, /. likes to stir up controversy, with the result that the commentary is usually much more interesting and informative than the linked articles.
    Third, for anyone coming from outside, the flame wars help identify the major players, and occasionally even some useful informations.

    There seems to be some kind of natural progression from Windoze to Linux to BSD, with a number of people running a mix. My idea of "World Domination" is half the destops running OpenBSD, but them I've got a warped sense of humor ;-)

  54. Execution units rapidly reach diminishing returns. by Christopher+Thomas · · Score: 5

    "Unlike a typical PC microprocessor, the chip features eight execution units fed by a 6.4 gigabyte-per-second memory subsystem, allowing the POWER3 to outperform competitors' processors running at two to three times the clock speed"

    Eight execution units! I recall that the x86 line have half of that. And 6.4Gb/s memory is not to be laughed at either!


    Memory bandwidth is a good thing. Low latency cache hits are great thing, if you can get them (no idea if PPC does this or not).

    However, adding more execution units won't buy you much beyond a fairly small number. The reason: you just don't have that much extractable parallelism in the serial instruction stream.

    I had the good fortune to be playing with this recently via simulation. If you give the processor a *huge* instruction window (256 instructions) and the ability to execute *any* number of instructions of *any* type in parallel (except for memory accesses - see below), you still get an average Instructions Per Clock of about 2.1-2.2. 95% of the time, you're getting four instructions or fewer issued (and most of the time, far fewer than that).

    When SMT is put in silicon, wider issue will become practical (due to increased parallelism in the instruction stream), but as it is, you're better off spending the silicon on other improvements.

    Re. memory accesses; the reason why it's extremely difficult to do memory accesses out-of-order with each other is that you have to check to see if any given two memory accesses refer to the same location (indicating a dependence). You often don't know what the target address is until late in the pipeline, and you'll still need to do a TLB translation to get the physical address, and compare two large bit vectors (the addresses).

    Remember, to be useful for scheduling, you have to be able to do all of this very quickly and very early in the pipeline.

    All of this makes out-of-order memory accesses very difficult to implement theoretically, and a nightmare to implement in real silicon. It's still sometimes done in a limited manner, but this doesn't affect the IPC very much.

  55. IBM using Macs? by AFCArchvile · · Score: 1

    Oh, the awful irony! Seeing cubes on desks where bulky PC AT computers once sat!

    --
    "Ancillary does not mean you get to rule the world." --U.S. Circuit Judge Harry Edwards, speaking to the FCC's lawyer
    1. Re:IBM using Macs? by joedumb · · Score: 1

      hate to say it, but you are wrong, my friend. IBM was one of the partners in developing the PowerPC chip, which was a combo of 88k technology by Motorola and the old POWER (not power3) technology by IBM

  56. Re:List of CPU architectures supported by Linux? by catpyss · · Score: 1

    "The difference is that NetBSD is an a complete Operating System, not just a kernel."

    Wow, your right. All this time I have been interfacing my kernel by hand. You OS-bigots were right all along! What I need is an "operating system". I feel so bad that I have wasted all this time using a kernel when I could have been using a "cool" operating system. Sign me up!

  57. Re:Big Time Linux: Itanium, S/390, PPC64 by Lx · · Score: 2

    Right. Let's take a look at this.

    Itanium: Linux has an Itanium emulator written specifically for it, by Intel, I believe. That makes it kind of easier. Besides that, BSD does boot on the Itanium, even though they were severely impeded by lack of tools.

    PPC64: It was ported by a corporation, fuckwit. A corporation with more resources than a non-profit organization could ever put towards porting to a platform, porting Linux to run on their own hardware, whereas NetBSD is an independent effort. They can't just run out and get a PPC64 box for themselves.

    S/390: Same story.

    UltraSPARC: Both run on UltraSparc, but I don't know dates of when they first booted, or the extent of Linux/Sparc support. This might actually be...a *relevant point*!

    And then you call this stuff "mainstream, state of the art hardware". For all but the UltraSPARC, it's impossible for a normal person to even lay hands on one of those machines. Even in the case of corporations, how many do you know that are running Linux on IBM boxes instead of AIX? Why the hell would anyone want to, seeing as how AIX generally outperforms it anyhow?

    In any event, how about high-end hardware that people can actually buy? NetBSD was the first to be running on the Alpha, for instance, a high performance platform that actually matters. First on SGI boxes. How about i386, the architecture everyone uses? In the early 90s, NetBSD was far more complete and usable than Linux, and to this day has very complete hardware support for the platform. One could also point out that Linux has been lagging behind on new technologies, like IPv6. Might want to take that into account when you're tallying up the final "Score".

  58. Re:Linux VAX at Sourceforge by norwoodites · · Score: 1

    read the second line down, or so, they say just to go to netbsd anyway.

  59. Re:Power3 or PowerPC? by dapprman · · Score: 1

    Yes. I've used the same install media for a couple of winterhawk nodes (POWER2), a couple of 44P270s (POWER3) and an elderly C20 (Control workstation PPC), with no problems at all.

  60. Re:RS/6000 by dapprman · · Score: 1

    Good I'm not the only AIX fan on Slashdot. As I wrote in a reply above, AIX 5L (the next version of AIX) will have linux libraries in it as well as the AIX ones allowing for easier (make that Far easier) porting of freeware and shareware. I do not believe that IBM plan to port PSSP to linux, which means the nodes in an SP frame (plus the control workstation and any attatched S series RS/6000s) will have to use AIX, and as the SP frame and S80/S85 are the IBM flagship products I can sse AIX being around for a fair bit longer.

  61. Re:List of CPU architectures supported by Linux? by Cassivs · · Score: 2
    There's a list of most of the currently supported architectures available here, mentioning the architectures actually in the kernel tree, and some that aren't.
    Of course, this is not all of them, S/390 is even missing.

    And uLinux runs on architectures like the DragonBall, and other things too. I don't know of a complete list anywhere.

    --
    -skip
  62. IBM Gaining Marketability in Mainframe Industry by searleb · · Score: 1

    IBM makes mainframes. They are third in the industry behind Sun and Compaq (DEC), respectively. I think they are aiming the 64-bit PPC Linux port to give them a boost in the market. Native linux has scalability problems but IBM has been putting an effort into developing large scale Linux solutions in recent projects. Combined with a recent deal with Redhat, they seem to be surging forwards into the linux mainframe market.

  63. Re:List of CPU architectures supported by Linux? by Cassivs · · Score: 2

    There's another list here, with some other ports mentioned, that a quick google search turned up.

    --
    -skip
  64. Re:Linux is dying by ViVeLaMe · · Score: 1
    OK i got it :-)

    i'm sure it's the *same* anonymous coward who posted the trail of trolls about BSD is dying in every /. news about *BSD :-)

    guess he must be working for microsoft, trying to get the various *NIX users to fight each others.. .. :o))

    --
    i had a sig, once..
  65. Re:Execution units rapidly reach diminishing retur by Jah-Wren+Ryel · · Score: 2

    IBM's current p680 box (up to 24 Power3 IV cpus) does implement a kind of multi-threading already it is too coarse to be called SMT, but it is multi-threading. As it is now, each processor 'presents' itself as two cpus to the OS. They say that it took less than 5% of the chip real-estate to support this multi-threading. If you look at their benchmark results on Spec and TPC, it seems to have paid off quite well.

    --
    When information is power, privacy is freedom.
  66. Re:Execution units rapidly reach diminishing retur by valdis · · Score: 2
    had the good fortune to be playing with this recently via simulation. If you give the processor a *huge* instruction window (256 instructions) and the ability to execute *any* number of instructions of *any* type in parallel (except for memory accesses - see below), you still get an average Instructions Per Clock of about 2.1-2.2. 95% of the time, you're getting four instructions or fewer issued (and most of the time, far fewer than that).

    Yes. You average 2.1-2.2, and 95% of the time you're only getting 4 or fewer. However, when you look at the other stuff the Power3 architecture includes, it's pretty obvious what the overall intent is:

    Hardware Loop Unrolling.

    IBM has got some customers that use some serious CPU. We're talking national labs and the like. For them, the ability to run 8 of those neat 'multiply-and-add' instructions per clock cycle is quite an important feature.

    The chip *starts* at 375mz, and can do 16 floating point ops/clock (an amazing amount of code uses that mult-and-add over an array - and the IBM compilers are smart enough to detect and convert divide to multiply-by-inverse and add/subtract issues).

    And of course, IBM is hoping that even though the big SP/2 iron is limited to national labs and Fortune-500 companies (see The Top500 List for details), that they'll be able to sell a lot of the smaller 43P deskside boxes (1-4 Power3 CPUS) and the 8-16 CPU rackmount servers, to all the smaller companies that need number-crunching.

  67. How is this news? by Knobby · · Score: 1

    http://slashdot.org/articles/00/06/11/145227.shtml ... LinuxPPC boots on a Power4 -- June 11, 2000..

  68. Re:OK, dumb 32/64 bit question by Guy+Harris · · Score: 2
    Well, a 64-bit integer solves the Y2037 bug inherent in Unix.

    ...and a 64-bit integer can be manipulated on a 32-bit machine - and even fairly conveniently, if the compiler cooperates, and GCC does cooperate here (think long long int).

  69. List of CPU architectures supported by Linux? by r.+ghaffari · · Score: 3
    I was arguing with a friend a couple of days ago on the merits of BSD vs. Linux, and while he rattled off the list of CPU architectures that NetBSD supports (obviously not off the top of his head), I was unable to find a central listing of CPU's supported by Linux.

    My question is, is there such a page updated with such info? I don't believe that Linux Torvalds maintains all different architecture branches..

    Thanks!

    r. ghaffari
    (25/M/Baltimore, MD)

    1. Re:List of CPU architectures supported by Linux? by Amon+Re · · Score: 2

      you can check http://www.kernel.org/ for a list and also you can look into the latest kernel source, check the directory /usr/src/linux/arch.

  70. OK, dumb 32/64 bit question by popular · · Score: 1
    Offhand, I'll guess that one of the biggest advantages is the ability to address more memory. I'm sure someone out there can find a reason to address more than the 8 or 16 GB that W2K Datacenter can, with some hacks they developed with Intel (flashback to LIM memory?).

    I don't see myself using that much RAM anytime soon, but the industry is moving toward it, so what else would 64 bit do for me?

    --

    1. Re:OK, dumb 32/64 bit question by ivan256 · · Score: 1

      The way that >4Gb of memory is implemented on IA-32 is basically a hack, and if you have > 4Gb of memory there is a performance hit.

      Either way, there are many advantages to having a 64 bit virtual address space even if you don't have enough ram to to use up the larger physical address space.

    2. Re:OK, dumb 32/64 bit question by mrdisco99 · · Score: 2
      Well, a 64-bit integer solves the Y2037 bug inherent in Unix.

      +++

      --

      +++
      NO CARRIER

  71. Power3 (and Power4) have som really cool features. by zensonic · · Score: 5

    As in almost any area where theres money to be earned Big Blue is in there with some really cool hardware.

    Taken from:

    http://www.rs6000.ibm.com/resource/pressreleases/1 998/Oct/power3.html:

    "Unlike a typical PC microprocessor, the chip features eight execution units fed by a 6.4 gigabyte-per-second memory subsystem, allowing the POWER3 to outperform competitors' processors running at two to three times the clock speed"

    Eight execution units! I recall that the x86 line have half of that. And 6.4Gb/s memory is not to be laughed at either!

    --
    Thomas S. Iversen
  72. The Alpha EV8 will have SMT by renoX · · Score: 1

    Details on the subject is quite low, but it WILL have SMT.

    Intel Jackson is rumoured to have SMT too.

    IMHO SMT will be the next big thing: not too complicated to implement and in some case a big boost..

  73. really useful ports by influensa · · Score: 2
    Linux and/or NetBSD being ported to yet another chip is nice, but not really big time news.

    These are platforms I'd like to see some porting being done for:

    Other people's brains (remote access, perhaps X-10 integration)

    Garage door opener (so I can apply sound themes to it, and replace the "so 1991" screeching)

    Alarm clock (see garage door opener)

    Pets (obediance school in any C, Perl, or any language you want).

    --


    Jeremy McNaughton

    ------ Live simply so that others may simply live.

  74. Not new, but still valuable by roguerez · · Score: 1

    While this is an interesting accomplishment, the fact that the port is to a 64-bit processor isn't something new. Linux has been running on 64-bit Alpha processors for a while now.

    It does underline IBM's commitment to Linux (again), though. Which is eventually good for entire open source community, which gains from the improved visibility to the public.

  75. Re: That's why you need a damn good compiler by Nexx · · Score: 2

    My concern is that as more and more code is given to us, will we refuse it becuase we didn't write it? And would that be a bad thing?

    In a word, yes. If the "Not Invented Here" syndrome becomes rampant, then large corporations will have less incentive to build improvements upon the system, and therefore, will start again with either a fork or a potentially closed-source proprietary system. Either way, support for the original open-source system will wither away, reducing the potential for corporate uptake.

    However, this is just a generalisation; I cannot comment specifically upon Apache itself.


    --
  76. Re:Big Time Linux: Itanium, S/390, PPC64 by chrysrobyn · · Score: 1

    I think the count of people that use S/390 is far less inportant than the importance of those people. S/390 has no peer in its class as a mainfraim. Sun's starfire comes close. Sun's Starfire, aka Enterprise 10000, comes as close to competing with the S/390 as a Cessna does to an SR-71. They both fly, burn stuff to get from point a to point b and get their jobs done, but they cater to different people entirely. Please allow me to consult Google and present you with this 13 September, 1999 InformationWeek story. "The RS/6000 S80 symmetric multiprocessing server will be available next week with between six and 24 450-MHz PowerPC RS64 III chips based on copper technology for enhanced performance. IBM says a 24-way S80 outperforms Sun's 64-way Enterprise 10000 server. Though Independent Transaction Processing Council benchmark results aren't final, industry analyst Brad Day of Giga Information Group confirms IBM's claim. Pricing for the S80 starts at $290,000 for a six-way server. The vendor says high-end versions of the server will cost 50% less than Sun's high-end versions of the Enterprise 10000. " Since then, the RS6k has continued to grow in speed. The S/390, on the other hand, does not specialize in massively parallel jobs like that. The S/390 is awesome at high volume data processing. I'm not sure who you insulted more-- the S80 for forgetting about it or the S/390 for not knowing what it did.

  77. Re:Big Time Linux: Itanium, S/390, PPC64 by mrdisco99 · · Score: 1
    Where are the Itanium computers? This port isn't of much use to nearly everyone.

    They're not here... yet...

    Where are the PPC64 computers?

    Lessee... IBM's RS/6000 170 and 270 workstations, p640 rack-mount server, and Winterhawk and Nighthawk SP nodes.

    Granted, Linux probably isn't much use on SP nodes, but would be very appropriate on the other systems I mentioned. The p640 is a powerful 4-way machine capable of powering a medium sized web site or mail server.

    How many people own an S/390?

    This is totally irrelevant. These platforms aren't meant to be personally owned. However, many large corporations own S/390s and Linux is very handy on a partitioned S/390. Just imagine thousands of independent web servers on a single machine.

    Heck, all of these ports require much hand-rolling. And you also mentioned hardware which the vast majority of people here have never even touched or seen- have you?

    Again, whether or not people have touched or seen these platforms doesn't make them irrelevant. In fact, these platforms have more significance because they are the way toward corporate acceptance of Linux. NetBSD was recently ported to Dreamcast, which many people have touched and even own... but who cares?? The computer industry is a lot more than personal systems.

    And, by the way, I see and touch these systems every day... at work. Maybe your opinion would change if you actually worked with computers.

    Proof of concept ports, and ports that aren't deployed anywhere in the real world: these aren't of much use, regardless of if the port is of a Linux or a BSD.

    I agree... But, I don't think that applies to these ports.

    +++

    --

    +++
    NO CARRIER

  78. Re:its pretty bad when the editors troll... by catpyss · · Score: 2

    I completely understand your point and agree, but if I may offer an example or two of Slashdot reporting:

    http://slashdot.org/article.pl?sid=01/02/21/145023 4&mode=thread
    "it looks like NetBSD could give Linux a run for its money in the handheld arena."

    http://slashdot.org/bsd/01/02/05/1859221.shtml
    " 'Linux 2.4.0 is available for no money. So is FreeBSD. Linux uses advanced hardware, so does FreeBSD. FreeBSD is more stable and faster than Linux, in my opinion. "

    Basically the precident is that it is acceptable to be inflammatory as long as your aren't Linux. A majority of the articles comparing BSD and Linux do so on a well-known point, stress under high loads. Notice that Slashdot does not post articles comparing native application support, user-base, or multi-processor support. Posting of such articles or comments will likely be considered inflammitory.

    In posting this in now way am I trying to start a flamewar. However, I do feel Slashdot holds a double standard in how it treats BSD remarks, especially on the front page. Being immature and biased is useless, regardless of OS choice. Thoughts?

  79. Political considerations by Stephen+Samuel · · Score: 2
    IBM doesn't want to see Linux ported to the Itanium, but not the PPC, in 64 bit mode. If that were to happen, it would pretty much kill support for the chip outside of things like the AS/400.

    Far better for them to put the work into ensuring a stable port to their new chip. Now all they need to do is to wait for Intel to put out a sickly version of the Itanium (like they did with the first release of the P4).
    --

    --
    Free Software: Like love, it grows best when given away.
  80. check your facts by RelliK · · Score: 1
    IBM makes mainframes. They are third in the industry behind Sun and Compaq (DEC), respectively.

    Since when has Sun started selling mainframes?
    ___

    --
    ___
    If you think big enough, you'll never have to do it.
  81. Power3 or PowerPC? by RottenApple · · Score: 1

    The log says that it's PowerPC, but on the bottom line of it, it says:
    cpu : POWER3 (630+)
    clock : 375MHz
    revision : 1.4
    bogomips : 748.75
    zero pages : total: 0 (0Kb) current: 0 (0Kb) hits: 0/0 (0%)
    machine : CHRP IBM,7044-270

    POWER3? I don't think Power3 is a PowerPC processor architecture.
    So, the log is confusing.

  82. Re:Why would you? by popular · · Score: 1
    http://www.netbsd.org/Ports/sparc64/
    (This link is to be butchered by SpaceDot, the Slash daemon of link mangling and patron saint of goat sex)

    --

  83. Re:Linus sucks dicks for money by rdean400 · · Score: 1

    You seem to have a real obsession with fallatio, judging by your posts in this thread. Perhaps you're having fantasies about Bill Gates?

  84. Re: That's why you need a damn good compiler by codealot · · Score: 1

    Doing this kind of parallelism extraction in the compiler just plain makes more sense than doing it on the chip (with SMT).

    He was talking about the compiler. The fact is, with the vast majority of code achieving a high level of ILP with compiler technology is a extremely difficult problem with no easy solutions. GCC tries very hard to achieve ILP on RISC architectures too. The standard tricks are well-known, but not nearly good enough to keep 8 integer units busy. (In some cases it may be provably impossible.)

    Even if IA-64 solves the ILP problem, there is still the memory latency barrier. For those reasons many (myself included) are not so optimistic about IA-64 performance in the long haul.

    And it's also why RedHat is screwing over their users by sticking with gcc over SGI's (GPL'ed) IA-64 compiler

    I have no doubt that SGI's compiler is better suited to IA-64 today, since the backend was written exclusively for that chip. GCC's goal of compiling for every hardware is still worthwhile.

    Btw GCC is not "Cygnus' baby", it belongs to the FSF. And the FSF has specific rules concerning contributed code that have nothing to do with Red Hat.

  85. Re:IBM -it's for tivo by voxman · · Score: 1

    ibm makes powerpc chip for linux appliances.

  86. One size doesn't fit all. by crovira · · Score: 2

    Where do you get your figures (everybody does not know that, yadda, yadda...)

    Don't fall into the arrogant assumption of thinking one architecture is enough for everything. You wouldn't want just VGA graphics now would you?

    The 370 architecture is alive and quite well, thank you, and processing payroll, accounting and other mundane crap that you can't live without.

    But you wouldn't want to have to write a game for a 3279 terminal now would you. No more than you'd want to bank with somebody who'd balance your accounts on a PS2.

    The PPC architecture is alive and well and the G4 is very useful for some types of processing and totally useless for other things but what it does, it does damn fast.

    The x86 is as much of a dead-end as the z80. It will be utterly swamped by the requirements of voice processing and image recognition that a wired economy needs. Forget passwords. Just say your name and smile for the cam. (And that's only the first app. The one at the gate, so to speak. )

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
  87. RS/6000 by Bender+Unit+22 · · Score: 1

    If I remember correctly, about a year ago, the smallest rs/6000(almost pizzabox) was the only only one they supported with linux(Yellow Dog). When I asked if it would work on one of the bigger boxes, they just said, "It might, we just don't support it". I never got around to test it since I liked AIX better than Linux(sorry but I think that it was really better with AIX on that platform).
    Anyway I don't think this is a plan to kill AIX and replace it with Linux. There still is Project Monterey. which is a 64bit operating system only,,, and linux compatible. and will be available on Intel platforms(when available). well go there and read for yourself.
    btw. if you are interested in seeing the peace-love-linux, thing you can see it here: IBM'n'Linux
    --------

  88. Re: That's why you need a damn good compiler by JohnZed · · Score: 2

    Doing this kind of parallelism extraction in the compiler just plain makes more sense than doing it on the chip (with SMT). The compiler can see all of the source code at once and spend a huge amount of time studying the problem (which can be extremely complicated, if you want a really good, inter-procedural, flow-sensitive analysis) and then spit out code that bundles it up in explicit parallelism.
    That's exactly why IA-64 is going to kick the crap out of other architectures 3 years down the line (once the compilers actually get good). And it's also why RedHat is screwing over their users by sticking with gcc over SGI's (GPL'ed) IA-64 compiler. As the first author said, conventional compilers only 2.X IPC at best. So two-thirds of the Itanium execution units are wasted when you use a compiler like gcc. SGI's, on the other hand, was redesigned from the ground up (starting with the gcc parser for compatability) to use all of the neat, theoretical tricks that you need to get ILP in this situation. TurboLinux has already gone with it and demonstrated good results (that's one reason why NCSA will be using Turbo for the second stage of their huge, new cluster). But gcc is Cygnus' baby, and they will fight to keep using it, no matter how badly it hurts performance in the end.
    --JRZ

  89. Re: That's why you need a damn good compiler by codealot · · Score: 1

    The fork has merged. EGCS is now GCC.

  90. Now give one to Linus! by Midnight+Thunder · · Score: 1

    If the core Linux kernel development team had their hands on one of these machines, it might allow them to do any other modifications to the core Linux to make it ready for when other procesors come in the 64-bit varities.

    --
    Jumpstart the tartan drive.
  91. Re:Execution units rapidly reach diminishing retur by Chris+Colohan · · Score: 1

    I had the good fortune to be playing with this recently via simulation. If you give the processor a *huge* instruction window (256 instructions) and the ability to execute *any* number of instructions of *any* type in parallel (except for memory accesses - see below), you still get an average Instructions Per Clock of about 2.1-2.2. 95% of the time, you're getting four instructions or fewer issued (and most of the time, far fewer than that).

    What set of benchmarks were you running when you collected those numbers? I assume you were using a benchmark suite such as SPEC. This is only one measure, useful to a certain audience. IBM is more than happy to create special hardware to accellerate a single application -- DB/2. If they can show that building this hardware will run their customer's codes faster, they can (and will) ignore SPEC performance numbers.

    I use SPEC in my own research because it is a well packaged, easy to use and simulate set of applications, and it includes source code. It also lets me make direct comparisons to the research of others. But my results are not universally applicable -- you still can't beat evaluating hardware directly on your customer's code. You can bet I would be using a DB/2 (or other commercial database) to evaluate compiler and architectural tradeoffs if:

    • There was a simple benchmark
    • It had source code
    • It was small enough to run under a CPU simulator (ie, no more than 1 cpu second on our current infrastructure), and
    • It was free from restrictions on what results I could publish.

    Unfortunately, no such benchmark exists. So we are left simulating what we have, such as the vortex benchmark from SPEC.

  92. Actually, SMT is much better. by Christopher+Thomas · · Score: 2

    Doing this kind of parallelism extraction in the compiler just plain makes more sense than doing it on the chip (with SMT). The compiler can see all of the source code at once and spend a huge amount of time studying the problem which can be extremely complicated, if you want a really good, inter-procedural, flow-sensitive analysis) and then spit out code that bundles it up in explicit parallelism.

    You seem to have an incomplete picture of what SMT is.

    SMT - Symmetrical Multi-Threading - is simply the ability to have multiple threads running on a chip at the same time, with separate fetch units and register files but with the instruction window and the functional units still shared.

    The threads don't even have to be from the same program, or in the same address space (though it'll reduce TLB and cache load if they are).

    No extra effort is needed on the part of the programmer, and you get N times as much instruction level parallelism with N threads as you would for one thread. In one instruction stream, you'll always have dependencies that can't be avoided - true dependencies. Parallel threads don't have any shared dependencies for register operations, and are much less likely to have dependencies for memory operations (under most conditions).

    A compiler, on the other hand, has to be made extremely complex to extract much more parallelism than is currently extracted, and still won't be able to capture a lot of it. I know this far too well, having seen the guts of compilers on a few occasions. You'll also get no benefit for legacy code or for code that was compiled with a mediocre compiler (as almost all code is, to Intel's continuing dismay).

    SMT is especially nice because there's almost no extra hardware overhead for implementing SMT. It's a winning strategy from all angles.