Slashdot Mirror


Apple Is Buyer of New 64-Bit IBM Chips

ohmygod2 wrote to us with a story from SF Gate that Apple, unsurprisingly, is going to be one of the purchasers of IBM's PowerPC 970. At this time, though, it's unclear where Apple is going to actually *use* said chip.Update: 10/14 15:53 GMT by H : Follow-up to Tim's story.

3 of 401 comments (clear)

  1. Where? That's obvious! by Bug-Y2K · · Score: 0, Redundant
    At this time, though, it's unclear where Apple is going to actually *use* said chip.

    Well, Duh! In a computer ya doofus! Where else would a computer company use a new CPU?

    They sure as hell ain't dropping it into an iPod of some vaporPDA!

  2. WTF?!!! by io333 · · Score: 1, Redundant

    Granted, I am unsure as of yet if Darwin runs 64 bit natively, but when it does, imagine a dual processor of these (with of course, quartz extreme pushing all of the video over to the Graphics processor).

    OK, what is wrong here? You the mod are going to mod me down to minus -1 and this numnut gets a 4? I don't care.

    IMAGINE A F&CKING BEOWULF CLUSTER OF THESE!!!!!!!!

    ha ha ha ha ha ha ha ha ha ha ha ha ha ha ha!!!!!

  3. Re:1.8ghz... Ignoring Pipeline Length by Yokaze · · Score: 3, Redundant

    >"Average code" is 20% branch instructions,

    Well, as wanted to indicate "Average" is very vague. SPECInt has 14-16% conditional branches. SPECFloat has 3-12%.

    >On the first run [...] 50% prediction rate

    Not really. Statistical prediction and profiling can be applied. 85% of back branches are taken (loops), and 60% forward branches are taken. With predict taken you get roughly a miss prediciton rate (MPR) of 35%. With profiling you can get a MPR of 10%-20%.

    >[...] and a good branch prediction unit can give 90% correct predictions.

    Let's say an average one. (At best)

    A two-level adaptive scheme (T. Yeh, Y. Patt) delivers a MPR of 3%. Hybrid Branch Predictors deliver even better results.

    >We then need to add the pipeline length

    The penalty is not always the whole pipepline length.
    For the P4 the pipeline has 28 stages. But only 19 have to be flushed (8 are needed for the trace cache).

    So lets review your calculation:

    >So this leads us to need a pipeline flush every 45 instructions (on average)

    20% branches 10% MPR means to me 2% pipeline flushes. How do you come on every 45 instructions?

    I'd say something more like this:
    Conditional branch instructions: 20% (your guess is as good as mine).
    MPR 10% = 2E-2 Pipline flush probability
    MPR 5% = 1E-2
    MPR 3% = 6E-3

    Some Guesses:
    MPR: P3 5%, P4 3%, PPC970 3%
    Pipeline penalty: P3 10, P4 19, PPC970 10
    Overhead: P3: 10%, P4: 12%, PPC970 6%

    So, at least according to my estimation the P4 has actually not 18% penalty towards the PPC970 but only one of 6%.

    > multi-tasking
    Umm, you are running with something 1GHz for something like 10ms. So you'll have 10MI. So most probably, the penalty for a cold BTB is probably neglectable. Otherwise, you're probably IO-bound anyway, and the CPU will be you're least problems.

    The reason for better (or worse) performance may probably lie somewhere else. Actually, the increase of other pipeline hazards may be one of them. How long instructions take another one. (Well, for RISC processor an (non-fp) instruction takes 1 cycle, but for x86...) Not to mention caches and memory.

    --
    "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"