Slashdot Mirror


34 Design Flaws in 20 Days of Intel Core Duo

Pray_4_Mojo writes "Geek.com is reporting that Intel's errata (bug) documentation shows that the Intel Core Duo chip has 34 known issues found in the 20 days since the launch of the iMac Core Duo. (you can read the list) with only plans to fix one of them. While bugs in hardware is nothing new (the P4 has 64 known issues, at this time Intel does not plan to fix a single one) this marks one of the first times that Intel released a processor with known bugs, and some of the bugs are of higher severity than in the past. Also alarming is the rate the flaws have been found, at one and half per day since the launch of the iMac Core Duo."

81 of 356 comments (clear)

  1. Up front by emerrill · · Score: 5, Interesting

    I just think it means that Intel is being more honest about the problems, rather then hiding them til others find them.

    1. Re:Up front by Ucklak · · Score: 2, Interesting

      They had issues with their first run of the P4's. Remember that there was a BIOS workaround which made it slower than the P3 at the time?

      --
      if you steal from one source, that is plagiarism, if you steal from many, well, that's just research.
    2. Re:Up front by ciroknight · · Score: 5, Insightful

      Take a look at the error list for a second. Over 50% of them are caused by dropping the processor into Debug mode, with over 75% of them only being observed by Intel themselves. Now, certainly there are more bugs reported so far, but does that mean that there are actually more bugs, or that Intel is getting better at finding bugs and reporting them?

      Time will only tell.

      --
      "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
    3. Re:Up front by Radicode · · Score: 4, Informative

      And I would add that most "flaws" can be avoided by the compiler. Programmers (except the ones making the compiler) don't have to worry about those. These bugs occur in really rare conditions that can be avoided. CPU design is really complex... if you thought assembler instructions were executing one after the other, you're wrong. Usually, they will execute in mixed order, many at the same time. That's what makes a fast CPU.

      For those still reading books, I suggest "Computer Architecture" by John L. Hennessy and David A. Patterson.

      Radicode

    4. Re:Up front by Massacrifice · · Score: 2, Informative

      I thought the P4 was slower than the P3 when it started because of its lower IPC.

      --
      -- Home is where you eat your heart out.
    5. Re:Up front by Analog+Squirrel · · Score: 2, Informative

      Or possibly the one by David A Patterson and John L Hennessy...

      --
      I'd rather be flying
    6. Re:Up front by Kadin2048 · · Score: 2, Insightful

      That's a poor attitude to take. Almost certainly they did testing before they went to production and started making masks and all the rest -- but a responsible company doesn't just stop doing testing the moment the product rolls out the door.

      I work on a very large software project. In some ways, it's not unlike designing hardware; we have a very slow, inflexible release schedule. Once a release starts being rolled out to the users, it's done. While theoretically there might be a way to do an "emergency patch" in some extremely severe circumstance (followed by a ritual sacrifice of everyone involved), in practice it would be almost impossible. But that doesn't mean that we stop testing software once it goes into production -- and the fact that we still test production versions doesn't mean that we don't do a lot of in-house testing, either.

      You test, test, test before the product gets rolled out -- whether it's hardware or software -- and then you continue to test afterwards. What changes is your ability to fix things. Before the product has been frozen and you're committed, you can actually fix bugs. Afterwards, you are limited to impact mitigation and providing workarounds for your support teams. Not as good as actually eliminating the bug, but I think as a user it would be better to know about a bug in advance and be provided with a workflow that avoids it, than run into it on your own and be stuck.

      Frankly I think it would be irresponsible for a company not to continue testing, as long as they have the resources to do so. That's called maintenance.

      Furthermore, there is a certain point you get to (at least in my experience) where you can keep hammering out bugs, and eventually start creating new bugs as the result of your own fixes. It's a never-ending process; there will always be one more bug. This idea that anyone can produce a totally bug-free product, on a large scale (the size of a modern microprocessor or a huge software project), if they just threw enough resources at the problem, is incorrect and dangerous. At some point you have to stop fixing things and release the product -- especially if your goal is to make money and stay in business.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
  2. Faster by mysqlrocks · · Score: 3, Insightful

    Maybe they're just getting faster/better at finding bugs?

    1. Re:Faster by Golias · · Score: 5, Funny

      Shh!!! You're ruining perfectly good FUD!

      --

      Information wants to be anthropomorphized.

    2. Re:Faster by adrianmonk · · Score: 5, Funny
      Maybe they're just getting faster/better at finding bugs?

      Yeah, I hear they're 2 to 3 times as fast now on the most important bug finding benchmarks.

    3. Re:Faster by Surt · · Score: 4, Insightful

      It seems likely that given the increasing complexity, the error rate is going to rise proportionally. I mean, how many errors do you expect in a 100,000 transistor chip vs a 100,000,000 transistor chip?

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    4. Re:Faster by Golias · · Score: 2, Insightful

      And we know that there are no plans to fix these "show stopper" bugs because geek.com says so. Also, we know they are "show stopper" bugs because geek.com says so.

      34 is actually a very tiny bug list for a bleeding-edge CPU.

      --

      Information wants to be anthropomorphized.

    5. Re:Faster by c_forq · · Score: 2, Insightful

      Future chips. This batch may have them until they are no longer pressed, but I would imagine any revisions or new families of chips will take these past mistakes into account.

      --
      Computers allow humans to make mistakes at the fastest speeds known, with the possible exception of tequila and handguns
    6. Re:Faster by Golias · · Score: 5, Insightful

      What I am saying is that in general, what's the use of getting better and faster at finding bugs if there aren't plans to fix it?

      Because the purpose of finding silicon bugs is almsot never to fix it. Fixing CPU bugs is often impractical. You find the flaws so you can route around them. This is the case with every consumer chip on the market, including the one you are using to read this right now.

      --

      Information wants to be anthropomorphized.

    7. Re:Faster by VitaminB52 · · Score: 5, Informative
      It seems likely that given the increasing complexity, the error rate is going to rise proportionally. I mean, how many errors do you expect in a 100,000 transistor chip vs a 100,000,000 transistor chip?

      Given the fact that a very substantial part of the extra chip estate is being used as L1 and L2 chache, the error rate should increase less than proportionally. If you upgrade cache size from say 8 kB to 1 MB, then there is only a relative small increase in complexity of the cache controler, not of the cache itself.
      Add the new chip design software and the use of hardware libraries for standard chip functionality, then the error rate should increase even slower.

    8. Re:Faster by diegocgteleline.es · · Score: 3, Insightful

      Indeed! It's like you would say that it's much easier to find bugs just after the first release of the CPU and even easier when it's the debut of a completely new architecture like the Core Duo is!. It'd be like posting links to the AMD errata docs!

      like bugs in CPUs are something new....I want to know how many bugs where found in the first 20 days of the release of other intel architectures and the opteron, otherwhise I can't know if the core duo is a bad CPU compared with others or not. This article just looks like anti-intel FUD from AMD fanboys (Intel made a good CPU even with the bugs, deal with it, AMD is not going to give away free CPUs to you for being a fanboy).

      And let me doubt that there's any CPU manufacturer at all that releases CPUs without any "know bug", many CPU bugs are fixed with microcode updates via new bios versions. There's a reason why both amd and Intel CPUs allow to update the microcode, they don't include features for fun.

    9. Re:Faster by hobbit · · Score: 2, Funny

      including the one you are using to read this right now.
      Tsh. Like all real geeks, I read Slashdot in Lynx under HURD on a custom ASIC I designed myself.

      --
      "Wise men talk because they have something to say; fools, because they have to say something" - Plato
    10. Re:Faster by gad_zuki! · · Score: 2, Funny

      >>Maybe they're just getting faster/better at finding bugs?

      Right. Its dual core so its twice the bugs found twice as fast. Amazing!

    11. Re:Faster by njh · · Score: 2, Funny

      Soon they'll be finding the bugs before they leave the factory!

  3. Re:Should've gone with AMD by Transeau · · Score: 4, Informative

    You do realize that there is an 85 page PDF of errors in the AMD64, right?

  4. "one of the first times"? by sczimme · · Score: 5, Insightful


    this marks one of the first times that Intel released a processor with known bugs

    No: either it is the first time or it is not. There can be only one... first time.

    and some of the bugs are of higher severity then in the past

    then != than

    --
    I want to drag this out as long as possible. Bring me my protractor.
    1. Re:"one of the first times"? by Golias · · Score: 5, Insightful

      this marks one of the first times that Intel released a processor with known bugs

      No: either it is the first time or it is not. There can be only one... first time.


      I disagree with the mod who marked you "Off-topic." It may look like you are just being a grammar nazi, but you raise a valid point.

      Saying "this marks one of the first times that Intel released a processor with known bugs" is pretty much the same as saying, "this is not the first time that Intel has released a processor with known bugs, but I want it to sound like alarmingly bad news for Apple."

      --

      Information wants to be anthropomorphized.

    2. Re:"one of the first times"? by laird · · Score: 2, Informative

      "this marks one of the first times that Intel released a processor with known bugs"

      Every chip Intel has ever shipped has had errata. This isn't unique to Intel, of course -- every chip ever shipped has had errata. The only news here is that apparently people have found a lot of bugs in this specific chip fairly quickly. But Mac users are a demanding bunch...

      http://www.amd.com/epd/desiging/tsdocs/2.erratashe /index.html lists AMD's errata sheets.
      http://www.rcollins.org/Errata/ErrataSeries.html documents some Intel errata from the late 90's.
      http://mysearch.intel.com/corporate/default.aspx?c ulture=en-US&q=errata&searchsubmit.x=12&searchsubm it.y=8 searching for Errata on Intel's site returns 6,520 hits (most for errors in documentation). This is to their credit -- everyone makes mistakes, and documenting them benefits everyone.
      http://www.freescale.com/webapp/search/MainSERP.js p?QueryText=errata&RELEVANCE=false&showAllCategori es=false&srch=1&assetLocked=false&pageSize=5&Selec tedAsset=Product+Pages& and FreeScale has a ton of errata documentation as well.

      You get the idea.

  5. Does anyone know.... by Jaysyn · · Score: 3, Interesting

    How many "bugs" are in Athlons?/Duron/Semprons?

    Jaysyn

    --
    There is a war going on for your mind.
    1. Re:Does anyone know.... by Surt · · Score: 4, Informative
      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
  6. 20 days? by Anonymous Coward · · Score: 5, Insightful

    It's a little disohnest to use the phrasing "Core Duo chip has 34 known issues found in the 20 days since the launch of the iMac Core Duo."

    Most of these bugs were found well before the release of Core Duo. Many of the bugs are listed as having been observed by Intel only. That means the verficiation teams did hit these issues, either with very bizarre code setup, or doing something that's probably not technically legal anyway. Odds of seeing most of it in an end-user platform are very unlikely.

    1. Re:20 days? by Anonymous Coward · · Score: 5, Informative

      And AMD has no bugs in their chips? Here's the Athlon 64 Revision History document off of AMD's own website:

      http://www.amd.com/us-en/assets/content_type/white _papers_and_tech_docs/25759.pdf

      There's a lot more listed there than for the Core Duo so far, and quite a few marked as "Won't be Fixed" and are scary sounding. Here's an example of a rather nasty looking ordering bug that results in system hang:

      Downstream non-posted requests to devices that are dependent on the completion of an upstream
      non-posted request can cause a deadlock in the presence of transactions resulting in bus locks, as shown in the following two scenarios:

      1. A downstream non-posted read to the LPC bus occurs while an LPC bus DMA is in progress. The legacy LPC DMA blocks downstream traffic until it completes its upstream reads.

      2. A downstream non-posted read is sent to a device that must first send an upstream non-posted read before it can complete the downstream read.

      In both cases, a locked transaction causes the upstream channel to be blocked, causing the deadlock condition.

      Potential Effect on System
      The system fails due to a bus deadlock.

    2. Re:20 days? by Anonymous Coward · · Score: 4, Funny

      I just checked on my P1, and it's really 19.9999999999999999742919319 days, not 20.

    3. Re:20 days? by SpinJaunt · · Score: 2, Funny
      Q: How many Pentium designers does it take to screw in a light bulb?
      A: 1.99904274017, but that's close enough for non-technical people.

      Q: What do you get when you cross a Pentium PC with a research grant?
      A: A mad scientist.

      Q: What's another name for the "Intel Inside" sticker they put on Pentiums?
      A1: Warning label.
      A2: Truth in advertising.

      Q: What do you call a series of FDIV instructions on a Pentium?
      A: Successive approximations.

      Q: Complete the following word analogy: Add is to Subtract as Multiply is to

            1. Divide
            2. ROUND
            3. RANDOM
            4. On a Pentium, all of the above

      A: Number 4.

      Q: What algorithm did Intel use in the Pentium's floating point divider?
      A: "Life is like a box of chocolates." (Source: F. Gump of Intel)

      Q: Why didn't Intel call the Pentium the 586?
      A: Because they added 486 and 100 on the first Pentium and got 585.999983605.

      Q: According to Intel, the Pentium conforms to the IEEE standards 754 and 854 for floating point arithmetic. If you fly in aircraft designed using a Pentium, what is the correct pronunciation of "IEEE"?
      A: Aaaaaaaiiiiiiiiieeeeeeeeeeeee!
      TOP TEN NEW INTEL SLOGANS FOR THE PENTIUM

      9.9999973251 - It's a FLAW, Dammit, not a Bug
      8.9999163362 - It's Close Enough, We Say So
      7.9999414610 - Nearly 300 Correct Opcodes
      6.9999831538 - You Don't Need to Know What's Inside
      5.9999835137 - Redefining the PC--and Mathematics As Well
      4.9999999021 - We Fixed It, Really
      3.9998245917 - Division Considered Harmful
      2.9991523619 - Why Do You Think They Call It Floating Point?
      1.9999103517 - We're Looking for a Few Good Flaws
      0.9999999998 - The Errata Inside

      --
      /. is good for you.
  7. AMD errata by Anonymous Coward · · Score: 5, Informative

    Revision Guide for AMD AthlonTM 64 and AMD OpteronTM Processors. Just for balance. (only two of them are really interesting, #113 is one of them IIRC)

    1. Re:AMD errata by Tucan · · Score: 2, Interesting

      Just to balance the balance, the AMD document indicates that of 136 listed problems they plan to fix all but about a dozen.

  8. Statistics by emerrill · · Score: 3, Interesting

    Another thing here that people don't seem to get, is that just because there have been 1.5 'found' a day (I would bet most were known before general release), that says nothing about the total number of bugs. For all we know, there could be only 40 total, just most of them were found quickly.

  9. First time with BUGs?!?! by Ninja+Programmer · · Score: 5, Informative
    ... While bugs in hardware is nothing new (the P4 has 64 known issues, at this time Intel does not plan to fix a single one) this marks one of the first times that Intel released a processor with known bugs, ...


    Huh? That's clearly wrong. When Intel had its famous FDIV bug, they shipped it knowing that the problem was there (the chips were already manufactured before they noticed it in their internal design validation.) In fact I would highly doubt that any Intel chip (or AMD chip) has shipped without some known bugs in them.

    Its just a question of severity. Most of these bugs tend to be highly marginal in a "real software doesn't push that hard on the CPU" sense.
  10. Why is this an Apple issue? by toupsie · · Score: 4, Informative

    Apple is not the only manufacturer using the Core Duo chip.

    --
    Strange women lying in ponds distributing swords is no basis for a system of government.
  11. Oh thats it! by catahoula10 · · Score: 5, Funny

    Why does Apple want to use an intel chip?

    Oh, thats right:
    Microsoft Owns Apple.

    How can we tell?

          1. Apple's stock only rose 25% last week.
          2. Bill Gates's birthday now a paid holiday for Apple employees.
          3. Default Mac startup sound changed to "Taps."
          4. Wall Street brokers have stopped using Apple stock certificates as toilet paper.
          5. Apple's new slogan: "Almost as good as Windows!"
          6. Apple has been bent over with its pants dropped for so long now, even a geek like Bill Gates was bound to get lucky.
          7. Cute rainbow-colored apple now inhabited by cute rainbow-colored worm.
          8. microsoft comes out with an operating system incorporating Mac technology ... uh, wait a minute ...
          9. Phone and utilities mysteriously start working again at Apple's corporate HQ.
        10. Steve Jobs seen tending bar at the Gates' private lawn party.
        11. Diners in Microsoft's staff cafeteria can now enjoy their apple pie purely for its wholesome goodness and no longer as a symbolic act of global domination.
        12. Unsold Newtons used as cobblestones in Gates's driveway.
        13. Apple Employee of the Month gets to hunt loose change at Bill's house.
        14. New Apple employee dress code includes large "Property of B. Gates" tattoo on ass.
        15. Bill Gates still burned in effigy, but upper management no longer attends.

    (http://www.ehumorcentral.com/Directory/Jokes/838. html)

    I like #7 and #11 myself :-)

    --
    This has been another valuable and informative opinion from:
    Catahoula!
    1. Re:Oh thats it! by HeroreV · · Score: 2, Informative

      7. Cute rainbow-colored apple now inhabited by cute rainbow-colored worm.

      I like #7 and #11 myself :-)

      Apple hasn't used that rainbow-colored apple logo in ages, have they?

  12. All modern processors have bugs on release by tlhIngan · · Score: 5, Informative

    It's called "errata", and it's common for most processors to be released with pages and pages and pages of errata.

    Of course, what happens is that the alpha/beta silicon ships to select customers without many errata (though internal testing often finds them too, and they ship with those). Then the manufacturer goes back, resolves a few, then the cycle repeats until everyone is happy with the bugs and it's released with a book of errata on them, and workarounds for the severe ones.

    "No fix" errata are common. The most serious of those have workarounds. Fixed errata are for things where there can be no possible software workaround. But there's a large number of varying severity - from cache incoherences, lock failures (you try to lock something, and it either can't be unlocked the usual way, or it doesn't reliably indicate lock), to bus and spec violations.

    Nothing new here...

  13. Re:Should've gone with AMD by GeekDork · · Score: 2, Insightful

    Now, this would've been interesting or informative if you would have provided a link to that PDF. Pretty please?

    --

    Fight hunger. Filet a politician and send him to a 3rd world country of your choice.

  14. Equivalent PowerPC numbers? by Angostura · · Score: 3, Insightful

    This news would be a lot more interesting if I knew the size of the errata list for the G4 or the G5. I think it unlikely that there are zero unfixed bugs.

    Anyone? Bueller?

  15. Sensationalized by emerrill · · Score: 2, Insightful

    geeks.com has pumped up these problems by doing their own analysis, and claiming 'show stopper' on many of them, yet there are already machines in the wild that seem to have no problem with many of them. Like them saying that machines wouldn't be able to wake from sleep because of one of them. Their analysis is a lot of FUD.

  16. Image Mirror by XMilkProject · · Score: 2, Informative

    It's going pretty slow, here's a mirror I setup to the image with list on it: http://www.xmilk.com/coreduo.gif

    --
    Big ones, small ones, some as big as yer 'ead!
    Give 'em a twist, a flick o' the wrist...
  17. The One I'm Waiting For by Nom+du+Keyboard · · Score: 3, Funny
    The flaw I'm waiting to see:

    Cannot run Windows XP. Classification: Minor.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  18. Re:A flawed design kept alife. by TheRaven64 · · Score: 4, Insightful
    Not quite the same. All that has been kept the same is the interface, not the implementation. It's the equivalent to having to keep an API/ABI stable. It can cause problems (see the WMF features for more information), but it's also often useful - Win3.0 apps running on Windows XP, for example, or UNIX code from the '80s compiling and running on Linux / BSD.

    The problem with x86 comes from the fact that a large number of instructions interact in relatively complex ways with others. Changing a small amount of silicon can change a side-effect of an instruction, which is then a bug. An ISA such as Alpha eliminated this by keeping inter-instruction interactions to a minimum (no condition registers, etc).

    --
    I am TheRaven on Soylent News
  19. Yeah some perspective would be nice... by sterno · · Score: 4, Insightful

    So not only how many bugs in Athlon, etc, but also...

    How many bugs in other Pentium chips?
    What was the rate of discovery of bugs in other chips?

    Keep in mind that during Intel's entire history they've released one desktop processor that had a bug sufficient to require a recall. Most of the bugs are easily worked around including that one. Hell, I've got an old P60 that I was using as a router until the last year or so and it just worked fine and it was always amusing to see Linux notice the FDIV bug on boot.

    --
    This sig has been temporarily disconnected or is no longer in service
  20. All CPU, controllers, etc. have errata... by shawnce · · Score: 4, Informative

    Not sure I understand the point of this new article... all chips have errata. This is like reporting that the sun set again or that slashdotters have no love life.

    For eample...

    The MPC7410 family of chips (aka G4) from Freescale (formally part of Motorola) has 21 errata currently listed: MPC7410CE.pdf

    The MPC7447 family of chips (aka G4) from Freescale has 36 errata currently listed: MPC7457CE.pdf

    The PPC 970FX (aka G5) from IBM has 24 errata currently listed: 970fx_errata_dd3.x_v1.6.pdf

  21. AMD Opteron errata by mrm677 · · Score: 2, Informative

    The errata for the AMD Opteron is 85 pages long . I once spoke with a chipset designer and he told me that the Opteron errata was especially long with some convoluted workarounds, compared to other CPUs he's worked with.

  22. It's normal to not fix silicon bugs by Theovon · · Score: 5, Informative

    As an ASIC designer, I have produced my fair share of silicon bugs. Chips are expensive to produce, making bugs expensive to fix. As a result, chip designers (even ones with deep pockets like Intel) do not look at bugs as something to FIX, but rather as something to MASK. I don't mean to hide it from people (although that does happen), but to make it not a bug by working around it.

    Unless the bug is so fatal that you can't work around it, or the bug could potentially cost lives, the primary solution is to work around it. Either you write driver code to avoid the bug, or you find some other cheap solution. Sometimes, it's a simple matter of removing a feature from your marketing literature.

    Intel's typical means to mask processor bugs is microcode. This hurts performance, but they can typically create a workaround that routes everything around the bug. I can't read the article (it's slashdotted), but I'm sure that by saying they won't fix some bugs, they're saying that they won't respin the silicon but rather mask the bug in some other way.

    Listing the bugs (and not fixing them in this version) is an appropriate thing for Intel to do.

    (I'm no Intel fanboy. I think they're bastards. But this is NOT an example of them being bastards.)

    1. Re:It's normal to not fix silicon bugs by homer_ca · · Score: 3, Informative

      "Intel's typical means to mask processor bugs is microcode."

      That's true. Every Intel CPU since the Pentium Pro can update its microcode. Many times, BIOS will contain microcode updates from Intel. Linux also has a microcode update driver.

      "I'm sure that by saying they won't fix some bugs, they're saying that they won't respin the silicon but rather mask the bug in some other way."

      I'm not sure about that. "Will fix" seems to imply the errata could be fixed in silicon or microcode, while "Will not fix" means it won't get fixed at all.

    2. Re:It's normal to not fix silicon bugs by masklinn · · Score: 2, Informative

      I'm not sure about that. "Will fix" seems to imply the errata could be fixed in silicon or microcode, while "Will not fix" means it won't get fixed at all.

      A workaround isn't considered as a FIX, WONTFIX is wontfix even with published workarounds (including microcode). WONTFIX means that the error won't be fixed at the silicon level, which is the subject of errata papers.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    3. Re:It's normal to not fix silicon bugs by stevesliva · · Score: 3, Informative
      Chip bugs often are due to the intersection of the domains that the "chip simulations" you mention. You get static timing analysis, power analysis, logic verification, transient simulation at various process and applied conditions. But many of the analyses are done without true interlock with the other simulators. And you get layered levels of abstraction, and all sort of automated tools hooking all the abstracted components together...

      So if you look at the list of errata, you see things like flags not getting set properly after the execution of an instruction. What could cause this? 1.) The design was logically incorrect. 2.) The design was logically correct, but the flag is never properly latched on the correct cycle for all hardware. 3.) The flag doesnt get set for slow hardware. 4.) The flag doesn't get set for hardware that has issues with supply integrity. Etc etc.

      One would think that if they screwed up the implementation of a long-lived feature, it wasn't a logic error (likely to be caught by running verification) but an error caused by the analog or physical world intruding upon the digital domain. Some small amount of this may be expected-- oh crap! 1% of chips have an obscure timing issue we can't catch in test-- but if it is a true logic bug, someone screwed up.

      --
      Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
  23. I think this is what he meant by flyinwhitey · · Score: 3, Informative

    http://www.amd.com/us-en/assets/content_type/white _papers_and_tech_docs/25759.pdf And as an aside, it took two seconds (actually .08) seconds to look up on Google. Maybe try that next time.

    --
    How pathetic are you that you follow me from topic to topic and waste all your mod points at once modding me down?
  24. Re:34 design flaws and only 1/4 faster.... by catmistake · · Score: 2, Funny

    we will miss the AlteVec Velocity Engine and 64-bit full RISC processing, no doubts. Lets hope Intel designs something as useful as AlteVec developers can take advantage of, and gets Apple a 64-bit chip soon.

  25. Safety critical software developers beware.... by s31523 · · Score: 2, Insightful

    Being in the Aerospace/Defense industry, this is disconcerning, especially for those of us that deal with the FAA and the imfamous DO-178B. Higher demanding systems are forcing us to use more powerful processors and if they are plagued with "known issues" it may be a problem with getting through a certification by some governing agency. Especially now that DO-254 has reared its ugly head... Has Intel gone the way of Microsoft? Delivering early to gain market even though the product has sever quality issues and then take the "well, it's not a critical secutriy flaw?".

    1. Re:Safety critical software developers beware.... by emerrill · · Score: 4, Insightful

      That assumes that intel wants the safety critical market for this processor. In most cases, when you develop in this sector, you have to use hardware that is specificly designed for these applications. developing chips that can be certified for SC applications can be a pain in the ass, and the may simply not car for this chip.

  26. Re:Should've gone with AMD by freidog · · Score: 5, Informative
    Here you go

    I didn't bother to actually count the number of unfixed or no fix planned glitches / bugs in there, so I don't know if it actually validates the 80+ the grandparent claimed, but there are quite a few known bugs in A64 and its HTT bus.

    In fact there are going to be any CPU released, even stuff like Power / Itanium / USpark are going to have errata like this. Microprocessors are inredibly complex equipment, and 100% stable and glitch free under all possible conditions just isn't going to happen. Who ever submitted this story is blowing this entirely out of proportion. The link is already Slashdotted so I haven't gotten a chance to read what the bugs / glitches are, but I would be good money a normal user could go through the entire life of their Core Dou Mac and never notice one. These are typically very small gliches / bugs that occur under very specific conditions, and are meant more for hardware manufacturers to be aware of than they are to warn a user there could be problems with their chips.

    publishing them publicly I think is a good move on Intel's part, but they do run this risk where people don't understand that this is a completely and utterly ordinary and expected thing to happen.

  27. I like the comment on bug AE9 by Phil+John · · Score: 2, Funny

    Coral Cache of the image

    Quoth the image: Show stopper, but only observed by Intel so far. Also, any OS developer who codes like this deserves this one.

    --
    I am NaN
  28. Re:No buy by manno · · Score: 3, Informative

    And you think that the A64, and P4 are clean and squaeky?

  29. It's because by RealProgrammer · · Score: 5, Funny

    ... for the first time, they're releasing the chip for a stable OS first.

    It used to be that testers only had an unstable testbed OS (designed primarily to run the same company's office suite) to use for validatation. Testers were never quite sure before where the blue screens, lockups, funny noises, and billowing smoke actually originated.

    (Relax, it's just a joke).

    --
    sigs, as if you care.
  30. Re:A flawed design kept alife. by CountBrass · · Score: 2, Insightful

    No, that's what you get when you build something really complicated. The clever bit is that they still work despite the errors.

    --
    Bad analogies are like waxing a monkey with a rainbow.
  31. Re:No buy by mr100percent · · Score: 5, Informative

    All chips have errata, and custmarily are well documented and are published on the vendor's web site. BTW, errata can be something as simple as a correction to the datasheet. Most are usually minor and are dealt with by the compiler. For example, if there's an error with calculations dealing with a certain registry and decimal values, the compiler would just not use that registry for the calculations.

    The documented and known errata are not what you should be concerned with. It's the unknown ones that freeze your computer or cause all robots to attack their masters.

    If someone's complaining about this, they should just turn off their computers, because as we ALL know, every operating system (the OS is what runs on chips that have the errata) also are shipped with hundreds, if not thousands, of known bugs. You're not going to find a perfect chip in the real world. How many errata did the G4/G5 have? By comparison the IBM PowerPC 970FX has 24 errata, none of which is planned for a fix. When you consider the 970FX is a fairly mature chip, 34 errata on a new chip is hardly news worthy. As transistors get more and more compact and miniaturized, I'm sure we're bound to see more.

  32. I like this comment by jm91509 · · Score: 4, Funny

    AE 16:

    Show-stopper but only observed by Intel so far. Also, any OS developer who codes like this deserves this one.

  33. Re:One MAJOR flaw by 99BottlesOfBeerInMyF · · Score: 2, Interesting

    Mac Powerbooks and G5s are WIDELY used as THE copmuter for editing film on. The new MacBook does not properly run Final Cut Pro 4, one of the biggest names in editing software. BIG mistake apple, big mistake.

    Ummm, because some company is going to run out and buy new machines right away and expect the software to have been ported, even though anyone who follows either the video editing or Apple news knows they announced Final Cut pro would be ported in March? Do people really use imacs for pro video editing? I'd think they would be going with towers, which work fine now and will likely not be intel before march or with powerbooks, which won't ship till Feb, only a month before Final Cut Pro is ported. The only people who might get burned by this are the clueless.

  34. Re:34 design flaws and only 1/4 faster.... by aftk2 · · Score: 2, Funny

    Yeah, I know - there really isn't that much difference between a 1.8Ghz Core Duo and the 1.8Ghz dual-core G5 in the current Powerbooks.

    Er. Wait a minute. There's no such G5 in a Powerbook? The best we had a single core 1.5Ghz G4? Oh - well perhaps there is a substantive difference in chips, after all.

    --
    concrete5: a cms made for marketing, but strong enough for geeks.
  35. Time for article moderation. by blair1q · · Score: 2, Interesting

    Let's not just moderate comments.

    I want to be able to moderate articles for depth, due diligence, and bias.

    This one's going to sit at top level for quite some time, trolling in everyone until they read the comments and discover they shouldn't have bothered.

  36. Re:34 design flaws and only 1/4 faster.... by jcr · · Score: 2, Insightful

    No, he said "something as good as Altivec."

    -jcr

    --
    The only title of honor that a tyrant can grant is "Enemy of the State."
  37. Re:Faster? Or under pressure from Apple? by davidsyes · · Score: 3, Interesting

    Hmmmm....

    What wast the (newsworthy or not) bug per CPU per release count BEFORE switching to Intel? What happened to all that new-fangled "chip simulation" stuff? Seems if this erratta is not just typos and such, then the SIMulation needs some STIMulation to be more useful.

    I wonder if AppTel did a "test design" before the Apple side of the house went to market. As for "finding the bugs faster", I am wondering if Apple found them and told Intel, "fixem or we go back to IBM, even if IBM charges more money to come back-but you can be sure we won't pay YOU over flaws we specced to be avoided...", assuming Apple could foresee and document what to avoid.

    As for Intel being "more honest", heck, I am willing to assume Apple has a better branding position than Intel, and Apple is not going to stand for Intel using it's mammoth inventory and factory count to roll over people. Any heavy computer user-- particularly Mac users who make money by USING their computers in small businesses-- will not tolerate Intel chips if things don't turn around.

    And, finally, I imagine Jobs will do a war-dance job in Intel if they think ONE bug fix is all that's required or if they think they can get away with fixing only ONE bug. But, if they are firm on fixing only ONE "BUG", then maybe they have refunds, refurbs, exchanges, chip-swaps... and/or a new chip in the pipeline...

    --
    Previously: "Linux... Toward the Sunrise..." Now: "Linux... Toward the-- No, now, part of Every Sunrise"
  38. Hmm is only Apple using Core Duos? by podperson · · Score: 2, Insightful

    I've heard rumors that some small PC manufacturers, such as Dell and Gateway are selling computers using this cpu.

  39. Intel should take google's lead by redcircle · · Score: 2, Funny

    Just leave it in "BETA" google does a good job at that.

  40. "85 pages" is a misleading comment. by wild_berry · · Score: 5, Informative

    Your comment is misleading. The document lists only 61 errata and contains their respective details. The initial table of errata -- table 5 -- is only four pages long (begins 13 and ends 16) and is most likely to group the problems by the wafer families; the next two pages reiterate the errata for each given brand name of AMD K7/K8 chip; all but one of the remaining pages detail the errata and their suggested workarounds/fixes. The last page is a list of extra resources.

    I don't dispute your comment regarding the experience of a chipset designer.

  41. Re:Should've gone with AMD by Overly+Critical+Guy · · Score: 2, Funny

    Sir, what are you doing? This is Slashdot, where everybody for some reason has a hard-on for AMD and ignores their flaws while pointing out Intel's to further their fanboy agendas. For crying out loud, we almost had a moment of calm, rational reasoning there. It's almost as if you're suggesting that the submitter is blowing things out of proportion, and that is IMPOSSIBLE HERE! Our system is fool-proof. Good day.

    --
    "Sufferin' succotash."
  42. Re:A flawed design kept alife. by Zathrus · · Score: 4, Interesting

    Well, that's what you get when you stick to crufted designs and try to keep them at all costs although there are known better archtectures. It's just like code: it gets unmaintainable over time.

    Ah. Ok. So then -- do these "known better archtectures [sic]" have no bugs then? Significantly fewer bugs? Are the bugs less severe? And how do they compare to the Intel/AMD architectures in terms of speed? I can assure you that I can make a chip that is 100% bug free -- it's also going to run somewhere in the vicinity of the original 8008.

    Frankly, I doubt you know all that much about the real ISA that Intel or AMD execute on their cores. The x86 instructions are never executed -- they're translated into an internal only ISA that doesn't look anything even vaguely like the x86 ISA.

    I'm so sick and tired of all these kids out of college whining about the x86 ISA. And yeah, I was there once too. But know what? That decreipt, horrible, ghastly API has outlasted every single competitor, has been upgraded from 8-bits to 64-bits without losing backwards compatibility, and runs far, far faster than every chip that's tried to take away the title. And costs less. Intel's proven the doom 'n' gloom wrong everytime -- including with their latest transition off the Netburst architecture. AMD has as well (I give Intel props because for decades they were the only real designers for the x86 ISA; AMD is pretty much responsible for the latest incarnation as x86-64 though).

    If you look at any of the modern chip architectures then none of them fall nicely and neatly into "CISC" or "RISC". The Power architecture is awfully CISC like in some ways. The x86 (the classic CISC) doesn't use a complex ISA internally, it has pipelining, branch prediction, caching, etc. -- all classic RISC subsystems that were never supposed to work on CISC. Everyone is multi-core now (to various extents).

    The x86 architecture isn't going anywhere. If anything Apple's move should've reinforced this concept -- the fact of the matter is that Intel spends more in R&D than every other (general purpose) chip maker on the planet. Combined. And sells their product for less. That kind of R&D budget makes up for a lot of paper shortcomings.

    Welcome to the real world.

  43. Re:3 Reasons by MORTAR_COMBAT! · · Score: 2, Interesting

    if apple doesn't have a serial number -> internal model number table, i would be heartily surprised. instead of asking all these questions about mirrored doors and DVI ports, why not just ask for the product serial number (which you'll need anyway to tie into warranty service)? or better yet, since they've already registered the serial number to their account, you just look up their account and see which machine they have. for larger accounts (several machines) asking for the serial number is more than appropriate, it would be necessary.

    my problem with the ipod versioning then becomes "the serial number on the thing is too damned hard to read".

    even on my thinkpads, yes there are "Thinkpad R-31" but that is hardly enough when needing detailed technical support, that is why there is easily available "real" type information (e.g. 2656-MU5) when you get down to technical support.

    --
    MORTAR COMBAT!
  44. Re:There isn't anything out of the ordinary about by SharpFang · · Score: 2, Informative

    Therefore its expected that a chip fabricated on a substrate whose minimum feature sizes are half those of the other chip and whose complexity is double the other chip would have 4x the errata items of the other chip.

    Complexity of the CPU contributes some to the amount of bugs - more project work = more bugs, though only in cases of introducing new algorithms, not in case of adding "more of the same" - dual core CPU is NOT supposed to have twice as many bugs as single-core counterpart, because the two cores are identical, contain the same flaws as the single core, and new ones are introduced only by the extra glue logic that makes it "dual". Twice the complexity usually means twice the number of gates, not twice the difficulty of design - stuff like cache memory swallows a major part of available space but 64KB of cache is associated with the same number of bugs as 4MB of it. So not x2 by complexity. At most x1.5 or so.

    And thet errors are not manufacturing flaws, they are design flaws / software (VHDL) bugs. If I write a program twice as long as original and save it to a harddrive of double the capacity, am I expected to have four times as many bugs? The new technology has its own share of problems but they are to be caught before releasing the chip from the factory, and chip that has a technology-related fault is just faulty and should be replaced. It has nothing to do with what appears in errata.

    So - the new CPU can have more bugs than the old one. But not four times as many!

    --
    45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
  45. Given the R&D costs... by jd · · Score: 3, Interesting
    I'd expect very close to no bugs in either. The costs involved in carrying out comprehensive design analysis, specification verification and implementation verification are virtually zero compared to the cost of producing the initial run of actual silicon.


    You also have to bear in mind that designs are modular and have limited connections, so N transistors is not a meaningful number - you should only be concerned with the number of modules and the number of interconnects. (eg: a 32-bit register will obviously take more transistors than an 8-bit register, but both are simply cut-and-paste copies of a 1-bit register. So long as you have the 1-bit form correct, there is no increase in complexity no matter how wide the register becomes.)


    As for the interconnects - if you have N modules, you have an upper limit of !N possible interactions, if you can string any possible combination together. That's a big number, even for small values of N. But most of those don't exist. You cannot feed the output of one operation directly into the input of another. There are some special cases where there is a chain of events, but it is not something you can program with total freedom. Many operations just produce a result which is pushed back into the registers. Thus, N modules will produce only a little more than N interactions of interest. That is a much more managable number.


    Then you need to consider that processors aren't "open floor plan". They are highly segmented. The term "floating point unit" literally does refer to a definable segment of the chip that is designed for floating point work. Again, from the standpoint of reliability, you can test each unit independently before doing an integrated test, so unit tests don't need to concern themselves with overall complexity or the number of other units out there.


    Next up is the cost of a recall. Recalls are expensive. From a pure profit standpoint, you want to spend less on QA than you'd spend on a recall, but the less you spend on QA, the more you are likely to end up spending on that recall. The ideal is to reduce the number of potentially serious bugs to the point where any further initial clean-up will cost more than the money lost in cleaning up afterwards. Less QA than that will cost more than it saves. More QA than that will also cost more than it saves unless it expands the market (ie: the chip becomes good enough to be used in mission-critical systems such as life-support or fly-by-wire systems), but is sometimes good to do anyway for PR reasons.


    Finally, not all transistors are "important". Once you know the cache algorithm works, the actual cache memory is irrelevent - memory is rarely implemented "incorrectly", it doesn't "do" anything (the active part is the algorithm), it's just heap.


    With modern software verification tools, chip validation suites and the high level of understanding of microelectronics, an average of one bug for every four or five instructions is high. I would consider a chip with a third as many bugs to be only just acceptable for home use, and a thirtieth as many for operations in which any significant number of people would be put at risk. The extra cost would be minimal (compared to all the other costs) and would still be much less than the cost to Intel of the Pentium divide bug or to Transmeta of the flaws in their initial Crusoe chips.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:Given the R&D costs... by OOGG_THE_CAVEMAN · · Score: 4, Insightful

      I think your estimates are *way* off.

      Silicon fab facilities are extremely expensive and capital intensive, but they produce shitloads of chips. The process scales; making 1000 wafers in these fabs is as easy as making one.

      Engineering analysis of complex IC designs is a perfect example of combinatorial explosion. Each bit of state in the chip doubles the state space in which bugs can exist. Yes, *most* of that state is in the cache which has regularity in its structure, but that regularity didn't happen by accident: it was *designed* that way.

      You can only test to a spec, and if the spec is imperfect and has gaps, you will leave space for bugs. Given that specs are written by engineers, they cannot be nearly complete for anything other than the most trivial circuits; the infrastructure used to suppor engineering of non-trivial circuits could itself have bugs.

      The part of the spec that covers the cache is simple, and can conceivably be error-free and well-tested, and perhaps with methods that are amenable to mathematical proof. But that's not where the errors crop up. The errors crop up in the hugely complex mechanisms that handle all the pipelining, branch prediction, translation to microinstructions, handling of interrupts, etc., etc., that are not highly regular and modular and are not easy to spec, and are not easy to approach with formal methods.

  46. Well this goes along with... by groman · · Score: 2, Funny

    Well this goes along with the new Apple announcement for a compatibility layer that recreates a genuine Mac OS 9 experience on an Intel-powered Mac. ... I'll shut up now.

  47. Re:A flawed design kept alife. by ooze · · Score: 2, Interesting

    That's exactly my point. All the effort that has to put into it just to make it still work at all. And it's not just on the R&D of intel that all this effort has to be taken. This register starved PU and this horrible MMU. It funny how many design papers you read of people who really wanted to be inventive and bring up some clean designs. And in the introduction of those papers it's almost always a good bet to to expect finding some sententence in the spirit of "Well, we'd just rather do it this and this way, to have a clean and efficient flexible design, but due to the xxx restriction of the x86 architecture, whhich is the dominant on the market, we have to do the following suboptimal workarounds:" and then comes a list. Those kind of sentences I have read in Java whitepapers (x46 is the very reason it's a stack engine), in the L4 kernel documents, in quite a few comments in compilers and the list goes on.
    Up to about 15 years ago x86 was ok. Up to about 10 years ago it was bearable. Everr since then it's a mere roadblock for software and hardware development. One that had to be steered around with much efford on a daily basis. Mos people just don't notice it anymore, because they got used to it. Intel builds Ford Ts. The have a big advantage in manufactoring methods and and in economy of scale. And it sure has it's merits. Bet even the Ford T wasn't built for 20 years. If Ford did what Intel does we'd still have to start the car at the front with a lever. And actually we do. We start in real mode.

    --
    Just because I can imagine doing a hippopotamus, doesn't mean I'd like to do it.
  48. Re:Thank you by Twanfox · · Score: 2, Informative

    I think you have that last sentance backwards, or at least, incorrect. AMD chips run at a slower clock speed, but do more per clock cycle than the Intel chips do. While Intel chips are pushing 3GHz and faster, AMD chips are not nearly as fast, and yet remain competitive in terms of 'work done'

  49. Re:Thank you by theLOUDroom · · Score: 2, Interesting

    AMD has always had more bugs, and some for more serious then the intel one that sparked enough consumer backlash (out of panic) to have a recall.

    I have a hard time believing this is true.

    I might believe that AMD usually has more bugs, or has more bugs cumulatively, but the number of bugs, being a RANDOM varible, is quite unlikely to be so well behaved that the number of Intel bugs has NEVER exceeded the number of AMD bugs. I would like to see a source for your statement.

    --
    Life is too short to proofread.
  50. Re:Hardware vs. Software testing by ciroknight · · Score: 2, Informative

    Well then your point is flawed, because as any manufacturer of CPUs will tell you, error will crop up after they are taped out and produced. AMD certainly is no stranger to it, neither is Freescale or IBM. Hell, there are smaller processors used in cellphones and calculators that have errors much worse than anything Intel's ever released, and yet you never hear about those. Why? Because these kinds of errors are trivial to fix in Software.

    Secondly, no, these chips are probably revision 8 or 9 internally; they'll typically do a few runs at a time to make sure that yields are where they want them to be, and that mechanically the chip checks out. However, you can not do intrinsic debugging at this level, because of the simple supply problem; there are not enough chips made at this point to get all of your engineers looking at them. This is why most manufacturers won't catch an error until the first production run is underway, and by then it's far too late to go back to your design drafts, fix a bug, and re-tape the processor. It'd delay the product by 4-6 months; you've got to remake all of your lithograph templates and make sure they're all exactly created to spec, you've got to re-send out all of these plates to all of your fabs, you've got to then go through recert and make sure that the chips work (yes, that means you have to make more wafers of bad chips), and then you're still looking at debug time.

    And for what? Your processor's accidentially got a single instruction that's lightly flawed which can be checked and fixed in software (if (value == (INTEL_DEBUG_VALUE && expected_value)) { intel_fix(); } ).

    Lastly, if you need an example of any product shipping flawed, take a look over at the car industry. There are recalls, after recalls, after recalls on parts that are often bad, and require a new bolt to fix something. Think of this as the same thing, only you don't have to take your car into the garage; you are likely to never know, speak with, or hear of the people who are fixing the problems mentioned in this article. These are problems for OS developers, who are working in debug mode, who *might* run into this problem if and only if some crazy absurd bit-pattern is laid out just right in a register when a command is executed (for example).

    So please, before you tell a Computer Engineer how to make a microprocessor, make sure you know what you're talking about. It's better that they catch these problems in the weeks after release so that the OS developers will have time to fix them before their next major version goes out and they actually have to release a patch to deal with it. It's better that they catch them before they run the next production run, just in case there is an error that warrants fixing (and they've only discovered ONE of such errors, and they are probably going to wait until Core Duo rev B to do it). And it's better that they catch them at all, instead of a year down the line when everyone starts to realize their floating point math is going screwy on their multimillion dollar simulations.

    --
    "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush