Slashdot Mirror


Intel's Atom C2000 Chips Are Bricking Products -- And It's Not Just Cisco Hit (theregister.co.uk)

Thomas Claburn, reporting for The Register: Intel's Atom C2000 processor family has a fault that effectively bricks devices, costing the company a significant amount of money to correct. But the semiconductor giant won't disclose precisely how many chips are affected nor which products are at risk. In its Q4 2016 earnings call earlier this month, chief financial officer Robert Swan said a product issue limited profitability during the quarter, forcing the biz to set aside a pot of cash to deal with the problem. "We were observing a product quality issue in the fourth quarter with slightly higher expected failure rates under certain use and time constraints, and we established a reserve to deal with that," he said. "We think we have it relatively well-bounded with a minor design fix that we're working with our clients to resolve." Coincidentally, Cisco last week issued an advisory warning that several of its routing, optical networking, security and switch products sold prior to November 16, 2016 contain a faulty clock component that is likely to fail at an accelerated rate after 18 months of operation. Cisco at the time declined to name the supplier of that component.

59 comments

  1. Haiku error message by Anonymous Coward · · Score: 0, Offtopic

    "A crash reduces
    your expensive computer
    to a simple stone."

    1. Re: Haiku error message by Anonymous Coward · · Score: 0

      Beautiful. Almost made me cry

  2. Bricked? by AmiMoJo · · Score: 0

    The headline says products are bricked, but are they?

    It seems like a clock on the CPU is failing. Those CPUs are soldered on, so replacing them is not easy and you could fairly say that the device is bricked. But Intel claim that there is a "board level" fix. I wonder if they mean replace the CPU, or if there is some other bodge that can mitigate the problem.

    I can't imagine how a bodge would prevent a clock failing or replace it once failed. It sounds like there is a silicon level fault with the CPU, with the clock is generation circuit inside it.

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    1. Re: Bricked? by Anonymous Coward · · Score: 0

      Lets ask clock boy.

    2. Re:Bricked? by ChrisMaple · · Score: 1

      Since we're not given the board level change, we don't know. I can think of a couple of possibilities. If the clock outputs are loaded too heavily, the chip's output transistors might be overheating, or aluminum traces overheating and failing by migration. If there's capacitive coupling between pins, there might be an over-voltage situation. Either might be fixed by external circuit changes.

      --
      Contribute to civilization: ari.aynrand.org/donate
    3. Re:Bricked? by Khyber · · Score: 2

      More like silicon-level change. The problem seems to be caused by a highly-biased transistor which has too thin of a gate.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    4. Re:Bricked? by ChrisMaple · · Score: 1

      That makes sense. A board level change would be to run the chip with a lower supply voltage, provided that does not degrade performance so much that the whole thing won't work.

      --
      Contribute to civilization: ari.aynrand.org/donate
    5. Re:Bricked? by arglebargle_xiv · · Score: 2

      the semiconductor giant won't disclose precisely how many chips are affected

      Thanks to a leak by an Intel integrator, we can now reveal that the number of defective devices is exactly 1,999,999.999975243.

  3. Intel dropping the ball by Eravnrekaree · · Score: 4, Insightful

    Intel for the past decade has dropped the ball. Its missing the boat on mobile and failing to push x86 chips into mobile phones has weakened their entire platform which really needs to be an "everywhere" platform. It has been clear for a while that mobile would be a majority of CPUs for a decade, why it has not pushed x86 into more phones is beyond me. Its totally incompetent, especially given x86 binary compatability between desktop and mobile could be a selling point

    1. Re:Intel dropping the ball by drinkypoo · · Score: 5, Interesting

      It has been clear for a while that mobile would be a majority of CPUs for a decade, why it has not pushed x86 into more phones is beyond me.

      Because Intel cannot make a really low-power processor. They keep trying, and keep failing. Remember XScale? It was the fastest ARM implementation at the time, but it was also the most power-hungry.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:Intel dropping the ball by Anonymous Coward · · Score: 0

      why it has not pushed x86 into more phones is beyond me. Its totally incompetent, especially given x86 binary compatability between desktop and mobile could be a selling point

      Dear God,

      Please don't answer his prayer. Thank you.

    3. Re:Intel dropping the ball by Narcocide · · Score: 1

      I think they literally can't fit their crap into that small of a box. The whole x86 approach must be fundamentally too bloated, and their chip designs just simply too inefficient below a certain performance envelope. Their technology simply physically can't compete with ARM, and this current faceplant is more evidence of what happens when they try to.

    4. Re:Intel dropping the ball by TheGratefulNet · · Score: 3, Interesting

      check out intel's curie module (their really BAD arduino chip found in the arudino101 board). its a horrible chip and intel actually made that chip the center of their reality tv show 'americas greatest makers'. little known fact: the contestants on that show tried using the chip and all its features and almost everyone failed, even with intel's help. I can't tell you how I know, but I know this for a fact.

      a prime time tv event and intel farked it up.

      only intel could make such a mess and squander such a good chance to get a message out and create some buzz and goodwill. 'makers' still refuse to use intel for many reasons and the biggest: intel still has NO CLUE what the maker market is really about. curie is not a maker chip and so far, no one really is taking that chip seriously.

      that's one example. but it seems typical of intel these days.

      --

      --
      "It is now safe to switch off your computer."
    5. Re:Intel dropping the ball by bored · · Score: 2

      100% Agreement. Their failures to penetrate the mobile market could be understood for a few years as their technology was behind (SOC integration, power consumption, lack of modem/etc). But, over the last couple of years they have come out with some pretty good products. For most measurements, their stuff is actually better than pretty much everything but apple products.

      Yet, what does Intel management do late last year? Cancel the entire business! Which is so short sighted its not even funny. I have no doubt that long term they are done for because of it. ARM's partners will continue to make subpar servers, until they don't, and the fat margins will evaporate and its game over for Intel..

    6. Re:Intel dropping the ball by Anonymous Coward · · Score: 1

      The reason Intel has had a hard time breaking in to the mobile market has to do with Intel's buisness model.

      Intel makes a set of chips they control the specs for, and then device makers design systems around them. This works great for the PC market because Intel turns out really fantastic chips (and chipsets. Really, it's an entire platform) And it's a VERY stable and consistent platform.

      The mobile device market, however, is completely different. In that market segment ARM licenses their IP to chip makers - Everything from pre-designed SoCs, just cores, blocks to make up cores, or everything if you want to make your own chips from the ground up. Point is it's very flexible and you can make chips as cheap or as expensive as you want.

      The downside is different arm chips can be wildly incompatible and each device is basically a one-off. Each device needs a heavily customized OS image because they generally don't even boot the same way.

      Turns out, though, that extreme flexiblity and customization works well for mobile device makers. Intel's one-SoC-fits-all business model does not work as well.

      Which is why Intel has started licensing cores to Rockchip. - Yeah. That's right. Go look it up. Intel atom cores in a chip not made by intel. Crazy times. I picked up a 40 dollar tablet with one of these chips and it's actually pretty good (for a 40 dollar tablet. It gets about 2 days on it's battery and runs apps without crashing and not too much slowdown)

      http://www.androidauthority.com/intel-rockchip-anatomy-chip-deal-611880/

      https://en.wikipedia.org/wiki/Rockchip#Tablet_processors_with_integrated_modem

      http://www.cnx-software.com/2015/01/21/antutu-benchmark-rockchip-rk3288-arm-vs-intel-atom-z3735f/

    7. Re:Intel dropping the ball by Anonymous Coward · · Score: 0

      Price point is the answer. Intel does not want to sell chain as cheap as most mobile chips are, because it wiyll can into their profit margins, especially as those chips get faster.

    8. Re:Intel dropping the ball by networkBoy · · Score: 3, Interesting

      As a *former* Intel engineer I can tell you a little more about this bit:

      only intel could make such a mess and squander such a good chance to get a message out and create some buzz and goodwill. 'makers' still refuse to use intel for many reasons and the biggest: intel still has NO CLUE what the maker market is really about. curie is not a maker chip and so far, no one really is taking that chip seriously.

      Of us engineers in the trenches we had *many* makers, hackers, and all around nerds. Problem is very shortly up the food chain the view changes drastically. Marketing and management are generally clueless about it and adamantly refuse to listen to the real hackers in the company.

      As to how to ass up a design? Think of Intel as a medieval feudal system. Each Earl has his dukes, each duke has his lands with the peasants.
      Well, obviously the peasants can move as long as they're not indentured (A.K.A, can't move for a year after moving), but the dukes don't like it, because with less peasants they can't produce enough for the taxmen (from the Earls).

      The solution? put your thumb in other duke's pies and force a design by committee; erstwhile not actually sharing useful information to the product team because a competing duke is the figurehead of the project.

      There is everything from passive aggressive to cloak and dagger interference that happens. (though normally it's just the PA Asshattery.)

      meh, glad I'm out. They offered me money to go away forever. I grabbed it (and my super ergo office chair) with both hands and bolted.

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    9. Re:Intel dropping the ball by networkBoy · · Score: 1

      deeper than that.
      Intel has a problem with Qualcomm and the later's adamant refusal to licence any of their patents to the former. Thus Intel has to design around the patents and that leads to inefficient silicon.

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    10. Re:Intel dropping the ball by currently_awake · · Score: 1

      Arm doesn't own a chip fab, others make their stuff under license. Intel could do that, while working on designing low power chips for mobile use. Trying to shoehorn x86 chips into mobile use is a waste of resources when the market is already standardized on Arm chips.

    11. Re:Intel dropping the ball by Anonymous Coward · · Score: 0

      Intel's technology is brilliantly antiquated and twee.

    12. Re:Intel dropping the ball by K.+S.+Kyosuke · · Score: 1

      Arm doesn't own a chip fab, others make their stuff under license.

      Which is the very reason why they're forced to innovate and Intel isn't. The synthesizable design really requires ARM to squeeze everything out of it. The fact that Intel still isn't able to compete as an ARM manufacturer is even more surprising in this light, since Intel isn't really forced to compromise like that. They could work out a top-notch custom implementation if they wanted...but they don't want to. And why would they? Better milk the wintel cash cow some more...

      --
      Ezekiel 23:20
    13. Re:Intel dropping the ball by Aighearach · · Score: 1

      What would intel bring to the table? If I want ARM I can get a cheap Chinese generic, or a quality TI chip. Other quality vendors also have offerings.

      Intel's problem in trying to compete in this space is that there is very little room for premium products with lots of label value; they are commodity products, and the higher priced offerings are for higher specs, with very little label value.

      Maybe it is too many managers and marketers, or some other issue, but Intel really sucks at making things that will sell based on the datasheet. For CPUs they compete mostly at the higher end where there is label premium and they can get higher margins than the rest of the industry by being the biggest or fastest in class. That's great while CPU speeds are increasing, but they don't perform well on the industry plateaus. As more and more devices reach a speed where improvements wouldn't be noticed, this is going to be an increasing challenge for them.

      And now with this blunder, Intel's label value for embedded will be negative for years, even if they don't make any more blunders. Why would I pay the same price for the chip from the company that has a history of these types of problems? This is their biggest screw up in awhile, but their past math bugs are still in people's memory, and act as a multiplier for this type of mistake. Somebody saved Intel a few cents by putting in less olives, and they do that repeatedly even when it screws their customers! They won't learn this lesson, because they would have already learned it.

      Compare Texas Instruments; if I pay a few cents extra for TI chips, they not only meet their higher specs but they consistently exceed them. I pay extra for TI because I get extra olives. I'm not going to pay even more for Intel in order to have a fancy holographic sticker in the package, and maybe bugs and breakdowns.

    14. Re:Intel dropping the ball by Anonymous Coward · · Score: 0

      Is that where the stuff for sale on your website came from? Or was that stuff, like your chair, stolen from another company? I ask because it looks like stuff the enterprise server labs would have tested many years ago.

    15. Re:Intel dropping the ball by networkBoy · · Score: 1

      lol, I forgot that was even up (and all that crud is long since been tossed).
      Nah, my long former employer was an electronics scrapper and this was stuff I had built up from working there.

      The chair, however, I took with full permission from my (now former) manager and the HR exit interview gal.

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    16. Re:Intel dropping the ball by Anonymous Coward · · Score: 0

      LOL. -1 from ARM superfans and I was pointing out the strengths!

    17. Re:Intel dropping the ball by Anonymous Coward · · Score: 0

      Jajaja. I was a former Intel engineer, I can confirm his words.

  4. Re:Intel chips are bricking by TheGratefulNet · · Score: 0

    are we not cheeto?

    WE ARE DEVO!

    (sorry) ;)

    --

    --
    "It is now safe to switch off your computer."
  5. So Intel is continuing to lose the battle with ARM by Anonymous Coward · · Score: 0

    So much for knocking ARM out of the embedded processor market.

  6. Won't disclose which products at risk? by Anonymous Coward · · Score: 0

    Time to return them ALL then, no?

  7. It's a long slippery slope... by NMBob · · Score: 1

    The 8080 never had these problems.

    1. Re:It's a long slippery slope... by Anonymous Coward · · Score: 0

      the 8080 needed an external 2 phase high power driver, how is that comparable?

  8. atom chips also don't perform well in modems.. by Idisagree · · Score: 1

    Puma 6 issues (atom powered) - www.dslreports.com/shownews/The-Arris-SB6190-Modem-Puma-6-Chipset-Have-Some-Major-Issues-138411

  9. Not to worry by Waffle+Iron · · Score: 3, Funny

    Once you get a replacement CPU from Intel, it's easy to upgrade your system.

    Get a small screwdriver, and insert it in the gap under the chip near pin 1. Gently rock the CPU out of its DIP socket; you may have to alternate pulling at each end of the chip.

    The new chip's legs will be slightly splayed for use with automatic pick-and-place machines. You may need to gently bend them inwards before proceeding. Making sure that pin 1 is aligned with the marker on the motherboard silkscreen, gently push the new CPU straight down into the DIP socket. Your system is fixed!

    1. Re:Not to worry by Rick+Schumann · · Score: 1

      He still uses Intel microprocessors, LOL

      You clearly don't know anything about microprocessors, if you did you'd be using a Z80, not some antiquated thing like an 8080 that needs 3 supply rails and a multi-phase clock signal, LOL.

    2. Re:Not to worry by Anonymous Coward · · Score: 0

      This would be great advice if the Atom C2xxx series wasn't a soldered BGA chip design...

    3. Re:Not to worry by ilsaloving · · Score: 1

      Unless your machine is out of warranty, I don't see the point of this. The hassle and risks greatly outweigh just contacting your vendor and getting the part/unit replaced.

      And never mind that the average person is won't have the skill necessary to do such a repair anyway.

    4. Re:Not to worry by Anonymous Coward · · Score: 0

      the z-80 had severe failure issues across many different brands, I have replace hundreds of them

    5. Re:Not to worry by Anonymous Coward · · Score: 0

      2017: Still pedantic as hell, still no sense of humor, still a total buzzkill

      Go back to reddit, or wherever it is you killjoys come from.

    6. Re:Not to worry by Anonymous Coward · · Score: 0

      ^Joke
      ---------- < ISS

      ---------- < Jetplane

      \ O /
        |
        / \ < you

    7. Re:Not to worry by K.+S.+Kyosuke · · Score: 1

      Isn't it a jokoid until it hits the atmosphere?

      --
      Ezekiel 23:20
  10. All products? by Anonymous Coward · · Score: 0

    If they don't want to say which products are effected I will assume it is all of them.

  11. Immediate repair is cost of a virgin motherboard by Anonymous Coward · · Score: 3, Interesting

    Can't post to The Register, since they don't have ACs.

    Anyway, the issue is damage to the LPC (low-pin-count) bus clock line. This is a secondary bus where you hang old ISA-style devices, like the system FLASH. If the FLASH is the only thing in there, it will mostly render the system unbootable (so, stuff that never gets power-cycled would just keep going). But LPC can generate interrupts, and one often hangs other crap to that bus, such as i2c controllers for hot-swap bays, motherboard management controllers, and other sensors. In that case, you can expect severe runtime misbehavior.

    The issue is caused by *continuous degradation due to use*, so repairing it is easy, if costly: replace the motherboard with a new one under warranty (and even if out of warranty period wherever this kind of "stealth" manufacturing defect is not subject to warranty time period limitations, such as in Brazil). It will "reset" the counter. This is your zero-day solution to the issue.

    Depending on time-to-market for the new stepping (hardware revision) B1/C0 of the Atom C2000, you might need an interim solution, which is the "platform-level change", i.e. redesigned board with extra components that work around Intel's hardware design error. As soon as you have these, you start using these to replace any boards returned due to the defect, or start a "recall" to preemptively replace boards.

    Depending on the total cost of the board plus other components, you keep the old boards you replaced around, and when revision B1/C0 of the Atom C2000 is out, you BGA-replace them in a factory (about US$ 25 per board in large volumes, if that much), maybe replace any liquid electrolytic capacitors and other crap that ages badly, and use the boards either as new or as refurbished, depending on your corporate/regulatory ethics. This kind of repair almost always really resets the boards MTBF. If Intel supplies the replacement Atoms at no charge, the cost of repair might well be far less than the cost of the production run for boards you'd want to keep around for warranty services, anyway.

    Mind you, at 1.5 years per failure, it will be rare the legislation/contract that forces more than one replacement... so, let's hope they don't replace a faulty board with a brand-new virgin but-still-timebombed board. You'd have trouble to replace it a second time if it fails after the warranty period.

  12. Netgear also affected by Anonymous Coward · · Score: 0

    So far, NetGears storage line, ReadyNAS, has some affected systems, including the Readynas 3100 family. And possibly some other networking controllers like their wireless controller. Those are using various C2000 family chips.

  13. Damn interns by Anonymous Coward · · Score: 0

    They were supposed to fail after warranty period, not before

  14. Re:I wonder how many are failing by Anonymous Coward · · Score: 0

    Don't power it off, or it might not power back on.

  15. Crap have supermicro by bongey · · Score: 1

    I have Supermicro with a 2750F-O dammit

  16. Intel abandoned its engineering roots... by Anonymous Coward · · Score: 0

    and let marketers take over the company.

    They had issues with this dating back to the 80s, but the pentium and forward is where they really ruined themselves with it.

    Plus putting their engineers (both hardware and software) on Don Quixote or MacGyver (the original!) style quests to make an oversized peg fit into an undersized hole. And around the time the engineers figured out an ingenious way to make it work: PRODUCT CANCELLED. After putting months or years of overtime into a project like that, of course they are going to find fewer and fewer competent engineers working for them. The smart ones cut their losses after 5-20 years, just enough to build up a cushion for a career change. The rest are interchangable and either aren't willing to risk jumping ship to somewhere better, or can't get somewhere better to hire them.

    1. Re:Intel abandoned its engineering roots... by Anonymous Coward · · Score: 0

      You mean like Apple?
      Marketing is all you need fuck whether it works very well, but look it's so pretty.

  17. Nope. I have the chip at home by stikves · · Score: 1

    Intel C2000 series was a dream come true for low power servers. I have a 8 core C2758 atom server at home (from SuperMicro), and it is really a beast given working at less than 10W total system power at idle or low utilization (excluding HDDs, of course but with the MB, CPU and RAM).

    But they have dropped the ball, now in two ways:

    - There was no update to the Atom server line in the last two years. They probably do not want to cannibalize their other offerings (8 core CPU with AES and VT extensions is more than enough to host several VMs). But they also left the market empty (there is still no competition).
    - Now we learn that the chips are faulty. Without any replacement option (the chip is soldered on a particularly expensive motherboard), I'll just wait for the time it will fail.

    Given they did not provide any useful performance increase in the last two generations, heck make it three, I'm really disappointed with Intel at the moment. No more mobile chips, no more low power server chips, and three year old i7 4790K can easily compete against a recent 7700K with only 10% drop in performance.

  18. Re:Immediate repair is cost of a virgin motherboar by Anonymous Coward · · Score: 0

    Wow! You have a lot of insight in to the technical issue AND the strategic implications/solutions. I suspect that you "run things" somewhere with mastermind skill. Thanks for sharing your thoughts!

  19. Re:Immediate repair is cost of a virgin motherboar by Anonymous Coward · · Score: 0

    Thanks for the sarcasm :-P Yeah, things are *never* that simple, but it is still a nice simplified view of how it *should* go.