Slashdot Mirror


OpenSUSE Beta Can Brick Intel e1000e Network Cards

An anonymous reader writes "Some Intel cards don't just not work with the new OpenSUSE beta, they can get bricked as well. Check your hardware before you install!" The only card mentioned as affected is the Intel e1000e, and it's not just OpenSUSE for which this card is a problem, according to this short article: "Bug reports for Fedora 9 and 10 and Linux Kernel 2.6.27rc1 match the symptoms reported by SUSE users."

129 comments

  1. Badly written firmware. by psergiu · · Score: 4, Insightful

    Any decent firmware for a device should not allow the user to accidentally destroy the device. Looks like Intel skipped on Q&A.

    --
    1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.
    1. Re:Badly written firmware. by reashlin · · Score: 4, Informative

      Any decent device driver should also not be writing to the firmware, which i'm guessing is how the device can become bricked.

    2. Re:Badly written firmware. by arkhan_jg · · Score: 4, Informative

      A few years back, Mandrake merged a kernel patch in their new release that would accidentally brick certain LG CDROM drives using old firmware versions when it checked if it had writing capabilities. This was largely LG's fault for re-using a valid command code to mean 'start flashing me now' instead, and of course, no firmware was then forthcoming, leaving the drive in an unusable state.

      LG ended up replacing old affected drives, and the kernel patch was rewritten. Mandrake bore the brunt of the reputation hit though for quite a while, which I suspect will happen to SuSE.

      The e1000e driver is the new one for pci-e based intel pro 1000 chipsets, with the old pci and pci-x cards unaffected with the original e1000 driver. Still, that's going to be quite a lot of cards affected.

      --
      Remember kids, it's all fun and games until someone commits wholesale galactic genocide.
    3. Re:Badly written firmware. by SanityInAnarchy · · Score: 1

      I don't think that's entirely possible.

      Consider a cooling system. Is there any reason that shouldn't be software-controlled? If it is, the user could conceivably turn off all fans, thus overheating the device.

      Sure, you could take away enough functionality that the user can't do that. But that's the tradeoff -- functionality. No decent gun wouldn't allow you to shoot yourself in the foot.

      But then, I do think Linux should be in Intel's Q&A, especially for something like a network card.

      --
      Don't thank God, thank a doctor!
    4. Re:Badly written firmware. by Anonymous Coward · · Score: 1, Insightful

      Any decent firmware for a device should not allow the user to accidentally destroy the device.
      Looks like Intel skipped on Q&A.

      I think that's a bit unfair. To put it in terms of our favorite analogy subject, it would be like saying that a well-engineered car should not allow the driver to accidentally destroy the vehicle.

      There's only so much you can account for in silicon. Any device that has a function that can severely damage itself (i.e. a car being able to turn on with low oil levels or a motherboard with a firmware update function) will eventually get damaged by somebody, somewhere.

    5. Re:Badly written firmware. by PC+and+Sony+Fanboy · · Score: 1

      Ah, but when even a child can get the keys and crash the car, there is a problem.

    6. Re:Badly written firmware. by Anonymous Coward · · Score: 3, Informative

      It appears that the bug is a combination of memory-mapped control registers which can enable flash writing and another (graphics related) problem which causes random data to be written to that area. The driver itself does not attempt to rewrite the firmware.

    7. Re:Badly written firmware. by jacquesm · · Score: 1

      A good design would still failsafe by shutting down when the components reach their maximum rated working temperature.

      That's what engineering is all about: anticipating modes of failure and dealing with them.

    8. Re:Badly written firmware. by griego · · Score: 1

      It's "QA", not "Q&A". Sorry, pet peeve of mine. :P

    9. Re:Badly written firmware. by jandrese · · Score: 4, Interesting

      Except for the multitude of cards that require you to basically reflash the firmware as part of the initialization? Cheap 802.11 cards are notorious for this, and it's a pain because it means you have to ship a binary blob with the driver and all of the licensing headaches that entails.

      --

      I read the internet for the articles.
    10. Re:Badly written firmware. by Anonymous Coward · · Score: 0
    11. Re:Badly written firmware. by Anonymous Coward · · Score: 1, Interesting

      Let's say intel does test with Linux, and releases a GPL driver. Then debian programmers make a few changes to eliminate compiler/valgrind warnings (remember ssh?). Oops, they didn't know what they were doing and now your ethernet card is bricked. But, hey, they got rid of that compiler warning. Open source FTW!

    12. Re:Badly written firmware. by ChrisJones · · Score: 2, Insightful

      err, what's the point of having firmware if your driver can't talk to it?

      Also, this isn't about firmware, it's about Non-volatile memory. The chip uses it to store things like its MAC address.

      You fail.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    13. Re:Badly written firmware. by ChrisJones · · Score: 1

      Intel do do QA on Linux. One of their engineers reported to the Ubuntu bug about this that they had reproduced it while testing Alphas of Intrepid.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    14. Re:Badly written firmware. by pipatron · · Score: 1

      How can an invalid MAC address brick the hardware? Lesson one is to check user input, no matter if it's strings in an URL or user-written data in NVRAM.

      --
      c++; /* this makes c bigger but returns the old value */
    15. Re:Badly written firmware. by ChrisJones · · Score: 5, Informative

      the NVM is checksummed. If the checksum fails, the driver refuses to initialise the card.
      It seems that something is able to write garbage data to the NVM, leaving all of its settings broken.

      This isn't some database API where you get to do lots of nice high-level verification, this is twiddling bits in hardware. Of course it should be properly protected, and my discussions with Intel about this suggest that it is, and that something else is at work here, but until they release a fix, we won't know for sure.

      Also, their own DOS tools to restore NIC EEPROMs actually break the laptop NICs to the point that they won't enumerate on the PCI bus, so there is literally no hope of recovery unless you happen to have a BIOS update which will rewrite all of the memory the NIC uses.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    16. Re:Badly written firmware. by PhilHibbs · · Score: 2, Insightful

      There's no way that a device can check all possible combinations of input for crash-inducing behaviour. If you think it can, go read "GÃdel, Escher, Bach". In fact, go read it anyway, it's awesome.

      Also there's a difference between a NIC and a web site - the NIC's API input is coming from its owner, the web site's customer is not. If you're a piece of hardware, you do what your owner tells you.

    17. Re:Badly written firmware. by Fweeky · · Score: 1

      So what you're saying is Linux should be split into a microkernel with every driver in it's own private address space?

      Good luck with that.

    18. Re:Badly written firmware. by Zero__Kelvin · · Score: 1

      Ah! ... now you are starting to understand why your parents won't let you have them ;-)

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    19. Re:Badly written firmware. by X0563511 · · Score: 2, Interesting

      Can you get us an ISBN for that book? The non-ASCII character in there got mangled by slashdot (or my browser) and all search results based on my assumptions, are trash.

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    20. Re:Badly written firmware. by X0563511 · · Score: 1

      And that is what Intel is doing... when the nvram checksum is bad, it refuses to load the card assuming it is damaged.

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    21. Re:Badly written firmware. by Anonymous Coward · · Score: 1, Interesting

      Those cards need their on-card software reloaded every time because they use volatile memory to store it. Any changes to that memory are gone the next time you power off the computer. Depending on the rest of the hardware, that code could still damage the card though (software controlled voltage regulators, fans and amplifiers come to mind).

    22. Re:Badly written firmware. by Anonymous Coward · · Score: 0

      "Godel", where 'o' is an umlaut (two dots on it).

      And I am afraid there is no book by that name.

    23. Re:Badly written firmware. by Anonymous Coward · · Score: 0

      978-046502656, but you fail at being a nerd for not having read that book a long time age.

      (not that is has anything to do with the topic at hand)

    24. Re:Badly written firmware. by the_B0fh · · Score: 1

      Isn't this another reason not to use blobs?

    25. Re:Badly written firmware. by psergiu · · Score: 1

      The /. forced preview is useless if you don't actualy preview the post before submitting :)

      --
      1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.
    26. Re:Badly written firmware. by Khyber · · Score: 1

      "Is there any reason that shouldn't be software-controlled?"

      I can think of one. Software can be infected with malicious code. I'd prefer pure hardware-logic control. A simple potentiometer that controls how much power a fan receives. I can control it myself, don't need to learn any special commands/controls/programming, and it just works.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    27. Re:Badly written firmware. by norton_I · · Score: 1

      Actually, the cheap cards you are talking about wouldn't have this problem. Their firmware is not actually stored in flash, but in RAM which is why it must be refreshed on every boot. If you write a bad blob to it, a power cycle fixes it.

    28. Re:Badly written firmware. by Anonymous Coward · · Score: 1, Informative

      978-046502656, but you fail at being a nerd for not having read that book a long time age.

      That's 978-0465026562. Go home and practice copy and paste. The moving finger fails, and having failed, moves on.

    29. Re:Badly written firmware. by PC+and+Sony+Fanboy · · Score: 1

      Actually, I was saying that the children who use linux get what they deserve - it isn't exactly a user-friendly OS when you've switched from OSX, or have unsupported firmware, or use beta versions of things...

    30. Re:Badly written firmware. by dyftm · · Score: 1

      Yes, GEB is awesome. But it doesn't have anything to do with this. You can check every input, it may just take a while.

    31. Re:Badly written firmware. by SanityInAnarchy · · Score: 1

      In which case, the blame would be where it belongs -- on the open source developers. At least that way, they've got an opportunity to get it right.

      --
      Don't thank God, thank a doctor!
    32. Re:Badly written firmware. by PhilHibbs · · Score: 1

      You're right, my example was inappropriate.

    33. Re:Badly written firmware. by Anonymous Coward · · Score: 0

      In this case that would be an advantage. A card that has it's firmware loaded on boot cannot be destroyed by overwriting said firmware. Power-cycle the card, and load the firmware again. Problem solved.

      And the firmware being loaded on boot means that the loader code is not a part of said firmware, and thus won't be overwritten by mistake.

    34. Re:Badly written firmware. by BarryJacobsen · · Score: 1

      I don't have an ISBN, but I know the book he's referring to is Godel, Escher, Bach.

    35. Re:Badly written firmware. by Shotgun · · Score: 1

      That depends entirely upon where you work.

      I've been at places where the quality teams job was to ask, "What the hell is this SUPPOSED to be?!"

      No design team (or documents). You're supposed to ask the developer how it is supposed to work, and then write your test cases accordingly.

      --
      Aah, change is good. -- Rafiki
      Yeah, but it ain't easy. -- Simba
  2. !Bricked by Anonymous Coward · · Score: 5, Funny

    Why won't people stop using the word brick to mean things that aren't bricked! All you have to do is use a quasi-negative reverse transponder linked to your flux capacitor to generate an inverse tachyon field, connect it to the JTAG while chanting Siaynoq and it will come right up. Sheesh!

    1. Re:!Bricked by binarylarry · · Score: 1

      Yes because such a technical term should not be bandied about lightly.

      --
      Mod me down, my New Earth Global Warmingist friends!
    2. Re:!Bricked by 3p1ph4ny · · Score: 1

      What's your definition? From TFA:

      The problem is described as "a serious issue with the potential to damage the network card in a way that it cannot be used any longer".

      I thought that was pretty much textbook "bricking".

    3. Re:!Bricked by ckthorp · · Score: 1

      You forgot the deflector dish.

    4. Re:!Bricked by clickety6 · · Score: 4, Funny

      I also hate it when people call these things bricked incorrectly.

      Bricked XBOXen, bricked PSPs, bricked iPhones and now bricked network cards.

      People, these things are not bricked! Believe me. I've tried building houses and garden walls out of them and they are absolutely fecking useless as bricks!

      Please use the correct term in future. These items are not bricked, they are just FUBI (fecked up by incompetence)

      Thank you...

      --
      ----------------------------------- My Other Sig Is Hilarious -----------------------------------
    5. Re:!Bricked by Anonymous Coward · · Score: 0

      WHOOSH

      That's the sound of a brick sailing over your head.

    6. Re:!Bricked by RedK · · Score: 1

      And I hate people that don't know that the plural of box is boxes, not boxen. Xboxes. BOXES!!!

      --
      "Not to mention all the idiots who use words like boxen."
      Anonymous Coward on Monday August 04, @06:49PM
    7. Re:!Bricked by darkonc · · Score: 1
      Actually, it takes four of them (and a boatload of epoxy) to make a brick -- but it's all the same result.

      I think, however, that I can safely claim that the 'brick' moniker is meant to imply the kind of result you get when you smash the device with a brick.
      It's my story, and I'm sticking to it (until something better comes along)

      --
      Sometimes boldness is in fashion. Sometimes only the brave will be bold.
  3. That's not what Bricked means!!!!oneone1 by OverlordQ · · Score: 5, Funny

    I hate it when people keep incorrectly using brick . . . . wait, what? They used it right? Oh . . . my bad, carry on.

    --
    Your hair look like poop, Bob! - Wanker.
    1. Re:That's not what Bricked means!!!!oneone1 by Anonymous Coward · · Score: 0

      I hate it when people keep incorrectly using brick . . . . wait, what? They used it right? Oh . . . my bad, carry on.

      Except they can be "unbricked" with a BIOS update, so it's not really a brick.

    2. Re:That's not what Bricked means!!!!oneone1 by ChrisJones · · Score: 1

      Except that that's not true.
      You *may* be saved by a BIOS update, but only if that update happens to include the LAN option ROM and NVM area.
      All of the publically available BIOS images for my Thinkpad X300 do *not* include the LAN portions and so were unable to rescue my corrupted NVM.
      Ironically, Intel do ship rescue tools for this sort of thing, but while they run on Laptop parts, they are not supposed to, so trying to use that actually made things worse and the NIC refused to initialise and didn't even appear on the PCI bus.
      The only solution (as confirmed to me by Intel) was to RMA the laptop.

      That's about as bricked as you get.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
  4. Kernel fix, perhaps? by Anonymous Coward · · Score: 4, Informative

    Kernel 2.6.27-rc7 has a changelog entry that reads:

    Christopher Li (1):
                e1000: prevent corruption of EEPROM/NVM

    1. Re:Kernel fix, perhaps? by enodev · · Score: 1

      No, its the e1000e which is affected (note the trailing e)

    2. Re:Kernel fix, perhaps? by neonprimetime · · Score: 5, Informative

      From: Christopher Li

      Andrey reports e1000 corruption, and that a patch in vmware's ESX fixed it.

      The EEPROM corruption is triggered by concurrent access of the EEPROM read/write. Putting a lock around it solve the problem.


      link

    3. Re:Kernel fix, perhaps? by phorm · · Score: 1

      Is there a separate driver for the e1000e? Maybe both use can the "e1000" driver (I don't have that particular card) but it only bricks the e1000e

    4. Re:Kernel fix, perhaps? by ChrisJones · · Score: 1

      e1000 and e1000e are separate drivers.
      Some of the devices which are now supported by e1000e were previously supported by e1000, so it's all a bit confusing.

      AIUI, if the part is PCI Express, it's now in e1000e.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    5. Re:Kernel fix, perhaps? by Fweeky · · Score: 1

      In FreeBSD at least, the split's not that clear cut; there's em(4) for older cards and igb(4) for newer ones, both of which share a seperate e1000 module where lots of their code lives. I think the situation's similar on Linux.

  5. Re:haha! by denis-The-menace · · Score: 0

    That's because it a closed driver.
    Only Intel knows what it does.
    Looks like it write the NIC's firmware and cannot reset just the NIC. So it asks for a reboot.
    IOW: the NIC needs the reboot, not Linux.

    Almost all drivers in Windows are closed.

    --
    Obama's legacy: (N)othing (S)ecure (A)nywhere and (T)error (S)imulation (A)dministration
  6. Agreed, in this instance by davidwr · · Score: 2, Interesting

    Gun and air-conditioning aside, devices should not allow accidental bricking or physical damage unless it is inherent in the function of the hardware.

    For cases of loading bad firmware, the "load new firmware" instruction should have a few failsafes like magic words or what-not so it isn't accidentally invoked.

    Even better, hardware devices should have a failsafe firmware burned on silicon that can be reactivated by flipping a switch, setting a jumper, or some other hardware-action-required setting. This "failsafe firmware" may be nothing more than a stub that prepares the device to accept a new "real" firmware, but at least it will allow de-bricking.

    You don't really want this debrick/failsafe-mode to be triggerable through software alone, it's too much of an opportunity for malicious use.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    1. Re:Agreed, in this instance by Anonymous Coward · · Score: 0

      Or save yourself 50 cents a device when you sell millions and deny the issues if they come up and then only fix the issues for those with money that complain the loudest.

    2. Re:Agreed, in this instance by antifoidulus · · Score: 1

      For cases of loading bad firmware, the "load new firmware" instruction should have a few failsafes like magic words or what-not so it isn't accidentally invoked.

      So you are saying that QA engineers could learn something from BSDM? Um, sign me up for that training class!

  7. Oh great. by gandhi_2 · · Score: 4, Interesting
    Remember when Dell told customers that installing Linux on their computers voided the warranty?

    Remember how everyone on /. called bullshit?

    This doesn't look good for our cause.

    1. Re:Oh great. by Anonymous Coward · · Score: 0

      Now that I actually bought an inspiron 1420n with Ubuntu preinstalled, I can keep my warranty while only having no /support/. What a deal.

    2. Re:Oh great. by ChrisJones · · Score: 1

      Isn't there an option to buy support from Canonical?

      (Disclaimer: I work for Canonical, but not in the bits that produce or support Ubuntu)

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    3. Re:Oh great. by jimicus · · Score: 2

      This doesn't look good for our cause.

      It doesn't, but it does get me thinking: Given that it's possible for a badly behaved driver running in the kernel to stamp over NVRAM rendering hardware useless, how many pieces of bricked hardware have I thrown out over the years that were bricked because of a freak coincidence involving a rogue driver?

      Bear in mind that many drivers in Windows run quite close to ring 0 of the kernel and they would be just as capable of causing such a problem.

    4. Re:Oh great. by sloth+jr · · Score: 1

      No, I don't remember this. Given that you can buy a Dell without an OS whatsoever, and still have it warrantied, I do indeed call bullshit.

      sloth jr

    5. Re:Oh great. by slyborg · · Score: 1

      Yeah, I call bullshit again. WTH are you on about? That was about Dell not wanting to educate its script monkeys in Support about Linux. Don't have a problem with it, but it wasn't about any known issue with Linux damaging a computer.

      This issue is a specific model of ethernet card on a specific board type using a beta release sometimes getting corrupted. A BIOS restore restores function.

      This could easily have happened with a Windows beta release.

    6. Re:Oh great. by Anonymous Coward · · Score: 0

      Yeah, "easily".

      Keep dreaming. Linux breaks network cards.

    7. Re:Oh great. by Anonymous Coward · · Score: 0

      Yeah, but would dell actually put a PCI-E Ethernet in one of their boxes? Hmmm?

    8. Re:Oh great. by Anonymous Coward · · Score: 0

      If you installing linux is the thing that breaks the computer then it should void the warranty. If the network card just breaks because it was defective then it shouldn't matter if you installed linux or not, the warranty should cover it.

      Now for the car analogy. If I replace my car's coolant with kool-aid and it overheats because of it then my warranty shouldn't cover it. If I replace my car's coolant with a different brand of coolant and my car overheats because of a bad gasket my warranty should cover it.

    9. Re:Oh great. by darkonc · · Score: 1

      Yeah, but how many people have Windows on their machine and it works like a brick? It's not like even this beta leaves people worse off than Using MS Windows....

      --
      Sometimes boldness is in fashion. Sometimes only the brave will be bold.
    10. Re:Oh great. by gandhi_2 · · Score: 1
      Well...my friend. I 100% agree with you.

      I was simply talking about companies like Dell who simply needed an EXCUSE to avoid fulfilling warranty obligations. There were several cases where Dell owners (other OEM's did this too, i'm not picking on Dell exclusively.) put a Linux/GNU Linux/BSD/etc on, only to find their warranties voided...only a few who fought loud enough got service. And these were the days before most OEM's sold systems with FOSS OS's.

      You are right, it really came down to service employees unable to trouble-shoot 37 distros over the phone and Dell et al trying to save a few bucks.

      My point is: to the business-volks, marketroids, and hot-buttered soccer moms out there, something like this doesn't sound good. It doesn't matter how many devices might be broken by Windows, how poorly the firmware was engineered, or anything else. Expect the "Linux voids warranties" idea to gain a little steam out of this. Hopefully not much.

    11. Re:Oh great. by Anonymous Coward · · Score: 0

      That is one of the major reasons that I won't use Linux, it tends to break hardware. I've never had that happen with any other OS, including Windows.

    12. Re:Oh great. by Anonymous Coward · · Score: 0

      So you mean the company that will install linux on your computer if you ask them to, said that doing it yourself voided the warranty. That would mean they have the drivers for the distro you wanted to install and if so you could get them from them under the GPL license agreement. :-)

      You have to admit they are fricken useless sometimes.

      Holy caped bat crap > What is with that intel ethernet? Anyone ask Intel to write a better driver? Anyhow i have been using linux since I discovered a distro many years ago and am very happy with the open suse 11. Which i run.

      The plasmoid desktop is pretty sweet even on my 5yr old box.

  8. What is really happening regarding this problem by ronch · · Score: 5, Informative

    I work on the e1000 team (including the e1000e driver) and here is what we know. A panic in another driver (believed to be the gfx driver but uncertain) which scribbles over the NIC/LOM non-volatile memory (NVM). This is only happening with the 2.6.27-rc kernels on ICHx systems. Since the NIC/LOM VNM is part of the whole BIOS image other things in the system could be effected by this driver panic as well. An update of the system BIOS will restore the NIC/LOM to be operational. We have some patches under test right now that we will be releasing later today to protect the NIC/LOM NVM. That should help narrow down who is scribbling over NVM.

    1. Re:What is really happening regarding this problem by Anonymous Coward · · Score: 3, Funny

      This post was what to helpful and informative.

      It doesn't belong in the comments of a Slashdot article!

    2. Re:What is really happening regarding this problem by jacquesm · · Score: 1

      Would you like some toast with your acronym soup ? The scary thing is that it is in fact readable...

    3. Re:What is really happening regarding this problem by Alsee · · Score: 1, Funny

      You should take their crayons away until they learn to stop scribbling everywhere.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    4. Re:What is really happening regarding this problem by k.a.f. · · Score: 2, Informative

      I work on the e1000 team (including the e1000e driver) and here is what we know. A panic in another driver (believed to be the gfx driver but uncertain) which scribbles over the NIC/LOM non-volatile memory (NVM). This is only happening with the 2.6.27-rc kernels on ICHx systems. Since the NIC/LOM VNM is part of the whole BIOS image other things in the system could be effected by this driver panic as well. An update of the system BIOS will restore the NIC/LOM to be operational.

      In other words, as usual, the device is NOT bricked.

    5. Re:What is really happening regarding this problem by Anonymous Coward · · Score: 0

      Thanks for posting some real details.

      Can you tell us; how come the NVM is even writeable outside of doing BIOS updates and such, in the first place?

      It's hardly inconceivable that at some point a driver in ring0 will barf and write garbage into that memory area..

    6. Re:What is really happening regarding this problem by ChrisJones · · Score: 1

      ronch: very interesting.

      I was given to understand that writing to the NVM on these parts was controlled by a hardware lock bit and so it shouldn't be possible for something to scribble over it?

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    7. Re:What is really happening regarding this problem by pipatron · · Score: 5, Funny

      This post was what to helpful and informative.

      Good thing you set it straight again with a sentence that's both incomprehensible and contains a typo.

      --
      c++; /* this makes c bigger but returns the old value */
    8. Re:What is really happening regarding this problem by ChrisJones · · Score: 3, Informative

      As I just pointed out elsewhere, that particular comment is not correct.
      A BIOS update *may* restore the NIC, but it depends purely on the BIOS update including that information.
      All of the available updates for my laptop do not include that, so the device was not recoverable. ie bricked.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    9. Re:What is really happening regarding this problem by ChrisJones · · Score: 1

      Most Ethernet drivers expose interfaces for writing to their NVM, specifically for ethtool to do its work.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    10. Re:What is really happening regarding this problem by incripshin · · Score: 1

      So, this is apparently not at all a problem with Suse or Fedora. Referring to the article and not the parent post, I hate it when people talk about distributions like they are responsible for anything at all. Distributions are 99% a collection of software written by others. If you're going to assign blame, it should be done properly.

    11. Re:What is really happening regarding this problem by Anonymous Coward · · Score: 0

      Look, I work on the e1000e team and here is what we know. Due to too many late night partying sessions during hardware Q&A the e1000e has a massive affinity towards bricking, "brickiness" as we like to call it. We realized the brickability of the e1000e after it was printed and shipped to retailers. Our one hope was that no OS would stumble upon the correct steps, but leave it to those Linux geeks to find a way. Now we're doing a massive PR stunt by having our public relations director, codenamed "ronch," post misleading comments on tech forums, hoping that the techies will then spread the propaganda and we can go back to our usual late night party...er, "training sessions."

    12. Re:What is really happening regarding this problem by T.E.D. · · Score: 3, Informative

      I've written an E1000 driver (for a realtime program on a different OS). The issue is that once you have the base address registers for the card mapped into system memory, they are there. There's no super-special secret mechanisim devoted to writing only to the flash RAM.

      Oversimplifying things a great deal (so experts out there, please don't roast me over the nitty details), every PCI device in your system presents software drivers with up to 6 "Base Address Registers" (BARs). Most PCI devices really only use one or two of those. This is (mostly) the device-driver's only window into the PCI device.

      At bootup the system places the physical address of the device's control registers and memory into its BARs. When the device driver starts up later, it grabs those physical addresses and maps them into virtual memory so that software can get at them.

      Once this is done, *all* the device's control registers are avialble to software. If one of these registers can command the card to write data to flash (as one of the control registers on the E1000 does), then the proper (or improper) value written to that memory location by *anyone* will cause a flash write. Its that simple.

    13. Re:What is really happening regarding this problem by T.E.D. · · Score: 1

      On the E1000? I believe Intel has the specs online, although I couldn't find them with a quick website search. I have them handy here, but I've never written my driver to write to the NVRAM so I can't say I know entirely how its done. However, skimming through it, I don't see anything about any "hardware lock bit".

      Think about this logically. If its possible to flash it on purpose entirely through software, then it would be possible to do so accidentally with malfunctioning software.

    14. Re:What is really happening regarding this problem by ChrisJones · · Score: 1

      Of course, but the prevailing wisdom is that for the parts used on ICH8/ICH9 systems, a more complex series of actions is needed than simply flipping one bit.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    15. Re:What is really happening regarding this problem by Anonymous Coward · · Score: 0

      How did you understand it contains a typo, if it was incomprehensible to you?

    16. Re:What is really happening regarding this problem by Anonymous Coward · · Score: 0

      Generally in a modern OS (including Linux), memory mapped addresses for hardware are not going to be written by user-mode processes because the OS will set the page permissions such that they can't. It'll cause a page/segmentation fault. Apps would use probably use a syscall, probably some kind of ioctl(), to tell the kernel to do something to the card. The kernel, and probably the card driver, then gets a chance to sanity check the operation before it happens.

      The issue here is that some other driver might have a bug that would write to the memory. If that driver runs in the kernel, this would be allowed. It might not even be as obvious as a buffer overrun in software, as the bad driver could be using dma, telling the hardware to overwrite the wrong section of memory.

      I think in modern hardware there was talks about adding IOMMU hardware to prevent rogue DMAs, but i think this was more geared towards virtualization technology, as a kernel mode driver would have permission to write anywhere anyway.

  9. Bad hardware design by Anonymous Coward · · Score: 1, Insightful

    There should be a jumper somewhere which either physically disables writes to nonvolatile memory. That applies to all hardware. The current incident is a bug, but this could also be exploited by malware for embedding itself in the mainboard firmware or the PXE firmware on a network card. Also, bring back write-protection switches on USB sticks.

    1. Re:Bad hardware design by mapsjanhere · · Score: 1

      I can see how that will work well: "we need the change the bios update procedure. Where it says:
      "click on bios update, click yes, click yes, really, come back in 2 min"
      we need:
      "get on your knees, pull out computer from under desk
      get screw driver, take off cover, remove dust
      find flashlight and small needle nose pliers, locate jumper
      take out graphic card to access jumper hiding under oversized heatsink, move jumper, reinsert graphic card
      boot, click on bios update, click yes, click yes, really, come back in 2 min
      take out graphic card to access jumper again, move jumper back, reinsert graphic card
      put cover back on, move computer back in place, continue work"
      "don't you think we need instructions on where to pull the power cord and to disconnect the monitor cable too?"

      --
      I'm aging rapidly, I bought a new game and had no idea if my machine was good for it.
    2. Re:Bad hardware design by Anonymous Coward · · Score: 0

      The jumper could be a dip switch accessible via the mainboard I/O panel or the slot bracket of an extension card.

      Anyway, the jumper can be a software function, but it must control a hardware element which physically disables writes to the BIOS flash chip and cannot be reset by software. Likewise there could be a standard which allows the BIOS to send a command to the cards to disable flash writing until the next power cycle.

    3. Re:Bad hardware design by Anonymous Coward · · Score: 0

      Sheesh, kids these days...

    4. Re:Bad hardware design by the_B0fh · · Score: 1

      No no no.

      All you want is to confirm that the user really wants to perform that action, right?

      All you need is an "Accept or Deny" message box.

      Worked great for Vista!

    5. Re:Bad hardware design by Ant+P. · · Score: 1

      A hardware switch isn't really necessary. Just make the reflash instruction a 256 bit code that's impossible to come across during normal use. If you're worried about malware it could be made to only accept the instruction when the OS is initialising the card.

      If that's still not enough for you, maybe you should start worrying about them bricking your hard disk too...

    6. Re:Bad hardware design by Anonymous Coward · · Score: 0

      I do worry about that. ATA hard disks can be password protected and if the function is not disabled by the BIOS, then malware can set a random password and make all your data inaccessible in an instant.

  10. Isn't this the point? by Anonymous Coward · · Score: 1, Insightful

    It is a beta using an early release candidate kernel.

    Isn't this the sort of issue that testing releases are meant to catch? It is unfortunate that some users got bit by the bug but it probably isn't very widespread.

  11. One rule for one... by Anonymous Coward · · Score: 0

    Of course, if this was Windows the blame wouldn't be aimed at the hardware...

    1. Re:One rule for one... by darkonc · · Score: 1
      If this was Windows, we'd get a cryptic update on the next Patch Tuesday. Something about "de-ceramicizing for new Intel cards".

      Nobody would know any better -- except for the 5000, or so, users who had their systems bricked by the bug. .... of course, they wouldn't get the new update because their ethernet cards were bricked. (nor would they ever be told the real source of the problem).

      --
      Sometimes boldness is in fashion. Sometimes only the brave will be bold.
  12. Lesson finished by Anonymous Coward · · Score: 2, Insightful

    What do we learn from this incident?

    1. Beta is not for the common people.
    2. Programmers are humans are erroneous.
    3. "This program is distributed in the hope that it will be useful,
            but WITHOUT ANY WARRANTY; [...]"

    1. Re:Lesson finished by kimvette · · Score: 1

      3. "This program is distributed in the hope that it will be useful,
                      but WITHOUT ANY WARRANTY; [...]"

      Don't run any Microsoft or Apple operating system either, then. Both of them expressly declare there is no warranty.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    2. Re:Lesson finished by SanityInAnarchy · · Score: 1

      That's not "don't run", that's the "beta" part.

      The point is, if this happens to you, too bad. Unless your hardware warranty covers it, you're SOL.

      --
      Don't thank God, thank a doctor!
  13. I tried that, it doesn't work for long by davidwr · · Score: 2, Funny

    Unfortunately, due to inferior materials used in the chip's casing, exposing the device to a sufficiently strong inverse tachyon field will cause protonium breakdown which will in turn cause an endothermic reaction, which in turn will fracture the silicon along the sharp drop-offs in the resulting thermal gradient. As a side-effect of the presence of the inverse tachyons, the failure will happen in the near future rather than immediately. In other words, your device will work on the testbench but by the time you put it into production then *crack* there it goes.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    1. Re:I tried that, it doesn't work for long by J.+J.+Ramsey · · Score: 1

      But you can fix that by reversing the polarity of the neutron flow.

  14. BSDM? by davidwr · · Score: 1

    I've heard of BSD, NetBSD, OpenBSD, and FreeBSD, and others, but what is this BSDM of which you speak?

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    1. Re:BSDM? by pipatron · · Score: 1

      I bet it's not as painful as the alternatives you mentioned!

      --
      c++; /* this makes c bigger but returns the old value */
    2. Re:BSDM? by Anonymous Coward · · Score: 0

      A typo. In case you really don't know: BDSM, bondage/sado-masochism. People who are into that sort of thing usually have an agreed-upon safe word which they use instead of "stop" and which is unlikely to be used in that context, so that the "master" can safely ignore their (playful) begging. If they really want to stop, they speak the safe word. How do I know? Eurotrip.

      The hardware equivalent of a safe word would allow the card to ignore instructions to overwrite flash memory until a certain side-condition is met. That would make accidental overwrites less likely but still allow malware to exploit the flash writing capability to destroy the board.

    3. Re:BSDM? by Splab · · Score: 1

      Google it, should be pretty apparent what BDSM is.

      He is talking about a safe word for when things get a bit out of hand - but the other way around; a word to allow things to get out of hand.

    4. Re:BSDM? by Anonymous Coward · · Score: 0

      *whoosh*

    5. Re:BSDM? by Anonymous Coward · · Score: 0

      *whoosh*

      I know for a fact that a baday doesn't go *whoosh*.

  15. This with a suse bata not a finale release with Ma by Joe+The+Dragon · · Score: 1, Informative

    This with a suse bata not a finale release with Mandrake

  16. A cooling system should have a build in over ride by Joe+The+Dragon · · Score: 1

    A cooling system should have a build in over ride that trun it back on after it its too hot. There was a movie that got this part right

  17. Re:haha! by ChrisJones · · Score: 1

    Are you talking about the Linux driver? e1000e isn't at all a closed driver, it's in the kernel.org source, which is published under the GPL v2. That's about as un-closed as you can get.

    --
    Chris "Ng" Jones
    cmsj@tenshu.net
    www.tenshu.net
  18. Re:This with a suse bata not a finale release with by ChrisJones · · Score: 1

    It's by no means limited to a SuSE beta. Ubuntu's Alphas of 8.10 (Intrepid) are also using 2.6.27-rc kernels.
    SuSE are taking the heat because they were first to put out a warning about the issue, but it is not their fault, or specific to them.

    --
    Chris "Ng" Jones
    cmsj@tenshu.net
    www.tenshu.net
  19. Sony Beta? by tepples · · Score: 1

    1. Beta is not for the common people.

    Is that why Betamax (consumer video tape format) died and Betacam (professional video tape format family) lived?

  20. Hardware needs to be designed better by Skapare · · Score: 1

    For example, it needs 2 layers of control. One layer should be minimal and not subject to any possible reprogramming. It might even be implemented in pure hardware (not firmware). But if it is implemented in firmware, it needs to be in "read ONLY not physically possible to write" memory (e.g. legacy ROM). This layer is the one that implements the function to write the flash that controls the next layer of the device. And this first layer needs to function even if the next layer is hosed or doing strange stuff. Or alternatively, the first layer would have functions to stop, reset, and restart the next layer. This would allow the flash to be reloaded even if it is completely fouled up now. It should not be necessary to have a JTAG port for this.

    --
    now we need to go OSS in diesel cars
    1. Re:Hardware needs to be designed better by Anonymous Coward · · Score: 0

      At Cedaron, our hardware is similar. We can flash it easily. However, the flash module itself can only be flashed after assembly by soddering wires onto pins on the chip (we only had to do it twice, when two chips were installed before preparation).

  21. Re:A cooling system should have a build in over ri by Anonymous Coward · · Score: 0

    No, as one of the previous posters stated, it should shut down. A system should always go into the safest mode of operation if a problem is encountered. If your reaching a critical temperature and the operator has turned off the fans, you shut down. The system doesn't know why the fans were disabled and so it should not try to start them. (What if the operator shut them down because of a fire?)

  22. So which "bricked" is this? by Anonymous Coward · · Score: 0

    Which definition of "bricked" is this? The real "bricked" meaning "doesn't work anymore" or the apple iphone definition of "bricked" which has come to mean "doesn't work unless you reboot it" ?

  23. Was Vista Better? by Anonymous Coward · · Score: 0

    One of the vista betas bricked on old RTL 8029 clone card I was using. Vista lasted about 2 hours installed before I decided it was rubbish and went fully GNU/Linux instead.

  24. Don't forget! by denzacar · · Score: 1

    ... to reverse the polarity.

    --
    Mit der Dummheit kämpfen Götter selbst vergebens
  25. "don't just not" by orkim · · Score: 1

    Dear Editors,

    I am writing today specifically to request that the phrase "don't just not work" be banned from this site. This phrase is ridiculous and utterly confusing. Even expanding it to the root words of "do not just not work" does little to help.

    The English language is mutilated just fine without needing extra help by poor editing.

    Thanks,

    Concerned Reader

  26. Re:This with a suse bata not a finale release with by Knuckles · · Score: 1
    --
    "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
  27. Intels NIC quality is bad, bad, bad.... by gweihir · · Score: 1

    I once had to literally throw away 18 of 24 e1000's, because they could not be made to work reliable under Linux. Unfortunately among those working were my two test cards that I got beforehand. I will not get any networking equipment from Intel ever again. It seems to me that just because of their well-known name they believe they can get away with massively substandard quality.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Intels NIC quality is bad, bad, bad.... by amorsen · · Score: 1

      Intel's e1000-series is really great. They perform wonderfully and they always work. It is unfortunate that they have been hit by this problem, but such is life. I have never seen a bad e1000.

      --
      Finally! A year of moderation! Ready for 2019?
  28. Not the enterprise class stuff by Giant+Electronic+Bra · · Score: 1

    It is pretty much THE gold standard. There are one or two other vendors that have good chipsets as well, but probably half the servers in existence are using Intel NICs, and 3/4 of the rest are Broadcomm.

    When you pay 20 bucks for an ethernet card, you get what you pay for. 95% of the time it is fine for everyday use in a PC, but they aren't at the same quality level (and not even close to the same performance level) as the enterprise class products. Same is true of all the other vendors.

    --
    "Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
    1. Re:Not the enterprise class stuff by Gazzonyx · · Score: 1

      The cheapies use Realtek chips, too.

      --

      If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.

  29. Mandriva notification by AdamWill · · Score: 3, Insightful

    This can also affect Mandriva Linux 2009 pre-releases. To be clear, the bug is in the upstream kernel itself, not in any code specific to any distribution.
    It affects any 2.6.27rc kernel, whether it's in a distribution or a clean upstream build.
    We have posted a full, detailed notification of the issue for Mandriva users.

  30. Tried that too by davidwr · · Score: 1

    That causes a fire to start 5 minutes ago.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  31. C+/- by Ant+P. · · Score: 1

    Some Intel cards don't just not work

    card == (card & NOT_WORK); /* 0 */

    with the new OpenSUSE beta, they can get bricked as well.

    card == (card & NOT_WORK|BRICK); /* 1 */

    Get it now?