Slashdot Mirror


Blackout Cause: Buggy Code

blanca writes "The big northeast blackout from last summer was caused in part by a software bug in an energy managment system sold by General Electic, according to a story on SecurityFocus. The bug meant that a computerized alarm that should have been triggered never went off, hindering FirstEnergy's response to the train of events that lead to the cascading blackout. Investigators found the bug in a intensive code audit following the outage, and a patch is now available."

23 of 377 comments (clear)

  1. Uh... by Short+Circuit · · Score: 5, Interesting

    Didn't the story used to be that after a tech maintenenced the machine, he forgot to re-enable an alarm?

  2. Another opinion: maybe Blaster is to blame by kraker · · Score: 3, Interesting
    Bruce Schneier had a very interesting theory in his crypto-gram issue of December. The Blaster virus could be one of the reasons for the power outage:

    http://www.schneier.com/crypto-gram-0312.html#1

    A snippet of the article:
    Let's be fair. I don't know that Blaster caused the blackout. The report doesn't say that Blaster caused the blackout. Conventional wisdom is that Blaster did not cause the blackout. But it seems more and more likely that Blaster was one of the many causes of the blackout. Regardless of the answer, there's a very important moral here. As networked computers infiltrate more and more of our critical infrastructure, that infrastructure is vulnerable not only to attacks but also to sloppy software and sloppy operations. And these vulnerabilities are invariably not the obvious ones. The computers that directly control the power grid are well-protected. It's the peripheral systems that are less protected and more likely to be vulnerable. And a direct attack is unlikely to cause our infrastructure to fail, because the connections are too complex and too obscure. It's only by accident--Blaster affecting systems at just the wrong time, allowing a minor failure to become a major one--that these massive failures occur.
  3. speaking of outsourcing... by tuxette · · Score: 2, Interesting
    Does anyone know if the code-writing was outsourced abroad?

    With all the lip service about "homeland security," one ought to be concerned about anything affecting national infrastructure being sent abroad where you really don't know who is doing the coding, whether the coding projects are being further outsorced to say alQaidaSoft, etc.

    --
    People say I'm crazy, I got diamonds on the soles of my shoes...
  4. And Another... by Marxist+Commentary · · Score: 3, Interesting

    How about the energy companies?

    Certainly, the energy corporations must be somewhat culpable for not rigorously testing the software in the first place? It is not in the interest of a for-profit company to see to it that such systems are functioning correctly, as that cost will detract from the bottom line profit. Only when disaster strikes can they be goaded into looking into problems.

  5. Who coded this? Homer Simpson? by prgrmr · · Score: 3, Interesting

    From the article:

    When a backup server kicked-in, it also failed, unable to handle the accumulation of unprocessed events that had queued up since the main system's failure. Because the system failed silently, FirstEnergy's operators were unaware for over an hour that they were looking at outdated information on the status of their portion of the power grid, according to the November report.

    How in the world did they manage to build a system nearly completely dependant upon computers, and yet not know when they lost not just one, but two computers that monitored the system?

    Homer: Don't turn off the computer! Don't turn off the computer! Don't turn off the computer!

    "Click"

  6. Re:Yes but how is Microsoft responsible? by Anonymous Coward · · Score: 2, Interesting
  7. Re:Development vs Engineering by Jeff+DeMaagd · · Score: 4, Interesting

    I'd sort of tend to agree, although under your standards, the stuff I do as an EE really would fit under development, we don't have the budget to send out for external certification and external testing. No biggie, I guess I can live with being a hardware developer.

    Is it true that some states have prohibited Microsoft from issuing MSCEs? I heard this somewhere but I can't remember. Something about Microsoft not having the authority to certify engineers.

  8. Re:Hmm by A55M0NKEY · · Score: 4, Interesting
    Once upon a time, there was a power grid without any software. This is true because electricity predates computers. What did they do then?

    I bet they had much wider safety margins built into the system which prevented blackouts. But these safety margins probably cost money ( I say this without knowing a thing about the electrical system ) they probably mean a less efficient use of resources. So power companies buy GE's software. They don't buy it so that they can have an added measure of blackout prevention, they buy it because it enables them to cut out expensive/inefficient safety margins without (supposedly) sacrificing reliability. They do this to lower their cost of providing electricity to you.

    --

    Eat at Joe's.

  9. 50MV arc'd to a tree by tvh2k · · Score: 2, Interesting

    By my calculations, assuming air ionizes about 10,000 Volts / centimeter, a 50MV line should be at least 5,000 cm (or 50 meters) from any ground. 50 meters on either side of a line is a lot of property for an electical company to buy, and with a surge in the line I'd bet the distance would need to be even more.

    1. Re:50MV arc'd to a tree by Anonymous Coward · · Score: 2, Interesting

      ever see the big transmission lines in the middle of nowhere, that are clearcut on both sides of the line?

      the problem is First Energy - they (as a corp) werent keeping up with basic maintenance procedures, and as a result brought down the entire grid.

    2. Re:50MV arc'd to a tree by plover · · Score: 5, Interesting
      My property abuts a set of high voltage transmission lines. (I'm about three miles from a coal plant.) The lines cut a long, skinny park through my city. The plat for the site shows a 200 foot wide easement, which is about 30 meters to the property on either edge of the park. I've never measured the height of the towers, but my rough guess is that the line itself is perhaps 25 meters above ground. That puts the line itself about 39 meters from the edge of my property.

      The land beneath the lines was clear-cut about 12 years ago. But there are now trees under this line that are about 10 meters high.

      Years ago when my wife was concerned about "power line emissions" the power company loaned her a meter that showed "electrical fields." I don't remember the scale, or even what it was supposed to measure, but I do remember that we had to actually get about 200 feet from the wire before the field from the line stopped affecting the meter. (Yes, on a humid summer day I once stood in my back yard with a neon bulb and caused it to illuminate by simply dangling a three foot wire from one lead and touching the other.) I had always assumed it was a 750kV line, and that the 100 foot easement was more than sufficient. Now, I wonder. Hey, maybe this is enough of an excuse to go out and get one of those IKE toys!

      --
      John
  10. testing, testing, 1,2, 3 by hakalugi · · Score: 2, Interesting
    "When a backup server kicked-in, it also failed, unable to handle the accumulation of unprocessed events that had queued up since the main system's failure"

    what good is a backup system if it's never been tested?

    --
    If she floats, she's a witch.
  11. Re:Metroid by kabocox · · Score: 2, Interesting

    Actually, from the way the article sounds, the black out might not have been as large, as long, or even happened if the software was properly updating. The electrical grid is constantly falling apart. It is never all up. That's o.k. It is the status quo. It is when the electrical company doesn't know what is happening and get people to the trouble spots that these things become noticable. Usually they are fixed within 30mins to 2 hours. From everything that I've read it wasn't a big problem at all. It was a fixable problem that was allowed to exist too long. After that point it became a big problem. I'd hold the monitoring software responsible.

  12. What about the actual Engineers involved? by GoofyBoy · · Score: 4, Interesting

    The software handled one part of the electrical system involved.

    What about a good Electrical/Mechanical/Civil Engineering solution that would have prevented it from cascading through different systems / electrical companies / countries?

    One piece of software which didn't raise an alarm is shocking. The fact that it cascaded over such a wide area is simply mind blowing.

    Before we talk about "software engineers" how about talking about "traditional engineers" and their role in this massive failure?

    --
    The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
  13. We kick MS, but GE did the wopper... by CokoBWare · · Score: 2, Interesting

    We may slam Microsoft for all of it's bugs, but it's really hard to top a software bug triggering an international blackout the size of one last summer. I think I should sue GE for making me walk 3.5 hours home in the heat with no money in Toronto, uphill, because I couldn't take a subway home. I smell a lawsuit the size of the eastern seaboard.

  14. Re:Development vs Engineering by Anonymous Coward · · Score: 5, Interesting

    In Canada, "Engineer" is a protected term, like "Doctor."

    Doctor is not a protected term. Perhaps you mean "Medical Doctor"? There are lots of non-medical doctors.

    I was arguing once with a MD friend of mine who thought that PhDs (like myself) don't have the right to call themselves Doctor. I explained that while medicince has been around for a very long time, the degree of MD has not. PhDs degrees have a much longer history than MD degrees.

    It gets very funny when another friend of mine (who has a PhD in nursing) is called "Dr" in her hospital.

  15. Software engineering *not* possible. by master_p · · Score: 3, Interesting

    After lots of years as a developer, I realized that the engineering process that goes into other professions (for example, civil engineering) can't be applied to software. The reason is simple: software is many orders more complex. Software has many interdependencies between components, has many states, and it is subject to change every minute. It's very difficult to see ahead and provide APIs that fit all the needs, that's why we go back and change the damn thing. What does a civil engineer has to do ? he/she has to combine parts and test if they hold together. There are a lot of parts, but the general principles are a few and can be easily remembered...unlike software.

    Furthermore, the tools we have for the job are inadequate. The programming languages are primitive. The debugging tools are dumb. The machines are not clever and strong enough to prove the mathematical theorems behind its program. We don't even learn these things in college...we learn how to use programming languages, but we don't learn how to program...but I seriously believe we will never learn how to program, because a program's complexity increases tenfold for each line of code written!!!

  16. Re:Software "Engineering"? by zeus_tfc · · Score: 3, Interesting

    Just a nitpick,

    Creating a true software engineer is different than making them PE's. Right now, most of the engineers that design things in industry don't have PE's and if they do, they don't make it known publicly for the very reasons you mentioned.

    The rest of us with out PE's don't need the insurance, as that is supplied by the company.

    Also, keep in mind that just because an engineer worked on something doesn't mean that it will be expensive. Most of what I engineer costs less than a dollar.

    If you haven't guessed, IAAE (I am an Engineer)

    --
    "...At the end of the day"..."when everyone goes home, you're stuck with yourself." RIP Layne Staley
  17. Re:Development vs Engineering by Tassach · · Score: 4, Interesting
    I like to think I'm an engineer, not a developer. The problem is not that I don't know how to do good SW engineering, it that I'm usually not allowed to do good SW engineering. Good engineering is expensive in terms of time and money. The people who sign the checks aren't usually willing to pay for it and aren't willing to wait. The sad part is that they're often right: if you can't afford to wait, and you can't afford to pay the price, you have to settle for what you can get and hope that it's good enough to keep you moving forward.

    You have 4 main variables in the software development equasion: Time, Quality, Functionality, and Efficiency. Notice that we only measure time, not man-hours or monetarycost. As we know from reading The Mythical Man-Month , we cannot reduce time by adding more people or by spending more money. While we list efficiency as a variable, we really have to treat it as a constant within the scope of a single release cycle. Improvements in efficency are generally very gradual and incremental, and for the most part cannot be effectively implemented in the middle of a release cycle.

    I postulate that Time is directly proportional to the product of Quality, Functionality, and Efficiency [T = EQF]. Since E is constant within the scope of a single release, we can't use process improvements or similar techniques to improve quality in the short term. Assuming our goal is to improve quality, we either have to decrease functionality or increase time. Since monetary cost is directly proportional to time (time is money!), managers are very reluctant to give you more time. Furthermore, we are frequently under hard time constraints due to contractual obligations or market pressure. If we can't change time, we either have to sacrifice quality or functionality. Missing functionality is very obvious, whereas low quality isn't necessarily noticable in the short term, so it should be no suprise that quality is almost always takes the back seat to functionality.

    --
    Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
  18. Re:What does the watchdog watch? by AB3A · · Score: 3, Interesting

    I always treat watchdog software with just a bit of skepticism. The problem, as pointed out by NERC, was that a process in the system was somehow present, but not communicating well.

    The alarm subsystem is often a seperate process. It doesn't talk to the field. That's the job for other elements of the SCADA system. It was supposed to watch for semaphores, messages, or read shared memory somewhere. How do you watchdog something like that if it gets the message, but doesn't do what it's supposed to?

    In a SCADA system near and dear to my career, we set alarm thresholds so low that the operators expect a certain amount of alarm traffic even for routine events. This helps to discover any misbehavior in the alarm system.

    There is such a thing as a control center which is TOO quiet.

    --
    Nearly fifty percent of all graduates come from the bottom half of the class!
  19. Not very analogous... by Svartalf · · Score: 2, Interesting

    In the case of the electric blankets, you're not exposing yourself to a lot of any B or H fields- there's not enough current present to generate much. Now, if you'd said something like a hair dryer, where the field is concentrated to power the motor...

    The phone may generate more relative power, but it's at a different frequency- in regards to electricity and the human body, frequency matters as much as anything else.

    For DC, 10ma of current may not be noticable to a person.

    For 50/60Hz AC, it's going to cause a twitching of the muscles.

    For DC 100ma to 1a of current, you're going to get a zap similar in nature to sticking your tongue on a 9v battery, proportionate to the current in question.

    For 50/60Hz AC, 100ma to 1a, it's going to be causing painful contractions of your muscles, and very probably stopping your heart outright if the conduction pathway crosses it.

    There's been studies that tend to prove that even low energy densities of 50/60Hz AC can accelerate tumor growth- no studies have actually proven that they generate them though. Effects like the one mentioned tend to be caused more by continuous exposure than point exposure- so the low levels of the energy radiated by the high-tension lines may be a problem if you're next to them since it's a continuous background level sort of thing.

    --
    I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
  20. Echoes of Y2K by Anonymous Coward · · Score: 1, Interesting
    The whole blackout situation reminds me of a famous quote.

    If architects built buildings the way programmers wrote programs, the first woodpecker to come along would destroy civilization.

    I just never thought I'd see this in reality.

  21. Re:Bad bugs by klafhat · · Score: 2, Interesting

    But the second unit had failed in the identical manner a few milliseconds before. And why not? It was running the same software.

    I have read that story before on a different site. Everybody keep this in mind before you assume redundant systems can protect you against software errors.

    --

    Tell me more, tell me more