Slashdot Mirror


Delta Air Lines Grounded Around the World After Computer Outage (cnn.com)

Delta Air Lines says it has suffered a computer outage throughout its system, and is warning of "large-scale" cancellations after passengers were unable to check in and departures were grounded globally. The No. 2 U.S. carrier said in a statement Monday that it had "experienced a computer outage that has impacted flights scheduled for this morning. Flights awaiting departure are currently delayed. Flights en route are operating normally." A power outage in Atlanta at about 2.30 a.m. local time is said to be the cause of computer outage. CNN reports: "Large-scale cancellations are expected today," Delta said. While flights already in the air were operating normally, just about all flights yet to take off were grounded. The number of flights and passengers affected by the problem was not immediately available. But Delta, on average, operates about 15,000 daily flights, carrying an average of 550,000 daily passengers during the summer. Getting information on the status of flights was particularly frustrating for passengers. "We are aware that flight status systems, including airport screens, are incorrectly showing flights on time," said the airline. "We apologize to customers who are affected by this issue, and our teams are working to resolve the problem as quickly as possible."

5 of 239 comments (clear)

  1. Re:Shouldn't have upgraded to W10 ! by Joe_Dragon · · Score: 1, Informative

    the auto install of windows updates drains your battery and does not stop for battery mode or ups shut down commands.

  2. Report: Fire destroyed generators by McGruber · · Score: 4, Informative
    A fire at the datacenter caused the outage, according to a post on post from "walterD" in Flyertalk.com's "Delta computers down ..." thread:

    According to the flight captain of JFK-SLC this morning, a routine scheduled switch to the backup generator this morning at 2:30am caused a fire that destroyed both the backup and the primary. Firefighters took a while to extinguish the fire. Power is now back up and 400 out of the 500 servers rebooted, still waiting for the last 100 to have the whole system fully functional.

    1. Re:Report: Fire destroyed generators by Critical+Facilities · · Score: 4, Informative

      Well, to be clear, I'm just speculating here, but I'm not implying that the GENERATORS blew up, I'm speculating that the ATS blew up. It is a very common topology to have multiple Generators connect to one main bus, and then have that bus connect to the Data Center via an ATS. In other words, yes, there is/are redundant Generator(s), but they all connect to one central bus, which then connects to the UPS Systems via the ATS and other switchgear.

      The failure rate of ATSs is pretty low (when they're maintained), so it often becomes a value engineering decision during design. Yes, you could have each Generator connect via its own ATS, thus distributing the risk, but in so doing you increase your constructions costs, increase your maintenance costs, etc. The bean counters don't like that, and it becomes hard to convince them that it's worth it when you can't come up with statistical proof that a failure of the ATS is likely.

  3. Sounds like a problem with flight planning by Ami+Ganguli · · Score: 5, Informative

    I used to work on one of these systems.

    The flight planning system takes inputs from several sources - weather forecasts, notices about airspace closures, etc. (NOTAMs), and booking info - and creates an optimal flight plan for the aircraft.

    A modern airline doesn't have enough flight planning staff to take over manually if the system fails, so if your flight planning goes out, your fleet is gradually grounded.

    The large number of servers is due to the optimization problem. You need to take into account the flight conditions and fuel costs in different locations in order to decide your route, altitude, and fuel loading. Since fuel is a huge percent of the operating cost of the airline, it pays to invest a little extra computing power into optimizing these and save a bit fuel on each flight.

    Our system had lots of redundancy but, with all the data feeds, there are lots of moving parts. It's not hard to imagine a scenario where, for example, you get everything transferred over to your disaster recovery site, but for some reason the weather feed isn't coming in and you can't make flight plans.

    --
    It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
  4. Insurance by sjbe · · Score: 4, Informative

    Without Federal requirements there is no way a corporation is going to spend that kind of money.

    A few failures like this one and they'll dig into the couch cushions to find the change for it. Having a backup data center for stuff that will shut the company down is not exactly a tough thing to justify. This shutdown alone would probably justify the cost in a single day.

    They have legal protections in place to assure they retain their terminal slots, so while they aren't making money now they won't lose in the long run.

    Perhaps but if they managed their IT properly they wouldn't have to lose money now. They can buy the insurance or they can take the risk of serious illness so to speak. Their choice and their funeral. Sounds like they rolled the dice and came up snake eyes today.

    The only businesses with total data recovery sites and plans to actually use them are Banks, and that is because they are required by the FDIC.

    Not true. Some medical practices have them. Some internet firms have them (at least for the mission critical stuff). Some bits of the military and government have them. Insurance companies have them. Stock exchanges have them. And there are more as well. If it's valuable enough you have a backup data center of some sort.