Slashdot Mirror


Database Glitch Grounds American/US Airways

An anonymous reader writes "According to numerous news sources, all American Airlines and US Airways flights were grounded for two or three hours this morning. Both problems were caused by a computer glitch in the systems hosted by EDS. Quote: The operating system that drives the airline's flight plans went down."

4 of 274 comments (clear)

  1. Re:Windows by Anonymous Coward · · Score: 5, Informative

    The following entities were NOT mentioned in the article you're linking to:

    (1) American Airlines,
    (2) US Airways,
    (3) EDS.

    So, what the hell are you talking about?
    Why did you link to this article?
    (I know, I know, because nobody will read it anyway)

  2. Not Windows, Unix by JohnQPublic · · Score: 5, Informative

    This is undoubtedly a problem with Sabre, which EDS runs on behalf of Sabre Holdings. Both American Airlines and US Airways use Sabre for much of their operations.

    Sabre started it's life as an American Airlines internal system (SABER, slight spelling difference), running on a rare operating system (PARS, later called ACP and currently TPF) on IBM mainframes. In the last few years Sabre completed a lengthy migration to HP Unix on Non-Stop (i.e. ex-Tandem) hardware. The mainframe systems were rock solid, but software talent was hard to come by, so they decided the time had come to switch.

    Sorry, no Microsoft to blame here!

  3. You probably won't hear it by Anonymous Coward · · Score: 5, Informative

    The systems that run the aircraft and the navigational and communication systems really are redundant. It's the law. It also means that usually there are two different ways to do something not just the same thing repeated twice.

    Example 1 - The pilot and co-pilot can't eat the same meal. That way, only one of them can get food poisoning.

    Example 2 - The hydraulic system fails and the wheels won't go down. There's a hand crank.

    Example 3 - The communication systems at every tower I have worked at have two separate backbones. There are two of absolutely everything. If that fails, there are emergency radios under the desk. If the emergency radios don't work ... We used to joke that the controllers would climb to the top of the tower and wave fire extinguishers to warn the planes away. (I think it was a joke.)

    Example 4 - You can't fly very far over open water in a single engine aircraft.

    It used to be frustrating working on systems older than I was but we never had to worry about surprises.

    Of course all of this redundancy is very expensive. You spend the money where people's lives are at stake. On the other hand, if the worst problem is that some planes will be late, perhaps you don't spend the big bucks.

  4. Not "OS" by Master+of+Transhuman · · Score: 5, Informative

    When they said "operating system", they meant "operations system" - not the OS.

    See this quote from one of the articles:

    Wagner said a database malfunctioned that "basically runs every aspect of our client operations -- aircraft dispatch, crew scheduling (and) reporting weight, passenger load, balance."

    This system is hosted by EDS, who only said it was a "systems issue".

    So there's no evidence it was an OS problem. It could have been anything - OS, Oracle/DB2/SQL Server database, application code, upgrade, whatever.

    Nothing to conclude here except that somebody screwed up - and even that isn't certain - could have been a bad memory board someplace, who knows.

    Not having a backup is even irrelevant, since the "backup" might have taken three hours to bring up, when you're dealing with a production system like this. "Failover" is what you want, and they should have had, but if something got screwed there, it could still have been three hours.

    Shouldn't have happened, but crap like this happens all the time because nobody can do their damn jobs.

    --
    Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!