Slashdot Mirror


Database Glitch Grounds American/US Airways

An anonymous reader writes "According to numerous news sources, all American Airlines and US Airways flights were grounded for two or three hours this morning. Both problems were caused by a computer glitch in the systems hosted by EDS. Quote: The operating system that drives the airline's flight plans went down."

14 of 274 comments (clear)

  1. Operating System (singular) by Hypharse · · Score: 5, Insightful
    "The operating system that drives the airline's flight plans went down."

    How in the world can they state that as singular. Surely they have a backup of some sort. Especially with all the supposed "increased security" around air flight, you are telling me that one system crash can knock out half of the major airlines? That's ridiculous. Have they not learned about redundancy?

    1. Re:Operating System (singular) by Anonymous Coward · · Score: 5, Funny

      Yeah, Have they not learned about redundancy?

    2. Re:Operating System (singular) by njcoder · · Score: 5, Funny
      "Have they not learned about redundancy?"

      Yep, their so good, even the failure was replicated!

  2. Last thing you want to hear by xIcemanx · · Score: 5, Funny

    I'm guessing the last thing you want to hear on a plane now is the pilot saying, "What do you mean, fatal exception error?"

    >_ Why don't they swtich to Linux?

    1. Re:Last thing you want to hear by Brandybuck · · Score: 5, Funny

      Q: How far can the plane fly after a fatal exception error?

      A: All the way to the scene of the crash. Hell, it will probably beat the paramedics there by half an hour!

      --
      Don't blame me, I didn't vote for either of them!
  3. EDS? Quelle surprise. by leathered · · Score: 5, Interesting

    Sorry, have to rant where I see EDS mentioned.

    EDS, in cahoots with the UK govenment, have wasted millions of pounds of taxpayers money on failed IT projects. Notable ones include the Inland Revenue (UK IRS), Child Support Agency (£50M over budget and still not working) and an email and directory service for the NHS (withdrew at last minute allowing C&W to steal at a much inflated price).

    Though the blame cannot completely be laid at the door of EDS, the government has been guilty of sloppy auditing and the worst being the willingness to hand over extra money when EDS has come around with the begging bowl.

    --
    For all intensive porpoises your a bunch of rediculous loosers
  4. I thought everyone knew by Xerp · · Score: 5, Funny

    NEVER open Windows in an airplane!

  5. Re:Windows by Anonymous Coward · · Score: 5, Informative

    The following entities were NOT mentioned in the article you're linking to:

    (1) American Airlines,
    (2) US Airways,
    (3) EDS.

    So, what the hell are you talking about?
    Why did you link to this article?
    (I know, I know, because nobody will read it anyway)

  6. Not Windows, Unix by JohnQPublic · · Score: 5, Informative

    This is undoubtedly a problem with Sabre, which EDS runs on behalf of Sabre Holdings. Both American Airlines and US Airways use Sabre for much of their operations.

    Sabre started it's life as an American Airlines internal system (SABER, slight spelling difference), running on a rare operating system (PARS, later called ACP and currently TPF) on IBM mainframes. In the last few years Sabre completed a lengthy migration to HP Unix on Non-Stop (i.e. ex-Tandem) hardware. The mainframe systems were rock solid, but software talent was hard to come by, so they decided the time had come to switch.

    Sorry, no Microsoft to blame here!

  7. What *REALLY* happened... by catdevnull · · Score: 5, Funny

    At about 4:30 a.m., the outsourced SysAdmin was setting up to do routine patches to Windows 2003 server nodes. But just before, he decided to check his e-mail with Outlook and he opened an important message from his system administrator advising him that his e-mail would be de-activated if he didn't open the important attachment. I think we all know what happened after that...

    --

    I might know what I'm talkin' about, but then again, this is Slashdot...
  8. You probably won't hear it by Anonymous Coward · · Score: 5, Informative

    The systems that run the aircraft and the navigational and communication systems really are redundant. It's the law. It also means that usually there are two different ways to do something not just the same thing repeated twice.

    Example 1 - The pilot and co-pilot can't eat the same meal. That way, only one of them can get food poisoning.

    Example 2 - The hydraulic system fails and the wheels won't go down. There's a hand crank.

    Example 3 - The communication systems at every tower I have worked at have two separate backbones. There are two of absolutely everything. If that fails, there are emergency radios under the desk. If the emergency radios don't work ... We used to joke that the controllers would climb to the top of the tower and wave fire extinguishers to warn the planes away. (I think it was a joke.)

    Example 4 - You can't fly very far over open water in a single engine aircraft.

    It used to be frustrating working on systems older than I was but we never had to worry about surprises.

    Of course all of this redundancy is very expensive. You spend the money where people's lives are at stake. On the other hand, if the worst problem is that some planes will be late, perhaps you don't spend the big bucks.

  9. I found the root of the problem by Anonymous Coward · · Score: 5, Funny

    There is a line of code that raised the problem but is commented in Punjabi, I think it says "fuck this $3/hour job".

  10. Probably Sabre Holdings, rest probably wrong by Markus+Registrada · · Score: 5, Interesting
    First, they didn't "complete a migration". They're still deep in the middle of it, and will be for years to come.

    Second, this failure isn't in the Sabre reservations system, it's in some ancillary product, so who knows? Maybe they have no intention of switching it to Unix.

    Third, he didn't say so, but the migration isn't just to Unix. It's also migration to MySQL! (Hahahahahahahaha. Then again, coming from TPF, coded in assembly language for 4Kword pages, and a hierarchical database, that might seem pretty advanced.) Sabre had to fund a MySQL port to 64 bits, and a new "stored procedures" feature.

  11. Not "OS" by Master+of+Transhuman · · Score: 5, Informative

    When they said "operating system", they meant "operations system" - not the OS.

    See this quote from one of the articles:

    Wagner said a database malfunctioned that "basically runs every aspect of our client operations -- aircraft dispatch, crew scheduling (and) reporting weight, passenger load, balance."

    This system is hosted by EDS, who only said it was a "systems issue".

    So there's no evidence it was an OS problem. It could have been anything - OS, Oracle/DB2/SQL Server database, application code, upgrade, whatever.

    Nothing to conclude here except that somebody screwed up - and even that isn't certain - could have been a bad memory board someplace, who knows.

    Not having a backup is even irrelevant, since the "backup" might have taken three hours to bring up, when you're dealing with a production system like this. "Failover" is what you want, and they should have had, but if something got screwed there, it could still have been three hours.

    Shouldn't have happened, but crap like this happens all the time because nobody can do their damn jobs.

    --
    Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!