Slashdot Mirror


More Airline Outages Seen As Carriers Grapple With Aging Technology (reuters.com)

An anonymous reader writes: Airlines will likely suffer more disruptions like the one that grounded about 2,000 Delta flights this week because major carriers have not invested enough to overhaul reservations systems based on technology dating to the 1960s, airline industry and technology experts told Reuters. Airlines have spent heavily to introduce new features such as automated check-in kiosks, real-time luggage tracking and slick mobile apps. But they have avoided the steep cost of rebuilding their reservations systems from the ground up, former airline executives said. Scott Nason, former chief information officer at American Airlines Group Inc, said long-term investments in computer technology were a tough sell when he worked there. "Most airlines were on the verge of going out of business for many years, so investment of any kind had to have short pay-back periods," said Nason, who left American in 2009 and is now an independent consultant. The reservations systems of the biggest carriers mostly run on a specialized IBM operating system known as Transaction Processing Facility, or TPF. It was designed in the 1960s to process large numbers of transactions quickly and is still updated by IBM, which did a major rewrite of the operating system about a decade ago.

6 of 145 comments (clear)

  1. Dumb by geek · · Score: 5, Insightful

    "Most airlines were on the verge of going out of business for many years, so investment of any kind had to have short pay-back periods,"

    You really only see this type of thinking in the West. Most sensible companies know that when times are good, you build a war chest, when they are bad you invest the war chest to grow your business and be competitive. The problem wasn't that times were bad. You can always say times are bad. The problem was that they didn't make the best of things when times were good, and therefore deserve the cluster fuck situation they are in now.

    1. Re:Dumb by prograsm · · Score: 5, Insightful

      Agreed. Aging tech isn't the problem here, a complete inability to listen to or fund IT is the problem here. If they had a usable rolling backup system, it wouldn't matter how old everything is. If they had all brand new equipment and no functional load balancing system to compensate for outages that will always be a potential issue, they would still be offline for as long as it takes to fix everything. I have a hard time believing the words "off site redundancy" never came up in any IT budget meetings over the past half century, so their failures are 100% bad business decisions not IT issues. It would be no different if they had refused to budget for more fuel than exactly as much as predicted they would need. Tblaming the aircraft rather than the person that made the stupid decision to run out of fuel wouldn't make sense. It only works here because people don't understand IT, and the people that chose to allow outages like these aren't willing to admit it so they will repeat them again.

    2. Re:Dumb by clodney · · Score: 4, Insightful

      From what I have read, this was not an obvious WTF moment. Delta apparently has a complete disaster recovery facility with duplicate hardware. But they had a single point of failure in their infrastructure, which caused them to lose power to the entire datacenter, and everything went down. That part might be a WTF. But once they got everything booted up again, they had to contend with trying to get a system restarted that simply wasn't designed to ever fail completely. So it took hours to get all the pieces back up and communicating again.

      Then their are the real world problems - flight A feeds into flight B, but flight A was late, meaning all those connections were missed and passengers have to be rebooked. And flight B can't fly anyway, because the plane is still sitting 500 miles away because the flight that would bring it to this airport was cancelled as well. And the flight crew that was supposed to bring flight B to this airport technically went on duty the moment they reached the airport, and now they have reached the max allowed hours in the day, so a new crew is needed. But that crew is in a different city...

      This incident will span some fascinating failure analyses, and no doubt people will get fired and lawsuits will be filed. And like most DR scenarios, it is way harder in real life than it seems in planning and exercises. I wouldn't be surprised if this causes a big project to deal with outages and restarts, so that this doesn't happen as easily next time.

  2. Aging? by sexconker · · Score: 5, Insightful

    What's wrong with aging tech? If most airlines are on TPF and TPF works and TPF is still maintained by IBM, what's the problem with TPF?
    Something being old doesn't mean it's bad. Quite often, the reverse is true. The mainframe is still the king when it comes to reliability and transaction integrity, for example.

  3. Re:If it isn't broke... by LWATCDR · · Score: 5, Insightful

    The Delta outage was caused by a power outage. Seems like TPF is not the problem.
    Considering how well this 1960s tech seems to be working replacing it may and doing it better may not all that easy.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  4. I call BS by wcrowe · · Score: 4, Insightful

    This is bullshit. Software does not "age" the same way that a car or a washing machine ages. The hardware can age, but the hardware can be replaced, and in this case we are talking about IBM software and hardware, which has a long-standing reputation for reliability and for maintaining backwards compatibility.

    I think the more likely story is that the interfaces to these systems are being compromised. That's why it's happening, first at one airline, then another. Someone, somewhere is fucking around with the airlines' reservation systems.

    I think these stories about "fires" and "aging" software is covering up for the fact that these systems are getting hacked. If people start to lose confidence in the systems they'll fly less or stop flying altogether.

    --
    Proverbs 21:19