More Airline Outages Seen As Carriers Grapple With Aging Technology (reuters.com)
An anonymous reader writes: Airlines will likely suffer more disruptions like the one that grounded about 2,000 Delta flights this week because major carriers have not invested enough to overhaul reservations systems based on technology dating to the 1960s, airline industry and technology experts told Reuters. Airlines have spent heavily to introduce new features such as automated check-in kiosks, real-time luggage tracking and slick mobile apps. But they have avoided the steep cost of rebuilding their reservations systems from the ground up, former airline executives said. Scott Nason, former chief information officer at American Airlines Group Inc, said long-term investments in computer technology were a tough sell when he worked there. "Most airlines were on the verge of going out of business for many years, so investment of any kind had to have short pay-back periods," said Nason, who left American in 2009 and is now an independent consultant. The reservations systems of the biggest carriers mostly run on a specialized IBM operating system known as Transaction Processing Facility, or TPF. It was designed in the 1960s to process large numbers of transactions quickly and is still updated by IBM, which did a major rewrite of the operating system about a decade ago.
"Most airlines were on the verge of going out of business for many years, so investment of any kind had to have short pay-back periods,"
You really only see this type of thinking in the West. Most sensible companies know that when times are good, you build a war chest, when they are bad you invest the war chest to grow your business and be competitive. The problem wasn't that times were bad. You can always say times are bad. The problem was that they didn't make the best of things when times were good, and therefore deserve the cluster fuck situation they are in now.
If it isn't broke, don't fix it. .... Wait. It is breaking, but you aren't fixing it?
Anyway.... Why is it breaking to begin with? Age isn't a problem in IT, are they adding features that can't integrate with this system? Are there too many transactions? What gives?
What's wrong with aging tech? If most airlines are on TPF and TPF works and TPF is still maintained by IBM, what's the problem with TPF?
Something being old doesn't mean it's bad. Quite often, the reverse is true. The mainframe is still the king when it comes to reliability and transaction integrity, for example.
Southwest airlines reservation system is run off a IBM System/360 mainframe they inherited from Pan-Am. I'd be surprised if there was another functioning unit anywhere else in the word. You could probably emulate the whole damn thing on cell phone.
Who's surprised by this? In the quest for the lowest fare possible, who has money for preventing something that "might* happen that keeps aircraft on the ground, say like a power outage in your computer center? Apparently NOT Delta.. I'm guessing most of the other carriers too, they've just not been lucky enough.
Makes you wonder about all that expensive aircraft maintenance really getting done...
Think of that next time you strap one of their aircraft on for a few hours..
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
I thought the Delta failure was related to a transfer switch failure and the Southwest failure was attributed to a router failure. What does that have to do with using TPS on a mainframe?
aging tech isn't the reason. The reason is the airlines have to interface with department of "homeland security" as per FAA demand and that's a big mess.
^--Clearly you've never had a job in the real world with your "how hard can that be" attitude.
Always short sighted and thinking tomorrow will be the same as today.
What I'm afraid of is this business / investment / management continues to infect the rest of the world. I can't wait until all of the stock markets are controlled by algorithmic trading, with the next quarter's number the sole goal.
What's that? Trouble grappling with aging technology?
Don't you mean - having trouble grappling with H1B's that are moving up that pay scale? Never seemed to have such problems with "aging technology" in decades past when new tech was coming out every decade.....
Talk about correlation not being causation, was slashdot ever good?
Don't be surprised if the airline industry lobbyists are hounding President Clinton for a government bailout in 2017.
The companies are installing luggage tracking.... 40 years after it was needed.
Mobile Apps - because the latest buzzword has to be important...
All of this is just paint for the creaky old barn - it needs rebuilding.
Maybe, for example, they should hire a few programmers who develop porn websites, where the
flow of information is 24/7 - 365 and credit cards work, and such, on a scale that surpasses the airline reservations system...
And maybe invest in switchover-backup systems and new datacenters so business is 24/7-365 instead of 23/6.9 - 361.
And not be so short-sighted about investment versus profit maximization. Delta will pay/lose much more than they gained
because of their lack of a philosophy for the long run ( 3-10 years versus next quarter ).
This is bullshit. Software does not "age" the same way that a car or a washing machine ages. The hardware can age, but the hardware can be replaced, and in this case we are talking about IBM software and hardware, which has a long-standing reputation for reliability and for maintaining backwards compatibility.
I think the more likely story is that the interfaces to these systems are being compromised. That's why it's happening, first at one airline, then another. Someone, somewhere is fucking around with the airlines' reservation systems.
I think these stories about "fires" and "aging" software is covering up for the fact that these systems are getting hacked. If people start to lose confidence in the systems they'll fly less or stop flying altogether.
Proverbs 21:19
The late Dan Weinreb had a great presentation on ITA's effort to build their own reservation system named Polaris. If I recall correctly, the basic problem was the cornucopia of input formats in which the relevant airline data is being periodically updated, all of them being "mainframey", with all the idiosyncrasies it entails. It's a clusterfuck. I wonder if the reservations themselves aren't a comparatively simple part of the whole thing.
Ezekiel 23:20
It is likely that many airline managers have no knowledge of technology, but like to make decisions anyway.
Also, managers are dominant. They hire low-pay employees and don't train them so that they can make more money. Yesterday's Delta story: Delta Air Lines employees mistake New Mexico for Mexico (Aug 11, 2016)
He's not technically wrong, you know. You probably could build a reservation system like that. Even a decent one, perhaps. Just not one that could deal with all the hard crap that needs to be dealt with, especially interfacing. (Hell, I even remember one that we were doing at a programming competition when I was like nine or so. In BASIC. On an eight bit home computer. But that's obviously several degrees lower still...)
Ezekiel 23:20
While it's true that the technology itself is "dated", so is Unix. Also, as TFS mentions: " is still updated by IBM, which did a major rewrite of the operating system about a decade ago." Of course, they could do a RAD/SCRUM/No-SQL/Other-Buzzword-compliant technology rebuild and achieve the same results, with no downtime and seamless transitioning, right? Right?
Sounds awfully much like the old e-mail form that used to get passed around every time someone had a solution for spam.
Disclaimer: I used to work on what is essentially a middleware message processor for military use. It supported dozens of different inputs from myriad systems, some indeed "mainframey". The failing of the system wasn't the system itself - it was rock solid, UNIX based, and had decent hardware and procedures for maintaining maximal uptime in crappy environmental conditions. The ancillary systems that provided message feeds, on the other hand, weren't so reliable, and when they failed, guess who got blamed?
I suspect airline reservation systems probably are in the same boat.
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
continuous complaint about times are bad or union is rendering the business unprofitable has never stopped their officers from drawing ever larger compensation packages, nor has it prevented their board from approving those compensation packages. to claim that there's no money to reinvest into company infrastructure is but a self serving lie.
ELOI, ELOI, LAMA SABACHTHANI!?
Airline profits are 1% of every ticket.
40% is taxes. That is why they have no money.
"Most airlines were on the verge of going out of business for many years, so investment of any kind had to have short pay-back periods,"
You really only see this type of thinking in the West. Most sensible companies know that when times are good, you build a war chest, when they are bad you invest the war chest to grow your business and be competitive. The problem wasn't that times were bad. You can always say times are bad. The problem was that they didn't make the best of things when times were good, and therefore deserve the cluster fuck situation they are in now.
Do you know how to become a millionaire in the airline industry? Start off as a billionaire and then my an airline.
Generally speaking there are / were few-to-no good times over the past few decades. Most airlines have declared bankruptcy (sometimes more than once).
Seriously: I have no idea why anyone would want to be a shareholder as the ROI tends to be negative.
The first programming book I checked out from the library before I owned a computer in 1983 was COBOL programming. Payroll was the killer app in the 1960's.
In the Delta example, the problem was having the primary and backup servers in the same physical datacenter. This isn't an "aging technology" problem. It's a bad management decision problem. Professionals have known for decades that the failover server need to be physically located somewhere else. Article is just clickbait nonsense.
Based on my experience in the late 90's, to "webify" the access to these systems, you pretty much have it.
"Flyin' in just a sweet place,
Never been known to fail..."
...and 60 years later most companies still can't get this shit right.
...is that some salespeople are trying to take advantage of one rare period of downtime to hawk unnecessary software, maybe possibly fixing something that's broken but introducing a thousand other bugs in the process.
It's reworded neutrally to look like a piece of journalism, but that's good salesmanship for you.
When automated check-in kiosks were new they were easy, just type in your number (scanner never worked), select baggage, and go. Now more than every other page is an ad. Do you want to change seats on your first flight? Second flight? Add trip insurance? Buy an earlier place in line? Add rental insurance? An earlier place for your second flight? Are you sure you don't want to upgrade your seating? How about you change your class instead? These people paid us money to show you their logos. Look at these hotel and rental car brands. Are you done looking? Your plane is extra small so you'll have to check your luggage. Our airline credit card is giving you 50,000 free miles (I actually took that offer, but not from the kiosk. It saved me $470). Are you sure you don't have explosives? How about little kids? Sign up for this service and you're magically not carrying anything bad so you can go through the TSA-Pre line. Add $50 for the executive lounge that's 80% empty with leather couches and big chairs while people are standing downstairs because they've run out of chairs. Etc...
Kiosks are so fucking annoying. Ok, so all those things bring in $$$ for the airlines, but their service is crap and is just getting shittier and shittier while Greyhound's buses are getting nicer and nicer. They've added all the crap onto their existing systems, removing it should lighten its load.
At least the last few times I've flown I didn't hear the "Do not leave your baggage unattended" messages. Or maybe I've just completely tuned them out.
Working as a developer in Dallas, TX... Every team I have ever been on has contained at least one bass-ackward programmer who "used to work at southwest" and liked to poison our source control with garbage. "The Southwest way!"
Every time I have been looking for a job, 3 or 4 recruitroaches would call me daily about 3 month contracts with Southwest!!! great place to work!!!!111!!!.
you know you only get the primo talent when you are only willing to budget for 3 months of work at a time.
I used to work with a guy who claims to have written part of their wildly complex "fare forecasting" software, you could probably imagine how shitty his code was to work with.
He's the kind of asshole who would do a bit shift in a SQL statement instead of multiplying by 2 because it was faster that way on 1950s hardware.
There's probably a huge swathe of shit for brains IT/developers who have no clue as to the business model, the requirements or for that matter the ability to design large scale systems. But hey, nothing another majestic herd of H1B's or similar couldn't solve.
Reservation system could be implemented in chapter 10 of your first programming book. It seems trivial thing?
It's actually really really complex.
It's not just a "reservation system" where you lock out a ticketed space for X seconds until someone completes a transaction. You really have to view it as "The Company" when you're talking about airlines. Let's say a pilot has been in the air too long due to a delayed departure in New York. He hits his max flight time for the next 24 hours but he was scheduled to fly from his destination to another leg. So now you need to replace the pilot. Which pilot? Well there is a plane coming into the airport around the same time as the NY flight. But is that pilot rated to fly that same aircraft? Ok he is, great. But because 30 of the 300 passengers are going to miss their connections now because of the delayed arrival they need to be moved to different flights. But those flights are maxed out. So you have to bump some passengers on a scheduled flight and move them to a later flight as well. Because the plane is getting in late it's also going to depart late. So you also need to either arrange all of the passengers at the next destination to be on different flights and set of a chain reaction or you need to pull in a different plane at the 2nd destination to short circuit the chain reaction. But where can you get a plane from for the cheapest? And how much will it cost to put people up in a hotel vs flying an extra crew in on overtime?
This is all simple enough to calculate with like 1-2 planes. But when you have 1,000 aircraft and all of the seat assignments effectively being interdependent along with business interests (profit/loss of changes), customer service interests such as ticket class... and you have to stay up to date instantaneously with dozens of terminals all trying to do the same thing manually in addition to the automatic callbacks for unexpected events... it's big engineering effort to not create some sort of automatic-trading style feedback loop that accidentally sets off a chain reaction that cancels every flight in the country.
Every change has a cost. No human can orchestrate thousands of interdependent variables with millions of passengers manually. You have to have a central director system which instantaneously handles all of the callbacks and dependencies for a change throughout the entire graph.
It's actually very cool when you stop and think about how well it does at keeping everything relatively straight.
The UN once wanted a single simple reservation system. The concept was simple, and the requirements were very simple compared to what most people would assume. Users were equivalent to a small international employee base... all could do English. Vendors were very few and contracts were simple.
Except it had ONE requirement that made it impossible: There should be ONE standard.
Simple enough, except each party said "Yes, we demand one standard, as long is its ours."
Never underestimate the shear stupidity of the human mind and its confusion of want and need.
Even the industry lobbyists don't make the ridiculous claims you do, but instead say profit margin is 3% and taxes are 21%. (http://airlines.org/media/ticket-cost-breakdown/) And you can pretty much guarantee even those figures are crap.
A casual glance, for example, at Delta's 2015 FY statement shows total operating revenue of $28,898 million, and net income of $4,526 million -- that's 11% profit, inclusive of cargo. Looking at passenger traffic specifically, they show an operating cost per available seat mile of 13.33 cents, and a passenger mile yield of 16.59 cents. That's a 24%+ profit margin.
Yes, that's looking at the most profitable US airline specifically, but your numbers are complete hogwash even if you look at the others. The US airline industry is posting record profits -- about US$22 billion between American, Southwest, Delta and United last year -- and has plenty of money to invest in its future if it wants to. But you can pretty much guarantee it won't, because it will rely on a bailout to save its ass again if the profits dry up.
Do reporters even read these stories as they are writing them?!? "Airlines will likely suffer more disruptions like the one that grounded about 2,000 Delta flights this week because major carriers have not invested enough to overhaul reservations systems based on technology dating to the 1960s... [TPF] is still updated by IBM, which did a major rewrite of the operating system about a decade ago."
Big, complicated system, written by a big, experienced company, still maintained... Do they think we'd be better off if it were rewritten from the ground up as a Ruby on Rails app or something?
Psst, I don't want to cause a panic, but I heard that large, important chunks of the Internet run UNIX, which also dates back to the '60s.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Pretty much every logistics network I've come across is the same. Systems as old as I am, running in some forgotten cabinet that nobody has even laid eyes on in years because someone tweaked it til it was unkillable. Instead of going to the trouble of finding the box, they just built new software on top of the interface. In the event that someone did find the box, well the software vendor disappeared a decade ago, so the best we could do without gutting the entire stack was to host it on a VM. And hey, the stack NEEDED gutting, but who's going to sign off on that?
The worst are the goofy, home-grown solutions to integrate legacy and modern systems on the client side. Someone said "release early and release often" but as soon as the bugs in the first live version get fixed, development abruptly ceases.
Couriers, workshops, airlines apparently, they're all the same.
The article about TPF was fundamentally flawed. Using a former CIO from United who was replaced because he didn't do his job as your anchor for your premise that TPF was at the root was foolish. The writer obviously didn't fact check and let his article hand on the quotes of a disgraced former CIO and allowed the message to imply that TPF was and is going to be the cause of future failures. So far we've all been able to read that common between all of the failures, not just the Delta and Southwest ones of late, are actually faulty disaster recovery plans. If the plans were sound and tested, then when invoked they would have worked. In these cases they didn't. The TPF systems are designed to recover quickly and aren't going to be down for 6 hours - no architecture team would allow it to be set up that way, with the exception of Bob Edwards' team at United.
We have perfectly tuned devices and VR on the way but fly a metal tube a few thousand miles just isn't worth the trouble to get right. It's other basic infrastructure, too, people will pull all nighters to get an app done but nobody's pulling all nighters to make sure the drinking water's clean and the bridges are sound.
Combine the worst qualities of offshoring & agency labor with the worst of legacy systems and you get this disaster.
Twitter supports and protects racists - by smearing their critics with the "Hate Speech" label.
It seems we assume old systems should be bad. I am not sure modern stuff is more reliable than what was produced decades ago.
NSA coordinated the Delta flight interruption. Now suddenly on FBI slashdot they anticipate more.
Investors salivate... wow that could be contracts...
Remember the airline shorts of 9/11/2001? Just some innocent folks herding goats on camel back no worries friends.
The issue isn't simple number crunching and database retrieval, it's the fact that it's global.
Cogito, igitur comedam pizza.
The parent is spot on.
And just to add to that, until their recent run of profitability, the last time the airlines as a whole were consistently profitable was in the 1990s, before the dot-com bubble popped. Between roughly 2001 and 2011, they cumulatively lost money (the one bright spot was 2006, but of course the Great Recession hit).
http://web.mit.edu/airlines/analysis/analysis_airline_industry.html (apologies for the tiny image, but historical data more than 5 years out is typically paywalled).
It wasn't until we exited the Great Recession, airlines started charging for food and bags, and airlines did more to increase the passenger load factor (percentage of seats that are filled) to historically crazy levels that they finally became profitable as they have been in the past few years. Until then, even in decently good times, the underlying costs were pulling them down. Too many pilots and attendants drawing too high of a salary, too many flights going out less than full (i.e. too much spare capacity), etc.
So you can imagine why airlines weren't in any rush to invest in high cost, risky IT upgrade projects. When you're trying to just stay in the black, any optional cost not part of the core business (flying) is a risk.
Maybe they do need to upgrade their systems. And actually, Delta is making a profit right now. Maybe they have the money, maybe not. I don't know.
Final disclaimer: I don't know the details of what caused Delta's meltdown. But I'll share my own, much smaller-scaled personal experience to let you know why I shall at least hesitate before pointing fingers at the airline.
I work for the best company in US radio broadcasting. (Personal opinion, but there you go.) (Heh.) We are willing to spend the money on new equipment and systems to keep our radio stations on air. We have a backup generator at our studios and UPS units on all critical systems. They're tested and serviced frequently.
We've had severe storms in our region (I'm in Birmingham, Delta is in Atlanta) lately. We have had power failures where the AC will flicker on, off, on, off, rapidly, for several seconds, then finally die. Speaking from experience, this can cause all sorts of problems. (Don't believe me? Plug your favorite UPS into an outlet strip and toggle the AC on and off, on and off, while it's under load. Don't be surprised if it finally barfs.)
At any rate: our generator controller got confused and refused to crank the genset and a couple of critical UPS units shut off. I won't bore you with the details, but by any definition, it was a low probability event. We fixed it, we got back on air, and I designed a mod for our 10-year old Kohler generator controller. In fact, I'm ordering the parts now.
Here's the point: it's always something. If you lock the doors, the bad guys come through the window. If you bar the windows, they'll chop a hole in the ceiling. It's a never-ending battle. You examine the failure, do a post-mortem, then figure out a way to prevent it from happening again ... and THEN, wait for the next Big Bite(tm). :)
So ... maybe Delta mighta-shoulda spent some money to prevent their failure from happening. I'm not going to say that they've invested the money to ensure that what happened shouldn't have happened. But I'm also prepared to give them the benefit of the doubt. :)
Cogito, igitur comedam pizza.
"But they have avoided the steep cost of rebuilding their reservations systems from the ground up,"
As somebody who had family affected by the Southwest outage 3 weeks ago, the reservation system was one of systems that remained up the longest. Southwest still could happily take your money even if nobody was going anywhere. (I suspect they manually took it down later it became clear the day was lost.)
Focusing on the reservation system sounds like a contractor lobbying to sell something...
greybeards that wrote original are on pension on somewhere else. There is no competence available here so it will all be offshored to a company that lied it knows it all. That is if managers are insane. If they are sane they will not let this shit be rewriten completely at all unless they will be forced to. They will do small updates till kingdom come.
"Henry Harteveldt, founder of the travel consultancy Atmosphere Research Group, said some airlines are choosing to risk outages that might cost them $20 million to $40 million rather than invest, for example, $100 million on technology upgrades. He believes investors and the general public will apply increasing pressure on airlines to avoid outages at any cost."
How much did this cost Delta? Both directly and indirectly.
How much will preventing this single point of failure cost?
That reserves cannot be built up, looks like an inherent vulnerability in the system.
fucking polluting noisy bullshit meant for busybodies
Not only that, but an Airline Reservation System has a lot of moving parts. Airlines are usually divided into market segments (A, B, C, D) and each class has an increasingly complex number of requirements.
For example, some reservation systems need to interface with other third-party systems to support GDS, Government Security services (per country, mind you), as well as side services like car rentals and insurance.
Don't forget that modern airlines also have multiple points of booking: web, phone, mobile, travel agencies, etc. Yes, the reservation system should handle those as well while maintaining scalability.
Oh, you have a customer loyalty program? The reservation system also handles that.