Slashdot Mirror


Airlines Suffer Worldwide Delays After Global Booking System Fails (bloomberg.com)

rastos1 writes: Airlines worldwide were forced to delay flights Thursday as a global flight-bookings system operated by Amadeus IT Group SA suffered what the company called a "network issue." Major carriers including British Airways, Deutsche Lufthansa AG, Cathay Pacific Airways and Qantas Airways were among those reportedly impacted by the outage. Singapore's Changi airport said via Twitter that a technical issue affecting some operators was delaying the check-in process, with boarding passes having to be issued manually. "Amadeus confirms that, during the morning, we experienced a network issue that caused disruption to some of our systems," the Madrid-based company said in a statement. Technical teams took immediate action to identify the cause of the issue and services are "gradually being restored," it said.

74 comments

  1. Good news. by msauve · · Score: 1, Funny

    I'm happy the slashdot IT team found new jobs so quickly.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
    1. Re:Good news. by computational+super · · Score: 5, Informative

      I can't believe this hasn't even been addressed on Slashdot. The site was completely down for two days and they're trying to pretend like nothing happened.

      --
      Proud neuron in the Slashdot hivemind since 2002.
    2. Re:Good news. by TechyImmigrant · · Score: 3, Funny

      I can't believe this hasn't even been addressed on Slashdot. The site was completely down for two days and they're trying to pretend like nothing happened.

      But putting the servers back to work hosting slashdot seems to have borked the airline booking service.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    3. Re:Good news. by Anonymous Coward · · Score: 1

      I doubt that. It's obvious that Slashdot shit the bed. They're probably [still] scrambling to figure out what went wrong, and what to fix first. Sourceforge went down too. According to reports "equipment was fried" - which usually means "just reboot" and "restore from backup" is out of the question. Given the nature of SF, I'd bet that was given priority over Slashdot. It's probably a frantic, understaffed scramble that involves emergency purchases and a sudden, unplanned, total migration of *multiple* properties (SF, slashdot).

      Nothing to envy, for sure.

    4. Re:Good news. by Anonymous Coward · · Score: 0

      Don't know why this was modded down....

    5. Re:Good news. by jafiwam · · Score: 2

      SD seems to have lost data. The user interface here looks like it did while they were still futzing with it after the purchase. They lost front end HTML and CSS / Script files it seems like.

    6. Re:Good news. by rholtzjr · · Score: 1

      I was wondering when that was going to be discussed. I have not heard any information on what happened and the root cause of the site intermittent availability.

  2. Time to rewrite this software using Rust? by Anonymous Coward · · Score: 0

    Since this is critical software, rewriting it in a modern language designed to ensure security and reliability may be a good idea. The obvious candidate is the Rust programming language. It is being created by Mozilla for use with Servo, their next generation browser engine. Rust is being built from the ground up to be safe and secure and reliable. This is just what's needed when creating complex software systems that have to work properly all of the time, like airline booking systems.

    1. Re:Time to rewrite this software using Rust? by thomn8r · · Score: 2

      Thank you for this - with the week I'm having, I needed a good belly laugh...

    2. Re:Time to rewrite this software using Rust? by Anonymous Coward · · Score: 0

      Sigh, in case you haven't noticed human beings are involved in the writing of software, this pretty much guarantees that reliability will be poor. It doesn't matter what language, OS, framework, flavor of the month you use, something involved in your execution path was designed/built by a foolish/short-sighted/over-confident human being.

      Some project that you write in Rust will fail, it might not be under your control, it might not be in code you wrote, but you will be blamed.

      BTW, don't think that magic code writing AI will save you, guess who wrote the basis for the AI?

    3. Re:Time to rewrite this software using Rust? by Anonymous Coward · · Score: 0

      Rust with MongoDB, it will be fast as hell.

    4. Re:Time to rewrite this software using Rust? by cheesybagel · · Score: 1

      Nah. It must be rewritten in Javascript. With Angular.js, Node.js, and a whole load of other *.js's or it won't Web Scale!

  3. Re:profit form change fees and walkup fairs + lost by Anonymous Coward · · Score: 1

    Your words are in English, but your post makes no sense.

  4. SABRE was a classic case study by 140Mandak262Jamuna · · Score: 2
    In my software engineering course, back in the grad school, the SABRE airline reservation system was a case study. Supposed to be a text book example of how to implement and mange the life cycle of complex software systems. I still have the book Software Engineering by Shooman.

    That would have been 17 Moore's Law generations ago! In human terms, it like looking at the farming methods or weaving techniques or marine navigation procedures or military maneuvers of 1592!

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    1. Re:SABRE was a classic case study by CMOS4081 · · Score: 1

      In my software engineering course, back in the grad school, the SABRE airline reservation system was a case study. Supposed to be a text book example of how to implement and mange the life cycle of complex software systems. I still have the book Software Engineering by Shooman.

      That would have been 17 Moore's Law generations ago! In human terms, it like looking at the farming methods or weaving techniques or marine navigation procedures or military maneuvers of 1592!

      Back in the day I've dealt with AMADEUS, SABRE and GALILEO. This news sure brings back memories!

    2. Re:SABRE was a classic case study by rholtzjr · · Score: 4, Informative

      I agree, unfortunately by today's standards, the waterfall approach has it's limitations with respect to time to market and R&D costs. SABRE finally did make it away from the mainframe centric architecture (some time in the later 90's) and thus could adopt more modern life cycle management techniques.

      On another note, the SABRE system (American Airlines) was in fact a system that was comprised of the core system originated by Eastern Airlines (named System One) back in the late 70's, early 80's. SABRE and System One added functionality to market these two separate systems to travel agencies who previously had to use telephones to call into an airline reservation centers. System One was later branched off of Eastern Airlines into a separate entity under Continental Holding Co. before Eastern went bankrupt in the early 90's and was later sold to Amadeus. SABRE is still SABRE as far as I know, however Amadeus was the original purchase of System One.

      And how do I know this? That was my first job that got me started down the path of computers as well as my distaste for COBOL after C was standardized in 89.

      There were actually 4 big ones marketed to the travel industry back in the late 80's

      PARS (TWA)

      SABRE (American)

      DatasII (Delta)

      System One (Eastern)

      There were other airlines that utilized similar system, Pan Am had "Panamac" ( I think), Continental/America West/Alaska Airlines had "Shares".

    3. Re:SABRE was a classic case study by Anonymous Coward · · Score: 0

      I still have the book Software Engineering by Shooman.

      If I paid $100 for a textbook, I would keep it too.

    4. Re:SABRE was a classic case study by rholtzjr · · Score: 1

      Ditto, that is where I started my IT career back in the early 80's. System One in Miami, Fl.

    5. Re:SABRE was a classic case study by dpalley · · Score: 1

      Actually, the original Sabre system dates back to the 1950's. It was only used internally by AA until the 70's when it was rolled out to travel agencies.

      https://en.wikipedia.org/wiki/...

      Dan

    6. Re:SABRE was a classic case study by sconeu · · Score: 1

      And in the '80s they gave access to CompuServe Anyone remember EaasySABRE?

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
    7. Re: SABRE was a classic case study by Anonymous Coward · · Score: 0

      Another Creimer affiliate link. Brought to you by Creimer posting as AC.

    8. Re:SABRE was a classic case study by rholtzjr · · Score: 1

      You are correct that they had there own internal system, however Eastern had the first distributed reservation system that communicated to all major airlines in real time not just their own carrier. So while they may have developed their own internal system, what we see today is NOT entirely their doing.

    9. Re:SABRE was a classic case study by rholtzjr · · Score: 1

      Yes, that was the beginning of the "End of the Travel Agency". This is yet another example of a technology advancement that does actually take out other elements of an industry.

    10. Re: SABRE was a classic case study by Anonymous Coward · · Score: 0

      Give him a break. When you've got a $50,000/month gay porn habit, every dollar of income counts.

  5. Must be the Russians by Anonymous Coward · · Score: 0

    First it was Slashdot, then it was the airlines.

  6. Would cloud hosting have prevented the /. outage? by Anonymous Coward · · Score: 0

    As everyone is surely aware, /. and SourceForge have suffered some serious outages and problems over the last two days. Reportedly it was due to equipment failure. The big question is, would hosting /. in cloud infrastructure like Amazon's AWS or Microsoft's Azure have avoided these outages? The answer could very well be a resounding "YES".

    Modern cloud platforms allow for an unparalleled amount of redundancy and geographic distribution. Web sites are no longer constrained to just a single data center, but instead can be distributed across the globe with ease. The use of virtual machines and high-level services makes hardware more and more irrelevant. In fact, such cloud platforms are typically designed with hardware failure in mind, with such failure being a routine part of their massive-scale operations.

    I think that the recent outages here highlight the need for web sites and other online services to take advantage of the expertise and efficiencies offered by the leading cloud platform providers. A web site in 2017 shouldn't just run from a server or two in a single data center. A web site in 2017 should be globally-distributed, with redundant web servers across the globe. The best part of this is that thanks to economies of scale, cloud-based hosting can often be significantly cheaper than traditional dedicated hosting.

  7. Time to replace by emil · · Score: 0

    If the whole system is this fragile, it will be more cost-effective to select a stronger platform and development tools, and begin redesigning it now.

    I hear that ADA works very well for building reliable software that doesn't exhibit surprises or unexpected behavior.

    1. Re:Time to replace by computational+super · · Score: 3, Interesting

      You don't understand the principal rule of managing software projects (according to the MBA's who manage software projects): if anything takes more than a couple of hours to do, it's not worth doing.

      --
      Proud neuron in the Slashdot hivemind since 2002.
    2. Re: Time to replace by Anonymous Coward · · Score: 0

      I've never used ADA so I have some questions about it. Does it offer zero-cost abstractions? Does it have move semantics and a borrow checker? Does it have guaranteed memory safety? Does it have threads without data races? Does it have trait-based generics? Does it have pattern matching? Does it have type inference? Does it have a minimal runtime? Does it support efficient C bindings? Those are the kinds of features I'm looking for in a modern, high-reliability programming language. I know that the Rust programming language has such features, but I'm not sure about ADA. Do you know which of those features it supports? Does it have other features we should know about?

    3. Re:Time to replace by Anonymous Coward · · Score: 0

      This whole thread is bizarre - with an amazing reply above me from AC if it is a parody of an RFI.

      I'm pretty sure this has little or nothing to do with language choice - this is about distributed systems and operations being shitty. I used to do some incident response (IT incident, not emergency) type software that we used to work with people in the airline world on - and their systems were all strung together with string and hope, and a lot of overworked, not all that capable administrators.

    4. Re: Time to replace by Anonymous Coward · · Score: 0

      I'm pretty sure this has little or nothing to do with language choice

      Are you for real? Do you really think that software written in joke languages like JS or PHP is comparable to real software written in real languages like C++, Java and Rust? It's like you're going up to a carpenter and telling him that putting in nails by hitting them with a rock is just as efficient and effective as using a modern pneumatic nail gun. Well it isn't. There are huge differences between programming languages and the ability to which they can be used to create reliable software. If you believe otherwise then you're wrong.

    5. Re:Time to replace by thegarbz · · Score: 1

      Yes, let's replace a worldwide booking system that for the most part handles 3.7billion passengers every year without issue because of a very occasional outage causing a few queues.

      What could possibly go wrong.

    6. Re: Time to replace by Anonymous Coward · · Score: 0

      The system can't be 'without issue', to use your words, if the system experiences even just one 'outage', again to use your word. You contradict yourself.

    7. Re:Time to replace by emil · · Score: 1

      ...until the manager's flight is canceled, and [s]he is standing in an airport full of people in the same situation, including FAA regulators and Rand Paul.

      Then there might be sufficient motivation to refactor.

    8. Re:Time to replace by computational+super · · Score: 1

      You're attributing much longer memories to them than they seem to have. A co-worker and I were talking about refactoring and tech debt (and why we had so much still floating around) and he made an observation that (to management) tech debt is like a leaky roof. You don't have to fix it when it's not raining.

      --
      Proud neuron in the Slashdot hivemind since 2002.
    9. Re: Time to replace by Anonymous Coward · · Score: 0

      So it's confirmed, Rust is a real language?

      Show me one project that's big that's using it. And don't link me to fucking servo.

    10. Re:Time to replace by Anonymous Coward · · Score: 0

      The name of the language is Ada -- NOT ADA.
      Jesus you little idiots can't even get her name correct.
        Ada => named after Augusta Ada King-Noel, Countess of Lovelace.
      ADA => American Dental Association, American Disabilities Act ... etc.

  8. Re:Would cloud hosting have prevented the /. outag by Anonymous Coward · · Score: 0

    By the second paragraph it becomes clear this is just an advertisement in disguise. But thanks Anonymous Advertiser!

  9. All the new word-smith phrases for getting DDOS'd? by adosch · · Score: 2

    Are these airline information systems really all that fragile in this sector? I know we all say that, but I personally don't have a F clue; I'm 100% media driven on this from what I read, consume or read-between-the-lines. I'm hoping someone close to this could chime in or reply...

    With any one of us with any moderate amount of IT experience in the trenches and at any level that's support any ops or for-profit system, It's hard to dismiss a generic statement such as network issue. I know management I've worked under in the past at other organizations, private and government, would pre-can some huggable and down-played message like that --- and I totally get it; it's embarrassing on any level for any end-user disruption, but we'll never know why.

    With the amount of breaches, DDOS's and what seems like this popular resurrection of using the word 'Hacking' like we are all hoping a Hackers reunion happens with Jonny Lee Miller and Angelina Jolie is just nauseating, but a very true reality anymore with the lack of implementation over security practices.

  10. Re:Would cloud hosting have prevented the /. outag by knightghost · · Score: 3, Insightful

    AWS has had plenty of outages.

    Personally I don't think there is any such thing as "technical issue". There are resourcing, risk management, and personnel management issues. I've built systems with the right teams before that could stand anything short of a nuke, and we came in under budget. Honesty and the right people give you results.

  11. Re:Would cloud hosting have prevented the /. outag by Anonymous Coward · · Score: 0

    ...The best part of this is that thanks to economies of scale, cloud-based hosting can often be significantly cheaper than traditional dedicated hosting.

    Actually, the best part of this will be listening to the grand excuses justifying "cheaper" after the cloud provider is hacked.

    Targeting and hacking a single organization is one thing. Hacking a cloud provider can result in exponentially larger gains, so it tends to make the (very few) major cloud providers huge fucking targets. And for the "will never happen" crowd, let me guess...you were an Equifax customer too? Or perhaps you used a bitcoin exchange...

  12. they don't have local console or able to use ISO by Joe_Dragon · · Score: 1

    they don't have local console and you are not able to use your own ISO to install an OS.

  13. Re:they don't have local console or able to use IS by Anonymous Coward · · Score: 0

    Holy fuck. The idiocy that we see here at /. these days is astounding! Have you ever actually used any sort of a cloud provider?

    ISOs are a relic of the 1990s. The recommended best practice today is to upload a preconfigured VM image.

    Here are some instructions for doing it using Microsoft's Azure platform.

    And here is some information about doing it using Amazon's AWS platform.

    The advantages of this approach should be obvious. But since you seem oblivious, let's look at some of them. First of all, you're not stuck using an ISO image as the installation medium. Many Linux distros and even other OSes offer network-aware installations that bring in only the software you actually need. So you can build your VM using only the software you want, and then upload it to your cloud provider. There are other benefits, like being able to configure the VM locally, rather than remotely. You inherently get a local initial backup. It's often quicker to upload a small 100 MB compressed VM image than it is to upload a 800 MB or even 4 GB ISO image.

    And a local console is irrelevant in the world of virtual machines and virtual storage. If you have a problem with a VM, you can disconnect its virtual disk, attach the virtual disk to another working VM, apply whatever fixes are necessary, and then reattach the virtual disk to the initial VM. In the extraordinarily rare case that there's a problem with the VM, then you just spin up a new one and destroy the old one, after detaching any attached virtual disks to preserve the data.

    You're literally stuck in the 1990s, from what I can see. The world has moved on long ago, but you're still dicking around with primitive approaches. The problems you're facing don't even exist for the rest of us because we moved past them over a decade ago!

  14. I used to work for this company by Anonymous Coward · · Score: 0

    The core software is written in C++, and originally very well too I might add. However it is HUGE, and no one person or even one team understands even an overview of how it all fits together, never mind the specifics. Which is fine until there's a cascade failure and problems get batted around from one team to another. Also because no one really understands the low level side when new features were added someone would just bolt more code onto the top of side instead of finding out if the core stuff could already do it and after years of that you end up with something immense, unwieldy and essentially ineffable in its whole to even top class programmers.

    Posting A/C for obvious reasons.

  15. Re:Would cloud hosting have prevented the /. outag by easyTree · · Score: 1

    The best part of this is that thanks to economies of scale, cloud-based hosting can often be unpredictable at best

  16. Perhaps someone hasn't told you yet.. by Viol8 · · Score: 1

    ... but pushing your pet language in every goddam comments section is a perfect way to make people get sick of hearing about it and give it the finger before they've even tried it. Who knows, perhaps thats your intention. Either way, give it a rest you buffoon.

  17. All large systems can be fragile by Viol8 · · Score: 1

    And it doesn't matter what language they're written in , the fragility generally isn't down to a low level language issue such as memory, threading or pointer issues (though obviously those errors happen too), its usually a logic problem in handling edge cases, unexpected code paths and errors correctly. No language is going to save you from broken logic however much their proponents would pretend otherwise.

  18. Re:Would cloud hosting have prevented the /. outag by Anonymous Coward · · Score: 0

    Too bad they're not hiring "miracle workers," eh creimer? Actually, given the availability of the site, I dare say you'd fit right in in their IT department!

  19. Re:Would cloud hosting have prevented the /. outag by Viol8 · · Score: 1

    Every human built system can suffer from technical issues. Saying otherwise is just pretending the problem doesn't exist.

    " I've built systems with the right teams before that could stand anything short of a nuke"

    You're modest arn't you. Systems always look bullet proof - until they go wrong. I doubt yours are any better or worse than hundreds of others that have been written to be resilient.

  20. Re:profit form change fees and walkup fairs + lost by Anonymous Coward · · Score: 0

    Surely you can provide a link to such a job posting, creimer? You're not just endlessly repeating the same meme because you think it's amusing?

  21. Re:Would cloud hosting have prevented the /. outag by Anonymous Coward · · Score: 0

    Too bad they're not hiring "miracle workers," eh creimer? Actually, given the availability of the site, I dare say you'd fit right in in their IT department!

    Who the fuck is creimer??

  22. Re:profit form change fees and walkup fairs + lost by Anonymous Coward · · Score: 0

    Surely you can provide a link to such a job posting, creimer? You're not just endlessly repeating the same meme because you think it's amusing?

    Give it a rest, FakeFuck39, and get a fucking life.

  23. missing Over by Anonymous Coward · · Score: 0

    Airlines Suffer Worldwide Delays After Global OVERBooking System Fails

  24. Re: Would cloud hosting have prevented the /. outa by Ogive17 · · Score: 3, Insightful

    I've built databases that work flawlessly for a year and stop working when I'm on vacation without explanation. Shit happens sometimes.

    --
    "Action without philosophy is a lethal weapon; philosophy without action is worthless."
  25. Re: Would cloud hosting have prevented the /. out by Anonymous Coward · · Score: 0

    wait, say what?! Slashdot was down?!
    I've been troubleshooting my network for the last 3 days!

  26. Re:profit form change fees and walkup fairs + lost by Anonymous Coward · · Score: 0

    So that's a "no, I can't provide an example of a job posting that does this"?

    tsk tsk, creimer.

  27. ADA... by emil · · Score: 1

    ...is mostly syntax-equivalent with Oracle PL/SQL. The GCC toolchain targets ADA with GNAT. As such, it would obviously link against C.

    ADA is quite old and is likely missing many of the features you've outlined. Some of them may be present in the popular descendant of ADA known as SPARK.

    It is well-known that our software breaks far too much. Denying the problem does not solve it. ADA was designed to address this issue head-on, which is why Boeing's airplane control software is not written in C.

  28. Re: Would cloud hosting have prevented the /. outa by Anonymous Coward · · Score: 0

    You are, that's who.

  29. My heart goes out to Amadeus by shuz · · Score: 1

    From a person who has had similar international headlines for systems that I can impact. My heart goes out to you. System failures are never fun, failures that affect a lot of customers are just plain stressful. Document processes and learn from this event all the you can. Customers care most what was learned and how to prevent this and future scoped events from occurring again.

    Hang in there!

    --
    There is or can be built a machine that can simulate any physical object. -Church-Turing principle
  30. Re: Would cloud hosting have prevented the /. outa by Anonymous Coward · · Score: 0

    If it is happening at 6:45 in the morning just ask the cleaning staff where are they plugging in the vacuum and the coffee maker.

  31. I work there ... by Anonymous Coward · · Score: 0

    Sadly, for contract reasons I can't give more details than the published one.

    https://www.dallasnews.com/business/southwest-airlines/2016/07/30/southwest-ceo-router-failure-grounded-flights-equated-thousand-year-flood (paywalled but you can read it if you have NoScript ore equivalent installed)
    https://www.wsj.com/articles/why-a-single-failed-router-can-ground-a-thousand-flights-1489743001 (paywalled)

    The airline business suffers from not only aging systems, but also from highly interconnected, interdependent and extremely complex systems that keep growing in complexity and new systems and companies get connected all the time.
    Many of these systems are mission critical. The amazing thing is that all this is working almost ALL the time. The bad thing is an error on a critical path leaves everybody grounded.
    I have to acknowledge that Amadeus systems are much more stable than others in the sector (saber had a few high severity failures in the last couple of years, like this one https://www.dallasnews.com/business/southwest-airlines/2016/07/30/southwest-ceo-router-failure-grounded-flights-equated-thousand-year-flood )
    In Amadeus there are some nice things going on like it is published here:
    https://www.thecompanydime.com/mainframe/
    and here:
    https://www.nextplatform.com/2015/08/04/amadeus-takes-off-with-containers-and-clouds/
    Even if not everything is yet "in the cloud"

    Besides that, and here my rant about the company:
    I believe the main problem is incompetent management and too much politics in the middle that play to get everybody not unhappy enough, instead of doing what needs to be done. This plus HR that are mainly there to make senior management happy instead of doing the right thing doesn't add to the equation. This leads to: most of the really good ones either quit or stop caring about (although some manage to get to do really well what they want to do), the mediocre ones getting to most of the work and the bad ones that like talking and taking credit for other's people get to manage. Amadeus deals with developers and engineers only as a "cost" and not as an investment (even if their words say the contrary).
    This leads to slow change and errors in technical and strategical decisions. I know of some, because I've fought for some while I still cared about.
    What seems to have happened (in my opinion) can be dealt with (and I'm sure they will because some heads might be rolling right now) in the future.

    1. Re: I work there ... by Anonymous Coward · · Score: 0

      Wrong. HR is there solely, not mostly, to keep senior management happy. Generally by minimizing payout when staff sue, file for unemployment, etc.

      As for the airlines, and major airports, they should each have their own system and share data with a common API for interconnections. That way no airport or airline gets jammed by an outsider failure

  32. Re:profit form change fees and walkup fairs + lost by Anonymous Coward · · Score: 0

    So that's a "no, I can't provide an example of a job posting that does this"?

    tsk tsk, creimer.

    Who the fuck is creimer??

  33. Re: Would cloud hosting have prevented the /. outa by Anonymous Coward · · Score: 0

    You are, that's who.

    Anyone who disagrees with you is creimer. Gotcha. Have you ever thought about getting psychological help?

  34. Re:Would cloud hosting have prevented the /. outag by K.+S.+Kyosuke · · Score: 1

    Reportedly it was due to equipment failure

    If you trust some with the name "Logan, A Bot"...

    --
    Ezekiel 23:20
  35. Re:Would cloud hosting have prevented the /. outag by K.+S.+Kyosuke · · Score: 1

    Who the fuck is creimer??

    Isn't that the guy living next door to Alice?

    --
    Ezekiel 23:20
  36. Re:profit form change fees and walkup fairs + lost by hackwrench · · Score: 1

    He's using the wrong words, but essentially he's saying that they get to charge you for their mistakes caused by having lousy IT that is lousy because they are saving money by not spending it on training.

  37. Re: profit form change fees and walkup fairs + los by Anonymous Coward · · Score: 0

    Creimer is a sad troll, a symptom, but not the proximate cause, of Slashdot's death spiral.

  38. Re:they don't have local console or able to use IS by Anonymous Coward · · Score: 0

    The disadvantages of using a potentially hacked OS provided by someone who definitely cannot be trusted, should be obvious.