Slashdot Mirror


Computer Error Grounds Japanese Flights

zephiros writes "Mainichi Daily News reports that a "computer glitch" in Tokyo air traffic control systems resulted in the cancellation of 203 flights this weekend. At 7am Saturday, the error "caused the names of airlines and flight numbers to disappear from radar screens." A Japan Times article suggests the problem may be related to upgrades on a system which exchanges flight plans with the Defense Agency. Makes one wonder about the integration and maintenance risks of systems like CAPPS II."

7 of 154 comments (clear)

  1. ATC and CAPPS II are NOT connected by MyNameIsFred · · Score: 4, Informative
    Obviously upgrades to Air Traffic Control (ATC) systems and communication links to ATC can be cause problems. There is a significant safety of flight issue. Therefore, the FAA maintains strict control of these systems. And in fact, has a dedicated network reserved for ATC. Only "essential" programs and systems are allowed to connect to it.

    Passenger listings, airline booking systems, and related software are NOT connected to the ATC network. Since CAPPS II looks at booking data, credit card info, and related data, it would not be connected to the ATC network.

  2. Re:2 things I want to know... by Anonymous Coward · · Score: 5, Informative
    > 1) How the hell did the flights get DOWN once the radar died? It said they disappeared from radar, and you don't keep radar on the planes that are on the ground, so....?

    Read the article. It says that just the airline name and flightnumber tags printed beside the radar blips vanished. The radar worked just fine.

    > 2) Whose bright idea was it to do a "systems upgrade" while there were large, flying metal objects carrying many people still in the air?!?!

    Read the article. The change was made early in the morning on a weekend. When would you suggest?

    > Wouldn't you do a test run, install it on a backup system, or one that's not systems-critical?

    The article (did you read it?) hints that might have been a networking problem when they integrated the military database with the civilian database. A backup system is a good first start, but isn't always the same as the production system. Network problems can't always be perfectly tested or simulated.

  3. Yes, it is that bad by Anonymous Coward · · Score: 2, Informative

    I've lived here for several years now, and the above stories really are an average selection. On a true freaky, awful day, you would see stories far worse.

  4. A few thoughts on redundancy. by muonzoo · · Score: 5, Informative
    I think this is one of those rare times where I have an opinion that's actually relevant. :-)

    First, people need to understand that no Bad Things will happen if an ATC system goes offline while planes are under it's jurisdiction. ICAO member countries (and most nations for that matter) have strong procedural rules in place that keep planes separated without the help of radar. This is espcially true in the enroute case. (Area control centres handle overflight and enroute traffic. Eveyone is separated by at least 1000' vertical and 3 miles horizontal at all times. The altitude restrictions and clearances that each pilot receives are chosen specifically so that in the even of loss of communications, the pilot can continue to his "clearance limit" without any problem. Well, you ask, what happens when he gets to his clearance limit and still isn't communicating with air traffic control? They hold. This is all laid our quite clearly. These rules have been around since before RADAR because thats the way it was done.
    Just take a look at the RADAR coverage map of Canada (one is visible at the link above). There are lots of places that don't even HAVE radar coverage.
    The old tried and true clearance and time/speed based conflict resolutions works and works well.

    Secondly, and more imporatantly, there really isn't any news in this article. It's scaremongering. This happens all the time. It's an inconvenience, but rarely a saftey concern.

    For those who asked about it; yes, typically a new system is run in parallel with the legacy system for a period of time (sometimes 24 months) before it is used as the primary control. Notice that the old system is live and the new system is shadowing. That way, anomalies that are found do not impact any flights.
    [*flame proof underwear on*]
    Is it just me, or does the press dig around for 'news' in about as diligent a manner as Slashdot?
    1. Re:A few thoughts on redundancy. by Microlith · · Score: 2, Informative

      Mainichi Daily News (daily daily news) is often regarded (especially MDN english) as being a tabloid.

      Generally they go for sensational headlines and stories (their "Wai-Wai" section is the most popular).

    2. Re:A few thoughts on redundancy. by Oswald · · Score: 2, Informative
      Hmmm. Perhaps I can help with a few misconceptions here, based on over 19 years of air traffic control experience at Atlanta Center.

      First, people need to understand that some bad things might happen if enough ATC systems go offline at once. Bad things are less likely to happen, as the poster states, if the outages occur in the enroute (my) environment, because the planes are generally farther apart than in terminal airspace. (Picky notes: enroute separation is 5 miles (not 3) OR 1000 feet--not AND--but I'm sure that was just a misstatement.) But they're not THAT far apart. This post makes it sound like any time we want to we can drop back to good old non-radar control. Well, standard separation in a non-radar environment is as high as 10 minutes flying time (longitudinally, which is to say along the same route). That's a lot more than the five miles I was using when the radar was working. The transition will be a bit tricky, and if I have to do it for any length of time, traffic will slow to a virtual standstill.

      What's more, it is simply not true that aircraft clearances cover eventualities like lost communications or lost radar. This is a myth, and one that new on-the-job trainees quickly get de-programmed out of their heads. It's not possible to issue clearances that are good all the way to your clearance limit--every aircraft that departs, deviates for weather, changes destination, or even changes altitude (say, for turbulence) has the potential to screw up everybody else's "perfect" clearances. We truly don't even try to come up with such clearances. As for the idea that everybody will get to their clearance limit (actually, it's the published holding pattern for the route they're on to their clearance limit--probably that was simplified for clarity) and hold, that's great until you get the part about "until their estimated time of arrival" (original poster left that part out). Now you have planes dropping out of holding (and BTW, who assigned altitudes to make sure 6 aircraft didn't hold at the same altitude when the radios went out?), not necessarily from the bottom first, and flying to their destination airport. It's a 5-times-a-day event at hubs like Atlanta for 30+ aircraft to be scheduled over one fix in an hour--what are we gonna use for sequencing? TCAS? Common Traffic Advisory freqs? Get serious.

      I'm not trying to scare anybody here. There are redundant systems (and they're pretty well-seasoned at this point anyway, so they almost never break), and ways to get hold of aircraft through company radios, and it really is a big sky. But it doesn't do anybody any good to pretend that it's not dangerous to try to sort out a major arrival rush by looking in your fish-finder and chatting with the other pilots til the controller gets back.

      ATC was invented many decades ago because airplanes flew into each other without it. Those were props, flying to destinations with a tenth the volume of a modern hub. Maybe someday we'll have some cool hive-mind software that will allow the airplanes to sort everything out between themselves, and there won't be anymore ground-based controllers. (I won't see it in my career, cause I retire in less than 6 years.) Until that time, controllers and reliable control equipment will continue to be necessary for safety as well as expediency.

  5. Re:Risk Maintentance 101 by kryonD · · Score: 3, Informative

    Actually, the damage was almost minimal to the Japanese air system. The delay only lasted 50 minutes. Unlike American travellers, Japanese people will quietly and orderly board a fully booked 747 in under 20 minutes. If asked to hurry, they will board it even faster. That combined with Narita and Haneda's ability to handle traffic far above their average had most flights back on time before noon. Only a small handful of international passengers may have had to rebook a connecting flight. Domestic flights are almost always direct.

    As far as risk management, had there actually been a perceived emergency due to the malfunctioning radar display system, the airports would default to an agreement with Yokota and Atsugi US airbases to provide fallback flight control facilities.

    This is really a non news item. The system administrator correctly applied upgrades during non-critical operation time. (i.e. not during the main business week) The problem was identified early on and corrected pretty damned quick. This happens hundreds of times a week all over the world. Had the glitch actually halted the entire Japanese air system for a long period of time, then it would make more sense.

    --
    I've dirtied my hands writing poetry, for the sake of seduction; that is, for the sake of a useful cause. --Dostoevsky