Software Upgrade Crashes UK Air Traffic Control System
pitpe writes "Earlier today the computer system controlling most of the UK's airspace failed, after tests in preparation for an upgrade failed. The original failure occurred at the West Drayton centre, which is an old (70's) system, as opposed to the new system at Swanage, which has had its own problems. A system wide reboot to fix the system resulted in the entire system being taken down temporarily."
And I was going to put the blame on M$, but if it's a 70's system we're talking about I'll just shut up.
Ah the good old reboot and hope for the best method :D
"which is an old (70's) system". As long as it's not 30-year-old hardware then the software should still be fine. Why does everyone think that simply because software was written in the past it is bad?
Considering that up until about 2000, all of the major Air Traffic Control centers in the US were running on vacuum tubes, we were lucky nothing like this ever happened here. Sure, there were glitches at regional centers, that required controllers to do everything by hand, but nothing that required a full reboot of the entire country's ATC system.
Hopefully the UK will get the new system tested and online before it causes more problems!
Whoever stated that signature sizes should be limited to one hundred and twenty characters can just go ahead and kiss my
It seems they have been having problems with their computer systems since 2001 when it was "privatized".
"The air traffic service has been beset by problems since it was partially privatized in 2001. A $484 million center at Swanwick in southern England opened five years late in 2002.
The opening was delayed by problems with computer software, and the glitches continued for months afterward, as controllers misread aircraft altitudes and destinations because of hard-to-decipher computer screens. In at least one case, controllers mistook the Scottish city of Glasgow for Cardiff in Wales.
Now.. that seems like a pretty big mistake for me.. especially for an air traffic controller to do. However, the article later states that:
"Transport Secretary Alistair Darling said Thursday's problem did not lie at Swanwick but at the older West Drayton center, which is due to be closed by 2007."
Thank goodness that old one is closing, however it doesn't sound like its replacement is doing any better!
"If you want to know what is wrong with transport in this country it is that over decades successive governments did not spend enough on the infrastructure and air traffic control is no different," Darling told BBC radio."
Excellent quote! While terrorism is on everyone's mind, we sometimes forget that safety of transportation should also be just as high. I couldn't imagine pilots relying on themselves to fly airplanes amid the thousands of others without the aid of traffic controllers and their computers.
Hmmm.
Perhaps a person experienced in ATC software or hardware could enlighten us on the specific system in use, its OS and other trivial bits.
It would help to reduce the coming surge of Microsoft jokes, which is very likely not relevant here.
Vos teneo officium eram periculosus ut vos recipero is.
National Air Traffic Services http://www.nats.co.uk/services/index.html are the outfit responsible for this.
3 .html which explains quite nicely what they did and why.
They have a press release http://www.nats.co.uk/news/news_stories/2004_06_0
Matt Thompson - Actuality - Insert product here.
There are redundant systems in place. Analog radar, humans with brains.
At least there should be. Computers crash, break, have bugs, etc. They're a tool - a more efficient and convenient tool to be sure.
But when they break, there are contingencies so that planes can still take off and land, and wont just fall out of the sky.
This is also why Y2K was such a bunch of stupidity. We really aren't as reliant on computers as people think. We know they crash and are prepared to handle it when they do.
I don't need no instructions to know how to rock!!!!
Much the same thing happened last
week in Dublin
It appears you are trying to land a plane. Would you like to:
[x] Allow Windows to detect new hardware ?
[ ] Allow planes to circle in uncertainty ?
[x] Show this window at all airports
This wouldn't have happened had they been using Linux.
This might have happened even if they were running linux. If the software that is used for the air traffic controlling was written badly it still could have crashed.
Evolution or ID?
http://abclocal.go.com/ktrk/news/050404_local_airp ort.html
Check out gflightcontrol-0.01, then run the usual:
Of course, it requires gnome 2.6 and all deps. Planes will have to circle while everything emerges.
cpghost at Cordula's Web.
This wouldn't have happened had they been using Linux.
No, the air traffic controllers would still be figuring out how to cut/copy/paste while a 747 is on it's final approach.
as for the system crashing in the first place, it's unfortunate, but a good thing that they were able to cope and keep everyone safe - that's the main thing, right? (it's certainly my main concern)
and as for the software not being up to the job, it may well not be. after all, air traffic has increased ever so slightly since the 1970's - is it reasonable to expect a program presumably designed for 70's hardware, and 70's air traffic loads to cope with heathrow in 2004?
The new centre is at Swanwick in Hampshire, not Swanage in Dorset!!
--
This sig is inoffensive.
Swanage is a pleasant little seaside resort. I know it well and stayed there a few nights when on my honeymoon.
Finding Swanwick and Swanage on a map of southern England is left as a exercise. Hint: Mapquest may be a good place to start.
Paul
Lasciate ogne speranza, voi ch'intrate
Yes, I think that the software structure of a critical realtime system like ATC is much more important than which OS or language it's written in. It should be built like a strange composite stranded cable, with different strands of simple structure that can survive sporadic (even systemic) failure of its parts. In such a system, there should be no such thing as a system-wide reboot, since the only thing that is truly system-wide is the data.
Without this structure, Linux would probably fail at an unacceptable rate too.
[You have a stable society when some nut guns down a schoolyard and the law doesn't change.]
Slackware has had ATC for years.
- These characters were randomly selected.
To quote from the NATS (National Air Traffic Services) press release:
"The FDP was being tested overnight for a future upgrade. The system was successfully returned to service but at 06.03 errors were detected in the distribution of flight data between Centres. As a precaution, we decided to restart the FDP (known as a cold restart) causing an interruption to full service. The data processing system was restored at 06.42 and declared fully operational at 07.03. Flight capacity restrictions were lifted at 08.05. The system is now fully operational and we are confident that it is stable.
Through the response team at West Drayton, we have been working with airports and airlines to clear the delayed departures, and expect the backlog to be cleared quickly.
Our investigation into the cause of the problem is continuing."
Let me get this straight: they ran a test on the FDP. The FDP glitched. They rebooted the FDP. They are still investigating the problem.
Now, unless I am mistaken, I can only infer from their statement above that they are now running the FDP which is still susceptible to the problems highlighted by the test.
Never just test software upgrades on Live systems
Rus
Cheap UK and US VPS
Yes they did, and no, using a cell phone is not a certainty to cause problems.
It does, however, carry the potential to introduce errors in various systems.
Would you want the altimeter to read 200 feet too high, or have an uncommanded left turn, because some numbnuts is yakking on the cellphone?
"DC-9 flight crew experienced an involuntary turn by the autopilot during cruise. Autopilot reacted normally after the captain asked passengers to turn off any personal electronic devices. Crew later learned that a cell phone in an overhead bin was heard during the time of the autopilot problem."
Unconfirmed reports are stating that aparently one of the air traffic controllers accidently clicked on the "Windows update" icon. :P
My dad is helping the FAA and the US military design and roll out the next gen ATC software here in the US. He comes home and tells stories that make my skin crawl.
The first version of the software was built using standard current interface guidelines and widgets and the testing group that had no experience with older ATC systems were wowed at how simple and yet powerful it was. Pretty much any random person off the street could look at the screen and easily figure out what was going on and how to do various basic tasks. When that version was demoed to the ATC union the union freaked out at how different it was and thus began a cycle of making it more and more backwards.
So, nowadays the next gen ATC software almost exactly replicates the UI of the old non-computerized and semi-computerized systems. On-screen toggle switches and dials, that sort of thing. The FAA and the ATC union have decided that retraining all of their ATCs to use modern computer interfaces would be a Bad Thing. When the computer screen doesn't exactly replicate the interface of the 50+-year-old systems, they freak out and scream bloody murder. On the flip side, kids coming into the field today that have been using computers most of their lives are finding the interface to be counterintuitive to the point of being almost unusable. Middle-aged workers who are both highly proficient ATCs and home computer users report that switching between the two types of interfaces each night when they go home requires conscious effort on their part, since they are so orthogonal.
So who wins? Historical inertia, of course. Why fix the problem today when you can wait for your successors to fix it in 25 years?