Database Glitch Grounds American/US Airways
An anonymous reader writes "According to numerous news sources, all American Airlines and US Airways flights were grounded for two or three hours this morning. Both problems were caused by a computer glitch in the systems hosted by EDS. Quote: The operating system that drives the airline's flight plans went down."
How in the world can they state that as singular. Surely they have a backup of some sort. Especially with all the supposed "increased security" around air flight, you are telling me that one system crash can knock out half of the major airlines? That's ridiculous. Have they not learned about redundancy?
EDS is by no means a Windows shop. They work extensively with "big iron" mainframes. In fact, they recently got the contract to handle the database of terrorist information that'll be used at airports. Likely this will be hosted on a 390 or something... Windows can't handle that kind of I/O.
What scary news? The airplanes are piloted by people, not computers. And certainly not the computers that control flight plans. Do you think that airplanes will start falling from the skies because a computer went down somewhere? I guess you packed your basement with cans of beans for Y2K too.
Support the First Amendment. Read at -1
They aren't telling the whole story.
I come from Solaris/Veritas/Oracle and Redhat/Oracle RAC environments. One single system going down cannot take out the service. Database HA is somewhat complicated and expensive, but it's not rocket science, regardless of platform.
I find it very difficult to believe that they would have any single points of failure in a system of that importance. Blaming MS is the easy way out.
If we're going to lay blame, let's make sure we're spreading it evenly. A lot of contracts, and especially government ones, suffer from extreme scope creep. I have seen projects that started out with a 20 page description grow to over 150 pages by the end of the project.
EDS and other large IT vendors try their best to discourage scope creep by making changes-after-the-fact billable for time and materials, instead of a negotiated cost. This makes the project go over budget. If the clients knew what they wanted at the begining, instead of wasting time and money doing engineering on the fly during the project, then the costs wouldn't be so high.
Don't be so quick to slag EDS about the outage either. There are lots of factors out there that could have contributed. I have worked on projects where the clients say the servers are mission critical, yet can't be bothered to shell out money to upgrade from ultra-1 and ultra-5s, let alone pay for an HA solution. The technical people keep providing the justification and making the requests, but it's the project managers and accountants that really determine what kind of solution is feasible.
Don't think you can say much better or anything difference about CSC or the rest of the small cadre of IT companies who specialize in winning government contracts around the globe. They've had their share of multibillion dollar fiascos too.
The problem with these companies is they specialize in winning big contracts. They put their best people on the proposals. They don't specialize in delivering great systems. Their best people probably move to the next RFP and they mostly fill the contracts with warm bodies.
They can get away with it because its pretty rare for them to actually be punished for poor performance. If they get blacklisted by the agency that awarded the contract the agency ends up just replacing EDS with CSC or vice versa and the results don't get any better. I'd be interested if someone could cite a huge government IT contract that actually worked well. At some point governments need to figure out this methodology doesn't work and try something new.
@de_machina