Slashdot Mirror


Comair System Crashes; Passengers Stranded

Broerman writes "30,000 people have had their flights cancelled by Comair this weekend thanks to a computer system shutdown. It appears that due to weather and other problems that flights began to be cancelled on Thursday and the backlog choked the system. 1,100 flights have been cancelled so far, including all flights through 12/26. Does anyone know what platform their system was based on? What kind of system just totally crashes? The official statement is that 'There was a cumulative effect with the canceled flights and trying to get crew assigned that caused the system to be overwhelmed.' It seems highly improbable that a system would crash because it had too many reservations. The system should only be able to hold as many reservations as it has flights/seats. It would seem that it's more likely that the system was overloaded with use and that caused a meltdown. When you add in the problems experienced by US Airways, this hasn't been a Merry Christmas for many."

398 comments

  1. Fire away! by weeksie · · Score: 5, Funny

    Anybody know what they were running? I'd like to see this flamewar get started as soon as possible.

    1. Re:Fire away! by mirko · · Score: 5, Insightful

      There recently was a big card problem here, in Europe.
      It did not come from a peculiar OS but just because a partition got filled by index tablespace extents.
      So, it could just be that they ran out of place and it froze the whole application.

      --
      Trolling using another account since 2005.
    2. Re:Fire away! by Deviate_X · · Score: 3, Interesting

      Interesting...

      Job postings might give some insight: Comair, Inc. jobs into what they are using.

    3. Re:Fire away! by Anonymous Coward · · Score: 0

      The mess could have been caused by the existing platform. The IT job postings are for the platform they are moving to. We do'nt know yet.

    4. Re:Fire away! by Anonymous Coward · · Score: 0

      This is why middle managers in sales should not be allowed to distribute their homebrew MS Access apps.

    5. Re:Fire away! by theonetruekeebler · · Score: 2, Insightful
      Based on those postings, I'm guessing the application is based on either Oracle or Sybase on HP-UX.

      My preliminary diagnosis: blown rollback segment. With too many flights being cancelled, the simultaneous rescheduling of all those crew resulted in a SQL transaction that exceeded the size of what the DBMS could undo. So an uncommitted statement failed and the application code either was not prepared for such a possibility or could only handle it by timing out. Scheduling tasks could no longer move forward, and right now some poor DBA is hoping to Christ that he printed out that e-mail he wrote asking for more disk space...

      --
      This is not my sandwich.
    6. Re:Fire away! by Anonymous Coward · · Score: 0

      Comair's pilot scheduling software was apparently written by a company in Montreal called Ad Opt Technology (according to this press release. http://tinyurl.com/6z4hk). Ad Opt was recently acquired by Kronos (http://tinyurl.com/69xku)

    7. Re:Fire away! by [Xorian] · · Score: 5, Informative

      Someone from Comair (who shall remain anonymous) provided me with some details whch people here would be interested in:

      The computer system in question runs AIX. The box itself is still up and running just fine; this is purely an application error. This application was not written in-house at Comair, but by another large aerospace company -- SBS (http://www.sbsint.com/, owned by Boeing.) This bit of software does not use an external database, it tracks everything itself. It is a dedicated system responsible only for flight crew assignments. (The blather in the original submission about passenger reservations is way off-base. Those functions are handled by a completely different system.)

      The great majority of Comair's traffic flows through the midwest, and the central base of operations is in Cincinnati. The midwest was hit by a major snowstorm this week, causing many, many crew reassignments. It appears right now that the application in question has a hard limit of 32,000 changes per month (ouch). Consider that Comair runs 1,100 flights a day and there are usually 3 crew members on each aircraft. A big storm like this can cause problems for days after the snow stops falling. That's a whole lot of crew changes.

      In Comair's defense, this has never happened before and is unlikely to happen again. The crew system was already on the chopping block long before this incident, with its replacement scheduled to go live in January. If this freak storm had happened a month later, this likely never would have occurred.

      --
      CVS is teh suck. Use Vesta instead.
    8. Re:Fire away! by [Xorian] · · Score: 4, Informative


      Just to be absolutely clear: I've only ever communicated with this person on-line, and I can't verify who they are in real life or that they actually work for Comair. It seemed credible though, and it seemed worth posting to de-bunk the slashdot knee-jerk reaction of blaming Microsoft. To me, an application using a 16-bit integer for something seems like a very likely explanation.


      --
      CVS is teh suck. Use Vesta instead.
    9. Re:Fire away! by Anonymous Coward · · Score: 5, Informative
      If it was the crew scheduling system, and it was SBS's Maestro Crew scheduling system, I can fill in some details.

      Maestro is delivered on AIX, uses a rather old version of Informix for it's database, and is tied together using the TUXEDO TP monitor from BEA.

      The business logic is written in C, and abstracted away using Tuxedo.

      In the case of a major schedule disruption, this program isn't responsible for "solving" the problem, but is responsible as being the system of record for holding the new crew schedule.

      My guess is that the changes to the crew schedule were large enough that some piece of the system was overwhelmed. ( For example, a transaction that was too large and overran the rollback buffers in Informix ).

      Without the system of record in place, a manual process would be very difficult. You would have to figure out:

      • Which crews where in which locations
      • What aircraft each crew member was qualified on.
      • How long they had flown already that day. ( Legalities about how much time you can fly before you need mandatory rest )
      • Which routes to send those crews on
      • How to get the crews back to a specific city to run the next day's schedule
      Of course, any mistakes you made doing this manually would overflow into other systems. For example, you might send an aircraft that's due maintenance to a city with no maintenance facilities.

      Also, for those that were critical of the system not being highly availble...this doesn't sound like the kind of problem that HACMP and replicated databases would have helped. The hot standby would have choked at the exact same point.

    10. Re:Fire away! by Daa · · Score: 5, Informative

      just to give you an idea, here is the applicable FAA reg for crew scheduling, and the pilots contract may have additional terms that must be met.

      121.471 Flight time limitations and rest requirements: All flight crewmembers.
      top

      (a) No certificate holder conducting domestic operations may schedule any flight crewmember and no flight crewmember may accept an assignment for flight time in scheduled air transportation or in other commercial flying if that crewmember's total flight time in all commercial flying will exceed--

      (1) 1,000 hours in any calendar year;

      (2) 100 hours in any calendar month;

      (3) 30 hours in any 7 consecutive days;

      (4) 8 hours between required rest periods.

      (b) Except as provided in paragraph (c) of this section, no certificate holder conducting domestic operations may schedule a flight crewmember and no flight crewmember may accept an assignment for flight time during the 24 consecutive hours preceding the scheduled completion of any flight segment without a scheduled rest period during that 24 hours of at least the following:

      (1) 9 consecutive hours of rest for less than 8 hours of scheduled flight time.

      (2) 10 consecutive hours of rest for 8 or more but less than 9 hours of scheduled flight time.

      (3) 11 consecutive hours of rest for 9 or more hours of scheduled flight time.

      (c) A certificate holder may schedule a flight crewmember for less than the rest required in paragraph (b) of this section or may reduce a scheduled rest under the following conditions:

      (1) A rest required under paragraph (b)(1) of this section may be scheduled for or reduced to a minimum of 8 hours if the flight crewmember is given a rest period of at least 10 hours that must begin no later than 24 hours after the commencement of the reduced rest period.

      (2) A rest required under paragraph (b)(2) of this section may be scheduled for or reduced to a minimum of 8 hours if the flight crewmember is given a rest period of at least 11 hours that must begin no later than 24 hours after the commencement of the reduced rest period.

      (3) A rest required under paragraph (b)(3) of this section may be scheduled for or reduced to a minimum of 9 hours if the flight crewmember is given a rest period of at least 12 hours that must begin no later than 24 hours after the commencement of the reduced rest period.

      (4) No certificate holder may assign, nor may any flight crewmember perform any flight time with the certificate holder unless the flight crewmember has had at least the minimum rest required under this paragraph.

      (d) Each certificate holder conducting domestic operations shall relieve each flight crewmember engaged in scheduled air transportation from all further duty for at least 24 consecutive hours during any 7 consecutive days.

      (e) No certificate holder conducting domestic operations may assign any flight crewmember and no flight crewmember may accept assignment to any duty with the air carrier during any required rest period.

      (f) Time spent in transportation, not local in character, that a certificate holder requires of a flight crewmember and provides to transport the crewmember to an airport at which he is to serve on a flight as a crewmember, or from an airport at which he was relieved from duty to return to his home station, is not considered part of a rest period.

      (g) A flight crewmember is not considered to be scheduled for flight time in excess of flight time limitations if the flights to which he is assigned are scheduled and normally terminate within the limitations, but due to circumstances beyond the control of the certificate holder (such as adverse weather conditions), are not at the time of departure expected to reach their destination within the scheduled time.

    11. Re:Fire away! by Anonymous Coward · · Score: 0

      The computer system in question runs AIX.
      That was the old one. They did the switch after July 4'th to a new one that was designed by EDS.

    12. Re:Fire away! by Anonymous Coward · · Score: 5, Informative
      No. It is the version of SBS that pre-dated Maestro. It was brought into Comair in the early 1980's. It's written in FORTRAN and uses whatever record managment system that came with the compiler.

      As such it used some very interesting data representations. For example, it tracked time using julian minutes. There are 44640 minutes in a 31 day month. That's small enough to fit in a 16-bit unsigned variable. This approach, nearly taboo by modern standards, was a God-send during Y2K. The system never needed to know what year it was. It became the running wisecrack, "You can't have a Y2K problem if you don't have a 'Y'".

      The Aircraft to Flight assignments is another system, but the two share information.

    13. Re:Fire away! by Anonymous Coward · · Score: 1, Insightful

      Ahh. I'm surprised the "pre Maestro" stuff still
      exists. In fact, I think SBS's preferred platform for the older stuff was Ultrix. If
      COMAIR waited this long to address replacing this
      ancient FORTRAN spiderweb, they made their own
      bed. I think SBS released Maestro to replace that stuff in 1993 or so.

    14. Re:Fire away! by pVoid · · Score: 2, Interesting
      I don't think they keep a SQL transaction running for as long as the flight hasn't taken off.

      SQL transactions generally last seconds and involve operations like "open tr, is there space in this flight?, reserve space, close tr". Not "open tr, wait for flight to fill up, close tr". Rescheduling or canceling flights probably isn't accomplished using transactions: it's application level logic.

      My personal diagnosis: I think it has nothing to do with the backlog, and that the system just melted under high strain (of millions of people trying to book other flights). Either that, or they ran out of disk space.

    15. Re:Fire away! by theonetruekeebler · · Score: 1
      According to TFA (please R), the problem appears to be with the software used to reassign crews to planes, not the reservation system. Assigning crews works like this:
      1. here is a list of flights that need crews
      2. here are the unassigned crewmembers
      3. mash them together into a fine paste.

      Step 3 is the one that can be an O(2^n) problem of assigning weights to different crew/flight combinations. Even with a very clever set of heuristics the problem is at the very least still an O(n^2) or more likely (O(n^2*lg(n)) with an unconscionably high coefficient.

      There are an amazing number of variables:

      • where are the flights going from/to
      • where are the crewmembers currently located
      • cost of deadheading crew
      • which crew is rated for what type of equipment
      • which crew are already assigned
      • which multirated crew can/should be reassigned
      • how close are crew to their legal flight time limits
      • which flights will not put those crew over their limits

      As you can see, this sort of thing tends to stack up, and involves building lots of intermediate data. If you feed 500 flights into a system designed to handle 20 or 30 at a time, well, the problem is somewhere between 275 and 625 times more complex for O(n^2) and 1300 times more complex for O(n^2*lg(n)) and while building the cost matrix for the flight assignments yes they ran out of fucking disk space. Do you not know what a rollback segment is? It's what makes you run out of disk space while updating a table 1300 times larger than you thought it would be. Sheesh.

      --
      This is not my sandwich.
    16. Re:Fire away! by pVoid · · Score: 2, Insightful
      Do you not know what a rollback segment is? It's what makes you run out of disk space while updating a table 1300 times larger than you thought it would be

      Yes, but you pretty much spelled out what my point was in that the n^2 complexity issue is unrelated to transactional operations. That is, a transaction is a transaction, it is scalable, so it doesn't matter whether the actual operation for computing stuff is O(n^2), the transaction is still a fixed cost. On a side point: I don't agree that because the problem is 1300 more complexe, the updates are 1300 times bigger. The problem is still based on n elements: it just happens that computing the solution of a problem with n elements takes n^2 time... the end result though is still n elements to update.

      That being said, I am fairly confident modern relational databases are scalable to the point of being able to handle a 500 fold increase (if only by simply slowing down to a crawl - but not crashing). If anything, it's probably internal application logic that wasn't able to handle the added computational complexity and at a certain point hit a hard limit of its scalability (some fixed sized arrays, or indexes of some sort).

      My comment about 'ran out of disk space' was more in the lines of "it's either an application fault, or something mundane like someone forgot to check if they had sufficient disk space" (something which can happen anytime due to neglect)

    17. Re:Fire away! by Anonymous Coward · · Score: 0

      for it's database

      "its".

      where in which locations

      "were".

    18. Re:Fire away! by theonetruekeebler · · Score: 1
      I don't agree that because the problem is 1300 more complexe, the updates are 1300 times bigger.
      More precisely, the complexity is a function of the number of unassigned crew multiplied by the number of unassigned flights. In a typical solution, matrix is built, costs are assigned to each combination, and assignments are built from there. Certain heuristics can keep ludicrous combinations out of the matrix, but the matrix is still going to be big. Since this is an RDBMS-based application, cost setting goes like this:
      UPDATE crew_cost_matrix SET distance=get_distance(crew_id, flight_id);
      Updating the entire table in one pass. So: If the table is big enough, the rollback segments (or something else) explode.

      I will cheerfully concede, however, that the process could require O(n^2) computation time, thus bringing the system to a halt that way, before the disks even light up.

      --
      This is not my sandwich.
    19. Re:Fire away! by pVoid · · Score: 1
      Agreed.

      (always makes me feel warm inside that you can have logical conversations every once in a while on /. =)

    20. Re:Fire away! by theonetruekeebler · · Score: 1

      Yeh. Sorry I got snippy earlier. New baby in the house. Sleeplessness affects mood.

      --
      This is not my sandwich.
    21. Re:Fire away! by Anonymous Coward · · Score: 0
      all good man. =)

      -pVoid

    22. Re:Fire away! by Anonymous Coward · · Score: 0

      It wasn't Maestro Crew. It was something much older that is being replaced.

    23. Re:Fire away! by TechSoft04 · · Score: 1

      I live in Cincinnati and we have storms like this every few years. Comair has been lucky in prior years that this application hasn't choked. My hunch is that Comair has been growing in the past few years and has added more pilots and stewardesses which caused the application to meet that critical breaking point in a storm like they had. But, to not have any type of contingency plan in place is insane! Even a hot site or live backup server could have helped them in regards to limiting how many changes could be made after the crash. Load balancing hardware and software could have helped. I have worked with hundreds of IT departments in large companies over 22 years and having a fail over system and a contingency plan has been priority 1. IT shops understand how much it costs their companies to be down just one hour. No matter what software or hardware Comair is using, there is no excuse not to have redundancy.

    24. Re:Fire away! by Anonymous Coward · · Score: 0

      Oh god fuck sita. Those guys are stupid. And crazy. Never call them, you get fwd'd to India then to Pakistan then to Atlanta then to Africa.

      Those guys should be avoided like the clap.

    25. Re:Fire away! by Tassach · · Score: 1

      That's because once you start a highly technical thread, the 14 year olds can't even understand the conversation enough to interject a comment.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
    26. Re:Fire away! by pVoid · · Score: 1

      Ahh. Man, look. I'm really not a "told you so" kind of guy, and I'm just being jovial here, but it turns out I was right. It *was* an internal scalability issue after all! =)

    27. Re:Fire away! by theonetruekeebler · · Score: 1
      Rather than fret that my theory didn't pan out, I'd just like to say

      BAAHAHAHAHA! What'd they prototype this on, a Gameboy?

      TFA says there was a hard-limit of "32000". I wonder if they mean a signed 16-bit counter or a #define TABLE_MAX 32658. Either way, Ewww.!

      --
      This is not my sandwich.
    28. Re:Fire away! by ddusza · · Score: 1

      Rats! I couldn't get my FAR/AIM 2004 out quick enough for this story! Don Student--PPSEL 44.5 hrs and holding....

      --
      Don't fear the penguins
    29. Re:Fire away! by pVoid · · Score: 1
      I know man. Thing is, I'm sure this is in some obscure part of their mainframe system, and that code was written while Pan-Am was still around.

      I wouldn't be surprised if they actually didn't have the source anymore. But I'm sure they do now, since they're a responsible company, right? =)

    30. Re:Fire away! by Anonymous Coward · · Score: 0

      The writer above is correct. Comair uses SBS' old crew scheduling system TRACK which has a limit of 32,780 trips in the database. Whenever a trip is modified, a new trip with a new trip number is created. (this allows the company to see the history of a pairing). When the trip limit is reached, the system does not accept the modifications. Comair's problems probably started on the 23rd or 24th and they soon realised they did not know where their crews were and hence had to shut down on the 25th.

      As for Comair, they have an excellent reputation in the industry for using IT agressively and well. Many would rank their IT department as one of the best. Compared to the majors IT systems, they are very sophisticated. (United has a hard limit in its flight attendant cerw scheduling system of 32,000 total flight attendants for the same reason as SBS)

      Comair has wanted to replace the SBS for many years but no adequate replacement could be found. Late 2003, they cut a deal to replace SBS with Sabre's AirCrews system. This system should go online in January. (No one cuts over to a new system in December - who wants to take that risk)

      Don't blame Comair or SBS, Stuff Happens - we're all human

  2. Happens all the time... by Anonymous Coward · · Score: 5, Interesting

    When I lived in Chicago, they would lose their radar system on what seemed like a strong wind. And I got stuck in Denver overnight once because the computer system they use to calculate the weight of departing flights crashed. I have a feeling these kinds of crashes are much more common than most people think.

    1. Re:Happens all the time... by hughk · · Score: 4, Informative
      I have a lot of friends working at a large airline.

      Yes, but it is mostly recoverable. The heavy iron handles things like backend reservations, checkin and cargo. Smaller systems handle things like weight/balance and fuel and PCs are typically used for the front-ends.

      Weight/balance calcs can be done more or less by hand if necessary, however a larger fuel margin is needed. Checkin can be done by hand (you have seen those sticky label systems). However to lose reservations is a major problem.

      --
      See my journal, I write things there
    2. Re:Happens all the time... by dattaway · · Score: 1

      Apparently, this kind of crash is recoverable, but I wouldn't feel good about it happening.

    3. Re:Happens all the time... by Greyfox · · Score: 4, Interesting
      From looking at the various terminals that the airline people use, I suspect that most of those airline systems are held together with duct tape and library paste and no one really understands how the whole system works anymore. We see that a lot in non-IT industries (And a few IT ones, too.) Of course, the folks using the IBM ones are not ever supposed to go down...

      I moonlighted as an AS/400 operator for a cruise line for a while. We had the system go down once because the janitor turned off the air conditioner in the closet the AS/400 lived in. They didn't dedicate a more secure facility for the computer because the computer wasn't demonstrably central to how the company made money. Turns out they couldn't launch a ship without it. Oops. I suspect that mentality is also prevalent throughout the non-IT industries. They don't know how important their computers are to their business models until those computers die on them.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    4. Re:Happens all the time... by Rosonowski · · Score: 2, Funny

      Wow. I would hate to be the one sitting there when that happened.

      There's some...thing on the ... wing

      --
      01101001 01100001 01101101 01101110 01101111 01110100 01100001 01101100 01100001 01110111 01111001 01100101 01110010
    5. Re:Happens all the time... by Anonymous Coward · · Score: 0

      What the hell is library paste? Paste to fix book bindings?

    6. Re:Happens all the time... by Anonymous Coward · · Score: 0
      There's some...thing on the ... wing
      Surely it would be:
      There's... something... on the wing.
    7. Re:Happens all the time... by budgenator · · Score: 2, Insightful

      Not to hard to imagine, I see a system that's a combination of Fortran 66, cobol, and C all sort of working together over the years. All parts have had numerous patches and changes applied over the years until no one understands it anymore with each interation making the system more fragile. Now they are lucky if they have the source code for the current build.
      Each time the industry is making money and IT is flush a project is started, to examine all the code in the system and refactor and rewrite to modern standards, and each time the project gets just past the planning phase the economy takes a dump and the team get laid off.
      Now that the problem has had an economic impact on the company, the PHB is going to send it off to India, to some kids 6 months out of college who is going to have to google the internet for the meaning of a GOTO statement, used in the million lines of code that is older than he is.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    8. Re:Happens all the time... by TopShelf · · Score: 4, Funny

      I used to work with a guy who at one time was an HP3000 operator back when those things were as big as your average washer/dryer combo. His shop had about a dozen of these things, and one night he and a buddy were playing frizbee with the circular write-protect rings that were used on the reel-to-reel tape drives.

      Sure enough, his buddy whipped one at his head, and as he ducked out of the way, he fell back and by accident hit the power switch located on the back of one of the HP3000's. In an instant, all the ticket terminals for one airline (I can't recall which one) at O'Hare airport went down, prompting a frantic call from VP's wondering what disaster had struck. So who knows what could have happened this time around...

      --
      Stop by my site where I write about ERP systems & more
    9. Re:Happens all the time... by Anonymous Coward · · Score: 0

      Not if there were some thing on the wing

    10. Re:Happens all the time... by Anonymous Coward · · Score: 0

      My ship ... my ... crew.

    11. Re:Happens all the time... by phil+reed · · Score: 2, Insightful

      Of course, the folks using the IBM ones are not ever supposed to go down...
      There's a difference between the machine crashing and the application crashing.

      --

      ...phil
      "For a list of the ways which technology has failed to improve our quality of life, press 3."
    12. Re:Happens all the time... by Anonymous Coward · · Score: 0

      held together by years old patch and paste legacy big iron mainframe or unix systems.

    13. Re:Happens all the time... by Anonymous Coward · · Score: 0

      as opposed to some one?

    14. Re:Happens all the time... by arodland · · Score: 1
      Don't you mean:

      There's a man! on the wing of the plane!

    15. Re:Happens all the time... by Rosonowski · · Score: 1

      Ok, I'll admit. It's been so long since I've actually seen the movie that I'm just working off of faded memories and parodies, so you might be right.

      --
      01101001 01100001 01101101 01101110 01101111 01110100 01100001 01101100 01100001 01110111 01111001 01100101 01110010
    16. Re:Happens all the time... by A+Naughty+Moose · · Score: 1
      as opposed to some one?


      No, as opposed to some thing. As in the Twilight Zone episode Nightmare at 20,000 feet, where William Shatner played a guy who got a window seat on a plane and saw some thing messing around on the wing, presumable to trying to make the plane crash.
    17. Re:Happens all the time... by Coniagas · · Score: 2, Funny

      without mentioning names I have also worked on several airline systems on a contract basis. Two years ago I was asked to look at a problem with flight ops and was shown a 486 DX2/80 running Novel 2.11. I was told to just patch it till they could look at replacing the system. I did and a few months ago I was in the same office and was asked to look at the flight ops server that was "burping". They had upgraded to a P2-400 and still runnig Novel 2.11.

      I was told this was a major upgrade. Some things never change.

    18. Re:Happens all the time... by Greyfox · · Score: 2, Insightful
      Funny how you never really hear about the applications written in COBOL, Fortran and PL/1 crashing. You get the impression that all those applicatons run for years at a time without so much as a hiccup. It's only with the invasion of GUIs and "modern" design techniques and languages that you start hearing about crashes like this. Granted the newer applications tend to be more ambitious about what they do...

      I'd love to see some uptime numbers for past systems versus the systems we have today. I wonder if they'd show the downward trend that I suspect they would.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    19. Re:Happens all the time... by hughk · · Score: 0

      Write-permit rings or write rings are light and don't actually fly well, so I doubt they were playing with that as it is unlikely to mass enough to flip the switch. More likely frisbee with the 9-track canisters.

      --
      See my journal, I write things there
    20. Re:Happens all the time... by Kristoffer+Lunden · · Score: 1

      Mr Smith: "There is a gremlin destroying the plane! You've gotta believe me!"
      Speaker: "Why should I believe you? You're Hitler!"

    21. Re:Happens all the time... by myov · · Score: 1

      A few months back, Air Canada shut down for at least a day because the system which calculates fuel requirements (and outsourced to IBM) went down. I thought it was due to an upgrade, but I could be wrong.

      --
      I use Macs to up my productivity, so up yours Microsoft!
    22. Re:Happens all the time... by Anonymous Coward · · Score: 0

      Thanks Einstein. Next time, read the whole reply thread before sounding like comic book guy. First off, thanks so much for the 411 on the TZ episode. I can see how you would assume someone reading /. doesn't know their TZ episodes.

      Second, the entire reply thread was regarding the pause being on some...thing or ...somthing. Putting the pause as some...thing infers it is either someone or something. The other way works better.

      So what, I hear you saying. Well, I guess my whole point is if it doesn't interest you, at least have the courtesy to read the entire thread before sounding like an ass munch.

    23. Re:Happens all the time... by HiThere · · Score: 2, Interesting

      There were many of them that did, however, crash. But the reason you don't hear about it much is that most of them weren't designed to be running all of the time, but only occasionally. If one crashed (and was a known good program) you'd just re-run it. Frequently that was your only choice, as you might not have anything but the binary. (Sloppy contracts often left consultants with the only copy of the source.)

      I did hear of one company that went out of business because their accounting system was written in a combination of those languages (plus a bit of assembler, and some binary patches). When it was done, they let the consultants go. A few years later the consultants didn't have a copy of the source anymore, and some tax law changes took effect. Oops! (That's not exactly a crash, but it wiped out the whole company.)

      OTOH, when I was writing fortran I had frequent crashes. I never got programs as solid as I later did with C. But they were "good enough". (Actually, a bit better than good enough. I was criticized for "gold plating" code that didn't need it.)

      A new degree of error frequency, however, entered with dynamic memory allocation. This allowed memory leaks that had previously been the provice of the compiler (and assembly language subroutined). One must write very diciplined C code to avoid memory allocation problems unless you just don't do dynamic memory allocation. And as multi-tasking operating systems became common it also became more common to have interaction problems. Etc.

      But I can guarantee you that it's quite as possible to have those problems with PL/1 if you use a multi-tasking OS. And likewise if you use Java or Python, or similar language with constraints on pointer use you can avoid those problems. (This doesn't get rid of other problems. Thread syncronization problems are still problematical ... though you might check out Erlang or inferno. I think they both claim to have general solutions. [The Erlang solution has been ported to Python under the name of Candygram, but I haven't checked it out yet.])

      But if you haven't heard of the older program failing, it's because they are older, and the flakey ones have been retired or repaired.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    24. Re:Happens all the time... by jjp5421 · · Score: 1

      Preeching to the choir bro!!!

    25. Re:Happens all the time... by randombit · · Score: 1

      Funny how you never really hear about the applications written in COBOL, Fortran and PL/1 crashing.

      Not really. They were probably written 20 or more years ago, there has been plenty of time to catch most of the bugs. Not a function of language, techniques, or skill, just time and use. And stuff written in COBOL and PL/1 probably doesn't get touched much, so no new bugs.

      I'd love to see some uptime numbers for past systems versus the systems we have today. I wonder if they'd show the downward trend that I suspect they would.

      On average, sure. Most servers don't stay up for more than a few months, a couple of years at best. But they also cost a lot less than an S/360 did in 1965. You want a box that will stay up, buy a Tandem or something.

    26. Re:Happens all the time... by arodland · · Score: 1

      Or we're talking about different things. But I still like mine better. Best... Shatner... Ever!

    27. Re:Happens all the time... by some+guy+I+know · · Score: 1
      In addition to what the other two responders wrote, the requirements for present-day software are higher than those of 20 years ago:
      • Applications are expected to interact with other applications.
      • Tax laws and regulations are more numerous and Byzantine than they used to be.
      • The people who use the software tend to get less training than they used to, making it more likely that they will do something stupid.
      • Management is less in awe of and respects less the people who run the machines, and, with rapidly decreasing profit margins (and even without them), are likely to under-allocate the resources needed to do a project well.
      --
      Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
    28. Re:Happens all the time... by Greyfox · · Score: 1

      Hmm. That's interesting. By the way, I see a lot of Java programs "crash" with uncaught NullPointerExceptions. Just because Java eliminates the burden of dealing with dynamic memory allocation doesn't mean your programs magically become crashproof in that language either. It takes a sloppy programmer, but that's most of the programmers out there.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    29. Re:Happens all the time... by Anonymous Coward · · Score: 0
      Putting the pause as some...thing infers it is either someone or something
      You mean implies, yes? Sheesh.
    30. Re:Happens all the time... by Hard_Code · · Score: 1

      NullPointerExceptions are a hell of a lot more recoverable than say, segfaults, or worse yet, corruption that keeps a system nominally "working" but in a corrupted state, e.g. an int wrap around, etc.

      That's sort of like saying "If you think about it, you can put all the armor in the world on a tank and it can still be blown up".

      --

      It's 10 PM. Do you know if you're un-American?
    31. Re:Happens all the time... by aminorex · · Score: 1

      C'mon, insightful? Ludicrous. Wait for 30 years,
      to allow all the easy prey to die off, and the same
      could be said for Java apps, or C#, or OCAML, whatever.

      Selection of the fittest, man. COBOL, Fortran,
      and PL/1 died off because they could not compete.
      The apps they compiled live on, but only while they are competitive.

      --
      -I like my women like I like my tea: green-
  3. Official my arse... by Omicron32 · · Score: 4, Insightful

    Sounds like my Mother wrote the official statement. A techy would never report something in that way.

    Besides, it's pretty obvious their OS wasn't digitally signed. :p

    1. Re:Official my arse... by Saven+Marek · · Score: 2, Informative

      You know I think it was. btw the system being used by Comair?

      Its one of SCO's last large scale deployments. You know who to blame now.

      Online Anime Gallery's

    2. Re:Official my arse... by Anonymous Coward · · Score: 0

      Yeah Microsoft :)
      This is Sheepdot after all

    3. Re:Official my arse... by AKnightCowboy · · Score: 1
      The Comair system runs on Linux using an IBM DB2 backend. No wonder it crashed. Linux isn't built to handle that kind of load. The Windows 2000 Server system they were previously running with MS-SQL handled last year's Christmas rush with no problems.

      /pulled that out of my ass, Merry Post-XMAS day! :-)

  4. Someone's gotta say it... by mOoZik · · Score: 3, Insightful

    Yep, it was Windows XP. ;)

    I don't know. Frankly, it has less to do with the platform than the custom software that runs on it.

    1. Re:Someone's gotta say it... by Lamieur · · Score: 0

      The truth was: PEBKAC :)

      What difference does the OS make? It's not hard to guess it's too much money for them to lose upon OS crash/hardware failure, thus they surely have backup hardware running in parallel. It's the custom software that is buggy and causing all this mess.

    2. Re:Someone's gotta say it... by jcr · · Score: 2, Interesting

      Well, judging by the IT jobs they're advertising on their web site, it looks like a combination Windows/Linux/UNIX shop.

      At any rate, I suspect they'll be looking for a new IT director Real Soon.

      -jcr

      --
      The only title of honor that a tyrant can grant is "Enemy of the State."
    3. Re:Someone's gotta say it... by Anonymous Coward · · Score: 0

      Looks like a new COO. Something like this would almost seem to have to be an inevitabel result of refusing to upgrade/maintain infrastructure due to poorly considered efforts at cost control. Naturally, he'll be sent off with a huge bonus to serve as an officer for other compaines and probably ruin a smattering of startups who chose to go the venture capital route.

    4. Re:Someone's gotta say it... by mOoZik · · Score: 1

      I agree. Either the system was never thorougly tested or there was a weak link that went undiscovered. In any event, heads should roll, as 30,000 people were affected and it resulted in a lot of lost revenue for the airline.

    5. Re:Someone's gotta say it... by adeydas · · Score: 1

      just the word out of my mouth. it was most probable the software or a hardware glitch that could have brought the system down...

    6. Re:Someone's gotta say it... by lachlan76 · · Score: 1

      At any rate, I suspect they'll be looking for a new IT director Real Soon.

      At what point did the head sysadmin become responsible for finding bugs in the code?

    7. Re:Someone's gotta say it... by Brando_Calrisean · · Score: 1


      At what point did the head sysadmin become responsible for finding bugs in the code?


      At what point did IT Directors become 'head sysadmins'? In my experience, IT Directors are executive-types responsible for the entire IT infrastructure of an organization - and this failure would certainly fall on them.

      --
      Don't call me a cowboy, and don't tell me to slow down!
    8. Re:Someone's gotta say it... by Pharmboy · · Score: 3, Insightful

      You would think so. The IT Director is respsonsible for making sure everything IT works. Not to do it himself, but to make sure it is done and done right. I can't see how someone can argue with that. Even if it IS the janitor unplugging the UPS to plug in a floor buffer.

      Whether it is the cooling system for the computers, the operating system, the applications or simple hardware issues, it HAS to be the IT Director's responsibility. I mean, who the hell else?

      --
      Tequila: It's not just for breakfast anymore!
    9. Re:Someone's gotta say it... by Antique+Geekmeister · · Score: 4, Insightful

      Occasionally, however, the head IT guy gets over-ridden by management or by available finances. I've been there, saying "we need to spend money on this" and having to make do with much less money, or even with a cut in funding. You need to document the problem in advance to cover your ass, and get it in print and saved offsite to protect yourself from that kind of mistake. I've done that, too. It helped protect me from a nasty lawsuit because I demonstrated where I had told a consulting client, in print, when the systems would start failing and the resulting legal liabilities, and gotten it signed by the company notary.

    10. Re:Someone's gotta say it... by Subm · · Score: 1


      They were using a cluster of Nintendo NES's.

      It was working fine until someone told them they could overclock them.

      Even then there were only minor glitches until someone pushed the overclocking to 6MHz.

    11. Re:Someone's gotta say it... by Anonymous Coward · · Score: 0

      I'm sorry but it seams you're wrong.

      Acording to http://www.comair.com/hr/other/ they are running HP-UX and many Novell boxes and trying to get Linux up and running.

    12. Re:Someone's gotta say it... by Anonymous Coward · · Score: 2, Informative

      I've done contract programming for Comair. They use HP/UX for the servers that handle most things except HR which is mostly Windows. The system that went down was a COTS app that handles crew scheduling. It is a bid system that allows crew to bid for flights based on seniority w/ constraints to match FAA rules.

      Their IT director is really sharp, but he faces some real problems. First, IIRC, they only created a dedicated IT shop about three years ago. Second, their budget is small compared to the task they have to perform. Comair is an airline and airlines have been in real trouble since 9/11.

    13. Re:Someone's gotta say it... by Lally+Singh · · Score: 1

      He's the one responsible for testing it :-)

      --
      Care about electronic freedom? Consider donating to the EFF!
    14. Re:Someone's gotta say it... by rah1420 · · Score: 1

      Their IT director is really sharp, but he faces some real problems.

      Is this the first, second or third envelope? :)

      --
      Mit der Dummheit kämpfen Götter selbst vergebens.
    15. Re:Someone's gotta say it... by Marillion · · Score: 1

      I know that most of the facts in the previous post are true, except that the IT director is a SHE.

      --
      This is a boring sig
  5. Re:The system runs Linux by Anonymous Coward · · Score: 0, Offtopic

    So it was a hardware problem. Good call.

  6. It doesn't matter... by Anonymous Coward · · Score: 2, Insightful

    They're a bunch of incompetent boobs. The news keeps reporting on a "computer glitch" or a "computer malfunction". That's bullshit. This happened because some human(s) fucked up.

    1. Re:It doesn't matter... by Anonymous Coward · · Score: 0, Flamebait
      This happened because some human(s) fucked up.

      Yeah, a Micro$oft human.

    2. Re:It doesn't matter... by stfvon007 · · Score: 1

      Computer glitches and malfunctions happen, and that is forgiveable. What makes this so idiotic of the company is they did not have any backups. No competent company would go around without redundancy of critical equipment as well as backups.

      --
      All misspellings and grammatical errors in the above post are intentional and part of my artistic expression.
  7. Bringing the /. effect to the weary masses. by Anonymous Coward · · Score: 2, Funny

    Linking to their home page will surely help the situation..

  8. What kind of system just totally crashes? by Anonymous Coward · · Score: 0

    What kind of system just totally crashes?

    Wow. What a seemingly trollish attempt at baiting.

  9. My theory? by Ckwop · · Score: 4, Funny

    The janitor pulled out the plug for the mainframe and used it to drive is floor polisher..

    Simon.

    1. Re:My theory? by Zorilla · · Score: 1

      Nerd (Doug): We need the outlet for our rock tumbler.
      Bart & Lisa: PLUG IT IN! PLUG IT IN!
      Nerd (Doug): What, the rock tumbler or the TV?
      Bart & Lisa: THE TV! THE TV!

      (Itchy and Scratchy theme plays, Krusty comes back on)

      Krusty: WOW! They'll never let us show that one again... never in a million years!

      --

      It would be cool if it didn't suck.
    2. Re:My theory? by bcmm · · Score: 3, Funny
      BOFH excuse #38: secretary plugged hairdryer into UPS.
      That is where you got the idea for that post, right?
      --
      # cat /dev/mem | strings | grep -i llama
      Damn, my RAM is full of llamas.
    3. Re:My theory? by Anonymous Coward · · Score: 0

      More likely the tech support forgot to run Windows Updates.

      No Service Pack 2 means no airplane for j00!

    4. Re:My theory? by rlauzon · · Score: 5, Funny

      Probably not. It's an old story (quickly retold):

      Army base computer going down every night. So the grunt in charge of it stayed the night to see what was happening. When the computers went down, he heard the hum of the floor buffer.

      The janitor had plugged his floor buffer into the same power as the computers and it caused the crashes. It was quickly fixed by telling the janitor to not do that and putting locking covers on the power outlets.

      But they dreaded telling the base commander what the issue was. So they told him it was "a buffer problem."

    5. Re:My theory? by caluml · · Score: 1

      It does happen. Even night at 8pm at a web hosting company I used to work at, all the sites on a server would go down, and come back in 5 minutes. It was a cleaner using a socket.

    6. Re:My theory? by budgenator · · Score: 1

      Yes VIKI, your logic is inarguable, we'll just run windows update, and convert all the legacy systems to .NET.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    7. Re:My theory? by TheWanderingHermit · · Score: 1

      When I was just starting my business, my system was developed on a Linux box and I had to put one in each client's office (until, eventually, I had a Java program that was multi-platform). I hadn't done a lot of programming (almost none in 10 years), but the one smart thing I did was make sure my program logged everything it did, with the time it did it.

      I had a system in an office 2 hours away that went down the same night every week. After checking the logs, an idea hit me and I called the client.

      ME: Do you ever work late in that office?
      CLIENT: Sometimes.
      ME: When does the cleaning staff get there?
      CLIENT: What's that got to do with it?
      ME: I'll explain in a minute.

      It turns out the cleaning staff, once a week, was coming in and turning off the switch on the powerstrip that fed my system. I had them put tape over the switch and add a warning note to it. It never happened again.

    8. Re:My theory? by jridley · · Score: 5, Funny

      A friend was sysadmin at a manufacturing plant, and the janitor kept plugging into the power conditioned sockets with a very large, power-hungry floor polisher. He was actually blowing power supplies. Every one cost several thousand dollars in service calls to replace the power supply and downtime.

      My friend put "COMPUTER USE ONLY" stickers OVER the power-conditioned sockets. The janitor ripped them off to plug in, and blew another power supply.

      My friend finally confronted the janitor, who was a really obstinate PITA. He stood there and said "Yeah, I did it, and I'm gonna keep doing it, and I don't give a damn about you or your fu*kin' computers."

      This was a automotive union shop, very difficult to get people fired.

      But, in a show of karma rarely witnessed by mortals, the VP of the division was standing within earshot but out of sight. When the janitor finished saying he didn't give a damn that he was costing the company $10,000 a week because he was too lazy to go get an extension cord, the VP walked around the corner and said hi. I don't know whether the guy ran to his car or the VP kicked his ass right over the top of it.

    9. Re:My theory? by Decimal+Dave · · Score: 1

      This happened to me once, except the janitor didn't have to unplug anything. What he did was plug the floor waxer into an open socket on the UPS that fed our editorial system. The waxer overloaded the power supply and it shut down. It took us a while to determine exactly why the UPS failed since the diagnostics said everything was ok. Then we noticed how clean the floor was...

      --

      "Leave the strategizing to those of us with planet-sized brains." -Tycho
    10. Re:My theory? by itwerx · · Score: 1

      ...logic is inarguable

      VIKI actually said, "My logic is undeniable."

    11. Re:My theory? by Anonymous Coward · · Score: 1, Funny

      We saw the same problem a few years ago, in DB2 Support. Customer kept getting a system crash around the same time of day, nothing of consequence in either the db2diag.log or the system logs. The customer couldn't recreate, and traces were getting us nothing. Eventually, as the customer kept blaming us (of course) and went critsit, we asked him to physically monitor the system terminal to see what was going on. As he was on a conference call, a janitor came in and (IIRC, got this story second-hand) plugged his vacuum into the UPS circuit.

    12. Re:My theory? by Anonymous Coward · · Score: 0

      This was a automotive union shop, very difficult to get people fired.

      But, in a show of karma rarely witnessed by mortals, the VP of the division was standing within earshot but out of sight. When the janitor finished saying he didn't give a damn that he was costing the company $10,000 a week because he was too lazy to go get an extension cord, the VP walked around the corner and said hi. I don't know whether the guy ran to his car or the VP kicked his ass right over the top of it.

      Oh, you implied that the guy's job was secure by the company's contract with the union, but then the VP could kick his butt?

      Sounds like a tall tale to me.

      Most union/employer contracts are very clear about rule breaking - break the rules and you're fired or reassigned.

      If the VP could fire him, so could his boss.

    13. Re:My theory? by ces · · Score: 1

      We saw the same problem a few years ago, in DB2 Support. Customer kept getting a system crash around the same time of day, nothing of consequence in either the db2diag.log or the system logs. The customer couldn't recreate, and traces were getting us nothing. Eventually, as the customer kept blaming us (of course) and went critsit, we asked him to physically monitor the system terminal to see what was going on. As he was on a conference call, a janitor came in and (IIRC, got this story second-hand) plugged his vacuum into the UPS circuit.

      I've seen similar in person. At the university lab I used to spend way too much time in back in the late 80's and early 90's I watched a cleaner plug his vaccum into the same circuit as a bunch of the lab's X-terminals, macs, and NeXTs were on. There was a 'pop' sound and about 30 X terminals, 15 macs and 10 NeXTs all suddenly lost power. The resulting power surge killed a number of devices that happened to be on the same circuit along with many others that were coupled to them electricly via 10b2 cable outright. For at least a year afterward devices would die and when pulled apart would show evidence of arcing on the circuit board from a power surge.

      Needless to say the university managed to rewire the lab areas in that building with outlets off of the big power conditioner in back for the datacenter gear in record time. Fortunately the janitors already knew better than to plug into the special orange outlets if they wanted to keep their jobs.

      --
      Happy Fun Ball is for external use only.
    14. Re:My theory? by jridley · · Score: 1

      The point was that there was an exceedingly credible witness. There were others there, but everyone else around at the time was union and would not have backed up my friend's story. He just got lucky that there was management around that the guy didn't spot before confessing loudly and obstinately.

      Certainly his boss could have fired him. His boss being around the corner would have been just as good, but having a corporate officer there just made it that much sweeter.

    15. Re:My theory? by oh · · Score: 1
      BOFH excuse #38: secretary plugged hairdryer into UPS.
      That is where you got the idea for that post, right?

      This may sound stupid, but it happens.

      I was once helping to move the computer of the CIO's PA, and I she had a little 2000Kw heater under her desk. Of course, she didn't know that there was a difference between the red power points and the white ones. She was quite surprised when I explained that the red points are reliable power, and are generator backed, she had just used the closest point. (Which was red).

      In another company the generator runs were performed over a weekend, with advance notice given to staff. So food left in the fridge wouldn't go off people would simply plug the fridge into reliable power rather then clean it out for the weekend.

      There are other stupid things, like running the Aircon off the UPS rather then just the generator (if the generator doesn't start you will drain the UPS before the room gets too hot). You may have UPS power, but will your door entry system work? Getthing power right can be hard.

      --
      Democracy isn't about no one telling you what to do. It's about everyone telling you what to do.
  10. stating the obvious by Anonymous Coward · · Score: 5, Insightful

    "Does anyone know what platform their system was based on? What kind of system just totally crashes?"

    A stab in the dark here but I'm assuming a system without foresight and redundancy?

    1. Re:stating the obvious by Anonymous Coward · · Score: 0

      that is fair in this day and age, but when was the original software written? My guess is back in the 70's before robustness, fault tolerance... No matter what, it SHOULD be upgraded and fixed!

  11. It's obvouis... by bcmm · · Score: 3, Funny
    What kind of system just totally crashes?
    Oh come on...
    That doesn't need answering.
    --
    # cat /dev/mem | strings | grep -i llama
    Damn, my RAM is full of llamas.
    1. Re:It's obvouis... by Zorilla · · Score: 1

      Oh come on...
      That doesn't need answering.


      Damn! We warned them to test KDE 3.3 out before upgrading!

      (Ok, so just more obnoxious than anywhere near fatal)

      --

      It would be cool if it didn't suck.
    2. Re:It's obvouis... by Anonymous Coward · · Score: 0

      Aparently its *nix, being SCO the Vendor.

  12. Software. by eightball01 · · Score: 1

    I blame the software. Sounds like a more likely culprit than the OS, even if it is Windows.

    1. Re:Software. by canuck57 · · Score: 1

      I blame the software. Sounds like a more likely culprit than the OS, even if it is Windows.

      I don't doubt it was the software but the real cause was the management of that software component. Was it tested or to save a few bucks was testing avoided to save money. Or were the testers telling management what they wanted to hear rather than the truth. Or perhaps they needed bigger computers or smarter software developers but they got them cheap.

      Don't know Comair too much but it is safe to say they should be looking at the management practices that lead to this if they really want to fix the problem.

      But it is easy to blame the software as it is not vspsblr of Lie a Lot, denial and usually deals poorly with irrational behavior and input.

  13. It was running on SCO Unix... by bani · · Score: 4, Funny

    They obviously didn't take mcbride's "license or we will have you shut down" threats seriously enough.

  14. blaming the system can backfire by ext42fs · · Score: 5, Insightful

    It's not the OS, it's the people behind who's to blame. Yes, stupidity and MSW often go together but in a few years one will probably occasionally see a massive linux outage due to... similarly stupid people.

  15. Scalability and Twelve Step TrustABLE IT by NZheretic · · Score: 2, Interesting

    Sounds like Comair could have used a little virtualized scalability and third party audited builds.
    See Twelve Step TrustABLE IT : VLSBs in VDNZs From TBAs.
    and also The ActiveGrid(TM) Grid Application Server and Grid Computing in general.

    1. Re:Scalability and Twelve Step TrustABLE IT by Anonymous Coward · · Score: 0

      How would that help?
      Probably the grid would be oveloaded at the same time, and would have caused meltdowns in more than one place...
      I can be compared with the credit card companys here in sweden, where the systems is not designed to cope with the christmas shopping.

    2. Re:Scalability and Twelve Step TrustABLE IT by hughk · · Score: 4, Insightful
      No, its more difficult in the airline industry. The system by default tries to keep as many planes in the air earning money as possible. If you have an outage which disrupts this choreography, there is a tremendous knock-on effect as passengers/urgent cargo must be rebooked.

      I have seen the major hub for an airline closed because of snow for just a couple of hours in the early morning, but the resulting chaos of rescheduling/rebooking caused the reservations system to crash after just a few minutes of uptime. The same would keep happening after restarts.

      It is normal to test system up to several times normal load, but they were seeing peaks at over 100x. The old, 3270 emulator based system would have slowly got through it but the newer system died.

      --
      See my journal, I write things there
    3. Re:Scalability and Twelve Step TrustABLE IT by gl4ss · · Score: 1

      they aren't building them for normal use so why didn't they test it under the chaos that comes when there is downage?

      it's not an excuse to miss research on what the system could be hit with.

      --
      world was created 5 seconds before this post as it is.
    4. Re:Scalability and Twelve Step TrustABLE IT by Rebar · · Score: 1

      The old, 3270 emulator based system would have slowly got through it but the newer system died.

      Wow, I didn't know that 3270 emulators were even programmable, and surely wouldn't try to base an airline reservation system on them. Seems far better to use something like a mainframe than a grid of terminal emulators, although there must be a few distributed mips there...

    5. Re:Scalability and Twelve Step TrustABLE IT by hughk · · Score: 1

      The old system used keyboard macros but many experienced users just typed the commands completely themselves. The backend ran on heavy metal (Unisys) and that did most of the work. It seems a waste of the PC but the system functioned well.

      --
      See my journal, I write things there
  16. Re:It's OBVOUIS ... by Anonymous Coward · · Score: 0

    Well, no wonder it crashed. the French used an obvouis on it. And since their is no way to translate primitive French dialect into English we will never know how to stop this 'obvouis' again. /end communication.

  17. This is getting a little to common for them. by jhobbs · · Score: 4, Interesting

    Back on May 1st of this year Delta's internal traffic monitoring system grounded them worldwide when it was hit by a worm (forget which one). Yours truly was flying that day. I spent 7 hours on a runway in Cleveland. (Talk about adding insult to injury.) Comair is a regional carrier of Detla's. I wonder who handles Delta's IT needs?

    1. Re:This is getting a little to common for them. by Anonymous Coward · · Score: 0

      A herd of Indian preschoolers shuffling little paper markers around...

    2. Re:This is getting a little to common for them. by sacremon · · Score: 1

      When I was working there as internal tech support about six years ago, there was a dedicated division, named Delta Technology, that did all the development work. That included not only the software that ran on the mainframes and Unix boxen but also the hardware like the ticket scanners that they've implemented in that time. Given how well they knew how to deal with their desktop machines (Win2K Pro boxes), the vast majority of the software developers didn't know squat about Windows. Of course, that doesn't stop them from developing software for it...

      --
      If you can't beat them, embrace and extend them.
    3. Re:This is getting a little to common for them. by Deadstick · · Score: 1
      I spent 7 hours on a runway

      No you didn't...but you may have spent 7 hours on a ramp or taxiway.

      rj

    4. Re:This is getting a little to common for them. by Anonymous Coward · · Score: 0
      I spent 7 hours on a runway

      No you didn't...but you may have spent 7 hours on a ramp or taxiway.

      Do you doubt my modelling credentials? How dare you!

    5. Re:This is getting a little to common for them. by jhobbs · · Score: 1
      No you didn't...but you may have spent 7 hours on a ramp or taxiway.

      Allow me to rephrase. . . I spent 7 whole hours in a non-moving plane that was parked on a concrete surface and filled with just under 200 really pissy people screaming at flight attendants.

  18. But how can that be? by Anonymous Coward · · Score: 0

    I thought SCO Unix runs Linux?

  19. Travel tip by Anonymous Coward · · Score: 1, Interesting

    FAA's Rule 240 says that if your flight gets canceled for any reason other than weather, the airline has to get you on the next available flight to your destination, regardless of carrier. So if you're stuck in an airport bar reading this article go talk to your airline!

    1. Re:Travel tip by xlation · · Score: 5, Informative

      From: http://www.fly.faa.gov/FAQ/faq.html

      The term "Rule 240" refers to a rule that existed before airline deregulation. There is no longer an actual Rule 240. The term, as it is now used, refers to each airlines "conditions of carriage" policy. You would need to contact the airlines to obtain this.

    2. Re:Travel tip by reallocate · · Score: 1

      True, but...It's Christmas, everyone is booked up, and thousands of flights were already cancelled due to weather.

      --
      -- Slashdot: When Public Access TV Says "No"
    3. Re:Travel tip by Anonymous Coward · · Score: 0

      Delta did reschedule my flight on a different airline with no additional charge, despite us having $175 travelocity tickets. I think paying the difference is why they're so eager to get people refunded. Too bad it's a day and a half later, and goes through 4 airports ending at the wrong airport and 1 1/2 hour drive in a rented van.

  20. Re:The system runs Linux by bcmm · · Score: 1

    No, it means they didn't make a big enough swap partition.

    --
    # cat /dev/mem | strings | grep -i llama
    Damn, my RAM is full of llamas.
  21. Failure due to inability/unwillingness to test/QA by Anonymous Coward · · Score: 1, Insightful

    It is not easy to do real world extreme situation testing on large systems, but I wish people would at least try.

    It is fun to say Windows, blah, blah but given the number of buffer overflow problems found in programs/packages on all platforms, I would say that many programmers of every stripe severely underestimate the real world range/type/size of data their programs will encounter when in non-typical situations.

    To whoever wrote/maintains/admins this software:Global "climate change" means weather "events" will be more frequent and more exteme in coming years, another terrorist event on US soil may cause days of air travel disruption. Please "refactor" your shit with those things in mind. You're on the East Coast and Midwest for god's sake you're going to get storms that will shut down regions for days at a time. What happens when the FAA finds some issue with an aircraft part or maint. procedure and grounds your whole damn fleet to have it fixed.

  22. BUG! by Piranhaa · · Score: 1

    Now, I could be wrong, but I heard from a lot of people that it is the year 2005 bug... How did we not see this come?

  23. Re:Failure due to inability/unwillingness to test/ by Anonymous Coward · · Score: 0

    How about trying to reboot the server once and awhile. You know if I was running for 400 days straight my legs would be killing me.

  24. Re:The system runs Linux by Anonymous Coward · · Score: 0

    The day a plane crashes into a control tower from not having enough swap space to run the 'landing' command is the day I chuck my PC out the window.

    *waits patiently*

  25. Platform and software? by MadFarmAnimalz · · Score: 1

    This and this might offer a clue.

    --
    Blearf. Blearf, I say.
    1. Re:Platform and software? by Kalak · · Score: 1

      It doesn't look like the staff scheduling software, unless customers need to use a web interface to schedule their own flight attendants and pilots.

      --
      I am, and always will be, an idiot. Karma: Coma (mostly effected by .hack)
  26. Problem lies with Application Program by mahesh_gharat · · Score: 0, Redundant

    I doubt whether the problem is with hardware or OS. It more seems to be inefficiency of the application program to handle all the cancelled reservation. The programming logic for the cancellation of reservations and assigning them a new reservation time and date seems to be root of this problem.

    They could have avoided the problem if they would have included more rigorous test cases before deployment.

  27. Re:Slashdot this by Anonymous Coward · · Score: 0

    A guide to converting cd to mp3?

    WTF? if thats your site then you ARE a retard and must be whacked.

  28. Scalability on demand and third party servers by NZheretic · · Score: 1
    First of all, it all depends on what are the bottlenecks in the proccessing of the transactions. That is dictated by the combination of the hardware and network bandwidth and overall design of the existing software system. The worst cases are bottlenecks in the design of the software, where all transactions have to pass some/all data through a single proccess/proccessor. If the problem is just hardware scaleabilty or reliability is the problem then grid/cluster computing can help.

    If you choose a standardized virtualized platform then you need not be limited to using in house clusters. Check out ActiveGrid(TM) info page, it includes support for third party distributed hosting provider such as Akamai, . Other providers in the future, will provide massively scaleable systems such as Cray's Red Storm Cluster. All running Linux.

  29. Where did the system fail under stress? by NZheretic · · Score: 1

    Was it the design of the software or the limitations of the hardware? See my post on Scalability on demand and third party servers.

    1. Re:Where did the system fail under stress? by hughk · · Score: 1

      I do not know the nature of the Comair system, but software design is the major issue with systems that degrade catastrophically rather than gradually. Please remember that major airlines used to run with much slower hardware up to the eighties (indeed, much less processing power than my PDA), however they did have very high I/O throughput and intelligent frontends.

      --
      See my journal, I write things there
    2. Re:Where did the system fail under stress? by Anonymous Coward · · Score: 0

      .. based on my experience of acutally an Admin in that shop, and once had root on that system, it would be a flaw in the software. The hardware, probably still a RS/6000 always worked fine, but they had monthly issues with the software application from SBS.

  30. But management saved 13.7% by hiring H1-Visas by Soyobob · · Score: 2, Interesting

    Too bad the airline will go bust because of this. But then all airlines lose are loosing billions except for Southwest.

    1. Re:But management saved 13.7% by hiring H1-Visas by canuck57 · · Score: 1
      I noticed in your subject:

      But management saved 13.7% by hiring H1-Visas

      So maybe upper management will outsource management as they can no longer blame the American worker. Poor I/T and computer management practices are not problems of the workers, it is a problem of management and H1 visa have really nothing to do with it. But poorly skilled American workers need to hone their skills even if the company train programs don't exist as many are poorly skilled for their job titles. I actually met a Senior Web Developer that didn't know which part was the host and domain of a URL and they were American. You can go overseas and get this cheaper.

    2. Re:But management saved 13.7% by hiring H1-Visas by Anonymous Coward · · Score: 0

      ... all airlines lose are loosing billions...

      Hit that preview button!

  31. Crew assigment is a hard problem by rsilva · · Score: 5, Informative

    'There was a cumulative effect with the canceled flights and trying to get crew assigned that caused the system to be overwhelmed.'

    I am only trying to make sense out of the above comment from the official statement above.

    Crew assigment is a hard problem, it is usually an MILP (Mixed Interger Linear Programming) .

    Such problems may be very hard to solve in reasonable time. Maybe (I'm shooting in the dark here) the first delays made the crew assigment problems grow too large for being solved in reasonable time.This would generate a snow ball effect as the assimgment problems would keep on growing maing the system "crash".

    We may never know what really happened but this would be a nice example for my classes :-)

    1. Re:Crew assigment is a hard problem by Anonymous Coward · · Score: 0

      If I could mod you up I would ... you are right it was crew scheduling issues and the program they use. The program doesn't like it when you don't have PEOPLE to put on a plane to fly it and work it. They have only so many CVG based pilots and flight attendants. When you don't get new ones flown in from other flights (due to weather) you run into issues.

    2. Re:Crew assigment is a hard problem by timeOday · · Score: 1

      You can't blame something like this on algorithmic complexity. Finding an optimal solution make require an impractical amount of time, but a workable solution within a few percent of optimal is normally much easier. In the long run, a few percent may mean the difference between life and death for an airline, but you must retain the ability to cope with short-term emergencies, even if it means a lot of scrambling around and some wasted money in the short term. Most businesses can't afford many complete meltdowns like this Comair scheduling disaster.

    3. Re:Crew assigment is a hard problem by sporty · · Score: 1
      Not particularly. You can also use graph theory for this. Once you have a means to convert the data into a graph, and a means to convert something like, vertex colouring back into people assignments, it's as easy as your graph colouring algorithm to be. You can do it either brute force, or via a heuristic in reasonable time with a good enough computer. Even in a huge, huge graph.


      Then it's a matter of feeding the data at a rate so you don't backlog faster than the amount of data you get in.


      If you are wanting an explanation of turning it into a graph, I can spend an hour or so turning the various constraints into one. Just need to know/think about all the constraints, such as, making sure enough staff are moving into the right places to go to a "next flight".

      --

      -
      ping -f 255.255.255.255 # if only

    4. Re:Crew assigment is a hard problem by sporty · · Score: 2

      Gah, I said vertex colouring when I meant network flow. My bad. :)

      --

      -
      ping -f 255.255.255.255 # if only

    5. Re:Crew assigment is a hard problem by coyote-san · · Score: 4, Interesting

      It's far harder than that alone since you also have to get the aircraft back to the right city (many are in the wrong city due to airport shutdowns due to the weather). Obviously you want to optimize the number of passengers carried along for those flights, but at the same time you'll be "burning" allowed worktime for the crew.

      Even worse the crew and aircraft are independent variables. Obviously you need a crew to operate a flight, but the crew may end up in the "wrong" city for the usual schedule. It may be better to leave a plane on the ground and fly its crew "deadhead" to the "right" city than to have them fly a load of passengers to the "wrong" city.

      There are reasonably efficient algorithms to solve these problems, but we spent most of my entire second-semester graduate-level algorithms class studying them (network flows). The algorithms most developers would come up (including me after a decade of experience and graduate-level algorithm class) are extremely inefficient and scale horribly.

      The bottom line is that it's easy to imagine a system that has no problem with pertubations from the regular schedule but is totally overwhelmed when starting from scratch. I hope the bean counter who saved the company a few bucks by insisting on far more modest hardware gets canned for his costly lack of foresight, but we all know that IT will catch the heat.

      --
      For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    6. Re:Crew assigment is a hard problem by forkazoo · · Score: 1

      On one of the old PBS Math programs, possibly Sol Garfunkle's show, they did a piece on the crew assignment problem. They tried to feed a simplified version of the problem into a massive mainframe, which ran out of memory, and they were never able to get an answer out of it. Granted, this was probably in the 80's, when a massive system had multiple megawords of core, but it impressed me none-the-less!

    7. Re:Crew assigment is a hard problem by Anonymous Coward · · Score: 0

      It's even harder than "independent variables", because there are dependencies between crew and aircraft. The cockpit crew must have up to date qualifications for the type of aircraft they fly. If you have a crew qualified for an Embraer (sp?) turbo-prop, that doesn't mean they can crew a flight on a regional jet. Even the cabin attendants need to have training for each type of plane: "There are two exits doors at the rear...wait a second, those look like the restrooms". So the scheduling program must be sure that when the current wave of flights arrive, there will be a qualified crew, with enough time left under their duty-hours limit, to handle the plane's next leg. And so on, and so on throughout the holiday week. It will take them a while to clean up this fiasco.

    8. Re:Crew assigment is a hard problem by Anonymous Coward · · Score: 0

      If we can get a computer to stand toe-to-toe with a world chess champion, then get the algorithms efficient enough to run on and AppleIIe. One would hope that that over the years that the same kind of effort would be expended on real-world network flow problems. I guess that te real problem is that they are not glamorous enough to tickle to imagination of the best and the brightest.

    9. Re:Crew assigment is a hard problem by Tablizer · · Score: 1

      Software only tends to do well under conditions it has been road-tested under. I suspect one section/component/application get overwhelmed with requests that it was not designed to handle, and dumped the problem off to another application, that similarly was not designed to handle such. Maybe each section got stuck in retry loops and caused a feedback cycle of the problems. And/or, perhaps a deadlock where component A was waiting on component B, and B was waiting on C, which was waiting on A.

    10. Re:Crew assigment is a hard problem by coyote-san · · Score: 1

      They're coupled, but independent in the sense that you have to solve the problem of getting the planes into place _and_ the problem of getting the people into place. The crew that flew a plane in doesn't have to be the crew that flies it out. You can deadhead crew but you can't strap a 737 onto the back of an A3.

      --
      For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    11. Re:Crew assigment is a hard problem by coyote-san · · Score: 1
      I guess... that they are not glamorous enough to tickle the imagination of the best and brightest.


      No, remarkable progress has been made. The problem is that the appropriate use is rare enough that it's not worth the time to teach the algorithms to most students - the time would be better spent on other algorithms.


      But the results can be dramatic in those one-off situations that can use the algorithm. My professor mentioned rewriting the med student-training hospital matching program and the intern-specialty scheduling programs using network flow algorithms. Overnight runs now take seconds.

      --
      For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  32. Re:Failure due to inability/unwillingness to test/ by Anonymous Coward · · Score: 0

    Shouldn't be an issue, I have had Solaris boxes with 1200 days of uptime and they could handle bursts of load just fine. Lots of other OSs have obscene uptimes and even hardware fault tolerance, too - VMS, FreeBSD, IBM VM MVS, NEC, Stratus, Tandem (now HP NonStop), etc so 400 days shouldn't be a problem for a critical system like that. erlang, a language that supports hot code replacement/upgrades allows for code upgrades without even bringing down the application.

    So adequate tools/hardware exists for them to create a system that shouldn't go down except for programming errors or complete connectivity failures.

  33. Leasing third party servers for stress testing by NZheretic · · Score: 1

    One more advantage of a Virtualised Standard platform, would be the ability to do development and stress testing on third party servers. Full on stress testing is something that most organizations cannot afford to do on the currently deployed hardware.

    1. Re:Leasing third party servers for stress testing by hughk · · Score: 1
      The problem is that the daily schedule of an airline is extremely complicated. One issue is that many airlines have downsized their older and more experienced staff so they lack the ability to run the airline without their extensive IT systems. Even with the knowledge, you still need to be able to reschedule slots with the airports as well as new flight plans (also usually filed by computer).

      It is then an issue as to whether you really want to design IT systems for every scenario. It costs a *lot* of money to do this and is usually only warranted in a safety critical domain (i.e., ATC). Comair's solution was to scratch the flights and thus ensure that aircraft were at their start positions for the next day.

      --
      See my journal, I write things there
    2. Re:Leasing third party servers for stress testing by budgenator · · Score: 1

      Some how a blizard durring the Christmas or Thankgiving rush would seem like a likely event rather than a bizzare event for an airline.
      The real impact is,
      1. The customers will see this not as an unusual event but as they screwed up my annual vactaion and they will remember for life. How are they going to sell a customer after they screwed up the last christmas a loved relative was a live?
      2. When the airlines beg for a bail-out the customers will remember when they had a chance to make some money they blew it

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    3. Re:Leasing third party servers for stress testing by hughk · · Score: 1

      As mentioned, a lot of airlines have been able to cope with this kind of thing in the past. To shut down for the day seems to be rather an exceptional response, and I hope the market remembers them.

      --
      See my journal, I write things there
  34. Can anyone say... by carlmenezes · · Score: 2, Funny

    ...slashdotted reservations?

    --
    Find a job you like and you will never work a day in your life.
    1. Re:Can anyone say... by ScrewMaster · · Score: 1

      Since we're talking about airline systems, I'd say "crashdotted" would be a better word.

      --
      The higher the technology, the sharper that two-edged sword.
    2. Re:Can anyone say... by simon+hughes · · Score: 0

      Does anybody have a mirror?

  35. Sad thing is by Anonymous Coward · · Score: 0

    The computer system was probably operating perfectly. 30,000 reservations wouldn't crash my USB key. Will find out real reason when stock holders are told of loan forclosures and impending bankruptcy.

  36. Re:The system runs Linux by carlmenezes · · Score: 1

    Yeah, but the real cause for the crash was an Access backend. So there! :)

    --
    Find a job you like and you will never work a day in your life.
  37. 30,000? by __aafkqj3628 · · Score: 4, Funny

    30,000 passengers? Getting dangerously close to an integer overflow there.

    1. Re:30,000? by La+Gris · · Score: 1

      Sure, now the counter says they put -32768 passengers in the void.

      The overall cost of all this will be less than #NaN and ny chance, the security risk has been rated as low as #DIV0 %.

      --
      Léa Gris
    2. Re:30,000? by sporty · · Score: 1

      This is the 21st century. Integers are 64 bit now. 30k is close to a short, a signed one at that. Such waste. What are schools teaching you these days. Back in my day, an integer was 1 bit! and you liked it!

      --

      -
      ping -f 255.255.255.255 # if only

    3. Re:30,000? by Euler · · Score: 1

      Wouldn't surprise me at all if there were still 16 bit integers in an old creaky database system.

    4. Re:30,000? by forkazoo · · Score: 1

      Don't worry, they are running on an 18 bit minicomputer. They can handle 256 thousand passengers before they get angry at the programmer! Good solid PDP-8, or something. :)

    5. Re:30,000? by edp · · Score: 5, Funny
      "30,000 passengers? Getting dangerously close to an integer overflow there."

      That is not a bug but an accurate model of reality. When you strand 32,768 passengers, they will turn negative.

    6. Re:30,000? by T-Ranger · · Score: 1

      If it is an IBM mainframe app, or an app for a different mainframe who followed IBMs lead, or an app for a PC written by a mainframe programmer, it would likely be 31 bits. Mainframes have been 32 bit systems since the late '60s.

    7. Re:30,000? by handy_vandal · · Score: 1

      When you strand 32,768 passengers, they will turn negative.

      Made me laugh!

      -kgj

      --
      -kgj
  38. Re:Slashdot this by Gentlewhisper · · Score: 0, Offtopic

    Please attack my CD to MP3 guide site with all the vehemence you can muster. Thanks.

    W00t.. Another fellow Australian on Internode!

  39. new kernel? by Anonymous Coward · · Score: 0

    have had their flights cancelled by Comair this weekend thanks to a computer system shutdown.

    Comair Cancels All 1,100 Flights
    Saturday, December 25, 2004


    hum...

    2.6.10 2004-12-24 22:38 UTC

    probably they were restarting the boxes from the new kernel? :\

  40. System Tracked Crew Location, Not Reservations by reallocate · · Score: 5, Informative

    Of course, a techie didn't write the PR release. Who in their right mind would let a techie anywhere near a PR release?

    BTW, Comair, a Delta feeder headquartered outside Cincinnati, says the system that crashed was used to monitor crew locations and track working hours to ensure no one went over the legal maximum. Comair says the system crashed as a result of massive crew rescheduling following a record snow in their service area on Wednesday. There is no backup.

    --
    -- Slashdot: When Public Access TV Says "No"
    1. Re:System Tracked Crew Location, Not Reservations by Impy+the+Impiuos+Imp · · Score: 2, Funny

      Gosh, looks lke idiot programmer assumed a 256 length crew relocation array was big enuf fer anybuddy!

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    2. Re:System Tracked Crew Location, Not Reservations by Pharmboy · · Score: 4, Funny

      You know, I have my OWN reservations about flying on an airline when they have no backups and can't keep their computers from crashing. Whats to keep their planes in the air?

      The last thing I want to hear at 30k feet is that my current flight has been cancelled...

      --
      Tequila: It's not just for breakfast anymore!
    3. Re:System Tracked Crew Location, Not Reservations by shyster · · Score: 3, Funny
      You know, I have my OWN reservations about flying on an airline when they have no backups and can't keep their computers from crashing. Whats to keep their planes in the air?


      The Bernoulli Principle. And I don't think computers crashing are going to affect it. This isn't the Matrix, after all.

    4. Re:System Tracked Crew Location, Not Reservations by Pharmboy · · Score: 2, Interesting

      I think you are overthinking it. My point is simply that a company that can not be trusted to keep their computers fully functional, can not be trusted to keep their aircraft fully functional. This is based on the premise that it is easier to keep the computers running than the aircraft, which I can easily assume, based upon my own experience.

      I also don't eat at diners where the help isn't properly groomed. Same principal: if you can't take of simple stuff, you probably can't take of something more important and/or complex.

      --
      Tequila: It's not just for breakfast anymore!
    5. Re:System Tracked Crew Location, Not Reservations by logicnazi · · Score: 2, Insightful

      Do you also refuse to eat at a relatives house if their computer is virus laden or crash prone? After all if they can't be trusted to keep their computer working why should you trust them to make safe, sanitary food.

      Perhaps if computer usage/programming had evolved to the level of personal hygenie, namely routine effort anyone could do would prevent computer crashes, your point would be convincing. However, in practice we realize even the best professional programmers make errors even buffer overflows (and we don't even really know it's an 'error' perhaps the program exited gracefully after realizing the demands exceded its capacity and simply hadn't been programed to handle this size situation). So unlike your hygenie example this hardly impeaches the basic organizational discipline/compotency.

      Had this really been a computer engaged in flight critical tasks I would feel quite differntly. Programming error or even an unanticipated shutdown is not acceptable in systems necessery for real-time flight control. Since this was instead a system to reassign crew and guarantee compliance with federal labour law I feel much differntly. In fact if this system had been subject to a rigorous source code review by an outside team to check for bugs, or linked into some sort of failover system with differntly programmed systems accomplishing the same task I would worry that their priorities are being misplaced.

      Arguably an airline, given their limited budgets, which puts too much redundancy into their non-critical systems has an incorrect set of priorities.

      --

      If you liked this thought maybe you would find my blog nice too:

    6. Re:System Tracked Crew Location, Not Reservations by Mr.+Slippery · · Score: 1
      Whats to keep their planes in the air?

      The Bernoulli Principle.

      Not so much. Read the next page at the site you linked.

      --
      Tom Swiss | the infamous tms | my blog
      You cannot wash away blood with blood
    7. Re:System Tracked Crew Location, Not Reservations by Anonymous Coward · · Score: 0

      So you've been running both computers and airplanes? Interesting, perhaps you should visit the yahoo jobs link posted a few comments up, they're looking for someone like you. /It's a Joke. Flamining will only leave you saddened.

    8. Re:System Tracked Crew Location, Not Reservations by benjamindees · · Score: 2, Insightful

      Perhaps there's a better principle you could apply, namely that anyone, be it company or person, only has a finite amount of resources (time, money) at their disposal, and choose to dedicate them to specific tasks.

      Perhaps unkept diners are more concerned with the quality of their food than the ambiance. Perhaps the IT guy with twelve certifications knows more about getting certifications than about working on computers. Perhaps the vendor that sends you a Christmas card every year is pulling employees off of doing real work in order to make it look to you like they have their shit together. Perhaps the antisocial guy with the unkept hair and the socks-with-sandals is more concerned with proving his latest theory than with what you think of him.

      Perhaps appearances can be deceiving.

      --
      "I assumed blithely that there were no elves out there in the darkness"
    9. Re:System Tracked Crew Location, Not Reservations by jonbrewer · · Score: 1

      My point is simply that a company that can not be trusted to keep their computers fully functional, can not be trusted to keep their aircraft fully functional.

      And there you are very wrong. There are a million controls on maintenance of aircraft. Every part of a plane has a history and a pile of paperwork to go with it.

      If such documentation and controls existed for every component of every software package used by the airlines:

      1. New Airline Information Systems would be built by two or three programming shops in the world.
      2. They'd be massively expensive
      3. They would have lifecycles of 30+ years.
      4. The government would audit IT organizations working on the systems and would investigate every system crash.

    10. Re:System Tracked Crew Location, Not Reservations by Anonymous Coward · · Score: 0
      There may be a million controls, but those same controls were in place at Alaska Airlines, and you can see how well that worked (http://seattlepi.nwsource.com/flight261/).

      Grandparent post is on the right track - if they were cutting the budget to the bone on the crew training system,they could be doing the same to the maintenance budget.

    11. Re:System Tracked Crew Location, Not Reservations by Anonymous Coward · · Score: 1, Insightful
      I also don't eat at diners where the help isn't properly groomed. Same principal: if you can't take of simple stuff, you probably can't take of something more important and/or complex.
      Principle, not principal. You also left out the word "care", not once but twice. If you can't take care of simple stuff like grammar and spelling, you probably can't take care of something more important and/or complex (like programming, maybe?).
    12. Re:System Tracked Crew Location, Not Reservations by Shark · · Score: 1

      The Bernoulli Principle. And I don't think computers crashing are going to affect it. This isn't the Matrix, after all.

      Pfff, we all know which pill *you* picked...

      --
      Mind the frickin' laser...
  41. Y2K by MicklePickle · · Score: 1

    With the amount of Y2K 'fixes' I have seen around, (and some of them very dubious), I wouldn't be at all suprised if it was a Y2K problem. Looks like someone didn't have all their test cases written down properly and/or didn't test properly and/or tried to 'fix' the problem.

    --
    -- main(s){printf(s="main(s){printf(s=%c%s%c,34,s,34) ;}",34,s,34);} $p='$p=%c%s%
  42. Hey Editor, READ THE ARTICLE by Anonymous Coward · · Score: 0

    It isn't the reservations system that crashed, it is the CREW ASSIGNMENT system. Very, very different.

  43. I don't know about their internal system... by Glowing+Fish · · Score: 2, Interesting

    As a preliminary finding that may or may not give us a clue as to what the internet system was running, Netcraft reports that www.comair.com is running Apache on HP-UX.
    So don't assume that the internal system was Windows just yet. Then again, don't assume that it wasn't.

    --
    Hopefully I didn't put any [] around my words.
    1. Re:I don't know about their internal system... by Zachary+Kessin · · Score: 1

      Also don't assume it was the OS that died, it is very posible that the computers were up, just not responding to the client software or otherwise screwed up. I would guess that it was thier own custom software that died.

      --
      Erlang Developer and podcaster
    2. Re:I don't know about their internal system... by Anonymous Coward · · Score: 0

      Yes, but bear in mind that HP-UX has undergone a lot of changes with HP trying to integrate a lot of the Tru64 UNIX technology that they bought from Compaq, and it's entirely possible that there could be a systems integration issue underlying all that. A lot of the Tru64 technology (as well as the Alpha) has since been scrapped, and it's possible that the technical folks who understood it best might have left HP already.

      Even NASA has had monumental screw-ups over stuff like American/English units versus metric, so you never really know until it's too late.

    3. Re:I don't know about their internal system... by Anonymous Coward · · Score: 0

      HP-UX on legacy hardware. It doesn't take much more
      than a breath of fresh air to set these guys on their heels. Recently started working with these
      ancient behemoths coming from a straight linux/bsd
      x86 bg and have learned ready rules for behavior.

      Don't use these systems for anything that you think
      will schedule for more than 30 seconds, don't count
      on the nfs code to do the 'right thing', and count
      on it to break in routine operation without a single ascertainable clue.
      In other words: use it only till you can replace it.

    4. Re:I don't know about their internal system... by angrykeyboarder · · Score: 1

      Their web server is unrelated to thier Crew Scheduling system or reservations systems.

      --
      Scott

      ©20014 angrykeyboarder & Elmer Fudd. All Wights Wesewved
  44. From the writeup by jstrain · · Score: 0

    The system should only be able to hold as many reservations as it has flights/seats. In accordance with our policy of overselling flights, this flight has been oversold...

  45. whole story? by confusion · · Score: 4, Informative
    This comair story is all I'm seeing getting press. I think its a lot bigger than that.
    My sister flew Delta on Dec 23rd from Detriot to Atlanta. Plane was 2 hours late, but no big thing. Waited 5 hours for her luggage, with no dice. By the time we got in line for luggage services, there were at least 600 people in the line already.
    Talking to other passengers from 10+ different flights from different cities, no one got their luggage that night. Apparently, it wasn't just Atlanta - the local news in Tampa and Detroit had segments on how the airports had taken over parts of taxiways to sort through seas of bags that didn't make it on to planes.
    It's been 2 days, and Delta has no idea where the stuff from that flight is. I'm guessing it isn't just Comair that got hit by some computer problems.

    Jerry
    http://www.syslog.org/

    1. Re:whole story? by garcia · · Score: 4, Interesting

      Personally I think that Delta was being a bunch of assholes about the whole thing...

      Seeing that my 7pm flight was cancelled for the 23rd I spent 20 minutes redialing from two different phones until I got past a busy signal. After 50 minutes on hold I got through to a representative who scheduled me for the 24th's 7pm flight. I spent the rest of the time rearranging time off from work, the dog's time to be spent at the kennel, car rental stuff, and phone calls to my fiance who would meet me at the airport, and to family we were supposed to see.

      At 7am on the 24th the flight was already cancelled. At this point I didn't give a shit anymore. Delta was saying I would have to use my tickets by the 15th of January because "it wasn't their fault". I knew it wasn't the fucking weather down there as plenty of people were saying it was fine in the area. So I call again and get through after redialing for 65 minutes. I get through to a rep after 50 more minutes in queue. She tells me she can't do anything but schedule me for the 25th at 7pm so I'd have to get in queue for the reissue desk. Fine...

      After 2 hours and 11 minutes in queue (with no hold music or sound for that matter) someone calls on my home line at 5:15pm from Delta to tell me my 7pm flight is cancelled (cute, I would have been at the airport by then). I tell that rep to get me into the reissue queue as I've been on hold with them for 2 hours.

      I finally get through and tell them I want my money back. They tell me I need to speak to customer service. After waiting on hold (with the reissue rep) for 25 minutes the reissue rep offers to refund my money.

      We can't fly out for New Years as the kennel is booked and I'd feel horrible asking someone to watch our dog in our house for me than 1 night. So basically we have to wait quite some time to fly down there again.

      It was a little bit of a pain in the ass to wait on hold and be jerked around for two days for something that was their fault when they continually claimed wasn't. BAD WAY TO TRY AND PLEASE A CUSTOMER.

      Thanks for ruining our Christmas.

    2. Re:whole story? by Anonymous Coward · · Score: 0

      Detroit was hit with 7 to 8 inches of snow on the 23rd.

      You might want to factor that in.

      Also they had planes sliding off runways, and they were showing baggage trucks sliding and stuck on the tarmac on television.

    3. Re:whole story? by wwahammy · · Score: 1

      So they acted like a regular airline?

    4. Re:whole story? by Anonymous Coward · · Score: 0

      The group in charge of snow storms will check with you before scheduling the next one.

    5. Re:whole story? by realdpk · · Score: 1

      It's amazing to me how much energy has been spent trying to convince us that airport security is better now, and how not having nail clippers and stuff on planes is making us safe, and yet they still can't get something as basic as "put the bag on the plane with the passenger" done right.

    6. Re:whole story? by Anonymous Coward · · Score: 0

      Credibility suffers when you cite numbers so often as you do.

      Credibility suffers when you are trolling.

      Only a really anal OC person would do that, and if you're one, then you're more than likely also borderline antisocial and nearly impossible to please anyway.

      Only a really upset individual trying to get his fiancee to Ohio for Christmas to see her family for the first time in 8 months would make sure to document everything on his $550 plane tickets closely as to not get screwed on either end.

      Good for Delta to give you what you probably give to others.

      Hopefully you will get modded -1 and get what you deserve.

    7. Re:whole story? by Anonymous Coward · · Score: 0

      Thanks for ruining our Christmas.

      Did the airline force you to leave on the 23rd? Why not the night of the 22nd? I have little sympathy for people that whine about holiday travel when they didn't plan for things like this.

    8. Re:whole story? by HeghmoH · · Score: 2, Informative

      These days, "we hate the customer" seems to be the motto of all of the big airlines.

      This summer, I was flying from Paris to Ft. Lauderdale via Philadelphia on USAir. The Paris->Philadelphia leg was handled by the same plane that does USAir's Philadelphia->Paris flight that same day. The incoming flight was about four hours late, so of course our outgoing flight was also four hours late. Sucks, but what can you do.

      So we get into Philadelphia at about 9PM instead of 4:30PM and everybody rushes to get any last-minute connections they can. I was already stuffed and had to wait for the next day's flight, but a lot of people had chances to make late flights to their destination. We all get off the plane, go through customs, get to USAir's rebooking desk.

      Two people are working this desk.

      An airplane with three hundred people comes in four hours late. USAir knew that this flight would be late almost a full day in advance, since it was a cascade effect from the other flight's delay. And yet, none of USAir's genius managers had the presence of mind to call in a few extra employees that night to speed things along.

      A lot of people missed connections they otherwise could have made, because they had to wait in line for an hour to get new tickets.

      Since USAir obviously hates their customers exceptionally strongly, I won't be flying with them again.

      This isn't really an isolated incident, either, just the most recent bad one. The entire industry has a serious problem with this, and I have a feeling it's going to take a couple of high-profile bankruptcies before they get a grip on it.

      --
      Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
    9. Re:whole story? by the+pickle · · Score: 1

      Apparently the United and US Airways bankruptcies weren't high-profile enough...

      p

    10. Re:whole story? by winwar · · Score: 2, Informative

      "I have little sympathy for people that whine about holiday travel when they didn't plan for things like this."

      Okay troll, I'll bite. Maybe he had a limited amount of time off. Maybe that was the most convenient time to fly. Whatever. It doesn't matter.

      He shouldn't have to plan for weather, high traffic, and/or computer screwups. That is the airlines JOB. You know, the people who took the money and agreed to get him from point A to point B. Bad weather in the winter? From the massive effects it has on the airlines, you think this is the first time they have ever experienced it.... Running a computer system they KNOW will fail under load?!? Other airlines running out of deicing fluid?!? Excuse me, it IS THE AIRLINES FAULT. When your system is such that one winter storm will screw it up, and it happens repeatedly, and you do nothing to change it, it is broken and your fault.

      But they don't care. And that was the grandparents point. Admit it is your fault, refund his money, and let him make other plans.

      People accept that there will be problems-lying to them just pisses them off and guarantees that they WON'T believe you if it ever really isn't your fault. Accepting blame tends to build respect.

    11. Re:whole story? by babbage · · Score: 1

      According to news stories I heard this weekend, there were two problems going on: the meltdown at Comair/Delta due to weather (and as known now, software), and a work stoppage among baggage handlers trying to negotiate a new contract. It sounds like you may be dealing with fallout from the baggage issue, which was purely a human matter, not a software one.

  46. WINDOWS DUH by Anonymous Coward · · Score: 0

    I am not even go to read all the comments.
    But we all know (assume)it was NT based..
    WHat we don't know for sure is what flavor is it.
    The original, 2k or 2k3?
    I doubt it was the OS, it was most likely the poorly written, specially developed software that could not handle it, (being lack of space, corrupt tables etc)

  47. Re:Slashdot this by Zorilla · · Score: 1

    Don't smirk; it'll be the next topic on Ask Slashdot.

    --

    It would be cool if it didn't suck.
  48. Re:Dumbfucks by Anonymous Coward · · Score: 0

    What's the big deal? Just reboot. A plane that cannot take some reboots without failing is badly engineered; this has nothing to do with Windows.

  49. Not surprising, coming from Comair by Anonymous Coward · · Score: 5, Interesting

    Some of my co-workers are on contract developing Java software for Comair.

    Comair are very tied to particular systems, and don't want to change even when the developers have pointed out problems. Case in point: a J2EE-based employee portal, based on Novell exteNd (Novell Portal Service) and a one-way HPUX server. NPS runs in Tomcat, which is servicing requests (via mod_jk) through Apache. No other application shares the machine, and Comair will only consider vertical scaling, not horizontal.

    The application creates at least two threads per connection, and when the thread count goes beyond a relatively low threshold (between 300 and 400), Tomcat deadlocks. It's not because they're running out of space in the allocated JVM heap, and they've tuned mod_jk to allow for heavy load. The current solution is to restart Tomcat when the system locks up.

    Novell's support has been less than stellar, so the Java contracting group was informally asked what to do. We had all kinds of useful suggestions, from dumping NPS for another portal implementation, to creating custom thread-pools, to using JDK 1.4 new I/O and a minimally-threaded design, and even using round-robin DNS and a group of independent portal servers to share the load. Comair are wedded to particular minimal cost solutions, however, and it shows.

    At least when the portal crashes, it only impacts employees and not passengers.

    1. Re:Not surprising, coming from Comair by Anonymous Coward · · Score: 0

      The current solution is to restart Tomcat when the system locks up.

      As much as I like Tomcat, why are big businesses that can afford support contracts using it?

  50. Read the code, Luke (episode II) by chiph · · Score: 4, Funny

    Somewhere deep in the code is a comment that says:

    // I don't need to check for this condition because
    // my asshole manager Steve Johnson says it'll
    // never happen


    {friggin' slash - When I say plain old text, I mean plain old text!}

  51. It's a tragedy when... by Anonymous Coward · · Score: 0

    Computers are blamed when the probable truth is they had no backup plan - and if they did, it was never tested. Management approved this risk - or thought they could outsource it, the result is self evident.

    Disaster management strategy is not optional, and it is time someone asked difficult what if questions. If jobs were outsourced to India, I would love to see management flapping and groping at straws. As they are not up within 24 hours, and, oh out of de-icing, we can see these clever cutbacks were not so clever afterall.

    1. Re:It's a tragedy when... by gl4ss · · Score: 1

      actually disaster management strategy IS optional.

      this story as an evidence.

      though seriously, there's quite a lot of companies out there that instead of hiring incompetent people could be better off buying the services from outside. outsourcing doesn't necessarely mean it's crap, there's a lot of domestic in-house crap and idiots everywhere.

      --
      world was created 5 seconds before this post as it is.
  52. I'm surprised by antifoidulus · · Score: 2, Interesting

    that in the name of sensationalism reporters haven't said, "terrorism is probably not to blame but the Dept. of Homeland Security is looking into it." It seems that after Sep. 11th, the news wants to try to connect everything even remotely bad with terrorism, and of course the Dept. of Homeland Security encourages them by using as vague of language as possible. Are people that easily frightened?

    1. Re:I'm surprised by HangingChad · · Score: 3, Insightful
      It seems that after Sep. 11th, the news wants to try to connect everything even remotely bad with terrorism

      What else do they have to do? They've got this huge ass budget, all those people watching a lot of honest citizens. It was 10 years between the first attempt on the world trade center and the second. We've built and paid for this entire monster agency for an event that might be 10 or 15 years away. What are they going to do in the meantime? Grope women at the airport. They have to do something to justify their existence, Otherwise we'd have admit we over-reacted to 9-11.

      --
      That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
    2. Re:I'm surprised by Quixote · · Score: 1, Insightful
      Are people that easily frightened?

      As Nov 2 showed, yes they are.

    3. Re:I'm surprised by Anonymous Coward · · Score: 0

      You're confusing the press with the Executive branch. The press usually just regurjitates whatever the administration tells it.

    4. Re:I'm surprised by Anonymous Coward · · Score: 0

      "+5, Interesting"? For something that has absolutely nothing to do with the situation at hand?

    5. Re:I'm surprised by ScrewMaster · · Score: 1

      Well ... on the other hand, your reply has nothing to do with the situation at hand either, and you're currently modded 0, so I guess the moderation system works after all.

      --
      The higher the technology, the sharper that two-edged sword.
  53. Exactly by 36-bitter · · Score: 1

    Computers don't freak out or get depressed when work piles up. Backlogs mean nothing; they just keep processing one piece at a time until the pieces run out. I think someone was speaking imprecisely.

    I suppose that the system *could* have been built with a rule to detect that the results are becoming more and more untimely, and at some point just say "TILT!" and deliberately exit. I can't imagine why, though; getting there late is better than sitting in the terminal forever.

    1. Re:Exactly by Anonymous Coward · · Score: 1

      A lot of their problems are handled by algorithms with quadratic or worse performance. It *should* detect when work is piling up too fast and seek good enough solutions... but that obviously wasn't done.

      Some programmer probably got fired for telling their boss that the code was shit, and unless time was spent fixing it a meltdown was probable. Or maybe I'm imagining things; I know the last company I worked for is going to have this type of problem.

  54. From old information... by gminks · · Score: 5, Informative
    According to this article [written in 1995] , Dell and AT&T created a new company called TransQuest Information Solutions.

    This article outlines how this joint venture re-vamped Delta's IT systems (again remember, this is 1995):


    During 1995 and 1996, TransQuest reengineered Delta's systems to migrate them from Hitachi mainframes running Natural, Adabas, and DB2 to an open systems environment. The new systems are written in C++ and access Sybase databases of reusable and distributed objects. The systems run primarily on Sun, HP and AT&T servers under UNIX with clients running under UNIX, MS-DOS, and Windows. The clients are connected to the servers over high bandwidth TCP/IP frame relay networks.

    Job titles for the company's 1,100 computer professionals include Systems Engineer and Software Engineer 1 through 8. Staff members recently developed an aircraft weight balance system that can be accessed by pilots to determine how luggage and fuel have been distributed within the aircraft for balance during a flight. This system was developed in C++ on AT&T and HP UNIX servers and will be available on 40,000 devices to 2,000 users.


    The trail runs dry here, job postings stopped around 2001.

    Which really raises suspicions that all the code is written and maintained offshore. The question now becomes who is handling this for Delta.

    One of Tata's spinoffs, Airline Financial Support Services, is described as


    "an example of an external service provider that handles a wide range of back-office functions for the airlines. AFS handles sales, refund, traffic and cargo; performs fare audits; manages yields and revenues by performing departure and post-departure processing checks; books crews; deals with overbooked flights and wait-lists; adminsters frequent flyer programs; draws up flight navigation charts; such as landing or route facility charts; and provides customer care." This according to ebstrategy.com


    Wipro handles some of Delta's inbound reservation calls in India and the Phillipines.

    In conclusion, it would appear that either Tata's AFS arm or Wipro do the IT for Delta airlines.
    1. Re:From old information... by Kalak · · Score: 1

      Since it appears to be crew scheduling that was the issue, I'd look in the direction of Tata. Thanks for the info.

      --
      I am, and always will be, an idiot. Karma: Coma (mostly effected by .hack)
    2. Re:From old information... by The+Wookie · · Score: 1

      Having been a part of the ugly TransQuest fiasco, I am pretty familiar with most of this. TransQuest was a joint venture between Delta and NCR, which was owned by AT&T at the time. TQ had an absolutely terrible turnover rate and made a lot of temp agencies rich.

      Around the middle of 1996, I think, AT&T divested itself of NCR and Delta ended up making TQ a wholly-owned subsidiary. I left in early 1997, and I think it wasn't more than a year or two before TQ became Delta Technologies and the TQ employees became Delta employees again. As far as I know, Delta still does most of its own IT, I know some of my former co-workers are still there.

      Delta does contract some things out, but they are almost always installed and maintained in Atlanta.

  55. Re:The system runs Linux by Antique+Geekmeister · · Score: 1

    Nah, under Linux you can trivially create new files as swap space when needed. It may mean they overflowed available partition space on critical systems, or were unable to administer a heavily loaded fast enough to add swap before it overflowed.

    Knowingn nothing else, I'd guess they overflowed a key database partition. A lot of old programmers very foolishly over-partition available disk, trying to outguess the OS about what partition will need how much space and instead of protecting themselves from disastrous overflows, actually causing disastrous overflows. It's an old programmer's habit that's hard to train people out of.

  56. Oh, I know what happened by Anonymous Coward · · Score: 0

    They updated gcc. That always did it for me.

  57. No manual process? by SCHecklerX · · Score: 1

    WTF can't they do it manually? It's just keeping track of seats on planes for fsck's sake. Sure, they may not be able to accomodate everyone right away, but they could certainly do better than "nobody can fly at all because our computer system crashed". If a restaurant loses their computer, they don't stop admitting people. They just go back to paper orders/receipts.

    1. Re:No manual process? by aggles · · Score: 2, Insightful

      Hopefully someone from Commair reads /. and will not be able to resist spilling the beans. This sounds like a lawsuit in the making. It was not weather related - it was someone trying to either save a buck by writing crappy software or having poor operational procedures. This is a Sarbanes-Oxley event - and hopefully, the truth will come out about what happened, and why the backup procedures were either not-in-place or did not work. I don't want to see them go bankrupt, but they should be held accountable.

    2. Re:No manual process? by Anonymous Coward · · Score: 0

      Its not just tracking seats. You have to try and figure out crew assignments, plane assignments, etc. Match crew to types of aircraft they're qualified to fly. Try to make sure that the crew don't go over their hour limit that day. Make sure that the type of plane can land at the airport you're scheduling it to go to. Make sure you have access to stands/gates that can take that type of plane. Ground equipment that can handle that type of aircraft at that airport.

      Sure you can do it manually. But unless you do it right then the problems just get worse and worse and you never recover. Thats where the computer comes in.

    3. Re:No manual process? by wwahammy · · Score: 1

      Its a little more complex than a restaurant. We're talking about tens of thousands of passengers, plus rerouting flights to new places, assigning the correct crew members to flights and lots of other accounting type things. I would assume also that the reservation software is linked in someway to one of Homeland Security's wonderful databases so that just makes everything a lot of harder. Its not quite as simple as tallying how many people get on the plane.

    4. Re:No manual process? by winwar · · Score: 1

      "Its a little more complex than a restaurant. We're talking about tens of thousands of passengers, plus rerouting flights to new places, assigning the correct crew members to flights and lots of other accounting type things."

      Sure, it's complex. But they have lots of employees. Anyone not working could have been called in (aka, mandatory overtime).

      The process would have been slow and inefficient. But at least it shows you are doing something. That you give a damn. And it gets something done. Because, after all, it was the airlines fault in the first place....

      But, they don't care and are cheap. And it shows.

    5. Re:No manual process? by Anonymous Coward · · Score: 0

      Really, though, it's not a matter of just throwing people at the problem.

      You act as if those planes are just sitting there, waiting to be filled with passengers. Airlines shuffle planes like crazy; a little weather disturbance in one critical city can mean that the needed planes aren't even at the airport. Sorting things out takes serious coordination.

      Crew members get shuffled around, too. The pilot may be in one city, the crew in another. More coordination.

      Additionally, planes have to undergo mandatory maintenance after every X number of hours. Crew (and ground personell) are unionized, so they have a maximum number of hours they can work. Checking that all the planes have recieved their required maintenance involves a lot of verification. Usually, the computers do this automatically; without the system, you'd have to send runners down to the ground crews and maintenance people to make sure that plane N435333DA is flightworthy.

      And then there's baggage. Each bag must be matched to a passenger. This happens rarely enough even when things are working correctly.

      These things snowball quickly as delays mount.

    6. Re:No manual process? by Anonymous Coward · · Score: 0

      Risk Managemnent 101.
      They better have a manual process.
      Phones , conference calls and Airport motels, coupled with the incentive many may have no job next year will solve most problems.
      Now if Homeland security has nixed manual options, the airline can apply for an exception, as a packed airport with milling pax is a risk.
      A crass alternative would be to cancel all flights hoping pax would go and stay home. Seems this failed, as the airline has been less than forthcoming with details and particulars.

  58. seen it before by sacrilicious · · Score: 1
    It appears that due to weather and other problems that flights began to be cancelled on Thursday and the backlog choked the system. 1,100 flights have been cancelled so far, including all flights through 12/26. Does anyone know what platform their system was based on? What kind of system just totally crashes?

    Sounds like Diebold may have been contracted for the job.

    --
    - First they ignore you, then they laugh at you, then ???, then profit.
  59. Detroit, December 23rd by Anonymous Coward · · Score: 0

    I flew Northwest on December 23rd into Detroit, and from Detroit to Minneapolis. Due to the weather, the problems that day were terrible. My friend's connecting flight was canceled, and the lines to rebook flights were pretty outrageous. Thankfully, I remembered NWA has phone banks for this sort of thing, so we got her on the last flight out that day.

    Hope you get your luggage back soon.

  60. I'd like to know by HangingChad · · Score: 2, Interesting
    Not just the database platform and front end but who built it. This just has E-D-S stamped all over it. Everybody has a system go down once in a while, but it just seems like EDS has had more than their share.

    This is a worst case scenario for a system of that nature because of so many dependent calculations and calls to other systems. It takes more than just having a plane and a crew...which is a lot of work all by itself. It has to have a gate and connecting flights. Then multiply all that by 30,000 people, roughly 120 plane loads, and complicate it by some airports being closed. I bet you could actually watch the lights get dimmer in the server room. Still when you know the potential peak demand you have reserve capacity. Slow is okay, stop is unacceptable.

    --
    That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
  61. True story by john-gal · · Score: 1

    Well, the janitor did not pull the plug, but at a major airlines, the carpenter did. He wanted to plug in his electric drill and pulled the plug on the server and its power back up and every other wire plugged into a power source in sight!!! This came to light after it was reported to my company (we provide the software) that the system had crashed totally. We sent 2 people over and they came back laughing so hard that it took them 30 minutes to tell us what had happened.

  62. Bailout by parliboy · · Score: 1

    After 9/11, pretty much all of the domestic airlines were bailed out by the government to keep them from going poof (except for Southwest and a couple of others, who didn't have their heads up their asses). So I just want to know how long it will take this Delta affiliate to plead for money. That not only has it screwed over all of those passengers, the taxpayers will collectively pay for it.

    --
    "You're never ready, just less unprepared."
  63. MOD PARENT UP by johannesg · · Score: 1

    At least, if he is speaking the truth about this ;-)

    1. Re:MOD PARENT UP by Anonymous Coward · · Score: 0

      When I worked there before the dot-com craze, there was a single PC running SCO.

      Well, for some people, I guess that might be considered large-scale.

  64. Scare thought by ZeroReality · · Score: 1

    Some company refuse to do a complete system over haul. They just keep patching and upgrading. Their decade old software.

    I did a little researdh Conair was made Y2k compliant with only mirror changes.
    http://budgettravel.about.com/library/weekly/aa101 599.htm What i am saying is the software could be a couple decade old and be a rats nest of code. Kobold anyone?

  65. Re:Huge earthquake by Anonymous Coward · · Score: 0

    Looks like the "true beleivers" were confused, but I got it.

    To the outraged -- this guy is making fun of you by exaggerating common Christian attitudes to the extreme. It's a parody. Get it?

  66. Southwest refuses to drink the Kool-aid by Oswald · · Score: 4, Interesting
    This computer problem of Comair's just demonstrates how unworkable the hub-and-spoke system of flight scheduling is. It's a flawed concept, foisted on a naive public by an industry locked in some sort of mass psychosis. In the pursuit of minor economies of scale, the big airlines treat their passengers like packages (hey! it works for Fedex, and their cargo can't even walk itself to the next gate...), treat airport runways and air traffic controllers like unlimited resources, and waste vast amounts of jet fuel. The fact that Southwest Airlines (which does not use a hub-and-spoke scheduling system) is profitable, and the rest of our major airlines are either in, just out of, or about to go into, bankruptcy doesn't seem to dent their thick skulls.

    I have watched the operation at Atlanta for over 21 years, and I've seen how cutthroat the competition for a major hub is, but it feels like watching two dogs fight over two bones--you can't tell if they're fighting out of greed or stupidity. Southwest doesn't even fly into Atlanta--they know that only a pyrrhic victory would be possible under those circumstances. Management at the other airlines has been criminally incompetent ever since airline deregulation, but it's the passengers, employees and shareholders who pay the penalty time and again.

  67. Possible system OS by Anonymous Coward · · Score: 1, Informative

    Judging by www.comair.com and their job ops, it's probably HP-UX or Windows. More than likely the Unix flavor rather than Windows. Why down for a couple of days, probably a database restore. Never happens in TPF. Those mainframe systems crash and are back up with very little database degredation. By the way, in the job ops, if you want to be a crew scheduler, only need HS diploma!

    1. Re:Possible system OS by Anonymous Coward · · Score: 0

      "Never happens in TPF."

      Ha ha ha ha ha ha ha ha.

      That's so silly, you must be a TPF programmer.

      They think its no big deal to take $500,000 and 6 months to change one input parameter to a program. I can't say enough bad stuff about TPF. Its a 1962 OS that still acts like a 1962 OS.

      The whole system sucks. Why do you think sabre is doing their best to run away from it?

    2. Re:Possible system OS by Anonymous Coward · · Score: 0

      What I meant was that if this crash is causing them to do a database restore, I've seen this happen often in Unix. TPF has a good record on database integrity following a crash.

      TPF like a 1962 OS? TPF is continuously upgraded by IBM.

      When I hear Unix people gloat on the hundreds of transactions per second, TPF is handling thousands of transactions per second.

  68. Re:Huge earthquake by Anonymous Coward · · Score: 0

    ACTroll of the day? Not the only troll, but it is effecting some who (had) decent Karma

  69. Netcraft Confirms by Anonymous Coward · · Score: 0

    Comair is dying!!

  70. TransQuestites by Anonymous Coward · · Score: 0

    ...Dell and AT&T created a new company called TransQuest Information Solutions....

    ...the employees of which were referred to as "TransQuestites" (really! I was a subcontractor working on site at Delta at the time!)

  71. Must Account for Overbooking by syntap · · Score: 1

    The system should only be able to hold as many reservations as it has flights/seats.

    Useless commentary. Airlines overbook to fill planes and the system has to accomodate this.

  72. Re:Huge earthquake by Anonymous Coward · · Score: 0

    And what do you know about the sins "these people" have committed.

    Have you never read Mathew 22:35 - 40?

    Or John 8:3 - 9?

    I don't know what sins "[those] people" have committed, but the sins of pride and setting up yourself as their judge (Psalms 7:8), which is reserved for the Lord.

    You know nothing of these people, their religion, their spirituality, their worthiness, or the opportunities they've had to "accept Christ", or whether or not they turned it down.

    You are just so insecure in your own spirituality that you have to assume the rest of the world is going to "face an eternity in hell" just so you can feel good about your shakey place in the kingdom.

    You may want to re-read Mathew 7:3 - 5. Heck, re-read (or probably for the first time) the whole Bible, while you're at it, paying particular attention to when Christ himself speaks. You may learn something of His attitude towards the sinners, which had nothing in common with your attitude.

  73. Huh? by slavemowgli · · Score: 1

    What kind of system just totally crashes?

    Is that a trick question?

    --
    quidquid latine dictum sit altum videtur.
  74. Do some debugging. by Anonymous Coward · · Score: 0

    Seriously, doesn't sound like you guys have gone through all the steps necessary to say you have really debugged the problem. Depending on your vendor to solve issues like this, is asking for pain and suffering.

    IIRC the 1.4 fixed many of the thread locking and file handling problems that people couldn't track down in tomcat installations.

  75. Re:Southwest refuses to drink the Kool-aid by HR · · Score: 3, Interesting

    The problem with your analysis is that point-to-point flying doesn't work when you start talking about international travel. It's just not possible to fly passengers to, say, Germany or Japan from every domestic airport. The way you do it is to accumulate passengers at a major hub on the coast and then fly from there.

  76. Re:Southwest refuses to drink the Kool-aid by PPGMD · · Score: 3, Insightful
    It's isn't that easy, for the longest time Southwest was the hardest to book a flight for because they had no web system that could figure out it's route system (only 5 years later they just released one). Up about about July of this year to book a web flight you needed a route map and schedule to figure out what cities you had to go throuh if there was no direct flight option.

    The hub-spoke system is easier to manage, and can be profitable if the airlines relize that they aren't unlimited resources, and decentralize the hubs on a limited basis.

    Anyways Southwest doesn't drink anyone's koolaid, they run all their own in house designed systems (I am not sure they are even on Sabre anymore), including web apps. It's an intresting concept, but it probably causes their IT managers to pull their hair out.

  77. Car dealer by Anonymous Coward · · Score: 1, Interesting

    I worked on a car dealers' wide area network for a short time. Their entire network, all connections to other dealerships, internet connectivity, not to mention their Novell network, dealership inventory, parts, and tie-in to the manufacturer(s) was tied to a single router. They had problems, and I finally drove out there, and found the router "installed" in the drop ceiling above the mechanics' bathroom. The opposite side of that wall was the backer board for the telephone lines, located in a broom closet. I pulled the router down, and the inside had green mildew on the board. Routinely, the housekeeping service would unplug the 25 foot ORANGE extension cord plugged into the single-socket bathroom outlet! I advised the general manager about these problems, told them that they'd best extend their demarc, move the router to a better location, but they never bothered to fix it.

  78. Speculate much? by Anonymous Coward · · Score: 0

    It seems highly improbable that a system would crash because it had too many reservations. The system should only be able to hold as many reservations as it has flights/seats. It would seem that it's more likely that the system was overloaded with use and that caused a meltdown.

    Oh, of course. Given a specific reason from the people in the know, it's much better to assume that isn't the truth and make up your own vague reason, blaming such nasties as "overloading" and "meltdowns". Thanks, random Slashdot poster, I feel much more informed now!

  79. Er, I think I found the problem...they pay squat! by SharpNose · · Score: 2, Interesting

    From Yahoo Jobs:

    Software Engineer Cincinnati, OH $40K -$50K

  80. Probably TPF by Anonymous Coward · · Score: 1, Informative

    More than likely it is TPF as Delta is a TPF shop.

    TPF (http://www-306.ibm.com/software/htp/tpf/index.htm l) has been around since the '60's and is used by all the major airlines, most of the large hotels and most bizarrely NYC 911.

    1. Re:Probably TPF by Anonymous Coward · · Score: 0

      First of all if it was a Delta system it would have also affected Delta flights. Comair reservations are with Delta which are hosted by Worldspan but their flight operating system is apparently theirs.

  81. Re:Southwest refuses to drink the Kool-aid by Anonymous Coward · · Score: 0

    So much negativity... do you have any solutions or something positive to say? I'm sure you, as one person, know oh so much more about major airport planning and massive people conveyance than the hundreds of thousands of employees working in the airline industry for years. It's so very easy to be an armchair quarterback when you don't have costs/safety on one side and people/schedules on the other.

  82. It was just another scout... by Anonymous Coward · · Score: 0

    Prepare for the rise of the machines!!!

  83. A better snow job. They need it. by twitter · · Score: 1, Interesting
    I am only trying to make sense out of the above comment from the official statement above.

    My wife says things just snowballed.

    Crew assignment is a hard problem...

    Records keeping, very tricky. You would not want to try that with any old database, no sir, it might pop a window. Just thinking about how every other airline has managed this tricky problem since before computers makes my head hurt.

    We may never know what really happened but this would be a nice example for my classes :-)

    Yeah, it's a real class act for those 30,000 people sitting around in airports for Christmas, employees doing the same and those who have to recover from this disaster. Management is going to be happy about the publicity they just earned while their huge capital investment in AIRPLANES sits idle during a time of year that's supposed to be their most profitable because their far to expensive M$ "soloution" "melted". A chain is only as strong as it's weakest link. Employees, I'm sure, are also stranded for Christmas. For the New Year they get to ponder layoffs. What a happy company for you to dissect at your leisure next semester. Season's Best!

    Here's what I'll bet you might learn: WHEN SOMETHING MELTS, YOU LOSE YOUR ASS IF YOU DEPEND ON IT. MICROSOFT MELTS AND HAS POOR OR NO FAIL OVER CAPABILITY, SO YOU BETTER NOT DEPEND ON IT.

    --

    Friends don't help friends install M$ junk.

  84. Re:Southwest refuses to drink the Kool-aid by Anonymous Coward · · Score: 4, Informative
    Actually, the only thing that makes these sort of problems easier for Southwest is the consolidated fleet types. With nothing but 737's, you don't add complexity to the scheduler for things like pilot and f/a qualifications.

    What happened to Comair here could happen to just about any airline. There is no comprehensive suite of software that handles crew scheduling, aircraft scheduling, reservations, and the myriad of other functions that are needed to run an airline.

    Reservations, for other than tiny airlines, are still managed by large TPF mainframes. TPF is a very "bare bones" operating system that runs on IBM mainframes, and was written specifically to deal with high volume / high transaction rate systems. Personally, I've seen 5 attempts at 3 different airlines to replace it with something modern. ( like Unix with an RDBMS ). Each attempt failed miserably, and the airline went back to TPF. Note that TPF is not MVS, OS/390, or any other more mainstream Mainframe OS. It's purpose built.

    Unfortunately, this means that all of the other applications have to interface with TPF via screen scraping. To further compound the problem, no "suites" exist to handle the following functions, so most airlines have to "sew together" best of breed solutions for these basic functions:

    • Crew Scheduling - F/A's and pilots bid on slots to fly, this system takes those bids and turns it into a schedule.
    • Aircraft Scheduling - Tracks which tail numbers are flying which flights for the dispatchers
    • Optimization - Different optimizers to do things like:
      • Fuel Tankering - Use the jets as "tankers" so that you buy fuel where it's cheapest for flights later in the day
      • Crew Optimization - "Traveling Salesman" type solver to incur lowest labor cost, get crews back to home base, etc
      • Schedule Optimization - Use the aircraft in the most cost efficient way to cover all of your scheduled flights.
      • Maintenence Optimization - Pull aircraft in for Scheduled Maintenance at the optimum time.
      • Reacommodation - When things go wrong ( weather, mechanicals, whatever, pull in all of the above variables to crank out a new schedule, crewing, mx schedule, etc )
    • Booking Engines, for the internet and reservations agents
    • Point of Sale and Boarding functions for agents, skycaps, and kiosks
    • Interline functions where other airlines sell your tickets, and transfers for bagggage, etc
    Anyhow, this list isn't comprehensive, but shows enough of the disparate pieces that you can imagine why these "glitches" happen. Very few of the items from the list above come from the same vendor, or even run on the same platforms.
  85. Re:Southwest refuses to drink the Kool-aid by tigerc · · Score: 1

    Hub and spoke isn't the problem. You NEED it to get anywhere that's not a nonpopular destination. By saying that hub and spoke is a flawed concept, you effectively resign smaller cities to death.

    Say I want to go to Butte, Montana and I live in Boston. How is a direct route method of Southwest airlines going to get me there? Isn't the most efficient and cost effective way for an airline to transport me on a larger jet, to say Denver, and then, use a smaller less than 100 passenger plane to fly me to my destination of Butte? How do you say to medium sized cities without rail lines, we're not going to use the hub and spoke method and we're going to destroy any sort of business you have (not just tourism, but meetings/conferences)? How do you tell that to the people who will have to drive hours to get to a major airport?

    I agree that the upper management is corrupt. It really is. But because management is corrupt, you can't go saying that hub and spoke is flawed. In the future, I forsee a multitude of direct route airlines, and one big airline that still employs hub and spoke (either government subsidized, or just large enough and efficient to turn a profit). After all, how are airlines like Southwest going to get us overseas? Suddenly, the Jet Blue/Southwest system doesn't seem to efficient.

  86. Here is a clue by AnuradhaRatnaweera · · Score: 1
    Does anyone know what platform their system was based on?
    This site has a clue: check the link about Alan Cox's Windows 2000 snapshot capabilities. ;-)
  87. Re:Southwest refuses to drink the Kool-aid by ScrewMaster · · Score: 1

    So, what you're saying is this industry is operated by a disorganized hodge-podge of cross-connected hacked-up computer systems, such that it is a minor technological miracle that planes a. get off the ground, or b. ever make it to where they're going. Thanks ... I feel much better now.

    --
    The higher the technology, the sharper that two-edged sword.
  88. Yep, you are right! by Anonymous Coward · · Score: 5, Informative

    Your statements are accurate.

    I was a unix sys admin there, but left for greener pastures during the dot-com craze. The non-redundant hardware at the time ran AIX, and had a great support contract from IBM. The SBS application however, always had monthly issues, at least at that airline. They were looking for a replacement then, and I'm not suprised they still haven't replaced it.

  89. Re:British Isles DESTROYED by asteroid-tidal wave by Anonymous Coward · · Score: 0

    USA east coast destroyed by tidal wave.

    Oh dear.

  90. Simple Solution by jeephistorian · · Score: 2, Informative

    Take Amtrak!

    Amtrak receives around $500 million for a total budget, while the airtravel receives around $15 billion in subsidies. Take the train and save everyone money!

    _____________

    --
    Huh?
    1. Re:Simple Solution by jcuervo · · Score: 1
      Take Amtrak!
      Amtrak is actually way more anal about photo ID. Where the airlines accepted my Bank of America debit card with my photo on it as identification, Amtrak didn't. Fortunately, it wasn't that long of a drive...
      --
      Assume I was drunk when I posted this.
    2. Re:Simple Solution by HeghmoH · · Score: 1

      It's too bad Amtrak sucks harder than I believed physically possible, and can't even keep their trains on time in clear weather during the quietest periods of the year.

      --
      Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
    3. Re:Simple Solution by the+pickle · · Score: 3, Insightful

      Sure, that's eminently practical. I can take 48 hours to get from Detroit to LA, or I can take six (including travel time and check-in time at both airports).

      p

  91. Southwest's fuel by alder · · Score: 1
    ...just demonstrates how unworkable the hub-and-spoke system of flight scheduling is
    Quite possibly that is a big factor. Yet Southwest, at least for now, cannot be used as a litmus test. Their timely fuel hedging helps them reduce the cost enormously. (see also here)
    1. Re:Southwest's fuel by thogard · · Score: 1

      You can look at other airlines around the world that use the old TWA style system and compare them to the airlines using the Southwest style system. The ones using the southwest system are doing well and the ones using the old steam ship ticking model are having huge problems.

      I still don't understand why they put the plane departure time on the ticket when they should put the checkin time.

      I don't care about airlines anymore. I have a pilots license and only use them for very long flights or overseas flights.

  92. poor SKS ... by Anonymous Coward · · Score: 0

    lol

  93. Outsourcing, of course! by Anonymous Coward · · Score: 0

    They probably outsourced the software development to India.

    Think of how much they saved by paying the programmers peanuts!

    Think of how much they lost by having to shut down their business!

    1. Re:Outsourcing, of course! by hoolie · · Score: 1

      Yep, They will do anything to "cut costs"

  94. Re:A better snow job. They need it. by Lord_Dweomer · · Score: 1
    "What a happy company for you to dissect at your leisure next semester. Season's Best!"

    Oh come off it. What would you like him to do, sit there feeling bad for all the people who got screwed over? At least he's being proactive about it studying and analyzing it to try to figure out ways to prevent it in the future, as opposed to sitting on Slashdot and bitching how someone is being analytical when the troll (sorry, poster) thinks they should be sympathetic instead.

    Of course, he might be sympathetic as well, but who cares when we have blanket assumptions like the ones you've made in your flamebait post.

    --
    Buy Steampunk Clothing Online!
  95. Re:Southwest refuses to drink the Kool-aid by Zak3056 · · Score: 1

    It's an intresting concept, but it probably causes their IT managers to pull their hair out.

    Interesting question: Would you rather have an easy(ier) job with a company that loses billions of dollars a year, or a hard(er) job with a company that actually makes money and is going to be around for the next five years?

    Me, I'd choose the later. I'd rather be bald than working for a company with no future.

    --
    What part of "shall not be infringed" is so hard to understand?
  96. wrong... by Anonymous Coward · · Score: 0

    .... no EDS at that shop. The old vice-VP of IT, now "senior vice president of customers" preferred IBM Global Services to get his 'master plan' from. Back in 1999, they wasted gobs of money on IBM to tell them how to 'make their systems go', but in the end, the didn't really change a thing.

  97. Re:Southwest refuses to drink the Kool-aid by Anonymous Coward · · Score: 0
    Heh. Yes, that's exactly what I'm saying. Also, this isn't the worst industry as far as IT goes.

    The absolute worst one is Healthcare. They don't pay well for IT, and they have a culture where each department gets to pick their own technology. So, Admissions, Radiology, the Pharmacy, etc, pick their own stuff. Then a bunch of underpaid IT staffers cobble the stuff together with crappy JCL, shell scripts, etc.

  98. Re:Southwest refuses to drink the Kool-aid by hemp · · Score: 1

    Yep...SWA is still on SABRE or more precisely SAS, Braniff's old rez system in Tulsa(still TPF based).

    Everything else is done in house in Dallas.

    Which leads one to a question - why does the most profitable airline in USA not outsource its IT??

    --
    Skip ------ See the latest from http://www.anArchyFortWorth.com
  99. Delta sucks balls by Anonymous Coward · · Score: 0

    Out of all the airlines I've flown, Delta is the absolute worst in terms of customer service. Actually, Delta Shuttle is the absolute worst--the flight attendents are overtly hostile. Delta totally sucks and I never use them if I don't have to.

    I love SWA. :)

  100. Delta fuct me and didn't even kiss me! by y2imm · · Score: 1

    I got into Boston on the evening of the 24th for the final leg of my HNL-YFC flight. The small gate area for Comair passengers was overflowing with-not-so-happy looking folks. Checking with a few of them I discovered many had been waiting since early afternoon for their flights, with nothing more from the Delta folks but glib, non-informative announcements now and then. The aircraft were on the ground, but there were no crews available to fly them. I waited patiently while my flight got delayed and delayed again. Finally, after many passengers took the only available flights remaining to alternate destinations, did Delta/Comair finally admit our flight was cancelled. They gave us hotel/meal vouchers (well, I got a hotel voucher anyway) and we went our way disappointed but expecting to get to our destinations a day late.

    The next morning when we arrived back at Logan, our flight had already been cancelled. The lineup stretched from one end of Terminal C to the doors leading to Terminal B, and then some. While we waited, an airport rep was walking the lines asking where people were going. When our Comair destinations came up, he said "Go home. There'll be no flights out until maybe Monday or Tuesday." I thought people were going to cry right there. Our little band decided to try renting a car and driving. No luck, with no cars available at Logan that day. We tried buses, but no buses were running to our destinations that day. I seized upon the idea of using my Air Canada Aeroplan miles to get a free ticket to my destination (plus booking and taxes, $60).

    When I was waiting in the Air Canada line, I saw one of the people I was with earlier. She told me the Delta rep told her to go to other airlines and ask if they would give carriage based on the Delta tickets she had. Basically, Delta was telling people they were on their own. One guy told me he watched a Delta agent tell a lady to fuck off. Anyway, I got to the counter and asked the Air Canada agent if they were doing anything for Delta passengers. She told me unless Delta signed the ticket over to us they were not doing anything special for us. I didnt know what that meant, but I knew Delta was doing fuck all for us, so I went ahead and used my Aeroplan mileage ticket.

    The Delta fuck-up was basically all the news on Christmas and Boxing Days. Even the Canadian Immigration guys who never talk or joke around were feeling sorry for us. I dont know what I can get from Delta for the HUGE PITA they caused me, but I dont think Ill ever fly them again.

    1. Re:Delta fuct me and didn't even kiss me! by angrykeyboarder · · Score: 1

      "...he told me unless Delta signed the ticket over to us they were not doing anything special for us. I didn't know what that meant..."

      The Air Canada agent was probably assuming that the Delta ticket in question was a non-refundable ticket. If so, it was only valid on Delta (or it's regional affiliates like Comair).

      In the case of a ticket endorsed "Valid only on XX" (as non-refundable tickets are) said ticket would have to be reissued to reflect validity on another airline before any other airline could take it.

      In other words, the other airline wants to be paid by Delta for accepting their passengers.

      --
      Scott

      ©20014 angrykeyboarder & Elmer Fudd. All Wights Wesewved
  101. Didn't RTFP very well by A+nonymous+Coward · · Score: 1

    He said he fell back and by accident hit the power switch.

    You said Write-permit rings or write rings are light and don't actually fly well, so I doubt they were playing with that as it is unlikely to mass enough to flip the switch.

    His body certainly had enough mass.

  102. mod parent up! by Anonymous Coward · · Score: 0

    something's wrong with the way mba's are trained .... very wrong ...

  103. Mod parent up! by Anonymous Coward · · Score: 0

    Hell yeah!

    And Amtrak posts an operating profit on the NE Corridor line. Acela is a way better way to travel along that corridor--more legroom, AC outlets, tables, picture windows, beer on tap, you can use your cell phone, smoother ride, easier to get in and out of terminals, and the conductors are nicer than flight attendants.

  104. Mod parent up! by Anonymous Coward · · Score: 0

    Hell yeah!

    Amtrak 0wns in the NE corridor.

  105. Southwest's business model fails in isolation by Anonymous Coward · · Score: 0
    Southwest's cherry-picking of good routes that enables them to make a profit would fail in a vacuum, or if all other airlines did the same thing.

    How many folks fly Southwest from X to Y, but use another airline to go other places?

    Southwest - with political help from John McCain, probably - gets the profitable X to Y and leaves the rest for Delta and United and US Air and American.

  106. You're fuct just by flying Delta by Anonymous Coward · · Score: 0
    You can either go through the glorified Comair bus terminal in Cincinnati, or fly throught Atlanta.

    Either way, from a traveller's point of view, you need a belt to bite on.

    I really fear what Platinum Medallion members (Delta's highest level of frequent flier) must be capable of...

  107. Re:bullshit. by Anonymous Coward · · Score: 0

    The system in question was a custom app running on AIX. Not that this makes any difference to certified kooks like yourself, but you should probably find another thread to blather your well-rehearsed anti-Microsoft rants.

  108. Re:Southwest refuses to drink the Kool-aid by Anonymous Coward · · Score: 0

    It's a flawed concept, foisted on a naive public by an industry locked in some sort of mass psychosis.

    Odd that you don't give any alternatives to this kool-aid. What do you propose?

    I always throught an air-taxi service would work for the execs that fly first class. Arrive at the small regional airport, get on with 12 other people (no screaming kids or dorks with 4 pieces of luggage), and you're off.

    However that only works for 12 people at a time. Since you have some fixed costs (pilots making $150k a year, no matter if they fly 300 people at at time or 12) that require an airline to stay with the hub and spoke. So tickets would be 5x more expensive than a normal flight.

  109. Southwest refuses to drink the Kool-aid-Monopoly by Anonymous Coward · · Score: 0

    "Heh. Yes, that's exactly what I'm saying. Also, this isn't the worst industry as far as IT goes."

    Tought choice. Hodge-podge with some competition, or monolithic monopoly, were there's none.

    Which one would make people "feel better"?

  110. McPay. by Anonymous Coward · · Score: 1, Insightful

    "From Yahoo Jobs:

    Software Engineer Cincinnati, OH $40K -$50K"

    That's more than I make at McDonalds.

    1. Re:McPay. by groot · · Score: 1

      Not by much!

      --
      "Just remember, it takes a village idiot." -- The Motley Fool.
  111. Exactly-Blue Screen Of Depression. by Anonymous Coward · · Score: 0

    "Computers don't freak out or get depressed when work piles up. "

    Try loading Windows on them.

  112. Gee, What kind of system just crashes? :-D by Anonymous Coward · · Score: 0

    Can anyone say buffer overflow?

  113. Re:A better snow job. They need it. by Dachannien · · Score: 1

    A few years ago, a fellow was running an antique steam engine at the county fair. The steam tank ruptured, and one person was killed.

    This is obviously a tragic event for the person who was killed and his family, but it's also an interesting and important engineering problem. So much so, in fact, that the following year's PhD qualifier in our mechanical engineering department consisted of an analysis and redesign of the steam engine, so as to prevent such an explosion from happening again.

    It's a *good* thing to study problems like these, even in an academic sense, because academia is the very first step in the production of usable goods. If people aren't learning from these mistakes, then problems like the stranding of thousands of passengers in airports or the accidental death of a poor fellow at the county fair will happen again.

    because their far to expensive M$ "soloution" "melted"

    In case you forgot to read the first half of the comments to this thread, it's already been revealed that ComAir's system runs on AIX, a product of IBM, and that the software was developed by a subsidiary of Boeing. When you flame Microsoft for something, at least make sure they're *involved* first.

  114. Not entirely accurate by Anonymous Coward · · Score: 0

    "Rule 240"...refers to each airlines "conditions of carriage" policy.
    You would need to contact the airlines to obtain this.


    By coincidence, I had occasion to do this the other day.
    The 1st link has a bunch of links at the bottom of the page to many airlines' policies.

    gewg_

    1. Re:Not entirely accurate by angrykeyboarder · · Score: 1

      Still "Rule 240" went went out with Airline Deregulation years ago. It's a generic term these days and every airline can pretty much do what they want as far as re-accommodating passengers.

      For the most part they are fairly consistent and I'm sure that's mostly only for customer service/competition reasons.

      --
      Scott

      ©20014 angrykeyboarder & Elmer Fudd. All Wights Wesewved
  115. response from an AA employee by dan_bethe · · Score: 2, Interesting

    I sent a summary of these Slashdot comments to my cousin who works at American Airlines hq in Dallas. Here's his response!

    ---

    "ugh... I worked 9pm-1am yesterday (xmas day). I spent the first two
    hours of my shift calling people to tell them their flight was
    cancelled and reschedule them. Most of them were taking flights out to
    Miami and the Caribbean to spend New Years Eve partying on the beach.
    Honestly, I had little pity telling them they were going to miss out on
    one day of tanning especially since they seem to 'blame' the weather on
    us.

    "One hour into my shift our reference system went down. No IT people
    were willing to come in and fix it. I had the system up for booking
    flights and making reservations, but I could not look up any of our
    rules and regulations. Ah well, enjoy your xmas off IT guys!! Enjoy
    the weather in Cabo San Lucas!! Cheers!!

    "Fortunately, we have a backup of all our html files saved as text
    files. However each text file can only hold serval hundred text
    characters. So, when I want to look up our baggage policies the normal
    html file is called BAG INFO. In the backup system BAG INFO is
    separated into 10 or 20 text files and I have to 'page' through them by
    typing BAG INFO P2, BAG INFO P3, BAG INFO P4. The text files are not
    indexed and are not searchable. It took me 10 minutes to find and
    advise someone how big a bag they can take to Puerto Rico.

    "After I started taking incoming calls again, there were people calling
    in on Christmas day to book their trips for Spring Break. There were
    over 100 calls on hold to talk to us, and there were people sitting on
    hold for half an hour to ask me how much it would cost to book a trip
    to Fort Lauderdale in March. Couldn't that wait until the day after
    Christmas?

    "Yes, the airline industry does not prepare for emergencies as well as
    it could for the holidays when people want to travel in record numbers.
    However, I think the general public could try to have their own backup
    plans in place as well and realize that the travel industry in general
    does not have the equipment or the staff to handle everyone in the
    country wanting to travel all at once in one week. Do people stock
    their refrigerators year round with enough food to feed everyone in
    their families at one meal like they do at Christmas?

    "Even though we try to accommodate everyone as best as we can on the
    holidays, we want to to have a holiday just as bad as the rest of
    everyone else. Working in the travel industry should not indenture us
    to be your slaves over holidays. The public needs to have a little bit
    of compassion and realize how much we give up in our own personal lives
    just to help you get where you are going. Frankly, the way most people
    treat me on the phones I don't think they deserve our help and
    compassion. And don't call on Christmas day to book flights in March.
    That phone call is making someone work on a day they shouldn't have to.

    "anyways.... heh..... guess i had a bad night at work last night, huh

    "MERRY XMAS!"

    1. Re:response from an AA employee by winwar · · Score: 1

      Some comments:

      "One hour into my shift our reference system went down. No IT peoplewere willing to come in and fix it."

      I wonder why. If this is accurate, of course. I would define this as an emergency-willingness would have nothing to do with it. In other words, if you don't come in, you have just accepted unemployment.... Of course, if this wasn't in their contract, etc., just one more instance of incompetent management....

      "Yes, the airline industry does not prepare for emergencies as well as it could for the holidays when people want to travel in record numbers. However, I think the general public could try to have their own backup plans in place as well and realize that the travel industry in general does not have the equipment or the staff to handle everyone in the country wanting to travel all at once in one week."

      Ahh, yes. The "I'm incompetent, but it's your fault" defense. If they didn't have the staff or equipment, why did they sell the tickets? I mean, it's not like this was a surprise-this increase in travel happens every year. Hell, the airlines encourage it. Look, it is the airlines job to prepare for things like this. If they can't or won't then why don't they advertise this fact? Perhaps because they will be seen (rightly) as greedy and incompetent idiots? I mean, if they offered to refund peoples tickets, even those that weren't refundable, most of the animosity would disappear.

      "Working in the travel industry should not indenture us to be your slaves over holidays."

      What a crock of crap! You get paid for your work. All labor laws remain in effect. Therefore you are not slaves.

      "The public needs to have a little bit of compassion and realize how much we give up in our own personal lives just to help you get where you are going."

      And exactly what do you give up? It's called a job. You get paid for it. With the full knowledge of what it is going to be like this time of year. And it's not like you helped people get where they PAID to go very well....

      "Frankly, the way most people treat me on the phones I don't think they deserve our help and
      compassion."

      And the last time I was treated with compassion by the airlines in this situation was, oh, let's see, never. Look, you took their money for a service, now you are calling to say, oops sorry, we can't deliver this service because of the weather (because we are actually incompetent goes unsaid). Did you offer refunds? Probably not (oh, but you bought a non-refundable ticket....) Look, people don't deserve crap over the phone, but I doubt they were getting much help or compassion....

    2. Re:response from an AA employee by dan_bethe · · Score: 1
      All labor laws remain in effect. Therefore you are not slaves.

      Unbelievable. How to even respond ... the original author used an expression called "hyperbole". Comparable to "reduction to absurdity" for the purpose of illustration and summary.

      This was commentary on the personal experience of one individual inside a megacorporation -- an individual you haven't encountered and know absolutely nothing about. He merely advocates compassion and common sense. You're trying to project the idea that individual clerks are in any way responsible for the industry's strategic calamity. Then you ignore that he conceded the calamity in the first place. Then you use that projection as a strawman to stand on as a cheap way to try to elevate your own nondescript allegations.

      And the last time I was treated with compassion by the airlines in this situation was, oh, let's see, never.

      As his commentary proves, you would have if you'd called him! Or any of the subset of countless other highly trained and compassionate clerks who do their best to improve the calamity from within their isolated station, some of whom could use a pick-me-up after the last 50 people who dumped on em.

      If you want to pay attention and write about your experience or any other *original* thought so as to *add* to the discussion, then please do. Otherwise, I'm done with regurgitating negativity in the Slashdot guerilla verbal warfare zone.

    3. Re:response from an AA employee by Lord+Flipper · · Score: 1

      Agreed 100%

      I use a Mac at home, Windows at work, and started on old IBM mainframes (Fortran/COBOL) in the early Seventies.

      I read, respond to, and often ignore, posts on a variety of Mac user support LISTSERVS and whatnot. I hear bitching, from people who never read manuals, QuickStart guides, ReadMe.txt files, etc, all the fucking time, saying "This co. sucks", "these -company name- tech support/call desk people Suck", etc, etc...Okay, already, we all know that 1) Help Desk isn't a lot of people's first priority when it comes to Vocational Choices, and 2) Nobody had a gun held to their head (unless you count the need for food, shelter, 'stupid' things like hungry kids, as a 'gun', which I would), when they took those jobs. All that being said, every company I've seen singled out as being the 'worst', happens to be a company I've called. Earthlink, let's say, there have been times, back in Boca, after severe storms, when I wanted to rule out local damage as a 'fault' for net interruptions, and, even though I knew they might not admit it, I would call Earthlink in Atlanta, to see if the DHCP servers in Miami were okay. Nine times out of ten I could get the guy on the phone to get past 'All systems are go", or "Did you reboot?" script-talk, and actually poll the servers in Miami, and give me info. Amazing, eh? How did I do that? By being polite on the fucking phone.

      For a while I would just say "Win2kPro" when they'd ask what OS are you on? Heheh, and then I'd silently translate their advice into mac-speak. That worked pretty good too, but talking to human beings as if they were human beings is just the simplest 'ticket', not only for 'support', but in Life.

      One last example: Microsoft. Everybody hates the company, of course, that's a no-brainer, but the people that work there? I don't know, they might be 'people' too... let's see: I run VirtualPC on a Titanium Powerbook. (I run a lot of things in VPC, but 2kpro was my example.) So, after a few years, some unrelated issues and whatnot, I find myself with a re-install of Windows, that was simple, and no ID, no Auth number, no serial, etc, can't find the "card" with the number on it. (Which I'm convinced never existed, until 9 months later when I find it). Ok? So, I call Redmond, expecting the worst, and who wouldn't expect the same? Call answered on 2nd or 3rd ring. Whoa. No voice mail. (WTF?). Nice enough fellow, believes I 'never had 'the Card', says "Hang on a minute, I'll be right back". Meanwhile, he knows I'm on a Mac, because I set the scene... 15 fucking seconds later he's back on the line, gives me a new serial, and Auth Number, says if I find the 'card that never existed, don't worry about it, you'll have 2 then". And I'm golden. So much for that. The point is: Companies are about people. "Corporations" are 'entities'. There's a difference. And as for attitude on the part of help desk, etc, it's a software issue, used to have an acronym: GIGO, anybody old enough to remember that? The old Garbage In, Garbage Out.

      As for the guy with the 'Insult to injury' regarding Cleveland... I spent 5 hours in rain, on the tarmac at Kennedy, before they taxied back to the gate to let us all off a cramped 747 to stretch and eat. (airline regs said nothing but peanuts and candy till at elevation)... I stood in a nice long line, was one lady away from the cash register with 'dinner', when the girl from the Airline entered with "Flight blah blah Reboarding Immediately!!!", waved bye-bye to dins, and proceeded to re-board, and wait another 5 hours out on the tarmac. Heheh. A greasy spoon, with a coffee, a smoke, and dinner-on-the-way in Cleveland would have been "Nirvana Tonight", as far as I was concerned. (this from a current New Yorker).

    4. Re:response from an AA employee by Anonymous Coward · · Score: 0

      AA outsources it's IT to SABRE/EDS. Holiday's are 'out of scope' hence no IT guys coming in unless the AA pays through the nose.

  116. Re:Southwest refuses to drink the Kool-aid by PPGMD · · Score: 1
    Some of the systems work fine in almost every environment, it's simply that overall Southwest refuses to use them. Whenever possible they like to develop it in house.

    They believe in doing things differently from all the rest of the airlines, but IMO they are throwing the baby out with the bath water in somes cases, developing your own software has left them behind when it comes to web orders, which is possible revenue, and less seats empty.

    Don't get me wrong I respect Southwest, but there are areas where they could have done better.

  117. Re:Southwest refuses to drink the Kool-aid by angrykeyboarder · · Score: 1

    As has been pointed out by others, while hub-and-spoke has it's flaws, it's also the best solution (from an economic standpoint) to get from say Billings, MT to Jacksonville, FL.

    There is no way any airline could make money flying that route directly.

    And Southwest does, in fact, operate on a hub-and-spoke system. They have major hubs in Dallas (DAL), Houston (HOU),Phoenix (PHX) and Baltimore (BWI) among others.

    However, they also supplement that system with regular point-to-point service, but only on profitable routes (from big metro area "a" to big metro area "b"). Again, they'd lose their shirt on the "Billings-Tallahassee" route.

    On the other hand, Delta can get you from Billings to Tallahassee....

    --
    Scott

    ©20014 angrykeyboarder & Elmer Fudd. All Wights Wesewved
  118. Proof that you guys mean NOTHING to Joe Sixpack by gfecyk · · Score: 1

    A freak weather event did more damage to a computerized reservation system in one night, than all of the hackers, viruses, trojans, spyware and idiot lusers combined over all of 2004.

    Unless you take the overinflated guesstimates of the likes of mi2g at face value, anyway.

    Toronto faced a blizard last week. Some two hundred flights cancelled because of bad weather. Air Canada, West Jet, Jetsgo, etc didn't go down even if their planes did. That too, caused more damage to YYZ's fiscal health than all of its computer security woes combined through 2004.

    And then I read the letter posted here about the IT guys sunning themselves during all of this.

    Merry xmas. I hope you can justify your jobs in 2005.

    --
    Use Evolution instead of Outlook? Bewa
  119. Re:bullshit. by Anonymous Coward · · Score: 0
    Moderators: Please note that "twitter" is a known fanatical sycophant whose obnoxious offtopic rants are legend here on Slashdot. It doesn't matter what the topic is, he'll find a way to scrape in some pointless Microsoft bashing. While nobody expects us to love Microsoft in any way, his particularly tepid style of calling anyone he replies to "troll" or "liar" or "fanboy" because he happens to disagree with whatever they're saying is well documented and should not be rewarded. If anything, twitter is the type of person that should not be part of the open source/free software community. He is an anathema to all that is good about free software.

    I'm posting this so that you (the moderator) have some context to consider twitter and not mod him up whenever he posts his filler preformatted rants about installing Knoppix or Mepis or whatever that unfortunately get him karma every single time and allow him to continue posting his trademark toxic crap (read on) day in and day out. You may consider this a troll - I consider it community service. And I ain't kidding.

    If you're a /. subscriber, I invite you to look through some of his posting history. I guarantee that you'll be hard pressed to find someone that is more "out there" than twitter. You'll also probably notice he's got quite an AC following. Don't just read his posts, make sure you go through the replies.

    To get an idea of what I'm talking about, check this post out. This is an article about email disclaimers. The parent of the post is complaining about the ads in the linked page and so on, and twitter actually goes off on a rant to blame it on Microsoft and recommend Lynx, because "is teh free".

    Here's another. In this post twitter not only calls the OP a troll but attempts to "tell it like it is" while making some vague argument about "GNU". Yes, if you're confused, you're not alone. The reply (modded +4) proceeds to simply destroy his bogus argument. You will notice he did not reply. This is what some people call "drive-by advocacy". A sort of I'll just leave you with my thoughts here and move on to the next flamebait kind of deal. In fact, he almost never replies because he knows that his fanatical arguments simply do not hold up to any sort of discussion. It's not that he's chosen the wrong cause - he's just going at it in a completely wrong way.

    Here's that drive-by advocacy and FUD in motion: twitter goes on about some topic and then drops the usual "oh and M$ is teh evil" because "WMP phones home" or some such. Called on his FUD, he then claims that WMP stores every song and movie you've ever played in a file, somewhere. Pressed further, he just sort of slithers out of sight, his FUD-spreading complete. This is not about some Microsoft technology that nobody likes anyway; it's about lying for the sake of lying. Way too many of his posts are exactly like this one.

    More? Just read though this post and the subsequent replies. I guess this stands on its own. Or these two. Or this one. Or this one.

    Still not convinced? This is what twitter considers "humour" while going about his daily "M$" routine.

    M

  120. Re:A better snow job. They need it. by Anonymous Coward · · Score: 0
    most profitable because their far to expensive M$ "soloution" "melted".

    Try again, bitch. That's an IBM "solution" Comair was running. IBM, yes. Does that hurt? Feeling mighty retarded yet?

    I cannot believe someone actually modded you up when your other offtopic inflammatory "M$ is teh sux" post got its well-deserved troll rating.

    BTW, it's right after Christmas - give it a rest.

  121. Re:Southwest refuses to drink the Kool-aid by winwar · · Score: 1

    "Hub and spoke isn't the problem."

    Yes and no. Southwest does have hubs. I would say that Salt Lake City is a Southwest hub.

    "You NEED it to get anywhere that's not a nonpopular destination."

    Well, you have a few choices.

    Accept that you can't fly everywhere. (Olympia, WA, the capital of the state, has had great trouble keeping air service of ANY kind-they seem to survive).

    Accept that it will more inconvenient. Instead of flying to a hub, you will have to fly to a city from which the flight that goes to "nowhere, USA" originates. Many Southwest flights have multiple stops....

    Accept that you will have only a few large profitable "hub and spoke airlines" (fewer than now because they aren't profitable....) And they will routinely suffer huge delays....

    "By saying that hub and spoke is a flawed concept, you effectively resign smaller cities to death."

    No, they just don't get convenient/cheap air service. If I want to go to Olympia, Wa., I have to fly into SeaTac (an hour drive). Always been that way, probably always will be.

    Remember the old joke (paraphrased):

    How do you become a millionare? Buy an airline when you are a billionare....

  122. Algorithms by coyote-san · · Score: 1

    I think the key algorithms we studied in the mid-90s were developed in the late 80s. If my poor memory is right, the complexity dropped from O(n^3) to something like O(n^lg(n)) or O(n^lg(lg(n))).

    There's even more stunning improvement in the algorithms for solving multidimensional partial differential equations. (e.g., weather). Put a modern vector processor supercomputer running the algorithms from the 70s in one room, and a TRS-80 running the latest algorithms in another room, and the TRS-80 will easily beat the supercomputer. (Assuming it has sufficient memory to hold all of the model data, of course!)

    Implement the modern algorithm on a 1024-node cluster....

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  123. Surely you know the ditty ... by chris_sawtell · · Score: 1

    Time to spare?
    Go by Air.

  124. at our plant, it was little under desk heaters by DeprecatedFeature · · Score: 1

    that all the women had. inductive loads are a real bitch.

    --
    maybe one day i'll be smart enough to come up with a cool sig, too.
    1. Re:at our plant, it was little under desk heaters by jrockway · · Score: 1

      A heater is a resistive load.

      The blower is inductive, but the heating element is what's using the power.

      --
      My other car is first.
    2. Re:at our plant, it was little under desk heaters by Walt+Dismal · · Score: 1

      I had a similar problem. I was in a corner cubicle bordered by two other people. My monitor constantly had fluctuating image size and I figured there was something on the line pulling the voltage down. A rather obstinate co-employee had a massive heater plugged into her cubicle. She refused to disconnect the load. I came in a few days later in the afternoon to find that a fire had occurred that morning, in the cubicle walls. She had pulled enough current to heat up the wires in the plastic channels for power cables, the plastic actually caught fire. They had smelled smoke but nothing was visible for awhile. Cubicle wiring does not have circuit breakers, depends instead on the building. The building was wired for big load o current.

  125. not really a technical problem by snow_man · · Score: 1

    in my way of thinking this was a management failure. technical, finiancial, and operational management share this joyful problem.

    my understanding is that the software wasn't designed for the volume of txn pushed down it's throat. slowly this information "trickled up" the food chain and right back down. rinse, repeat.

    the folks that supported the system would say it's overloaded. the technically-responsible management would tell the next layer up that they needed money to fix the problem. the financially-responsible management layer would tell the operationaly-responsible layer that someoneone wanted more money than the fiscal budgeting planned for and it would be painful to fix. the operational-management would squeal like stuck pigs and never, ever, tell shareholders a story like that one. the financially- responsible people would convey this unhappiness downwards. technically-responsible management would "find a way make it work" and divert the unhappiness to the support people. the support people would piss/moan/bitch and make-it-so as best as possible.

    repeat this cycle 'til flames shoot out of the mission critical system(s).

    then "suddenly" the problem is handled properly but in a somewhat hurried manner.

    --
    i am snow. fear me.
  126. Re:Southwest refuses to drink the Kool-aid by RollingThunder · · Score: 1

    The trick is to have a crewcut. That way you can't get a good enough hold to pull the hair out. ;)

  127. The Scourge Strikes! by Anonymous Coward · · Score: 0

    Jim and Brent, you knew it was coming ... I can see Pat smirking now ...

  128. Re:Southwest refuses to drink the Kool-aid by /dev/trash · · Score: 1

    What if I wanna fly to Atlanta?

  129. Re:bullshit. by Anonymous Coward · · Score: 0
    The system in question was a custom app running on AIX. Not that this makes any difference to certified kooks

    Log in, bitch, and we might trust you.

  130. proactive and reactive. by twitter · · Score: 1
    At least he's being proactive about it studying and analyzing it to try to figure out ways to prevent it in the future,

    The proactive people said not to use M$ crap. Studying the mess afterwards is the very definition of reactive. The lesson was obvious before it happened, and the rest of us are entitled to a happy, "I told you so".

    we have blanket assumptions... in your flamebait post.

    Microsoft sucks, yeah, yeah, yeah. All your apologies are worthless.

    --

    Friends don't help friends install M$ junk.

    1. Re:proactive and reactive. by Anonymous Coward · · Score: 0

      Try again, bitch. Or, you can stop. You're clearly out of your league anyway.

    2. Re:proactive and reactive. by Anonymous Coward · · Score: 0

      and the rest of us are entitled to a happy, "I told you so".

      Just like I'm entitled to a happy "Get back to sucking my dick you little bitch Twitter... you aren't doing your job well enough".

      Is that proactive for you?

      - Stallman

  131. bring it on. by twitter · · Score: 1
    It's not the OS, it's the people behind who's to blame. Yes, stupidity and MSW often go together but in a few years one will probably occasionally see a massive linux outage due to... similarly stupid people.

    It's interesting how you dissmiss an OS with a track record of failure in order to blame anyone other programmer. This assumes Microsoft has better programmers than anyone else, an assumption Microsoft marketing loves you for but is unsupported by any objective review of of performance. The same "stupid people" have been and still are writing applications for Linux, Unix and Mac right now but they have better tools and make fewer mistakes there than they have with M$'s crappy SDK's and pathetic OS.

    Two examples of software that just works are Apache and Sendmail. People write all sorts of applications for both of them without this kind of meltdown and both dominate their "markets". Microsoft's efforts at both, IIS and exchange have been a total dissaster.

    Wanna bet what crappy OS is behind this? Blaming you developers is sorry stuff.

    --

    Friends don't help friends install M$ junk.

    1. Re:bring it on. by Anonymous Coward · · Score: 0
      Moderators: Please note that "twitter" is a known fanatical sycophant whose obnoxious offtopic rants are legend here on Slashdot. It doesn't matter what the topic is, he'll find a way to scrape in some pointless Microsoft bashing. While nobody expects us to love Microsoft in any way, his particularly tepid style of calling anyone he replies to "troll" or "liar" or "fanboy" because he happens to disagree with whatever they're saying is well documented and should not be rewarded. If anything, twitter is the type of person that should not be part of the open source/free software community. He is an anathema to all that is good about free software.

      I'm posting this so that you (the moderator) have some context to consider twitter and not mod him up whenever he posts his filler preformatted rants about installing Knoppix or Mepis or whatever that unfortunately get him karma every single time and allow him to continue posting his trademark toxic crap (read on) day in and day out. You may consider this a troll - I consider it community service. And I ain't kidding.

      If you're a /. subscriber, I invite you to look through some of his posting history. I guarantee that you'll be hard pressed to find someone that is more "out there" than twitter. You'll also probably notice he's got quite an AC following. Don't just read his posts, make sure you go through the replies.

      To get an idea of what I'm talking about, check this post out. This is an article about email disclaimers. The parent of the post is complaining about the ads in the linked page and so on, and twitter actually goes off on a rant to blame it on Microsoft and recommend Lynx, because "is teh free".

      Here's another. In this post twitter not only calls the OP a troll but attempts to "tell it like it is" while making some vague argument about "GNU". Yes, if you're confused, you're not alone. The reply (modded +4) proceeds to simply destroy his bogus argument. You will notice he did not reply. This is what some people call "drive-by advocacy". A sort of I'll just leave you with my thoughts here and move on to the next flamebait kind of deal. In fact, he almost never replies because he knows that his fanatical arguments simply do not hold up to any sort of discussion. It's not that he's chosen the wrong cause - he's just going at it in a completely wrong way.

      Here's that drive-by advocacy and FUD in motion: twitter goes on about some topic and then drops the usual "oh and M$ is teh evil" because "WMP phones home" or some such. Called on his FUD, he then claims that WMP stores every song and movie you've ever played in a file, somewhere. Pressed further, he just sort of slithers out of sight, his FUD-spreading complete. This is not about some Microsoft technology that nobody likes anyway; it's about lying for the sake of lying. Way too many of his posts are exactly like this one.

      More? Just read though this post and the subsequent replies. I guess this stands on its own. Or these two. Or this one. Or this one.

      Still not convinced? This is what twitter considers "humour" while going about his daily "M$" routine.

      M

    2. Re:bring it on. by Anonymous Coward · · Score: 0

      The same "stupid people" have been and still are writing applications for Linux, Unix and Mac right now but they have better tools and make fewer mistakes there than they have with M$'s crappy SDK's and pathetic OS.

      Hey dumbass, those "stupid people" they were referring to might not have necessarily been the people who do the coding, but maybe the people who implement the software, or *GASP*, maybe use the software?!

      Two examples of software that just works are Apache and Sendmail.

      Comparing those to IIS/Exchange as software "that just works" is lame. IIS/Exchange is just as easy to implement, and while IIS might not be anywhere near as common as Apache on websites, Sendmail doesn't compare to Exchange when it comes to features.

    3. Re:bring it on. by ext42fs · · Score: 1

      I don't dismiss the OS: M$ is crap and smart people just don't use it. The guy who picks a crappy OS to start with is a deep cause. Incompetent programmers, operators, whatever are a deep cause as well: they don't have enough clue and will fsck up some day even when they use linux. Are we then going to blame linux?

      Another example:
      If you can't view a web page because it is written for IE, blaming M$ is one thing but it is more effective to educate the world about the stupid webdesign company which did that lousy job. By now, only the really stupid people are not aware of M$ crappyness.

  132. Some clarification by Anonymous Coward · · Score: 1, Informative

    Well... to try and provide a little clarification here, as I work for Comair. Here's the skinny:

    Crew and aircraft scheduling is done through a software package called SBS Track. This very same software package is used by many other airlines, including the two I worked for before coming to Comair. I don't know if their systems have the same hard-coded limit that ours does or not. This software package has _nothing_ to do with reservations, or anything concerning passengers whatsoever. It is simply the software we use to schedule our aircraft and crews to fly the list of flights that Delta wants us to fly.

    Crew scheduling is done by creating "pairings". A pairing is a sequence of flights that comprise a crewmember's trip. Anytime a change is made, a new pairing is generated, with the new sequence of flights. The system has a hard-coded limit of 32k pairings ("transactions" is the what the IT folks call it) in a calendar month. As of 10:00 pm on 12/24, that limit was reached. Crew Scheduling was unable to create any new pairings, unable to track who would be flying what airplane to where, and basically unable to keep the airline flying at that point.

    It was not any kind of a hardware failure, there are backups for that. It is simply a software limitation, that when it was coded many years ago, nobody realistically thought it would ever be reached. Why they hardcoded a limit into it in the first place is beyond my knowledge. :)

    A major part of the problem is Comair's concentration in Cincinnati. CVG is our only crew base, and it is the largest single crew base of any airline in the world. Over 1800 pilots and 1100 flight attendants in one base. Not even any of the majors have a single base that large. Several of our software packages are woefully inadequate, and replacements have been sought for some time.

    As for getting things up and running on paper, this is a monumental task. Scheduling for 160+ aircraft and 2900+ crewmembers, and compliance with all FAA regulations, maintenance requirements, crew rest requirements, and contractual requirements is incredibly complex. In addition, we have crews and aircraft stranded across the country due to the weather that moved through that caused this whole mess in the first place. Add to that the very limited number of people who actually have the knowledge of all the requirements for scheduling, and coming up with a full schedule for the next day would be nearly impossible.

    Jan. 1 starts a new month, and the system will return to full functionality then. Until that date, however, our operations will be very limited.

    1. Re:Some clarification by DenvilleSteve · · Score: 1

      Really good post! How could Comair IT Management not have been aware of the limitation in this critical application?

    2. Re:Some clarification by Anonymous Coward · · Score: 0

      they were notified of this. It was in the Y2K report that was given to them August 1999, by an risk analysis contractor which they paid for.

      An investigation on Delta stockholder's behalf is warranted and called for.

    3. Re:Some clarification by DenvilleSteve · · Score: 1

      or lawsuits by stranded passengers!

  133. No, by twitter · · Score: 1
    Try again, bitch. That's an IBM "solution" Comair was running. IBM, yes. Does that hurt? Feeling mighty retarded yet?

    No, I don't feel hurt by some highly moderated anonymous "inside information" that clearly contradicts

    --

    Friends don't help friends install M$ junk.

    1. Re:No, by Anonymous Coward · · Score: 0
      Clearly contradicts what, you wack job? The OS is AIX. Did you RTFA or did you go off on another one of your neurotic "M$" cry-me-a-river blab fests because you had nothing better to do?

      Go watch your lawn grow or something. You're just trolling Slashdot, as always. Let me guess - you're going to re-post your flamebait posts again because you consider the moderation to be unjust? What a sad joke.

    2. Re:No, by Anonymous Coward · · Score: 0

      Twitter, please do us all a favor and stick the cock back in your mouth.

  134. Re:A better snow job. They need it. by Anonymous Coward · · Score: 0

    Did you miss the part where the OS was not MS Windows?

  135. Oh jesus christ... by Anonymous Coward · · Score: 1, Informative

    Some of you have no clue.

    The BS&T quotient on your average travel application is on the relatively nuts scale. Expedia, Travelocity, hotwire, priceline, whatever -- I'd ask that some of you with simple solutions go and speak to the lead travel-server dev for the product.

    You'll probably have to change pants after the conversation. Travel is stable, reliable, and generally rock-solid. The algo's for selecting airline flight prices or hotel room block-reservations are known and well-tested. The methods and protocols of communication are well-documented and generally straightforward.

    Until recently, it was all on hardware (And i'm speaking generally about the large travel providers -- Worldspan and Sabre come to mind) that was considered arcane. Ancient versions of Netware on an X.25 pad; screen-scrapers on top of it. Have Fun trying to modernize!

    This does not suprise me in the slightest. We are stressing our ancient systems more than ever these days, and it should not be a suprise when the occasional ancient application (ctime, folks) gets floor'd and dies a bloody death.

    It'll be patched in a month.

  136. Re:Huge earthquake by cammoblammo · · Score: 1

    Okay, nobody's going to read this, because it was yesterday's story and it's been modded into oblivion.

    But can somebody please explain to me how this is off-topic? Somebody posted something that wasn't, admittedly, perfectly on topic (but worth a `Flamebait' more than `Offtopic') and I replied to that exact post.

    I could cope with being modded Flamebait. Having a read, it probably does qualify. I'd even be happy with a Troll. But I made a perfectly relevant, and thus ontopic, reply to a post.

    I could go on, but there's no real point. Metamods, a bit of justice please?

    --

    Cogito, ergo sig.

  137. Re:Southwest refuses to drink the Kool-aid by PPGMD · · Score: 1

    Learned that long ago from a military man, I understood why later, he did a tour at the Pentagon.

  138. Re:Southwest refuses to drink the Kool-aid by Lord+Flipper · · Score: 1

    I have to wonder about the Healthcare area being worse.

    I worked, off and on, at a major VA Hospital. The VA, of course, like all parts of the Military system not related to procurement, is seriously underfunded. But, the only localised system in the entire facility I worked in, was the CT-scanners. They needed super hi-def imaging, so they used SGIs, and something that 'looked like' an X-window-type OS. I never ran those boxes, so I don't have better, more accurate info on that, sorry. (It might have been some version of solaris with a KDE or Gnome lookalike desktop, don't know. They were all leased from GE, anyway, and had really intelligent people running them. Every other department was tied-in to the same over-all 'system', as far as boxes and software, but separated from other departments (for obvious reasons, "Admissions" didn't need to be looking at/accessing Food Services, etc).

    As for the rest of the hospital, all of it was on a unified system. With the exception of paper-based Medical Records. You cannot imagine the enormity of the paper-based issue. It defies simple 'scanning' and conversion to electronic docs, due to the wide range of forms, crazy handwriting, etc. All that aside, maybe the 'normal' hospital systems are different. Or your post was referring to the Healthcare industry at the governmental level. Not sure.

    But (this is the 'plug'): If you, or anyone, care(s) about the kids and fellows that have already 'done their duty' (whether you're pro- or anti- political war is irrelevant, IMHO) be aware that the Administration (White House, power structure, whatever) is contemplating further severe cutbacks in health care, hazard duty pay, death benefits for families, etc, and write to your representatives, expressing your opinions.

    As far as the attitude, expressed by some, that the airlines (or any company, for that matter) should go bankrupt, to be 'taught a lesson', for management stupidity, or bean-counting decisions, may seem reasonable, but the only people hurt (and they are numerous) are the ones with the least responsibility for the failures, as a whole. Think of it this way: In a country that has a tax and Congress (whose primary purpose is to divy up tax dollars between competing corporations (aka the real 'Special Interests') working as a system designed to facillitate Corporate Welfare for the rich and powerful, the last thing we need are more 'little people' needing money for food, shelter, their families, etc.

  139. IBM Capacity on demand. by Anonymous Coward · · Score: 0

    IBM sells Capacity on Demand. My i5 Midrange has more CPU's than I purchased. If I have a need for additional power, I can turn on several additional CPU's, and just pay for the time I spent using them. So you can scale up for holidays, without paying for this scaling year round.

    www.ibm.com

  140. Maybe example of too much Horizontal vs Vertical by petepdx · · Score: 1

    Vendor A, B, C, ...; package Z, Y, X, ...;
    integrator 1, 2, 3 ...; contractor ....

    Maybe I'm too old, but it sure was nice when
    at least the applications were written in house,
    yea we would bitch when the writer of the code
    was long gone, but still, we could fix it.

    Oh well, I just wish I chould get use to saying
    "its the vendors fault" when ever something goes
    wrong.

    Conair made the descision not to replace/upgrade
    before January. End of Year, make the profits look better ? The roll of the dice ?

    -pete

  141. always preview by DeprecatedFeature · · Score: 1

    which i didn't do. i clipped out the bit referring to the floor buffers which also nailed us. sorry, thanks.

    --
    maybe one day i'll be smart enough to come up with a cool sig, too.
  142. Re:AIX? by Anonymous Coward · · Score: 0

    Your MS-bashing zeal has blinded you, twitter. Did you consider a company as large as Comair might have people working on a number of different projects on heterogeneous platforms?

  143. War stories, get your war stories heah! by hey! · · Score: 1

    Make in the early 80's I used to work making applications for and servicing very early microscomputers from a company called IMS. They basically consisted of dishwasher sized box with a S100 bus into which a number of single board z80 computers plugged, sharing a single hard drive (that's "Winchester" to you sonny) controlled by a master computer, and comunicating using a hacked up CP/M compatible system called TurboDos. The result was one of the first multiuser computer systems suitable and affordable enough for routine office use; you could equip a half dozen people for paltry twenty or thirty thousand dollars.

    Anyway, the manufacturer discovered that the office environment was a bit more, uh, challenging than they anticipated. Static was turning out to be an issue. So one day they issued a tech bulletin saying that secretaries should stop wearing pantyhose. In the formal (in those days) frozen Northeast, women did not wear slacks to work, any more than men went to work without a tie. So the resourceful ladies began to bring spray bottles of fabric softener to work; they'd run into the ladies room to "freshen up" every hour or two.

    Of course aside from the inexperience of the manufacturers, we still had the ususal load of PEBKAC malarkey. We had one customer who was complaining about hardware crashes. Now you can imagine these weren't altogether uncommon given the primitive nature of these boxes and the operating system they ran, but he seemed to really having a bad time of it. We sent a tech, who on entering the customer's newly built and rather swanky computer room, noticed a dimmer switch on the wall next to the door.

    "What's this?" he asked.

    "Oh, that runs the computer."

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  144. Re:AIX? by Anonymous Coward · · Score: 0
  145. Re:AIX? by Zeinfeld · · Score: 2, Funny
    While AIX, "Ain't unIX", might be described as Unix and the advert looks like HR drool, I'd still wager that some thing M$ failed something Sybase and that the AIX rumor is someone blowing smoke up your ass. Comparing reputations, AIX vrs. M$, the choice is clear.

    So lets think this one through for a second. The people who work there say the system that failled runs on AIX and that its the application thats gone whoopsie. So they obviously must be lying since everyone knows that the minute an application is ported to AIX all the bugs fall out of it.

    Of course with this type of thinking there is no way that reputations are ever going to change since every computer error is attributed to Windows even if it has nothing to do with the issue.

    I suspect that the HR advert is for a completely unrelated job.

    I also would hazzard a guess that the real problem at the place now is not the system anymore. The system is probably back up but they are now having to deal with planes that are in the wrong places and crews that have no flying hours left because of decisions that were taken manually while the system was down.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  146. Re:AIX? by Fallen_Knight · · Score: 1

    its a 24+ year old application, it should have been repalces years ago with a more mordren system.

    Not much a OS can do about a varible overflow in an application program, windows or AIX.

  147. Best (flame bait) quote for this situation. by notany · · Score: 1

    High on the list of things Lisp offers that most other languages botch is the idea that (+ x 1) for any integer x should return a number bigger than x in all cases. It seems like such a small point, but it's often quite useful. -- Kent M. Pitman

    --
    Dyslexics have more fnu.
  148. Re:Southwest refuses to drink the Kool-aid by MadHungarian1917 · · Score: 1

    Because SWA knows that IT is a critical part of their business not an "expense" to be minimized. Anybody with the proper qualifications can build an airline. To run one efficiently one needs to be in control of all the variables. To do this one needs efficient and effective IT.

  149. Re:AIX? by Anonymous Coward · · Score: 0

    Twat!!

  150. As was reported first on slashdot by Anonymous Coward · · Score: 0

    it was a 2 bit system with a 16 bit problem and tight beancounters.

    Award winning hair styling and hair coloring, 100% natural hair extensions, pictures and photos of perms, glamorous updos, bridal hair updos, wedding updos, teen styles, cute hair updos, prom updos, body waves, for men and women. Framesi, Bio Ionic, Wella, Paul Mitchell, Matrix, Consort, Rusk, Clairol, Shades Q., and more.

  151. Looks like I was wrong. by twitter · · Score: 1
    So lets think this one through for a second. The people who work there say the system that failled runs on AIX and that its the application thats gone whoopsie. So they obviously must be lying ...

    At the time, you did not know that people who worked there said that. All you had to go on was a post by another Slashdotter claiming an anonymous person told them that. It was pure hearsay, but it seems to have turned out correct as Comair has come out and said the same thing.

    with this type of thinking there is no way that reputations are ever going to change since every computer error is attributed to Windows even if it has nothing to do with the issue.

    Actually, the reputation will not change because Microsoft will not change. This one woopsie just happened to not be Windows, that does not make the platform any more stable. Microsoft has been warned about the dangers of their system designs but has chosen to blunder forth.

    Given a choice, which one would you rather be responsible for? Which one would you use for a mission critical application? Unbelievably, many airlines use Windows as a terminal for ticketing and other very important functions.

    --

    Friends don't help friends install M$ junk.

    1. Re:Looks like I was wrong. by Anonymous Coward · · Score: 0
      Actually, the reputation will not change because Microsoft will not change.

      Your post was offtopic, inflammatory and stupid, like most everything you spew. Like a good little sheep zealot you jumped the gun and started making dubious arguments as to why this was yet again another proof that "MS is teh evilz". The same tired, bogus arguments you always use. And now you accept you were utterly, stupidly wrong and still take a potshot at Microsoft. Silly little vengeful child.

      If anyone had any doubts before I think this pretty much proves you are a certified borderline retard that is more interested in finding new and exciting ways to hate Microsoft than to do anything useful with your limited communication skills. I suggest you go talk to a shrink about this dead-end job of yours that makes you hate a company (a company!) so much. You must have an ulcer the size of Wisconsin by now. Good luck.