Slashdot Mirror


Big Red Button Disasters?

FredDC asks: "The Daily WTF has a story about a Big Red Button disaster. What Big Red Button disasters have you experienced? Which ones have you caused? Are there any that you've heard about, or do you know of any that can happen any day now?"

6 of 508 comments (clear)

  1. A QA Intern Story... by __aaclcg7560 · · Score: 5, Interesting

    I was a QA intern at Fujitsu working on the WorldsAway chat world when I discovered a rare crash bug with a new artist tool that I could reproduce successfully but my boss couldn't. Since the tool was supposed to be used on the test server only, my boss approved release of the update to the production server. Everything was fine for a day before the production server started crashing. Turns out that the artists were creating new content on the production server instead of the test server and using the new tool that caused the crashes. The production server was shut down for three days a complete code rewrite was required and Fujitsu lost $250,000 USD in revenue. My boss kept his job as he led the programming team to rewrite the code. I, on the other hand, was given two weeks notice that my six month contract wasn't going to be renewed. Two weeks after I left the company, one-third of the division was laid off to pay for the lost revenue.

  2. Small Red Button by LiquidCoooled · · Score: 5, Interesting

    All new keyboards have a single key Shutdown/sleep thing.

    Arghhhhhhhhhhhhhhhhhhh @ little fingers.
    I either rip the bastard thing right off the board or dig out the regkey thingy to disable it.

    --
    liqbase :: faster than paper
  3. First Job Ever by daeg · · Score: 5, Interesting
    I was told to fix an invalid credit card number in the database. I didn't design it, I just worked there, so don't knock me for storing credit card numbers. Although what I did "fixed" that security problem...

    update customer_cc set card_number = '1234567890123456'; Woops. Backups were corrupt, too (not my task). Needless to say, it suddenly became a "security feature" that we stopped storing credit card numbers.
  4. Taking out an entire City... by waa · · Score: 5, Interesting

    While not an official "Big Red Button" story I think it is worth telling.

    In 1999 while I was working as a private consultant for the capitol city of a small New England state, a colleague of mine was attempting to make a change to the city's core switches. Per usual with this guy, he over-sold his skill set and was way out of his league - while never willing to admit it.

    Meanwhile, I was working in the server room on the squid web caching server while he was attempting the change...

    I kept hearing him say things like "I wonder what this command does", and "I wonder what the reset command means. Should I enter it?"

    Suddenly I was no longer ssh'ed into the proxy server... I looked up and asked "What the hell did you do?"

    His answer: "I entered the reset command"
    Me: "Well, fix it. Restore the configuration. It looks like you just reset EVERYTHING..."

    Well, needless to say, there was NO saved configuration to restore, and no documentation for the city's network nor the equipment installed, and on this equipment the reset command was the command to reset it to its default settings. (BTW, he entered the reset command on the core switch) There were several local switches (connected via copper), and many fiber connections to all the remote departments across the city - several fire departments, the main police department, city hall, you name it... All off-line.

    In the end, the city's network was DOWN for 3-4 full days while he contacted qualified people to attempt to rebuild the network...

    We would have been better off if he had hit the big red button near the sliding glass door at the server room's exit.

    sigh...

    P.S. I am pretty sure he blamed it all on me.

    --
    Windows is not the answer.
    Windows is the question.
    The answer is "NO."
  5. THNTD by Kadin2048 · · Score: 5, Interesting

    Working at a computer center, I think the best design I've seen was the "Big Red Button" was actually 2 buttons, spaced far enough apart that you couldn't hit them both at once with on hand, but close enough together that they were obviously related. They were also much higher off the raised floor than any other switches, and clearly marked.

    Just as trivia, that type of circuit is common on industrial equipment (think of the big press from the end scene in Terminator 1) and is called a Two-Hand No-Tie-Down. Basically there are two switches, and they have to both be depressed within a certain interval in order to close the circuit (generally 0.5s or so). If you "tie down" one of the switches, or have something leaning against it, or whatever, pressing the second switch won't trigger (otherwise it would be just a simple AND gate).

    The circuits to do it are pretty standard and easily available. What's cooler, is that you can actually get a basically-identical circuit that uses compressed air or other gas instead of electricity (for use in chemical plants and other explosive atmospheres). One of the cooler things I've gotten to see made was a pneumatic "circuit board" cut out of Lucite for this purpose. I've always thought they would make a nice demonstration device for teaching kids about electronic circuits.

    --
    "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
  6. Nuclear Power Plant Emergency Shutdown by slashdotmsiriv · · Score: 5, Interesting

    Arguably the biggest shutdown-button screw-up in history ...

    From http://en.wikipedia.org/wiki/Chernobyl_disaster :

    "At 1:23:04 the experiment began. The unstable state of the reactor was not reflected in any way on the control panel, and it did not appear that anyone in the reactor crew was fully aware of any danger. Steam to the turbines was shut off and, as the momentum of the turbine generator drove the water pumps, the water flow rate decreased, decreasing the absorption of neutrons by the coolant. The turbine was disconnected from the reactor, increasing the level of steam in the reactor core. As the coolant heated, pockets of steam formed voids in the coolant lines. Due to the RBMK reactor-type's large positive void coefficient, the steam bubbles increased the power of the reactor rapidly, and the reactor operation became progressively less stable and more dangerous. As the reaction continued, the excess xenon-135 was burnt up, increasing the number of neutrons available for fission. The prior removal of manual and automatic control rods had no substitute, leading to a runaway reaction.

    At 1:23:40 the operators pressed the AZ-5 ("Rapid Emergency Defense 5") button that ordered a "SCRAM" - a shutdown of the reactor, fully inserting all control rods, including the manual control rods that had been incautiously withdrawn earlier. It is unclear whether it was done as an emergency measure, or simply as a routine method of shutting down the reactor upon the completion of an experiment (the reactor was scheduled to be shut down for routine maintenance). It is usually suggested that the SCRAM was ordered as a response to the unexpected rapid power increase. On the other hand, Anatoly Dyatlov, chief engineer at the nuclear station at the time of the accident, writes in his book:

    "Prior to 01:23:40, systems of centralized control ... didn't register any parameter changes that could justify the SCRAM. Commission ... gathered and analyzed large amount of materials and, as stated in its report, failed to determine the reason why the SCRAM was ordered. There was no need to look for the reason. The reactor was simply being shut down upon the completion of the experiment."

    The slow speed of the control rod insertion mechanism (18-20 seconds to complete), and the flawed rod design which initially reduces the amount of coolant present, meant that the SCRAM actually increased the reaction rate. At this point an energy spike occurred and some of the fuel rods began to fracture, placing fragments of the fuel rods in line with the control rod columns. The rods became stuck after being inserted only one-third of the way, and were therefore unable to stop the reaction. At this point nothing could be done to stop the disaster. By 1:23:47 the reactor jumped to around 30 GW, ten times the normal operational output. The fuel rods began to melt and the steam pressure rapidly increased, causing a large steam explosion. Generated steam traveled vertically along the rod channels in the reactor, displacing and destroying the reactor lid, rupturing the coolant tubes and then blowing a hole in the roof.[7] After part of the roof blew off, the inrush of oxygen, combined with the extremely high temperature of the reactor fuel and graphite moderator, sparked a graphite fire. This fire greatly contributed to the spread of radioactive material and the contamination of outlying areas ... "