Slashdot Mirror


The Tech Failings of Hawaii's Missile Alert

Over the weekend, Hawaii incorrectly warned citizens of a missile attack via their phones. According to The Washington Post, the error was a result of a staffer picking the wrong option -- missile alert instead of test missile alert -- from a drop down software menu. Hawaiian officials say they have already changed protocols to avoid a repeat of the scenario. The report goes on to add: Part of what worsened the situation Saturday was that there was no system in place at the state emergency agency for correcting the error, HEMA (Hawaii Emergency Management Agency) spokesman Richard Rapoza said. The state agency had standing permission through FEMA to use civil warning systems to send out the missile alert -- but not to send out a subsequent false alarm alert, he said. Though the Hawaii Emergency Management Agency posted a follow-up tweet at 8:20 a.m. saying there was "NO missile threat," it wouldn't be until 8:45 a.m. that a subsequent cellphone alert was sent telling people to stand down. Motherboard notes that new regulations require telecom companies to offer a testing system for local and state alert originators, but because of lobbying by Verizon and CTIA, this specific regulation does not go into effect until March 2019.

In a piece, The Atlantic argues that the 90-character messages sent by the system aren't suited to the way we use our devices.

13 of 232 comments (clear)

  1. Uforgiveable by Ol+Olsoc · · Score: 3, Insightful

    You need a mechanical physical switch with a switch guard. The very fact that an actual alert would be triggered by a menu item, indicates a completely incompetent design. I seldom call for people's jobs, but I'll make an exception in this case..

    --
    The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
    1. Re:Uforgiveable by pablo_max · · Score: 5, Insightful

      What's worse, is that the menu items were right under each other. "Missile alert" and "Missile alert Test". Both items give the same "are you sure" confirmation.
      While it was certainly a bone headed mistake, it was one what was easily possible for someone in a hurry. As this fellow was just wrapping up his shift, he was clearly trying to get everything done in time.

      I don't get the people calling for this guy to get fired. Like none of those assplugs have ever made a mistake on their job. How many know someone in the office that accidentally did reply to all, or forward some email chain to external Eric rather than the internal Eric.
      Shit happens. Clearly the design of that system isn't the best.

    2. Re:Uforgiveable by Ol+Olsoc · · Score: 4, Insightful

      What's worse, is that the menu items were right under each other. "Missile alert" and "Missile alert Test". Both items give the same "are you sure" confirmation. While it was certainly a bone headed mistake, it was one what was easily possible for someone in a hurry. As this fellow was just wrapping up his shift, he was clearly trying to get everything done in time.

      I don't get the people calling for this guy to get fired. Like none of those assplugs have ever made a mistake on their job.

      I was perhaps not clear. I'm calling for the people who designed and implemented a system that was so mistake prone to be sent on permanent vacation. The guy who sent out the alert was just a person making a mistake on fatally flawed software.

      Their design and implementation indicates either a lack of knowledge of life critical systems, or a callous indifference to it. You have to place interrupt safe (yeah an oxymoron) points at places. Running a alert test? Have a nice Alert test physical switch. Switch guard, different color. Actual alert? Another switch with a guard and a different color. Never a menu item. The colors indicate the difference, the switch guards function as an "Are You Sure?" message. A degree of separation between testing the system and activating the system must be in place. There was essentially no separation in this incompetent implementation. How many know someone in the office that accidentally did reply to all, or forward some email chain to external Eric rather than the internal Eric. Shit happens. Clearly the design of that system isn't the best.

      --
      The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
    3. Re:Uforgiveable by Gaxx · · Score: 5, Insightful

      Whilst I believe that you are right in identifying an mechanical failsafe as an incorrect approach I don't think this should fall to the user to pick from two items next to each other on a drop-down. Intelligent, highly-skilled operators make mistakes in these sort of circumstances and a bit of decent UI design goes a long way in preventing such things (without the need for mechanical safeguards).

      Something as simple as giving obvious visual clues between test and live messages (icons, colour, font weight etc), separating the items on the drop down into obvious lists for test and live messages etc.

      Getting only a _little_ more complicated in UI, a subsequent message confirming a live message (possibly with an action that requires a user to type 'live' or something to ensure that the validation request has been received and understood) would almost certainly eliminate any chance a live message being sent in place of a test one.

      Decent design does not rely on users doing the right thing any more than it has to.

      --
      -- Gaxx
    4. Re:Uforgiveable by iamgnat · · Score: 5, Insightful

      What's worse, is that the menu items were right under each other. "Missile alert" and "Missile alert Test". Both items give the same "are you sure" confirmation. While it was certainly a bone headed mistake, it was one what was easily possible for someone in a hurry. As this fellow was just wrapping up his shift, he was clearly trying to get everything done in time.

      I don't get the people calling for this guy to get fired. Like none of those assplugs have ever made a mistake on their job. How many know someone in the office that accidentally did reply to all, or forward some email chain to external Eric rather than the internal Eric. Shit happens. Clearly the design of that system isn't the best.

      I agree. Shit happens. Just was unfortunately some really bad shit in this case. I haven't made such public mistakes, but I've made some big ones. He is just a scape goat here.

      The real problems I see here is that A) it wasn't blatantly obvious (through using a different workflow and by clear visual (and audio?) indicators) that he was going down the live path rather than Test and B) that having permission to use the EBS doesn't automatically carry the ability to send a "oh shit! we didn't mean to do that" message as well.

      At the point where the workflow path deviates between Test and Real it should be impossible for someone, no matter how rushed/tired/bored, to get it wrong. Glaringly different color schemes. Audio prompts. Full screen dialogs so they can't be paying attention to something else. Extra steps down the Live path. Having a second account confirm the action. Etc...

      Make it so that you have to be either blatantly ignorant or blatantly malicious to get to the point of sending a Live alert when you shouldn't. The timeliness nature of the system, however, does present some challenges since you want to delay getting the alert out as little as possible.

      Now what I think is really being missed here is that this was a blessing in disguise. Yes it inconvenienced and scared the crap out of a lot of people, but based on all the reports I've seen no one had a clue what to do with it. Given the short time involved for a missile to get from NK to Hawaii and the devastation a nuclear warhead would do I question the point of giving warning (I'd rather die blissfully ignorant rather than in a panic or linger through injury/radiation poisoning), but if there is going to be a warning people need to know what to do and react accordingly.

      They are concerned enough to spend money on the warning system, but have they spent the money on enough bunkers to hold the population of the islands? Are they located so that everyone has a reasonable chance of getting to one regardless of traffic/panic of everyone else trying to get there?

    5. Re:Uforgiveable by Megol · · Score: 3, Insightful

      Yes this is clearly a system design problem and something that can be solved in several obvious ways. The easiest would be to have a confirmation stage that makes it very clear if it's a test or if it is a "sharp" alarm that will be triggered.

      Note that how to make it very clear is still a problem but one that have been studied.

      That the same system that can send an alert can't also send a false alarm alert is also an obvious systems design flaw.

  2. There was no tech âoefailingâ. by Anonymous Coward · · Score: 0, Insightful

    The alert went through as chosen and selected. It worked as designed.

    What failed was the operator not paying attention to their work.

    1. Re:There was no tech âoefailingâ. by RobinH · · Score: 4, Insightful

      No, you're wrong. UI design plays a major role in the correct operation of a system. Very few people in my experience are detail-oriented people, and even the ones who are still make predictable mistakes. The system must account for how real people actually behave. To do otherwise is bad system design. Looks like this was just a test of connectivity. I don't know why they didn't automate the test (send a test file once every 8 hours, write in the log that it got sent, and write in the log that a confirmation came back, then have another job that looks for those log entries in the appropriate time range and alerts the operators if it didn't work). Yes, you still need to manually test, but not as often. In a case like this, there should be a prior action required to "arm" any of the "real" messages, so there's two different processes that you won't mix up. A generic "are you sure" query isn't good enough because it's the same message whether you picked a real message or a test message. Muscle memory kicks in and you just click Yes, after all that's what you did the last several hundred times.

      --
      "I have never let my schooling interfere with my education." - Mark Twain
  3. Follow up Tweet? by ogar572 · · Score: 5, Insightful

    Seriously, contact all the major TV and radio stations in the area first. The expectation that everyone should get critical information from "social" media is a joke.

  4. Takes time by DrYak · · Score: 4, Insightful

    Seriously, contact all the major TV and radio stations in the area first.

    Which should take some time, unlike sending a tweet on an account already owned by the emergency center.

    Also, the contacting of TV and Radio station might be hampered by people actually attempting to follow the instruction of the previous wrong alert.

    Though most TV and Radio crew might wonder how come there's an alert about a missile attack on their *phones* while, at the same time they do not receive a full list of information that they have to broadcast immediately to the population while interrupting the normal programming.

    So, while the HEMA guys are heading for the simplest thing to do to communicate information (blasting it on accounts that they actually own, like Twitter), the TV and Radio station should be the one trying to contact HEMA to understand why they weren't asked to broadcast any emergency information (it might have been an error like in this case. Or in the alternative case of an actual live attack, the general population might be missing critical information that the Radio should have been broadcasting and that got stuck somewhere in the process).

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  5. Re:Prod vs. Subprod? by quetwo · · Score: 3, Insightful

    This is actually a test OF the prod system. You can have a totally separate system for testing, but you do need to test the production system to make sure that some system hasn't broke, bird eaten through a wire, or service credential expired. Data Centers test transfer switches once a month in production. Across the midwest, they test tornado sirens once a month.

  6. Hindsight 20/20, Foresight 20/200 by ancientt · · Score: 4, Insightful

    You're right. If I had mod points, I'd give you a bump. Your insight that the blessing here outweighs the cost is one I haven't seen given enough attention. Fresh eyes will be looking at how the process should work to prevent mistakes and that's a good thing. Likely they'll find other areas that need improvement.

    Using a system intended for conveniently notifying the public with information to instead notify the public of an emergency is a dangerous mistake, one of which they're now aware. Finding out that the public doesn't know how to respond is priceless information that they have now. The guy who clicked the wrong menu option may not deserve a medal, but put him on the committee determining how to fix the system and plan responses. Redemption is a strong motivator.

    Now the public knows that they need a response plan for such an emergency. Having public pressure to get prepared is perhaps the greatest thing that could happen. People trying to get the public prepared would have been frustrated before this, but now they'll have the public on their side. That's the kind of thing that makes budgets happen.

    --
    B) Eliminate all the stupid users. This is frowned upon by society.
  7. The error is in process, not execution by sjbe · · Score: 3, Insightful

    While it was certainly a bone headed mistake, it was one what was easily possible for someone in a hurry. As this fellow was just wrapping up his shift, he was clearly trying to get everything done in time.

    It this was indeed the setup the mistake was idiotic programming and software design. The end user screwing it up was entirely predictable and probably inevitable. The problem occurred when the system was designed. If a system can fail because of the design, it almost certainly will fail sooner or later.

    Part of my day job is to write work instructions and design procedures. When something goes wrong the first question I have to ask is "what did I do wrong", NOT "who screwed up"? 90+% of the time the problem was unclear/wrong/misleading instructions, a badly designed process, or some other problem where the person tasked with carrying out the instructions was set up to fail. In other words, my fault. We as engineers tend to take too little responsibility for our own failures and blame user error when in fact the error was a badly designed program or procedure. We tend to think we are the smartest people in the room and while that may be true sometimes it doesn't mean we are perfect.