What Hurricane Sandy Taught IT About Disaster Preparedness
StewBeans writes: The National Oceanic and Atmospheric Administration Climate Prediction Center is calling for calmer than normal storm activity this hurricane season, which runs through Nov. 30. But it's likely that data centers and IT companies in NYC are still taking disaster preparedness seriously. Three years ago, Hurricane Sandy devastated homes, businesses, transportation, and communication in New York, and taught many companies (the hard way) how to keep the lights on when the lights were literally off for weeks on end. Alphonzo Albright, former CIO of the Office of Information Technology in New York City, gives a behind-the-scenes account of what life and business were like in the dark, cold days following Hurricane Sandy in NYC. He also shares tips for other tech leaders to create their own Business Continuity Plan in case this year's storms take a turn for the worse.
First rule: have facilities capable of running your business in more than one location. Everywhere is susceptible to disaster of one sort or another, but if you pick areas far apart that aren't geographically similar they probably won't both suffer disasters at the same time.
Second rule: the probability of disaster taking out your main facilities is 100%. It will happen. The only question is exactly when it'll happen, and the only constant in the answer is that it won't be at a good time. If anyone in your organization doesn't like this, remind them that reality doesn't really care what they like.
it was a storm, whatever.
Sandy was a much bigger storm when it was hitting Cuba and fucking up the southern end of the Atlantic coast. It was an actual hurricane then, in fact.
But allllllllllll we fucking hear about is how New York was unprepared. New York isn't special and doesn't deserve special attention for being unprepared, but it sure turned into a fucking media event that's still going strong today.
I don't know if the media wanted another Katrina or simply wanted to pander to their favorite place in the world (NYC), but it got old real fucking fast.
And what the fuck is up with "The National Oceanic and Atmospheric Administration Climate Prediction Center is calling for calmer than normal storm activity this hurricane season"? You don't call for that, you predict it. Calls are to be answered (or not). Predictions are to be met (or not).
Nothing. Disaster recovery plans are like backups... if you don't test them every so often then you assume that they don't work.
Companies should have already tested their plans and known that they worked so that when any interruption from the storm kicks in their backups would take over as planned.
You talk of geographic diversity, but that's only part of the picture. Software diversity is critical, too. Not all disasters are storms. Sometimes we have disasters of software architecture. Some will say that systemd is an example of this. Some of its architectural decisions, such as the use of binary logging and how it has subsumed so much unrelated functionality, prove to be very problematic for many users. That's totally separate from its implementation. Even a perfect implementation, which of course is not possible, would suffer from these architectural flaws.
Yet just when Linux needs software diversity the most, we've seen almost all of the major distros being using systemd. Linux users don't even have a choice any longer; unless they want to use an absolutely archaic distro like Slackware, or an impractical distro like Gentoo, they're going to be burdened with systemd. This systemd monoculture poses a huge risk, in my opinion. All it will take is one serious flaw in systemd, and we'll have a situation much worse than the bash/Shellshock disaster. So while using different Linux distributions used to provide some protection, the diversity of the Linux ecosystem is approaching rock bottom, which means the risk is shooting through the roof.
It's getting to the point where even long time Linux shops have to start introducing FreeBSD, OpenBSD, and even Windows to hedge against the increasing level of risk that can come with using modern Linux and its rapidly developing monoculture. Only through a diversity of software can at least some degree of protection be attained.
Not at all irrelevant,about how much is taught, really, with not a bit of disaster..
Happiness in intelligent people is the rarest thing I know.
Ernest Hemingway
Wasn't there a datacenter guy who posted here on /. when Katrina hit about all the stuff they went through keepign things up and running at some sort of minimal level?
Been drinking and google-fu is off but perhaps someone can post it. IIRC it included a blog of what was goign on, etc.
Don't blame me, I voted for Kodos
Not an option for High Frequency Traders. Geographic diversity means locating your fiber optic connect further way from the transatlantic fiber head ends which make HFT possible.
Almost certainly nothing.
The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
You can guarantee these IT idiots are going to leave the status quo intact for job security.
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
With so many comments here focusing on servers I'd just like to point out that those servers are generally useless without power, environmental controls and internet connectivity. If your disaster recovery plan doesn't address all of these issues, and physical security (some looters just want to watch the corporate world burn) it isn't worth the storage capacity it occupies.
First we must be willing to build very large facilities capable of storing thousands of tents and food enough to last people for many weeks. And we need to do this in regional centers such that deliveries can take place the day after a storm leaves an area. Areas such as Miami simply can not be evacuated as the population is way too large. A strong storm can knock out roads and rails and put a large area into severe isolation. A few days worth of groceries will simply not help much. In my area our grocery stores were destroyed when three storms hit us back to back. Getting a car on the road was next to impossible and quite dangerous. Gasoline was not available at all as the gas wells were all flooded. I had no power for three solid weeks. A situation like that can get to the point at which people raid each other trying to keep from starving. I see no effective measures at all. If another Katrina hit New Orleans the results would be very much like the original Katrina. The repairs made are not designed to stand up to a class 5 hurricane. And it is only a matter of time before New Orleans gets hit by class 5 storm. Miami is in the same boat. We roll the dice here constantly and count on good luck not to bring on a class 5 storm.
The immense sums forwarded after 9/11 to harden the infrastructure in the NY area were wasted on largess and patronage as usual and a storm came along and proved it. What to do was known, they just didn't do it, and still haven't.
It taught us that 1920s electricity infrastructure shouldn't be in use in one of the richest cities on the planet - wet wood in contact with high voltage is a bad idea and the inevitable fires happened.
Funny thing is I know a transmission guy who said "I told you so" based on what he said in the 1960s. Fifty years later that shit was still in service and it burned.
Eventually things pretty much go back to the way they were before. I remember seeing a discussion about the lessons learned from Hurricane Andrew (not just IT specific) and how after 7 years things that were important were forgotten or deemed less important. I'm sure the same happened with Hurricane Katrina, Sandy, and many others. It seems to be our human nature that these things eventually wear off and become less important. I think Neil Degrasse Tyson was on Joe Rogan's podcast a few years ago and touched on the subject as well.
Keep the Classic Slashdot.
The backup generators failed as the fuel pumps couldn't be powered as there wasn't any electricity to power the pumps ref. Don't site your critical infrastructure in the basement. ref.
Story time: A few years ago I was working on a web app for a US intel/LEO agency in northern virginia. The app had started as a demo, then kind of grew. Like a fungus. It was never really designed, much less designed to shut down and restart unexpectedly. There were some other similarly "designed" apps running in the data center.
The data center, being under the flight path for an airport, had a continuity of operations ("coop") plan and hardware. The "UPS" was a big generator with a switch so that it would take over when mains power went down. There was also a system designed to handle hot mirroring of everything and switch all network traffic to the backup center if the main center went down.
A great system which was never tested because what if the test takes the system down for 15 minutes and we thus miss the opportunity to prevent the Next 9/11 and Thousands Die and, worse yet, we have to testify in front of Congress?
So one day the fire marshall came through the building and, as part of his testing, hit the Big Red Switch. The switch designed to detect this and start the generators (and which was reported to cost $15) failed. All the systems went down, hard. The network switch in place to notify the hot backup site and send all the traffic there also failed. And the Vital Systems Protecting Our Nation From the Next 9/11 went down, worldwide.
Don't just have a plan, test it.
p.s. We never were able to determine how much, if any, data was lost....
Best Slashdot Co
Don't shutdown the servers that have the emergency disaster plans saved on them.
That's what my I.T. department did. It took us 3 meetings to convince them to turn them back on.