ISP Recovers in 72 Hours After Leveling by Tornado
aldheorte writes "Amazing story of how an ISP in Jackson, TN, whose main facility was completely leveled by a tornado, recovered in 72 hours. The story is a great recounting of how they executed their disaster recovery plan, what they found they had left out of that plan, data recovery from destroyed hard drives, and perhaps the best argument ever for offsite backups. (Not affiliated with the ISP in question)"
Hopefully no one was hurt when the trailer park got levelled.
when Munchkins overrun the web now that this ISP got relocated by the twister.
"So, ah, your ISP here.. what's your uptime for the last year?"
"99.18% for our service, and 96.2% for our building."
That's pretty good for getting levelled by a tornado.
My server company (who's name I will not mention) sometimes just shuts down because a gentle breeze. Then when I call them, I hear "whoops, I guess it is down!"
Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
And I'm sure every minute of those 72 hours was characterized by irate phone calls to tech support.
"Are you guys down again? You're down more than you're up! I'm going to find another service... etc..."
"Ma'am our facilities have been entirely leveled by a tornado, we'll be back up in 72 hours."
"72 HOURS?! I have photos of my grandchildren I have to mail! Worst ISP ever! Let me speak to your supervisor!"
"Ma'am our supervisor was also leveled by the tornado."
*click*
Not that I work tech support for an ISP and am bitter...
Now that that's out of the way, it never ceases to amaze me how many companies have little to no severe disaster recovery plans, and how a little bit of ingenuity(sp?) can go a long way in a company.
Times of crisis and how one deals with them are the mark of successful businesses/employees/people. I don't think that we could recover so quickly should a disaster of that size hit my job, but it'd be fun to try.
This is what happens when people make intelligent plans and the modify them as they see other plans work or fail. I'm glad to see that this was a work in progress rather than some arcane plan in a binder somewhere that no one ever looked at.
The Blaster Master Fighting for Truth, Justice, and Evil Pie since 1979
That is some very well thought out planning. Big props to those guys!
When your business gets pelted with the equivalent force of 100,000 elephants, you better have a friggin contingency plan.
--"The perfect example of the man of action is the suicide." - William Carlos Williams
"Move somewhere where the wind don't blow quite that much" =)
However, it amazing how soon after a 'total disaster' a system can be up and running again. I distinctly recalls seeing a lot about just that in the paper (the one made from dead wood) after 9/11. Kudos, I say!.
Everything in the world is controlled by a small, evil group to which, unfortunately, no one you know belongs.
Twisters, hurricanes, floods (oh my)
SEPTEMBER 03, 2003 ( CIO ) - The evening of Sunday, May 4, 2003, at Aeneas Internet and Telephone began as any previous Sunday evening had. The Jackson, Tenn.-based company that serves about 10,000 Internet and 2,500 telephone customers was closed for the weekend, awaiting the return of its 17 employees the next morning. Just before midnight, however, all hell broke loose. An F-4 category twister touched down just outside of town, then tore through Jackson's downtown area, leveling houses, historical sites and municipal buildings alike. The tornado ripped straight through Aeneas's one-story building, leaving only a pile of rubble.
Meanwhile, Aeneas CIO and Operations Manager Josh Hart, who'd heard about multiple tornadoes in the area that day, was home, 52 miles away in Martin, Tenn., huddling in his bathroom with his family. As soon as he was able, he flipped on the TV for news footage of the devastation. What he saw looked like "a war zone," bricks and concrete everywhere and piles upon piles of rubble.
At 2 a.m., with those images in the background, Hart's cell phone rang--it was Aeneas Network Administrator Jason Warren calling from what he likened to Ground Zero to report that everything in Jackson was lost. Another call came in from CEO Jonathan Harlan.
"I'm listening to [Warren] tell me what it's like, and he says, 'It doesn't even look like there was an office here,'" remembers Hart, 25. "The tornado destroyed our computers, our desks, everything. I couldn't believe what he was telling me."
Aeneas lost nearly $1 million in hardware and software that night, and an estimated 72 hours of downtime. But just as Aeneas in Virgil's Aeneid endured the worst the gods had to offer, so too did this Aeneas. This one, however, was wise enough to have created a contingency plan--one that minimized the damage and kept the company afloat during its darkest hour.
The company is not alone. After a nationwide scramble to prepare for high-impact, low-probability events similar to the attacks of Sept. 11, CIOs have since realized that their organizations are far more likely to succumb to another type of event--one that has a high probability of occurring and, curiously enough, is probably simpler to predict: the weather. For example, in June, while the Atlantic seaboard was bracing for the start of hurricane season, Arizona was busy battling forest fires. And in Harris County, Texas, in 2001, a tropical storm and resulting flood taught one IT executive the importance of flexibility.
Both Aeneas's Hart and Steven W. Jennings, Harris County's executive director of central technology, share their experiences here in an effort to provide best practices and battle-tested secrets about which preparations work best. According to Carol Kelly, vice president of government strategies for Meta Group, these are lessons from which everyone can learn. "When disaster strikes, you want to be ready with a plan of action and an approach of how to deal," she says. "You might be ready for the next terrorist attack, but if you're not ready for the next nor'easter, your plans won't amount to much."
Big plans for a small company
Aeneas launched its contingency plan when it was founded in 1996; since then, CIO Hart has enhanced the strategy gradually almost every year. In early 2002, as the ISP neared 10,000 Internet customers, he and his network administrator, Warren, thought up the company's most comprehensive approach yet. While they determined that the likelihood of a terrorist attack on the western Tennessee town of Jackson, population 59,600, was slim to none, they concluded that because of the municipality's location in the central U.S.'s infamous Tornado Alley, the plan should respond to the next most likely cause of disaster--twisters. What ensued was a three-pronged plan that hinged upon colocation, distribution and backups.
First, by employing Border Gateway Protocol (BGP) programming on a high-class circuit shared with an ISP 90 miles
funny munging
...is a good enough argument for off site backups. If you don't have them, your backup plan is not enough.
Hah, they can recover from a tornado. That's no biggie. How 'bout a SLASHDOTTING, then!
A Tornado huh?
Well that's what you casemodders get for installing twenty overpowered cooling fans in every one of your 1000 servers!
Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
Mirrors anyone???
We were somewhere around Barstow on the edge of the desert when the drugs began to take hold. - HST
Let the OZ jokes flow:
"Bring me the router of the wicked switch of the Qwest!"
Although, I am starting to wonder. Has anyone checked to see if this ISP has a record of resisting RIAA subpeonas? Perhaps the RIAA levelled it after acquiring cloudbuster equipment.
Don't blame Durga. I voted for Centauri.
I swear I read the headline as "ISP Leveled by Tomato"
A couple of friends of mine were badly burned because the web hosting company they were using lost all their data (customer and their own) in one humungous crash, and didn't have any backups. They didn't even have a spare copy of their customer database, so they couldn't even contact their customers to tell them what was going on. Nor could they tell what customers they had and how much service they'd paid for, etc.
The next Cmdr Taco duplicate will be ready soon, but subscribers can beat the rush and see it early!
long article with no link.
.. now lets see if they can recover from a slashdotting. :)
Aeneas Internet
- cnb
Wow...Jackson, TN has electricity. Now it has computers and the Internet. What's next? Evolution in schools?
There are a huge number of yeast infections in this county. Probably because we're downriver from the bread factory.
Worst. Troll. Ever.
. . . how long will it take the article's host to recover from the slashdot effect?
'I ain't a liar, baby, and I ain't proud I just want what I'm not allowed.' -- Violent Femmes, 36-24-36
All told, colocation was down for about a day and a half. Allthough it is exciting that they recovered from such an accident quite fast - it also means that their customers where 36 hours out of business. This could kill a small company. Maybe they should overhaul their strategys..
They really were mostly back in that time frame. it was amazing. All Hail the guys who went without sleep to make it happen.
Jason Warren has 3 level 60 EverCrack characters...
No, in Russia Tornado does not own you. Neither does ISP. It is not, step 1) tornado step 2) ??? step 3) ISP recovers. There is not a beowulf cluster of these, and the tornado doesn't run Linux.
Build a better building!
Unfortunately, computerworld.com may take longer to recover from the ./ effect.
As for cable monkeys, I would like to point out that a programmer is a programmer is a progammer. A programmer is just another dumbass like yourself, who may or may not know his code from a hole in the ground. Progammers can be replaced with folks in India and the Baltic States without anything thinking twice, or caring. But a good Network Admin who knows his stuff and how best to implement it in a business environment is a gem.
I, for one, welcome our new twister overlords.
+1 Informative?!?
Does that mean that some moderator actually believes that we have, indeed, been conquered by twisters?
Network admins are just IT jack monkeys running around plugging stuff into switches and getting in everyones way who has actual work to do.
...evolution of their knuckle-dragging residents.
-Looking for a job as a materials chemist or multivariat
let me get this straight, all the houses around the isp have no power, no phone... but they still need to get online?
Runnin' On Empty
I though I told you to shut the hell up.
Then I've seen the other end of the spectrum - a 6 Billion dollar corporation's world HQ IT center... wow. They have disaster recovery sessions and planning like I never would have imagined. Very cool facility, but it has to be like that. Some day if they get burned, it's all over.
Berto
...a Slashdot recovery plan
now I just hope somebody comes up with a way soon to recover that fast after a slashdotting!!!
But, as a programmer, I just dont care.
When I was a sophomore, working on my electrical engineering degree, I worked for a small, network-centric company that employed what seemed to be an abnormal number of snooty programmers and technical writers. Maybe it wasn't so abnormal.
Me: "Hi, IT support."
Stratjakt: "Hey, I know you're just a high-school educated 'IT person', but you need to get one of your cable monkeys up here and find out why I can't see the network!"
Me:: "OK, but let's check a couple of things quickly before I dispatch a technician. It may save some time."
Stratjakt: "Hey, I'm a programmer! I just don't care!"
Me: "I understand...I realize that my mundane existance doesn't have the exhilaration and exitedness of the thrilling, edge-of-your-seat world of a computer programmer, but there are just a few simple things that we could do to resolve this problem that will be faster than you waiting for a technician."
Stratjakt: "I just don't care."
Me: "No problem, I'll dispatch a technican."
An hour later...
Technician: "Stratjakt is all fixed up. I plugged his network cable back into the jack."
What amazes me isn't that these people were able to restore service to their customers in 72 hours. They used standard systems administration techniques. BGP was specifically mentioned.
No, what amazes me is that this is news. The IT industry is so full of idiots and morons and MCSEs that taking basic precautions earns you a six-figure salary and news coverage. These folks didn't even have off-site backups, it was luck that they were able to resume business operations (ie: billing) so soon.
Moral of the story? When automobile manufacturers start getting press coverage for doing a great job because unlike their competition, they install brakes in their vehicles, you know that the top-tier IT managers and executives have switched industries.
Barclay family motto:
Aut agere aut mori.
(Either action or death.)
OK I just may be jaded I work in a secor that thinks 5 minutes is earth shattering ammounts of downtime. 72 hours would ahve me everybody that works for me and some C level guys fired at the companies I work for. First things first what did they do wrong backups stored on site this is page 2 of a disaster recovery howto backup need to be stored onsite and remote, they also need to be verified as functional (yes I am that manager that insists that servers be restored and checked for functionality on the backup hardware during a work window) From the story it wasent even client data as much as it was there billing DB and other office information. When will people learn that information makes a lot of businesses and needs to be protected a nominal cost to do proper backups and house them remotly even if it's in a bank vault a few towns over perferably the other coast. Satalite uplinks can provide decent ammounts of bandwith in a pinch though the latency is horid.
No sir I dont like it.
>But just as Aeneas in Virgil's Aeneid endured the worst the gods had to offer
A single tornato is NOT the worse thing that could happen. It is pretty bad, but never underestimate how messed up God can make you.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
...cuz they'd be the laughing stock of their industry.
The TV Ope Geeks always get such a big kick out of their l'il Internet brothers. It was fascinating to watch the melding of these two geek cultures in some shops circa late 90's. The I-Net guys were still getting all the dough and new toys, but thought that "Five Nines" was the name of a new Goth Band. Meanwhile, the TV dudes ("Yo, I got yer "streamin' video" right *here*!) were besides themselves as they watched the Young Gods attempt to re-invent rich-content distribution.
This not meant as a knock to either "side" of course; the story just inspired a brief traipse down Memory Lane...
Our ISP was leveled in a Tornado.
"Learning is not compulsory... neither is survival."
--Dr.W.Edwards Deming
Wrong on SOOOOOO many levels.
Let me start with this line:
"I realize that slashdot is mostly populated by high-school educated "IT people", who give a shit about logs and backups"
You claim to be a programmer, I have been a programmer and am now a Sys Admin, as both the BEST way to troubleshoot was from the logs. Unless you are the supreme programmer whose code never needs debugging and whose users never mispunch something causing an error a log file will let you see and know what has happened.
Now for this line:
"and restoring backup tapes is exhillirating and exciting."
I have restored from tape backup. We had a "programmer" BS from Virginia Tec, Masters from UMass who was certain he knew exactly what he was doing when he blew away an entire production database. (Actually he was a really good guy who just made a simple mistake) Fortunately we had tapes to restore from. But if ANYONE thinks that a restore is "exhillirating" (yes I left your type/mistake in there) then they are just strange. That was one of the most tedious and boring things I have had to do. But we had been tedious in backing EVERYTHING up so production was not severely impacted.
Now for where you directly insult everyone:
"I fully expect the PHBs and army of cable monkeys to get the network up and running in our new location."
So as a systems admin do I become a cable monkey? or am I a PHB? Either way I would be VERY needed if a disaster strikes just as I am needed every day. As for the elitist attitude and your lack of knowledge and concern for the backend of systems I am glad you do not work anywhere near me as I hate IT personal that have to call me to run windows update on their system when the latest worm comes around or to show them how to NOT clik ignore when Norton tells them they have a virus.
In short, Please show some respect for your coworkers and realize that these guys were prepared and did what their plan stated they could do.
If not don't be alarmed if somehow your account gets disabled and everything blown away and surprisingly they won't have backups, cause you "just don't care" for them.
I am 31337 or something.
"There's fiber and wireless out in these woods"
Come to think of it, the Bronze Age could be called "wireless" as well. Makes them sure look advanced?
Don't blame Durga. I voted for Centauri.
I, for one, welcome our new Tornado-beating ISP overlords.
Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
don't blame me, I voted for Kodos...
ugh I knew I forgot one. :(
Any ISPs looking to learn their lesson through these guys should check out this company. They make wireless disaster recovery systems for ISPs. Pretty nifty stuff. AES encryption and Mesh protocol. Cool beans!
http://www.wavewireless.com
Raging in an online forum won't do anything for the world around you. To see change, you must take action.
Yep, thats the way it works. I dont crawl around on the floor plugging shit in and getting dirty.
...5 minutes later
But you forgot the rest of the story.
Me "Yeah, it took IT a fucking hour to plug my computer back into the switch."
CTO "An hour? What the hell."
Me "Yeah some high school kid argued with me on the phone for 15 minutes about jibber and jabber and didnt want to come up here. Then some kid shows up and sits around just randomly clicking shit on my desktop, and types 'ping 127.0.0.1' into the command prompt like he knows what hes doing. I told the kid 50 times that theres something physically wrong with the network. But you know, he's taken a weekend course on computers and needs to act like he's got some sort of skill. It was a friggin joke. I swear to god, it took an hour for him to plug in an ethernet cable"
CTO "Who was it? They're fired."
Seriously, you talking about taking an hour to plug in a cable like it was someone elses fault just justifies everything I feel about dipshit administrators.
They're just added beurocracy for the computer world, and I work to replace them each and every day with more sophisticated self-administrating softwares.
I don't need no instructions to know how to rock!!!!
First, this is a joke... we have a saying around here... goes something like: a good programmer is one who can replace all sys admins with a shell script.
Its a funny joke.
But anyway, I would becareful with your generalizations. I, as a programmer, have an enormous amount of respect for the admins because well, they know their shit and are pretty cool to boot... and you can't just say a programmer can be replaced by folks in India or Russia... as someone who has as much experience as I do, and not just hacking code, but doing analysis, design, consulting and all of the other things one gains by years of experience is not so easily replaced and is also considered a gem. Yeah sure college kids fresh from classes looking at jobs to do web development or whatever have a steeper hill to climb, but hey, that's the way its supposed to be... paying the dues and all that jazz.
sad robot making broken music
Can they recover from the slashdot effect???
The slashdot effect differs from a tornado in a few subtle ways:
1) You can't see it coming (unless you pay money to be a subscriber)
2) It doesn't hurt anything, except for webservers, the occasional OC line lit up like New Year's Eve, spammers, and the odd *IAA executive.
3) A tornado doesn't typically smell like armpits, cheetos, empty 64oz soda cups, burning plastic, your parent's basement and/or too much cologne for that first date.
4) It travels at the speed of light, a lot quicker than a tornado.
5) Does not require specific atmospheric conditions to be present...just a link on the front page.
Anything else?
Technician: "Stratjakt is all fixed up. I plugged his network cable back into the jack."
Also, a programmer that cannot diagnose problems at multiple levels is a bad programmer. This is why I think tools like GUI IDEs can cause more harm than good, because they trick programmers into thinking everything is dandy and cool. However, when those tools fail, I've seen programmers waste days on what should be trivial to fix (it turns out that nifty tool is quite inflexible, indeed).
Healthcare article at Kuro5hin
YHBT, sucka.
HAND
Well, my ISP (in Ottawa) was down for more than 24 hours, and after than one more day with using a slow link to Internet, with a caching proxy.
What was the disaster? a nuke attack? plane crash? no! it was the famous blackout in mid august. Out of power (and ups), with no generator or fuel.
I wonder whether they plan to serve Ottawa.
This
but isn't the new moderation system leading to the first few good posts on any topic all getting modded up to 5 while the rest get ignored?
You're a VB programmer, aren't you?
As the air to a bird or the sea to a fish, so is contempt to the contemptible -W.B.
Wow! This is exactly the reason that systems administrators generally dislike most members of their development group. Your attitude does not do very much to endeer us 'cable monkeys' and 'PHB's to you.
"IT people", who give a shit about logs and backups and think plugging a PC and monitor into a powerbar is "computer science"
If you think this is all that is involved in running a remotely large and reliable network, you are sadly mistaken my friend. A lot of thought, planning and testing goes into most corporate network infrastructures.....kinda like software development.
"Computer Science" is a very broad term that encompasses much more than just 'programming'.
Many companies in the World Trade Center thought that off-site backup meant the other building.
Cave, wreck, and deep diver.
You know, companies used to need to employ a bank of telephone operators. They'd answer the phone, ask "how may I direct your call? Hold please." and plug the cable into the appropriate jack.
They were some of the most critically important employees in any company in the 50s and 60s.
Then they were replaced by a box in the back of the closet.
Network administrators are todays telephone operators. Obsolete in a few decades. Just a part of a system that hasn't been implemented in silicon yet.
Since the event occurred in Tennessee, how could you forget to include something about Hot Grits?
Unfortunately (?), I wasn't an active Slashdotter when Natalie Portman and Grits were associated in the minds of the troll community, so I can't come up with anything myself. Maybe that's a Good Thing.
Stressed? Me? Of course not. Stress is what a rubber band feels before it breaks, silly.
I find it amazing that this ISP can be up and running within 72 hours of losing its main building.
I'm lucky if I can get change control OK for a very minor change on a single UNIX server within a week.
Maybe all the managers that do nothing but make everything slower and more expensive all died in the tornado.
Just goes to show... disaster recovery can make the difference between coming back online in a few days or not at all... http://www.nccomp.com/sysadmin/whatif-5.html
My generalization was out there, but it sounds like you do more then programming. At some point, you have to acknowledge the fact that you are no longer just a programmer, but rather you are a software designer.
I'm sure you would admit that there are a good many pompous programmers, but most folks who have been around are pretty humble. They know how much they don't know, and network admins who get to deal with 100 users and 30 different programs a day learn to be humble a lot faster.
There are a lot more asshole programmers than there are Mordacs.
I know this guy, and he's a pretty nasty h4XX0r. I didn't know he could take out whole buildings.
Oh, wait, you mean, this was an *actual* tornado. Crap, that must've hurt.
If you're going to post, at least take 5 seconds and think about what you're trying to say. Sheesh, what a ramble...
What takes an hour is that the technician has to take care of the other 20 people who can't be bothered to plug a cable back into the wall on their own.
Oh, and, of course, the tech also has to take care of real work - like fixing the programmer's machine after he installs the latest Webshots and Gator software.
Me: "It took our technican an hour to get all of the malware off of Stratjakt's computer that he downloaded from the Internet."
CTO: "Didn't he read the email that I sent out every month for the last six months telling the employees not to install non-work-related software?"
Me: "Well, I asked him about that...he said that he was a programmer and just doesn't care."
CTO: "He's fired."
Oh, and, incidentally, when your self-administering software becomes proficient enough to keep your big foot from wrapping around the network cable and yanking it out of the wall, then I'd say you really had something worthwhile. At this point, though, I have my doubts.
otherwise -- were the ISP linked from the story, we'd have to see if they can also recover from a Slashdotting within 72 hrs! ;)
/. article has an effect close to a tornado?
Could it actually be that a
That's like saying it's unacceptable for a 747 to fly without wings - they're a midsized ISP, certainly large enough that offsite backups would be wise, but be reasonable for a second; the entire physical facility was completely obliterated, electronics, electrics, offices, building...72 hours isn't THAT bad. 24 would be impressive ehough to get a headline though, IMHO.
Facts do not cease to exist because they are ignored. - Aldous Huxley
At least I had something constructive to say rather than a "ha ha ISP down means no pr0n" post like I'm seeing.
The Blaster Master Fighting for Truth, Justice, and Evil Pie since 1979
"Though the tape and hard drives were stored onsite at the Jackson location, Hart and Warren figured onsite backup was better than none."
They had to recover the drives from the rubble and after numerous failed attempts, finally found a data extraction company that could retrieve the data.
While their recovery, and foresight is impressive, I don't think we should raise them up as the example, when they ommited something as simple as carrying a backup home every once and a while. They got lucky, with regards to company data, plain and simple.
I am also a former Aeneas customer.
Unless Aeneas has made some major changes they are quite certainly the worst ISP I have ever worked with. Aeneas has contracts with the Jackson-Madison County School System to provide internet service district wide. The quality of such service is, bar none, the worst I have experienced.
I did some volunteer work at a local Elementary school helping teachers work out any lingering computing problems they had(Virii, printer drivers, misconfigured ip settings, file transfer to a new computer, etc). The internet service I experienced while I was there lead me to believe I was on a 128k ISDN line. Not until I went to the server room did I realize that I was, infact, on a T1. Now this is during the middle of summer, mabye four other persons were in the building, three of which were in the same room as myself. The service was also intermittent, having several dead periods while I was working. Needless to say, I remained unimpressed by said experience.
When I was an Aeneas dialup customer, in 1998, the service provided by Aeneas was also subpar. The dialup speeds were averaging 21.6kbps, where as when I switched to U.S. Internet(now owned by Earthlink) my dialup speeds were always above 26.4kbps(Except on Mother's Day). There were frequent disconnections, and they had a limit of 150hrs/month.
I'm not supprised how easy it is to restore subpar service. All they had to do was tie together the strings that are their backbone.
Keep up the good work.
sloth jr
Point is, if I had more than just a few thousand dollars worth of equipment, especially if I had a million's worth, I'd want to keep it safe. This is earthquake country (California) so here that means single story building, ground floor, no basement, but in tornado country, that means... putting it in the basement.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
It may be common sense, but so is quality assurance, developing products for quality rather than by schedule only, etc... but none of these things ever happen in real companies it seems.
if I were doing a bunch of levelling by a tornado. I mean, I usually do my levelling in locations such as the Royal Crypt south of Endor (Dragon Warrior IV), the Northern Crater (Final Fantasy VII), or Zeon's Lair (Shining Force II), but certainly not by a tornado.
This is the NFL, which stands for "Not For Long" if you keep making those bulls*** calls.
The company I work for practices disaster recovery once a year on all our major systems.
In the article the writer was talking about how much work it was to migrate the T1 connections, and how they hadn't forseen that. That is exactly the sort of thing that a practice disaster recovery uncovers.
If you want the model from the place I work it is simple enough:
1. Run the disaster recovery during a 24 hour period
2. Pat yourself on the back for what worked.
3. Ignore what doesn't work.
4. Repeat next year.
Of course next year gets a new step:
3.5 Act surprised that stuff didn't work.
Underpin it, boys! The lameness filter is lame...
So what? It's not the end of the world.
Yep, thats the way it works. I dont crawl around on the floor plugging shit in and getting dirty.
...
They're just added beurocracy for the computer world, and I work to replace them each and every day with more sophisticated self-administrating softwares.
If you don't know how to crawl around on the floor plugging shit in and getting dirty, you do not have the perspective necessary to write software to replace the people who do. The best programmers are not arrogantly disconnected from the people in the trenches, especially if they're working on software directed towards their field. A good programmer needs at least to know what people commonly need support about in order to address it in future software. If your CTO is as out of touch and disconnected as you, I pity your fellow employees.
You're also a poor team player, which is a liability to you and your career unless you work solo. You're also incredibly stuck up and elitist, which unfortunately probably actually helps your career. You're also way off base: you obviously consider yourself "above" the type of people who enjoyed this article, and your comments have been way more of an advertisment of yourself than anything to do with the issue. Why don't you drop out of this conversation and let the high school kids who spend all day plugging shit in enjoy it. Believe it or not, there are a lot more nerds in high schools than in high-paying programming positions. That being the case, this site should have more stories about them than you.
Oh, and, of course, the tech also has to take care of real work - like fixing the programmer's machine after he installs the latest Webshots and Gator software.
Now you're reaching. I don't know a serious programmer who will let *anybody* else "fix" his/her machine. I *might* let you babysit an OS install, as long as you promise not to touch anything.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
72 hours seems way too long to be out of business. That's 3 days of money that the ISP is not pulling in dough. Unless the whole internet is crippled, I'd ditch an ISP that was out for three days. One of the main selling points for ISP is connectivity rain, snow, shine, OR rabid squirrels...
The company (ISP/consulting/services hosting) I used to work for had a DR plan to be executed in 24 hours with 75% functionality. Offsite servers and backups of course...
More impressive to me is the World Trade Center folks like American Express and other companies that had DR plans situated across the river. A lot of datacenters and information services were functional again within 18-24 hours. That's PPP PPP (prior planning prevents piss-poor performance).
I write good sigs on my bathroom wall...but this is not a real sig.
While optical fibre may allow travel at the speed of light, copper lines do not.
Someone asked if I had patched against MSBlast; I said yes, I installed Linux.
My point exactly. No serious programmer would allow that. Of course, no serious programmer would install that malware on their system to begin with, would they?
-h-
In defense of the "idiots", many IT people and system administrators are hobbled by the lack of time, money, and equipment. There is the "right way" to do things and the "real world" way to do things. If management isn't willing to spend the money, and doesn't care, what can you do? At my last job, I had to bring a spare CD-RW drive and blank CDs in to work from my home to back up the critical files on my work PC.
Mea navis aericumbens anguillis abundat
1) Implement good disaster-recovery plan
. .data)
2) ??? (aka mad-scramble to initiate plan)
3) Profit (or at least don't go under)
This must have been a pretty in depth recovery plan though. I mean, even with backups and a redundant connection elsewhere... I think that for myself processing the fact that my office had just been bowled over by wind-on-steroids would faze me for a little while (office...tornado...holy...shit...must...recover.
Now they're up and running, but what of their old office? It must be very interesting to have to deal with the stage of "step over rubble, salvage what we can" and the general amazement at nature's fury.
I'm in the process of configuring several of my servers to offload to a remote master. If the town gets levelled we're toast, but if an individual location bites it, then at least critical data (accounting records, home dirs, etc) is saved. This will still be a big bite out of the business.
Does insurance cover natural disasters such as tornado, would be a big question? A lot of insurance companies don't cover "act of god", etc
Oh what a sad day it was when I (being a cable monkey) was asked by the supreme programmer to get his computer back up. When I told him the his HD was dead, he looked at me with shock, as he explained that the last months worth of his so valuable work was on his disk. I asked him if he backed it up anywhere. He said no. He then asked me if we backed it up. I said no, we don't do that for local drives. We sent the drive off to see if anyhing could be recovered. Nope, big waste of time. Almost like his own little tornado in his PC. Hope it doesn't happen to you.
I'm only human!
this is exactly why i have my backup tapes stored offsite. they're actually on a two week rotation. the current week is onsite - too frequently i have to get something off yesterday's tape because someone hosed a project file or changed their mind after emptying the trash - and the previous/next week's tapes are stored in my secure, climate-controlled offsite facility.
okay, it's my house, but it counts.
if my house burns down, it's unlikely the office will suffer the same fate, and vice versa - it's a 20 minute commute. of course, there is the possibility of a large nuclear blast that could hit both sites at once, but i doubt i'd survive, or for that matter care about recovering data, considering i'd be too busy killing off the other survivors and eating their brains..
brains... brains!
but on a more serious note, if i ever switched from tape backups (or had too much data to reasonably be able to do them on tapes) to a RAID system, how would i back *that* up and store it offsite? it's not like i can pull the whole shebang out of the rack, throw it in my car and head home each day...
- Entertaining Bits from the Ancient Kernel Tree
I can't remember if it's Springfield or Joplin Missouri, but one of those cities in southwestern Missouri has a municipal ISP service and the data center is located underground in an old mine's caverns.
And the serious programmer then ends up fixing all his own problems, when his attitude makes the support staff decide that he's not worth their time anymore.
Get over yourself, please.
From the article, it looks as if the only thing they had to restore from tape/disk was their customer database, so that they could send out the next month's bills. So, the 72 hours was basically putting in new hardware and turning it on. They probably lost all their user's web sites and other "expendible" data.
How about talking about disaster recovery for a REAL company with tens to hundreds of terabytes of data sitting on disk? The kind of data that you cannot lose and must have back on-line asap?
This article is like congratulating them for putting up detour signs when a road is destroyed, or rerouting power when a power line goes down.
Just about everything that was destroyed was not-unique, manufactured items that could be recreated and repurchased. The only exception was the user data, which was pulled off of a nearly destroyed drive by a data recovery company. (Lucky for them!)
I would like to hear more about companies that lose tons of difficult to replace, unique items, such as TBs of user data, prototype designs, business records, etc.
I would bet that if a company were to permenantly lose these types of things, they would nearly go out of business.
My point exactly. No serious programmer would allow that. Of course, no serious programmer would install that malware on their system to begin with, would they?
No serious programmer would be running an OS on which the malware will run ;-)
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
That was on a Monday. The next Monday was the Northridge quake.
They came into the next meeting a couple of weeks after the quake with a whole new perspective on disaster planning and training:
Can't get much worse than what it was, take a look at an arial photo of the leveled building: http://www.aeneas.net/news/tornado_aerial/index.ht ml
Critics are calling it "A triumph of the American spirit."
One ISP, a horrific tornado, and the fight for high availability.
"The part where the sysadmin held the hard drive platter in his hand was so gripping. I gave my computer a hug when I finished the movie..."
5 minutes later -
HR: "Hi, Stratjakt? This is Mindy in Human Resources, We've outsourced the programming department to a company in Bangalore. Your replacement, Raj, will be calling you today to discuss transferring over all your existing projects. Thanks for all your hard work-"
We had a similar story from a while ago:
t ml ?tid=99
http://slashdot.org/articles/02/11/20/132259.sh
There was a subsequent story that describes how that NOC went back up fairly quickly as well, thanks in large part to selfless sys/netadmins who put the good of the larger community ahead of their own need to sleep or, for that matter, do much of anything else.
Kind of offtopic but maybe funny if you haven't heard them 495,954 times...
You might be a redneck if:
You've been on TV more than 5 times describing the sound of a tornado
A tornado hits your neighborhood and does a $100,000 worth of improvement.
[[ the only 15 letter word that is spelled without repeating a letter is uncopyrightable: it may soon be, however. ]]
I wonder how long it would have taken them if they already had a redundant datacenter that everything was replicated to. In the financial industry, 72 hours passes and the feds come in and shut you down. 72 hours may be acceptable for an ISP, but not for a bank or services like Western Union.
Need Free Juniper/NetScreen Support? JuniperForum
Respect is earned and not given. And I respect the engineer the who built and designed the building, and not the janitor who cleans it up.
Have you ever been to a turkish prison?
I used to live near Jackson, TN. If someone thought they could get another ISP there that doesn't have "OL", "arthlink" or somesuch in the title, they've been on the tractor too long.
A beowulf cluster of these could have kept them down for a month or two...
Error 666 - Satanic SCO code found in your Linux kernel.
Given they had BGP running and obvisouly another set of machines to fail-over the entire operation onto, why were they down at all? Admittedly the admin side of things (i.e new user signup, billings, user queries) would take time to recover while new workstation are commisioned. But the operational side should not have been down more than an an hour. 72 hours is a very long time in any business, may be the lesson they have learnt is how to bring it back up within 72 seconds.
You should have posted a link to the ISP's website.
Then we could've kicked a dog while it was down.
http://jesus.everdense.com/
When you go to a DRP seminar, they make the claim that the majority of business that are knocked out for longer than 48 hours go out of business within 1 year.
Why not have the server detect overload from Slashdot, and direct the excess to Goate.sx.
That will kill the effect right then and there.
Customers need to be able to access their porn!
- local schools and universities (faculty only)
- ISWT
- AT&T
- Charter Cable
- DSL (in some areas, not sure what provider)
It's not that backwoods, and Martin's a town of 8,000 or so. Jackson is a good sized city (about 250,000 people in the Jackson metro area, 87k or so in the city proper) with a decent infrastructure. It's no Silicon Valley, mind you, but it's not anywhere near as bad as that....120 character sigs suck. Make it 250.
This was from a mazazine for managers, after all. Now there's some good news that pointy-haired bosses can understand!
In retrospect I suggest that they should have been doing remote file synchronization with rsync. There would have been no need then to try to recover data from lost hard drives.
I wrote such a plan for a small/medium sized business not to long ago. Part of the plan included leasing 3 off-sitededicated servers for nothing but back up. Their databases would be dumped once a day at like 2AM and then the data uploaded to the three off site servers. Their database is about 120MB, not that large, and those dedicated servers have dual Hard Drives between 80 and 120 GB SCSI harddrives in a raid configureation and those are also backed up on-site daily. While not fool proof, this should save their ass in case of an emergancy.
"The problem with socialism is eventually you run out of other people's money" - Thatcher.
Did anyone else read "Kroll OnTrack" as "Troll OnKrack"?
Wait, did anyone else even read the article?
Oh, never mind.
There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
did anybody else notice these lines:
Meanwhile, Aeneas CIO and Operations Manager Josh Hart..
'It doesn't even look like there was an office here,'" remembers Hart, 25.
Aeneas launched its contingency plan when it was founded in 1996; since then, CIO Hart has enhanced the strategy gradually almost every year.
Seems to have gone unnoticed that this guy founded the company at 18...before the dot com boom!
My company decided to migrate from $olaris to Linux to save money. They also bought the cheepest intel boxes they could find to run it on. Not soon after we started having hard drive failures. Geee there isn't much for System Recovery on Linux. We finally found a company called Storix, but not after many hours of writing scripts around tar. It is bad enough to restore from a tar backup, let alone a dd=if /dev/sda of=/dev/st0.
This is really sad, and the company could have fired him for being incompetent. He basically destroyed their intellectual property through negligence, wasting all the money they invested in his project, which was almost certainly more than just his salary for that time period.
If a truck driver gets a load and forgets to check his own tie-downs, and as a result loses the load before reaching his destination, whose fault is it?
Besides, as supreme programmer, he should be motivated to work sometimes from home in the middle of the night, and have backups there
Get off my launchpad!
philcrissman.com.
I took two years worth of C classes in college. However, I don't program in C now. I script in PHP and have worked professionally on many large business sites, but I'm sure that's beneath your respect (which I really don't care about). I have a friend who writes microcode for IBM and he's basically of the same philosophy as myself. And I've "hung out" with many programmers. Not all programmers are pricks.
As for your analogy, it's cute but I don't see how it pertains to the conversation you tacked it onto. I can only imagine you consider the people who keep the programmers' computers, and the services they depend on to do their job, running to be akin to a janitor. Network admins usually get paid a little bit more than janitors, so businesses disagree. Secondly, I respect the janitor who does a good job keeping things clean, and there's no reason you shouldn't except for snobbery. You certainly shouldn't act like you know more about cleaning up than a janitor does, because if you're like most programmers I've met (myself included), you sure as hell don't.
"You don't expect me to clean that up, do you?"
"So this is were 'Gone with the Wind.' was filmed?"
"I'm sorry. I should have covered my mouth, before sneezing."
"Damn! That was some serious burritos."
"When I said I wanted a skylight. THIS ISN'T WHAT I MENT!!"
[Sign in front]
"Moving to a new location, with statewide coverage."
"Hey dude. That was some party."
"Can someone help me? I lost a contact lens."
[Another sign]
"Excuse our mess. We're remodeling."
[Sign #3]
"The dynamite factory is that way--->"
"I'll huff, and I'll puff, and I'll blow your building down."
Stan:
"But Ollie, it's not my fault."
And the serious programmer then ends up fixing all his own problems, when his attitude makes the support staff decide that he's not worth their time anymore.
You must not have read my post. I said I don't know any serious programmers who would allow the support staff to touch their machines. By simple logic, this means that all of the serious programmers I know do fix all their own problems.
Get over yourself, please.
This has nothing to do with any sense of personal superiority, it's just about control. I like my machine my way, and I don't want anyone mucking with it.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
One of the larger clients of my company has this setup: they have a West Coast and an East Coast information center, presumably with identical hardware and software. At any time, one site is live, and the other is dormant. Data are constantly replicated from the live site to be backed up at the other site. Each Friday evening, they switch roles -- the backup site goes live, and the live site becomes the backup. That way, they know for sure that the backup copy is just as good as the live copy. They rehearse the switchover every week, so it should be no big deal should an emergency happen. With some fancy IP routing, I'll bet that the transition can appear to be transparent, too.
They would still need backups to guard against data corruption. And yes, their hardware and software would cost approximately twice as much. But I have to say, I'm impressed by their idea.
Me too !!!!
I, for one, welcome our new Tornado-beating ISP overlords.
Yes, Yes, Yes! Read this line and chant it people!!11!