Grading Telco & ISPs During the Blackout of 2003?
alt_cognito asks: "Our company runs natural gas generators here in Novi MI and when the power went out we didn't miss a beat. Nine hours later, our telco blinked and our T1 service went down despite the lines being run to different locations and ISPs (UUNet, LDMI). Service did not return until power had returned to the upstream offices. I was under the impression that these locations would be run by similar power generation. How did your telco/ISP perform?"
I had both dial up and dsl through the whole thing.
In flyover country, the only online problem I had was one banking site that had a stuck screen. Everything else worked fine, if not better (since some sites were much less overloaded with Northeasterners).
If the problem originated in Cleveland, I still think the problem had to do with Lewis and Oswald.
Don't blame Durga. I voted for Centauri.
Our generator worked fine (we were powerless till about 5am friday) and a dedicated line to Motient stayed up the entire time. But our UUNET T1 was down until about 6pm friday. and the entire time it was impossible to even get a Worldcom person on the phone to find out if they even had an idea as to when they would get thier act together. so basically I give Motient an A+ (well they were outside the blacked out area but I still got updated on thier network status inside the area) and I give Worldcom a big fat F.
My ISP handled the outage just fine. Here in California we don't have to worry about blackou&*#%@
carrier lost
~~~~~~~
"You are not remembered for doing what is expected of you." - Atul Chitnis
Oh yeah. I've always had minimum pingtime of about 30 seconds to render a damned page. And I'm on DSL.
Seems scoop isnt all what they thought. Guess they shold have chosen slashcode.
Here in Canada, Bell Sympatico was up from when I lost power to when I got it back, I would just assume they were up all along. All ISPs I believe have failsafe setups that were never tested and maintained and that showed very well during the blackout.
At my company the IBM eSeries servers were backed by a smart UPS that showed 17 hours remaining, 15 seconds before shutting down everything in cold blood. It was all scandisks booting back up there on Windows 2000 machines.
I maintain some small servers with no UPS in a few locations, and while one Solaris server crapped out, you had to manually do the fsck thing, all the FreeBSD servers were back up as the power came without a hitch. I have to learn to setup Solaris on sparc so it fsck itself without asking for an input.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
...and without a contract, I doubt the workers will be working very hard to fix the damage caused. I have multiple T-1 lines down all over Manhattan, most on UUNet/WorldCom. The word is Verizon has a couple CO's that are still out, and I hope they know what they're doin....
"... I declare our city to be a free and independent state to be named Tri-Insula!" --Fernando Wood, Mayor of NYC 1861
Since your telco is SBC, that should be a daily occurance.
--
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
It's great that you have two different providers. Eliminating the single point of failure is important. But, most people miss the semi-hidden single point of failure, the local telco. The problem is that your two ISPs likely don't own the copper that those T-1s run on. That copper is owned and operated by your local telco. Your ISP just contracts with the telco to provide you their service over the local telcos loop.
It is likely that you will never find out exeactly what happened but, from what you describe it sounds like; the lights went out, the local Central Office(CO) where the local loop for your T-1s went onto UPS backup or generator and after a few hours the UPS or the generator ran out of juice. Once the CO ran out of juice your T-1s went dead. So, you lost connection with the ISPs. More than likely the ISPs themselves never blinked.
The only way to avoid this problem it to use two different local loop providers which is usually going to be hard to find unless you are in a large metropolitan area. The other thing to do is get the local loop lines from different COs which will be like pulling teeth from your local telco.
Planning and preparedness, unfortunately, does not guarantee against failure.
I live here in Quebec and we never lost any power. However my ISP never came back online until 24hours later. I was very disapointed with the results, however I really don't think it was the ISP fault more to the fact that Ma'Bell(www.bell.ca) had not backup power other then a 10min UPS running there routers.
My company has a T1 with UUNET. Verizon handles the local loop. Verizon STILL hasn't brought our line back up. New York has had power for 3 days now. I already give Verizon an F for past intermittent problems they couldn't fix. Now I guess they get the talk about how high school isn't for everyone and they might want to drop out while they can still get in on the ground floor of a career in day laboring.
I would always get very nasty-looking fsck errors on my Solaris machines whenever they crashed. Although the messages looked nasty, the filesystems seemed to be fine after being repaired by fsck. The problem was that the fscks interrupted the boot process until manual intervention was given.
One day, I discovered a journaling mode for UFS. The journaling feature had been available since Solaris 7. See mount_ufs(1M).
You simply add logging to the mount options of your UFS volumes in /etc/vfstab. Reboot once so that you remount those volumes (presumably including your root partition) with journaling turned on.
That's all! I haven't had to fsck since I did that.
I love the fact marketers and politicians didn't get in on the pre-attack capitalization & exploitation by preplanning it in advance like they usually do these type of "nationwide" events. The Blackout probably saved Americans 2 billion dollars in wasted energy simply by turning off the switch for 5 minutes and giving the sonic bomblastment of current draw back to God, like peace, man. So, this is currently my favorite major catastrophic event this month, because it didn't involve the DOD, CIA, FBI, ADL, AOL, Waco or Janet Ashkraupht.
I'm in southern California, and around 18:00 PDT Monday, our T1 connection to MCI pretty much quit working to 95% of the internet. MCI said there were routing problems due to the NE blackouts... whatever.
Nothing to see here; Move along.
But a funny story none-the-less...
Back during the great auckland power crises of 97, my ISP was Binary Brothers, a now extinct ISP. They were a great ISP, run by a few guys who knew their stuff.
Turns out the owners were physically located on the coremandle peninsula (about 4 hrs drive from Auckland), while their servers/modem racks, etc were located in the heart of auckland CBD.
The power in auckland blinked out, and as did my net connection (I was located outside the blackout area). I rang 'em up and asked if they were out due the the power crises. They replied that they were currently in the wagon driving from the coremandle with a generator in the back, and the ISP will be back online in a few hours.
Well, they were right, it flicked back online in a few hours. Gotta hand it to them, they really did a good job. I just have this mental picture of these two guys in a wagon, speeding down the windy roads, one yelling, drive faster, we need more power!
Anyway, that was those days, these days I'm lucky if my home connection stayed up during rain storm, let alone a power blackout.
I use to have a funny sig, but slash cut it off, and I forgot what the punchline was.
I'm on the west coast so we were suffering from the california blackouts which we went through without too many problems.
But today at our colo facility which has battery backup and generators and can last for quite sometime without city power went down this afternoon. Why? An electrician turned off our breaker and shutdown our entire rack.
never underestimate human error.
--ajay
As a result, I could have used my laptop for some late night entertainment, but I decided to use my lanterns and read a book instead.
What was really interesting was that you could see the stars in the sky for the first time probably since 1977's blackout. Oh sure, you can see a few, but I mean see a LOT of them.
Never hit your grandmother with a shovel, for it leaves a bad impression on her mind...
Home:
ISP: Cogeco Cable: No interruptions
Power: NiagaraMohawk: No interruptions (I kid you not. My neighborhood hasn't lost power for a second during the outage).
Phone (both landline and cellphone): No interruptions
Work:
ISP: RoncoNet: No interruptions
Power: unsure (prolly NiMo + generators): No interruptions
Have EVDO, will travel.
For the first few minutes things were fine, my ISP was still up when I decided to shut down early, rather than running the UPS battery down hard. Cell phones were fine too, for a couple hours.
Later Thursday night I tried to get back online from my laptop, and my local ISP (who uses Megapop for dialin ports) would answer and connect, I'd get an IP address, but my packets went nowhere. Tracert stopped at hop 1.
So, knowing that the phone system was up, I simply dialed out of the affected area. Calling a POP in another city worked fine, and I was able to touch base with a few friends online before the laptop battery gave out.
Friday morning, having recharged the laptop in the car, I gave it another go. This time, the far-away ISP connected but wouldn't route packets. I decided it wasn't very important, that the fridge wouldn't stay cold much longer, and I gave up on the internet in favor of starting the generator.
Megapop: B for staying up as long as they did, F for eventually failing. Big batteries followed by dead generator? Final grade, low D.
When I signed up with Nextel as a cell phone carrier, I did so with the knowledge that their history had been as a mobile radio provider to law enforcement and public safety agencies. I figured if anyone's network would be well engineered, it would be theirs. Oops.
For the first few hours, my phone said it could see the tower, but circuits were predictably busy. That's understandable. Then some time Thursday evening the tower went off the air, and didn't come back until Friday afternoon, for 2 hours, then it disappeared again. I took to checking my voicemail from a POTS line. During the next day I got and lost signal several times, including one weird moment when the phone's light blinked red (It's only supposed to know solid red or blinking green) and the display said "Ready", instead of "NEXTEL".
Starting Thursday afternoon, my (quite expensive) Packetstream service was useless. Even when I could make voice calls, the internet was unreachable. This has me really baffled. The cell site was up, but something at the back-end was down? If anything, I'd expect their switching facilities to be better protected than the individual sites! I'm cancelling Packetstream in favor of the much cheaper and probably equally unreliable data service from T-Mobile.
Nextel: C- for unreliable service starting immediately, F for getting even worse after that. Shame on you.
POTS remained working perfectly the whole time, despite the fact that I know my line to be carried through a SLIC-hut. Those batteries did exceptionally well. Except for the first busy hour following the blackout when everybody called around to check up on family, I never had problems making calls. After replacing all the cordless phones in the house with corded models, life returned to a sweaty version of normal. Curiously enough, one of our three POTS lines failed midway through the outage, and didn't come back until late Sunday. Coincidental failure?
SBC: A for having dialtone the whole time, B+ for only giving "all circuits busy" messages a few times under what must have been heavy load.