Tier One ISPs Dying
xbmodder writes "Two tier one ISPs are down today. At about 23:30PST both Verio and Level 3 starting having problems with routes. According to Level 3 this is a software upgrade gone awry. Is this the end for Level 3?" Many, many reports about this are coming in, and if you're wondering why the stories were rather sparse overnight, it's because it's difficult to post them without internet access. Hope everyone else is back online too.
Maybe I'll get some work done today for a change.
-S
--- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
Is there a term for this kind of intermittant site inaccessability due to Internet outage -- not the user or the server being offline, but the Internet failing?
That what was all this school was for... to teach us how to solve our own problems. -- janeowit
It's nice to see something explaining why I was paged at 2:30am. And now, to whom from Level3 do I send my bill?
But what is a tier 1 ISP?
Is that like a bandwidth wholesaler or something?
This is the sig that says NI (again)
An ISP's server being down 1 day is unacceptable of course, but to say it is dying already? or is there more to these ISP's? (haven't heard of them before)
Bottles Of Beer On The Wall - Advertising Fun! Get your bottle of beer on the wall today!
See http://scoreboard.keynote.com/scoreboard/Main.aspx ?xAxis=Destination&yAxis=Origin&zAxis=Metric&nAxis =Period/
Nico M, London, GB.
Take a look at the scoreboard now. The mentioned problems are gone and Level 3 is no longer in the red.
Why would a software upgrae going wrong be the end of a gigantic Tier-1 ISP?
http://scoreboard.keynote.com/scoreboard/Main.aspx ?Login=Y&Username=public&Password=public[coralized ]
Seems like a non-event.What are the odds that some idiot will name his mutex ether-rot-mutex!
my net was down EARLY this morning for a while. my modem and wireless router were working ok and I had an IP address but I couldn't get anything to load. i thought maybe i forgot to pay the bill but its working again now.
Thanks for that amazing flash of insight. Maybe they could have used the Internet Smoke Signals gateway. Meh.
Today is a good day to code.
Today i was playing world of warcraft and on our raid about 25% of my guild mates lost their internet on and off. Other than that the lag was higher than normal but i wondered what the hell was going on. Anyway we still pwn some dragons in BWL :)
unzip; strip; touch; finger; mount; fsck; more; yes; unmount; sleep
No wonder I couldn't get through to /. through the rss.
I actually had to go directly to the front page for the second time in my life.
I haven't had any problems with 'net access today.
My web based email has been fine, the usual news sites I access have been available, wikipedia and dictionary.com were both around, a couple of message boards and special interest sites have responded ok.
Email hasn't been disrupted for me either. I haven't tried anything other than email and http, but I use other protocols rather less.
Maybe some people in America are having problems. Welcome to the world, it's great out here.
even though this has nothing to do with the DNS debate with ICANN and the EU, the media will make it look like the US screwed up things and the internet is down becasue the UN isn't controlling things.
Yep the BSD trolls about BSD is dying are coming true, it is obvious that the ISPs were using BSD, and therefore the internet is dying.
Screenshot of My G5 desktop!
The Internet Health Report cited in TFA shows all green now. It looks like whatever problem they had is solved.
I prefer carrier pigeons.
Noticed this this morning when a customer called upset about his hosting services being unreachable. A quick traceroute showed one of level3's ip to be down. A few minutes later more customers had problems with different routers from level3. As soon as I saw level3 I knew enough, shrugged it off and told the customer that it was routing problem we couldn't fix but those responsible were most likely already trying to fix it.
It seems fixed now though, so no, this isn't the death of the Internet just yet.
Has Netcraft confirmed this?
You see? You see? Your stupid minds! Stupid! Stupid!
The reports of my death have been greatly exaggerated.
Tier One
Cake or Death? Cake Please!
I'm not sure which amazes me more: that the only people you know are the ones you play WoW with or that "pwn" has become some kind of short-hand for "0wn3d".
wasnt the internet envisaged so that this specifically could NOT happen?
--- Stop the world! I want to get off!
Hey look, we slashdotted Level 3!
I was up late studying for a German exam, and I was having problems connecting to websites hosted in Germany that I was using to help myself review (dict.leo.org and canoo.net, if you're curious). US websites worked no problem.
Off to the test!
I was up all night last night doing some research and I wasn't able to get to Google or Yahoo at around midnight or so. I ended up browsing some pr0n sites and most were working fine. There was one good set of pictures in particular, but I wasn't able to retrieve the full size images (only the thumbnails).
It had to. So many game servers were being immature, and letting people say "Die Fucker", and "Eat Shit & Die", but if you say "owned", "0wned", you got booted...
Stupid, really.
While this only lasted a few hours, it still caused a mess across the North American Internet during those hours. The point is a small amount of big networks are responsible for over 90% of the traffic on the Internet. If alter.net went down it would be total chaos. If just one of the major peering points went down, sure the traffic would be rerouted, but overloading the other points at such high latency that it would be almost unusuable. You better hope no one destroys MAE-EAST or we'll have a live example of what ife without the Internet is like.
pwn is what the kids say these days,
To them own3d is something your dad would say.
Get with the times, daddio =)
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
"At about 23:30PST"
What does that mean in terms of something human-readable like GMT?
I notice the article links back to Slashdot... I wonder is Slashdot is going to get BoingBoing'ed?
Because you can't spell "slaughter" without "laughter"
http://dictionary.reference.com/search?q=Domesday
I wonder, can this outage affect Banks ATMs and if it would, who would be held responsible for people not being able to get their money and all the problems originating from it?
Is this even an issue? I mean, this was probably scheduled maitenance that went a little longer than expected. I have been through this before. It just sounds like Level 3 dropped some core routers for a few minutes to do a code upgrade - it didn't work so hot, so they were down for a few more mintes, OSPF/BGP decided to tell all the clients that they have no routes, Level 3 gets the routers back up, OSPF/BGP tells everyone that their fine again. Was this like 6 hours, or 45 min?
You create your own reality - Leave mine to me.
Is this the first time this has happened? Is it too early to start talking about re-thinking the way this is put together?
useless sig advice - Read Nabokov.
Now it can't even survive a software upgrade on some of the routers!
Why couldn't this have happened during my business day? For just once when a user calls and asks "is the internet down?" I'd like to be able to say "actually, yes, it is."
this sig deleted by another sig
This should be a wakeup call to keep moving ahead with creating a more distributed and resilient Internet infrastructure. The first step in that direction are wireless neighborhood mesh networks.
People have been able to say something like that at every point in history. And I'd hardly call this nastier than hurricanes, and the Tsunami was worse than either them or this. The sky is not falling.
I am trolling
Insightful? I certainly hope comparing a short Internet outage to large disasters is a joke..
One of the lessons of history is that nothing is often a good thing to do and always a clever thing to say. - Will Duran
Seems to have cleared up. ahref=http://scoreboard.keynote.com/scoreboard/Mai n.aspx?Destination=Veriorel=url2html-7588http://sc oreboard.keynote.com/scoreboard/Main.aspx?Destinat ion=Verio>
*--- Sometimes a majority only means that all the fools are on the same side. ---*
Good point you've got there. But i personally think there's more chance of an "electronic" attack then a nuclear one. Imagine if this was possible to be triggered on a very large scale. The economic and practical damage would be enormous.
"lets go and kill some mobs" doesn't sound right... but I couldn't bring myself to say pwn, so I decided to just say prawn. much nicer.
Domesday is something like a census of Britain circa 1085. It has nothing to do with internet outages, which is more akin to doomsday.
Toronto-area transit rider? Rate your ride.
Is that you, Mozes?
This sort of event provides motivation for overlay routing schemes, which can compensate for major outages along various routes of the backbone:
e rs/subramanianOver/subramanianOver.pdfn focom.pdf
http://www.usenix.org/events/nsdi04/tech/full_pap
http://www.eecs.umich.edu/~farnam/pubs/2005-hwj-i
An unjust law is no law at all. - St. Augustine
But perhaps what's really meant is:
23:30 PDT = 06:30 UTC = 08:30 CEST ?
Hmmm ... was wondering WTF Slashdot wasn't working today. My route to slashdot goes through Level3.
More like the CIA is upgrading their equipment.
Chicken little just wrote another news article I think.
x ?xAxis=Destination&yAxis=Origin&zAxis=Metric&nAxis =Period
http://scoreboard.keynote.com/scoreboard/Main.asp
There is no outage anywhere.
...and not one Netcraft joke?
Way back in the day when I was a Network Controller at BBN Planet, if we began to have cascading routing outages we'd call it "Flapping"... Visualize a wounded bird squirming around on the ground flapping...
Takes me back... My first night on the job a rat in Berkeley chewed through the wrong cable and got himself fried -- he also happened to take the entire west-coast off the internet for the better part of a day.
Then there was the time an electrical worker got vaporized in a hole near MIT which caused quite a problem too as it overloaded the MIT power station, but the fallout wasn't nearly as bad as the day of the rat...
Be who you are and say what you feel, because those who mind don't matter and those who matter don't mind. - Dr. Seuss
This joke is dead.
It's amazing what these crypto experts can come up with to stay ahead of the game. What's next -- qwned?
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
Imminent death of the net predicted!
Well we seem to know why Level3 went down, but why did Verio go down at the same time?
happened in Detroit in the last 24 hours. Apparently all ingoing/outgoing traffic to other Tier One ISPs had problems in that city. Also, Philadelphia had really slow traffic within Level3 (and slower to all the others), and had major problems connecting to Verio. San Diego also had some problems, especially within the Level3 network. St. Louis was the only area without major problems...
For a breakdown, check out this view of the data.
This sig donated to Pater. Long live
If you look at the image they have posted on the link there are two clues that the image is a bad fake.
1) The Verio column is not reporudced red across the verio row.
2) The response times listed for "ALL" of them, including the Level3 and Verio links are all below 180ms which is the cut off for "Red" alerts on the Internet Health Report.
Yes, that's exactly what this is. You better curl up into the fetal position in the corner and start crying.
Maybe I won't be able to get some work done today ;^)
(/me has an entirely Internet-based job)
It's better to vote for what you want and not get it than to vote for what you don't want and get it.
- E. Debs
Is there any way we can blame Microsoft for this?
Were they upgrading to one of the Beta builds of Windows Vista Home Edition?
I'm not back online
I think what God is trying to say is: "Sorry for the inconvenience."
oooohhh I like that!
*patents*
English is dying.
Can someone who has data on what the percieved "Verio" outage is let me know? Obviously Level(3) customers would have issues reaching Verio and Verio would have trouble reaching them but reviewing one of our external monitoring systems I only see 3 events and only one that is not customer related. So unless you're in that isolated corner of the network in Europe...
They would... except that they're not accessible..
Level 3 went down at 22:42 pst and was available around 23:50 pst. Verio started having problems right around the same time that Level 3 was coming back up. The Internet Health Report from Keynote showed me what was going on, scary that it was.
No Not Again! Its whats for dinner.
I don't know that they've replaced Sprint yet on my list of most sucktastic internet companies. Time was you lost connectivity to an important piece of the Internet (Like your favorite Quake TeamFortress server) and a traceroute would show the failure somewhere in the Sprint backbone. So far they've been more reliable than Sprint at their worst, at least for me.
If they go under, well Tier 1's don't ever really die. Chances are one of the other Tier 1's will buy their assets and it'll be business as usual. Usually the buyer is MCI.
Of course the true test is pretty easy -- has anyone who works at Level 3 had their paycheck bounce yet? Surely there are a few readers among their employees...
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
I can't tell you how many times I cursed my Windose box yesterday trying to figure this one out
- Turn to the West
- Bow humbly
- Beg forgiveness
No, of course not, you blithering imbecile. L3 had a 2 hour global routing meltdown. Now, it's fixed. Whilst their routes were flapping, other carriers saw transient increases in latency and some problems with reachability, to some sites. However, everything continued to work properly for non-L3 customers. Two hours later L3's routes are back and working properly. End of story, nothing to see here, move along please.
Slashdot editors, do you really expect us to believe that no-one had submitted a more coherent or accurate story than this one? Come on, for heaven's sake.
Anyway, a network engineer's view can be seen in the overnight traffic on NANOG: http://www.merit.edu/mail.archives/nanog/2005-10/ "Tier One ISPs dying" indeed. Worst. Story. EVER.
"None are more hopelessly enslaved than those who falsely believe they are free." -- Goethe
Glad to see that Tier 1 ISPs are joing the ranks of BSD and Apple.
The Anti-Blog
yeah
AccountKiller
Good way of testing your ISP speed: 380k video music feeds: http://www.interactivehuman.com/
....or is it related?
http://www.internettrafficreport.com/asia.htm is another place to check internet traffic - it's a different metric, but it's still useful. Taiwan and India seem to be the only places doing okay.
AccountKiller
I could easily fix this problem. I would just restore it from the Recycle Bin.
Slashdotting the entire internet!
So what , was this some sort of Divine Slashdot effect ...
The only things certain in war are Propaganda and Death. You can never be sure which is which though
I'm sure you know this, but for the rest: "flapping" is the common term for when a router's routing tables rapidly cycle between two invalid states. The dead bird analogy is pretty descriptive, but the term "flapping" has technical and not allegorical origins.
Dewey, what part of this looks like authorities should be involved?
(finishes stuffing last 'Pizza Pop' in gob) (unplugs microwave and removes from telco rack) Thorry.(spitting crumbs)
with your terrible spelling
These are the first salvos of the EU's attack on the Internet, since the US wouldn't give in to their demands.
Good job guys. Now that incredibly fragile IP protocol is completely screwed, along with any chances of my getting onto match.com tonight.
Comment removed based on user account deletion
I was playing online last night and it was like the internet had a stroke. WoW went down but I was still on voice chat with my friends. My trace routes were timing out at at about 5 hops down the line.
The good news is, I got more sleep last night.
Comment removed based on user account deletion
Comment removed based on user account deletion
I got the DTs(delerium telnets) from not getting my fix last night...
Languages change. One of the areas most subject to change is vowels. If people started spelling "confirm" with an E and "conferm" became the new standard, it wouldn't be the first time such a change has occurred. Just compare Old and Middle English to Modern English, or even the othrography in Shakespeare's day or even as recently as the 1800s, and you will see a lot of things spelled differently. Or compare a modern Romance language with Vulgar Latin and then with Classical Latin. These things happen.
Language is defined by usage, not some Slashdotter pretending to be an authority. Orthography and prescribed grammar are both quite arbitrary and change from century to century.
The scary thing is it makes you wonder is some terrorist who has intimate knowledge of how Tier 1 ISP's work doing a trial run in the middle of the night by knocking out Level 3 and Verio backbones so later they could try to knock out ALL the backbones in a co-ordinated terrorist attack.
It doesn't make me wonder that. Terrorists do not give a shit about this kind of thing. To even invoke the word "terror" in this discussion is ludicrous.
- A.P.
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
We used to call it "tripping over the cable".
Usage: "Outbound and inbound are both down. Bob tripped over the cable. He'll be in the hospital for awhile... We'll be back up once I find the crimper.."
Also, there is "dish tossing".
Usage: "There were some outages earlier, some kids were tossing stuff at the feedhorn on the dish and a foil lined box stuck."
Oh, so THAT's why my daily spam load suddenly dropped by about 35% or so...
Bruce Lane, KC7GR,
Blue Feather Technologies
It isn't just a coincidence that this outage happened on the same day as a gay pride parade somewhere. Clearly, our Lord Jesus hates these sinners and is exacting his revenge.
[a modest proposal: google for "satire" before modding down]
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
more like a grand version of the time the boss "organized" the cables on the routers so they line up nicely
Snowden and Manning are heroes.
"Reality check: An internet outage, no matter how big, is no different than a power outage. Yeah, here in the US we would be talking about loss of power to both coasts with only the middle left running. But after the outage life goes on"
So much like the NYC blackout of 1965. There should be a rise in the birth rate accompanying this outage?
They are doing pretty well, not amazingly so (what telco is?) but they have a lot of cash and a stable recurring revenue base. They also have a pretty good outlook because they are one of the few companies not caught with thier pants down when the FCC mandated E911 support - which a lot of people are coming to Level 3 for. If you think VOIP has a future then so does Level 3. The market thinks so; regardless of your outlook the stock has been up quite a bit recently.
To call them a "dot bomb" is really unfair since they were far more financially prudent during the timeframe, which is why they are still around at all in the dark forest of discarded Telco husks.
Disclaimer, I work for Level 3. But on the other hand doesn't that mean that I know more than most people about the real situation here?
I have had my paycheck bounce at companies I've worked for in the past and been told I'd have to wait an extra month or two for pay at said companies (you know the kind, six employees and the owners mom uses the company AMEX for trips to DisneyWorld while you wait weeks more to get paid). Level 3 is a few billion dollars away from that sad state.
And don't accuse me of drinking Kool-aid either - after going through a lot of layoffs over the years you have a VERY realistic outlook on what the company does well and what it does not.
yeah, i never liked mobs
I bet most players don't even know it means "Monster OBjectS" !
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
It's a face? I never looked at it that close, because my parents told me it would hurt my eyes. Next time I'm outside, I'll make sure to stare directly at it until I can make out the face!
Let me try again:
Your dying bird analogy is OK, because that's pretty much how it acts. But they say a router is "flapping" because its routing tables are flailing about aimlessly like maps to the Superdome.
So, it describes what's actually happening, not how it compares to something your cat's trying to eat.
Dewey, what part of this looks like authorities should be involved?
So everybody with an alternate path around Level 3 should have routed around them properly. And yet, they weren't routed around. That's a concern.
If you went down because of this outage, your provider is totally dependent on Level 3, which is not good. This is a useful warning - if you went down, and your operation is important enough that it needs to stay up, you need to look very hard at your provider's upstream connectivity. Better hosting services have connections to several Tier I providers, just in case something like this happens.
Seems a Fiber ISP in Palo Alto says it began in Germany. Thay have Deutsch Telecom with problems before Level 3. Long link to the NOC updates is on the top of their homepage today.
Don't you get it? They deleted the internet!!
Good thing I downloaded it last night.
Disconnect your television. Do your own research. Draw your own conclusions. They're probably lying. Don't be a sheep.
I was looking for a Linux Virtual Host, blah, blah.
Stumbed apon these pretty pictures (near bottom of page) .
Curious, I thought, what happened to Level(3) ? I though for a second because perhaps unixshell had a peering with those people that Level(3) were in dispute with.
Nope, just one of those regular outages that make the 99.999% promises sound a little over done.
[% slash_sig_val.text %]
You think "at least" is one word, but "nowadays" is two? Jesus Harold McChrist! Fucking illiterates. I blame AOL.
My pager went off at 1:48am EDT. Was able to get to my boxes from my Verizon connection, but couldn't get to other hosts via L3. Put in a ticket at 2:15am that L3 was having problems. Stupid L3 sucking woke me up. Grrrr.
TossableDigits.com: Temporary Phone Numb
No you idiot, it's because we block you on sight.
Karma: Meh (Mostly from meh.)
It's spelled "Hear! Hear!"
Nonononono... he was clearly replying to the parent's question. Please everyone, send your bill to SRoberts7758.
"And now, to whom from Level3 do I send my bill?"
HERE HERE!!!
94% of Repubs and 21% of Dems voted to renew the Patriot Act
Mobile OBjects, not Monster OBjects
retrorocket.o not found, launch anyway?
Let's go kill some MOBs sounds great to me. Unfortunatly the people I MMO with often don't understand what a MOB is. Damn you Graphical games for changing terms I've been using for years!
That which is done from love exists beyond good and evil
i don't know about you but not being able to connect to google last night scared the shit out of me. slashdot up, but google down? aaaaaah the madness
I know what it was! It was the Europeans trying to wrest control of the Internet! They sabotaged us so they can point fingers, "See! See! They don't know what they're doing!"
Or the global mesh of cooperating, overlapping wireless footprints?
Things are better in both the past and the future.
Last night my connection to work was screwed up also. Our ISP must of got affected by this outage even thought the connection to work shouldn't be affected by this type outage since the connection should within an Tier two or below.
In my previous job couple of years ago we had Level3 and several other Tier one installed at our location and we had no issues with each company in the same area in our data center. Level3 had some excellent techicians and engineers so I was very surprised that Level3 could of caused this. But I assume management of each of these Tier one need to show better numbers for quarter end stock so they are screwing all of us in the process.
cd /Internet; rm -rf Level3
PHP Developer Virginia this sig sold out!
No one said it would react *quickly* to a nuclear war. It took about two hours for routing to go around Level3 rather than through it.
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
I find it ironic that this was the quote at the bottom of the page when I loaded this article:
"The computer is to the information industry roughly what the central power station is to the electrical industry. -- Peter Drucker"
Obviously this has been proven false. A Tier One provider is more analogous to a central power station - if it goes out, the internet dies.
rm -r http:\\*
Dumbass
Click here to watch it. A group of people going crazy during an Internet outage. Perfect timing. :)
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
Maybe their bandwidth is really just saturated by the spammers they harbor.
Actually sabotaging the railroads would cause more than just late commuters. Same with roads and tunnels. On both hazzardous chemicals are being carried. Just imagine the damage if a train carrying chlorine or ammonia derailed due to a bomb in the middle of a majour city? As 9/11 taught us. Using the enemies equipment against them, while using little of your own is best.
Does it make you happy you're so strange?
Now Cogent appears down too.
Now Cogent is having a completely random and unrelated failure. What a total coincidence.
TAGGE: "And what of the Rebellion? If the Rebels have obtained a complete technical readout of this station, it is possible, however unlikely, that they might find a weakness and exploit it."
:-)
You know what happened less than two hours later
Loss of the internet would have a very large financial effect, more then enough to be a juicy target for those that feed on hate.
No matter how subtle the wizard, a knife between the shoulder blades will seriously cramp his style.
I'd rather get an X-ray than have brain surgery.
Some other things people get worked up about but terrorists are unlikely to attempt: sabotaging bridges and tunnels to cause traffic jams; sabotaging electricity distribution to cause blackouts; sabotaging railroad tracks, making commuters late for work!
You have to assume though that any major landmark could be a target. For example, I could see the Golden Gate being a potential target, along with the Space Needle in Seattle, a few other bridges, etc. Also I can imagine that tunnels could be attacked with poison gas, anthrax. Even though the likelihood of damage from such an attack would be low, the fact that it happened would be terrifying to a lot of people. I.e. it is not enough to think death, and attacks on one's security need not take that form. Instead attacks on familiar landmarks and attacks against a general sense of safety.
Now, a large-scale cyber attack seems to me to be economically damaging enough that someone who wished our country harm might try it but it lacks the propaganda capabilities that traditional terrorist attacks have. For example, if Bin Laden destroyed the Golden Gate Bridge, he might be able to drum up some additional support, but if he claims to have disrupted the internet, most of his supporters will probably respond with blank stares. North Korea, OTOH, would have a much more likely motive....
LedgerSMB: Open source Accounting/ERP
YOU confirm the netcraft joke is dead!
The dog ate my router!
So God knows IOS?! I'm starting to believe in Him again.
Used to do contract work at an auto company's plant. The main data center's primary job was to feed test programs to an distributor testing line and collect the stats. It was located in the middle of the plant on the second floor, next to the row of test stands.
Some time after my contract had ended I visited the place and it was a total disaster.
During the model change shutdown (when most of the plant maintainence and rearrangement was done) the millwrights were welding on some cableways on the ceiling of the plant floor below. The fumes from the welding, of course, rose to the ceiling and escaped through the first hole they could find - around the big fire sprinkler pipe that went up through the floor of the computer room and into the space beneath the raised floor.
It tripped one ionization smoke alarm and sounded the warning - but nobody was around during the shutdown to hear it. Shortly thereafter it tripped a second one and the halon system went off. The computer power shut down and $10,000 worth of halon blasted into the computer room. Half of it came out through vents under the floor, throwing the raised floor panels and a decade's accumulation of fine dust (much of it byproducts of metal cutting and anealing) all over the room. And finally sounding an alarm at the guard shack.
The guards came over and found the room in disarray but no slightest sign of a fire. A couple million bucks worth of computer equipment, slated for replacement in another few months but still critical to the plant's operation, was standing there, covered with dust (likely to cause trouble for the disk drives later) but otherwise intact. So they followed procedure and reset the halon system, switching to the backup cylinder, to protect the computer in case an actual fire made it to the comp room. (Normally that's a good idea, since smouldering that sets off smoke detectors is often followed some time later by an actual fire.)
Of course the welding was still going on - just not at the moment the guard sniffed the comp room. (Welders out to lunch, pulled out due to the alarm, or having decided to come down off the ceiling for a bit after the blast of gas from above.) And they still had work to do. So of course they went back to it.
In less than an hour the situation repeated, dumping the SECOND $10,000 worth of halon on the non-fire. B-(
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
"flapping" is the common term for when a router's routing tables rapidly cycle between two invalid states.
Also known as "route flapping". Also applied when the routes are valid, but a router still alternately advertising a pair of routes because it keeps changing its mind about which is better.
This, by the way, is partly the result of an internet standard that isn't sufficiently prescriptive. The BGP protocol itself is well defined, but its implementation is left open. Unfortunately, it prescribes a system that has gain, delay, and negative feedback. So a naive implimentaion leads to oscilation when deployed, and something must be hacked to stop it.
Getting it to stop is a black art of coding workarounds - both to keep yourself from oscilating and to react quickly and appropriately (rather than blindly doing a full recomputation of your routing tables with each received flap) when somebody else starts flapping, so your packets keep going through despite the bad net weather.
If you get something stable among your own machines, you still have the issue of whether it stays stable when they're talking to somebody else's, or to an earlier version of your own. Avoiding breaking your own earlier stuff in a new release is difficult. Especially so since you don't necessarily know WHY your implementation is working (if nothing else, because you don't know what some of the other implementations it's successfully conversing with are up to). And testing before release is very hard, because you can easily get something that works just fine in all your lab test cases but breaks when deployed on the real net.
Since deploying a broken BGP implementation breaks, not just the routers it's deployed on, but large sections of the rest of the net, backbone providers and ISPs are very leery about buying routers from somebody without a proven BGP implementation. This creates a chicken-and-egg problem for companies trying to break into the router business: You can't get customers without a proven implementation, and you can't prove an implementation without customers.
Last I heard (by word of mouth, a couple years ago) there were only two independently-developed implementaions of BGP that had achieved this level of confidence.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Netcraft confirms it. Tier 1 ISPs are dying.
* The IDC confirms once again that market share among Tier 1 ISPs has fallen again.
etc....
I thought about writing a long wandering spoof but don't have the time...
LedgerSMB: Open source Accounting/ERP
Used to be that people would predict "The End of Usenet As We Know It, GIFs at 11", but technology has progressed a lot since then...
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
The MAEs don't really matter much any more, at least for Tier 1 peering, which has almost all moved off of public peering and into private peering. A lot of that private peering takes place in carrier hotels or telco POPs - Equinix has 7-8 big locations, Seattle's Westin building, a bunch of different buildings within a block or two of the main Los Angeles telco POP, and a few others. Some private peering also happens on fibers run between carrier offices.
Most of the Tier 1 providers have lots of excess bandwidth - if the DC area peering were to fail, most of them would have enough spare peering capacity in New York or Atlanta or Chicago to recover without major capacity losses, and BGP would reroute most of the rest reasonably well. The West Coast is in a bit worse shape, just because the distances are longer - SF-LA is only ~350 miles (~3.5ms one-way), but SF-Seattle is a lot farther, and isn't quite as overbuilt, and with many carriers, if you lose the direct route, you take a 2-3000 mile loop through Salt Lake City to recover (unless you've got two central California routes, on I-5 and 101 or railroads.)
European peering architectures have much differently - the geography's different, and the carrier relationships were different, so huge fractions of that traffic go through LINX and a few other points like AMSIX, and losing LINX would be seriously bad.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
If you've got a given prefix, and people are exporting bogus reachability information about it, start advertising two prefixes of half the size. I know one large ISP with a /8 who had to start publishing two /9s because some bozo outfit was doing incorrect route summarization and claiming that their little circuit in South America had a really great route to that /8. It's a cheap trick, and you shouldn't leave it up too long if you can avoid it, but works really well when you need it.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Most of the time, ISPs can protect themselves against their neighbors failing, and most of the time they do. A few years back, some random company advertised that their T1 was the best way for half the world to reach MAE-EAST (or some target of that size), and suddenly half the world's internet traffic was trying to get down that wire before it melted, making it difficult for any equipment nearby it to even scream for help. Lots of ISPs started doing a lot more BGP filtering after that, and developing methods to monitor the advertisements the outside world was seeing about them, and things got safer, but it's still possible to screw up.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Their direct customers would obviously be affected, so if you've only got one ISP connection and they're down, you're out of service - if losing connectivity in the middle of the night is a problem, you need to arrange for diversity, but during the daytime you're more likely to have a backhoe take out the wires on your street than have your ISP down for more than an hour or two.
L3's a big wholesale provider, so if they're down, it can affect people who didn't know they were using their services; maybe they're using a small ISP that buys half its bandwidth as transit from L3, or maybe their ISP is using L3 to reach specific areas where they don't have geographical coverage or provide specific types of service. So the outage may feel a bit more widespread than it really was, but it's still the middle of the night. The recent L3-Cogent fracas was a bit more visible because they handle different kinds of customers - L3 provides bandwidth to lots of small ISPs with consumer end-users, while Cogent provides big cheap pipes to lots of hosting business, so the interference was fairly synergistic, and it lasted a lot longer because it was a Layer 8 / Layer 9 business disagreement, not a Layer 3 technical problem that can be fixed by engineers.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Seems like a good time for google to aquire the infrastructure of level 3, drop some more equipment in, use google's dark fiber ack and pwn the net.
Cheers!
ObT: microfost being a devision of google. Much like a retarted sib.
Oh, I dunno. I think it says a lot about the Slashdot community.
(Yes I despise the phrase "doesn't say much".)
what went wrong in Level3: http://www.merit.edu/mail.archives/nanog/msg13166. html
- Just because we CAN do a thing, does not mean we SHOULD do that thing.
Well, a week later - and the history tab shows that no modifications were done except for the addition of the slashdot warning.