Tier One ISPs Dying
xbmodder writes "Two tier one ISPs are down today. At about 23:30PST both Verio and Level 3 starting having problems with routes. According to Level 3 this is a software upgrade gone awry. Is this the end for Level 3?" Many, many reports about this are coming in, and if you're wondering why the stories were rather sparse overnight, it's because it's difficult to post them without internet access. Hope everyone else is back online too.
Maybe I'll get some work done today for a change.
-S
--- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
Is there a term for this kind of intermittant site inaccessability due to Internet outage -- not the user or the server being offline, but the Internet failing?
That what was all this school was for... to teach us how to solve our own problems. -- janeowit
It's nice to see something explaining why I was paged at 2:30am. And now, to whom from Level3 do I send my bill?
But what is a tier 1 ISP?
Is that like a bandwidth wholesaler or something?
This is the sig that says NI (again)
An ISP's server being down 1 day is unacceptable of course, but to say it is dying already? or is there more to these ISP's? (haven't heard of them before)
Bottles Of Beer On The Wall - Advertising Fun! Get your bottle of beer on the wall today!
See http://scoreboard.keynote.com/scoreboard/Main.aspx ?xAxis=Destination&yAxis=Origin&zAxis=Metric&nAxis =Period/
Nico M, London, GB.
Take a look at the scoreboard now. The mentioned problems are gone and Level 3 is no longer in the red.
Why would a software upgrae going wrong be the end of a gigantic Tier-1 ISP?
http://scoreboard.keynote.com/scoreboard/Main.aspx ?Login=Y&Username=public&Password=public[coralized ]
Seems like a non-event.What are the odds that some idiot will name his mutex ether-rot-mutex!
Today i was playing world of warcraft and on our raid about 25% of my guild mates lost their internet on and off. Other than that the lag was higher than normal but i wondered what the hell was going on. Anyway we still pwn some dragons in BWL :)
unzip; strip; touch; finger; mount; fsck; more; yes; unmount; sleep
No wonder I couldn't get through to /. through the rss.
I actually had to go directly to the front page for the second time in my life.
The Internet Health Report cited in TFA shows all green now. It looks like whatever problem they had is solved.
I prefer carrier pigeons.
Noticed this this morning when a customer called upset about his hosting services being unreachable. A quick traceroute showed one of level3's ip to be down. A few minutes later more customers had problems with different routers from level3. As soon as I saw level3 I knew enough, shrugged it off and told the customer that it was routing problem we couldn't fix but those responsible were most likely already trying to fix it.
It seems fixed now though, so no, this isn't the death of the Internet just yet.
Has Netcraft confirmed this?
You see? You see? Your stupid minds! Stupid! Stupid!
The reports of my death have been greatly exaggerated.
Tier One
Cake or Death? Cake Please!
I'm not sure which amazes me more: that the only people you know are the ones you play WoW with or that "pwn" has become some kind of short-hand for "0wn3d".
wasnt the internet envisaged so that this specifically could NOT happen?
--- Stop the world! I want to get off!
Hey look, we slashdotted Level 3!
I was up late studying for a German exam, and I was having problems connecting to websites hosted in Germany that I was using to help myself review (dict.leo.org and canoo.net, if you're curious). US websites worked no problem.
Off to the test!
It had to. So many game servers were being immature, and letting people say "Die Fucker", and "Eat Shit & Die", but if you say "owned", "0wned", you got booted...
Stupid, really.
While this only lasted a few hours, it still caused a mess across the North American Internet during those hours. The point is a small amount of big networks are responsible for over 90% of the traffic on the Internet. If alter.net went down it would be total chaos. If just one of the major peering points went down, sure the traffic would be rerouted, but overloading the other points at such high latency that it would be almost unusuable. You better hope no one destroys MAE-EAST or we'll have a live example of what ife without the Internet is like.
pwn is what the kids say these days,
To them own3d is something your dad would say.
Get with the times, daddio =)
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I notice the article links back to Slashdot... I wonder is Slashdot is going to get BoingBoing'ed?
Because you can't spell "slaughter" without "laughter"
GMT -8:00
Religion and politics, without the flame. godgab.org
http://dictionary.reference.com/search?q=Domesday
I wonder, can this outage affect Banks ATMs and if it would, who would be held responsible for people not being able to get their money and all the problems originating from it?
Is this even an issue? I mean, this was probably scheduled maitenance that went a little longer than expected. I have been through this before. It just sounds like Level 3 dropped some core routers for a few minutes to do a code upgrade - it didn't work so hot, so they were down for a few more mintes, OSPF/BGP decided to tell all the clients that they have no routes, Level 3 gets the routers back up, OSPF/BGP tells everyone that their fine again. Was this like 6 hours, or 45 min?
You create your own reality - Leave mine to me.
Is this the first time this has happened? Is it too early to start talking about re-thinking the way this is put together?
useless sig advice - Read Nabokov.
Now it can't even survive a software upgrade on some of the routers!
Why couldn't this have happened during my business day? For just once when a user calls and asks "is the internet down?" I'd like to be able to say "actually, yes, it is."
this sig deleted by another sig
This should be a wakeup call to keep moving ahead with creating a more distributed and resilient Internet infrastructure. The first step in that direction are wireless neighborhood mesh networks.
People have been able to say something like that at every point in history. And I'd hardly call this nastier than hurricanes, and the Tsunami was worse than either them or this. The sky is not falling.
I am trolling
Insightful? I certainly hope comparing a short Internet outage to large disasters is a joke..
One of the lessons of history is that nothing is often a good thing to do and always a clever thing to say. - Will Duran
Seems to have cleared up. ahref=http://scoreboard.keynote.com/scoreboard/Mai n.aspx?Destination=Veriorel=url2html-7588http://sc oreboard.keynote.com/scoreboard/Main.aspx?Destinat ion=Verio>
*--- Sometimes a majority only means that all the fools are on the same side. ---*
I was thinking this was the "or else" that the EU issued the US.
There's no emoticon for what I'm feeling! -- CBG, "The Computer Wore Menace Shoes"
"lets go and kill some mobs" doesn't sound right... but I couldn't bring myself to say pwn, so I decided to just say prawn. much nicer.
Domesday is something like a census of Britain circa 1085. It has nothing to do with internet outages, which is more akin to doomsday.
Toronto-area transit rider? Rate your ride.
This sort of event provides motivation for overlay routing schemes, which can compensate for major outages along various routes of the backbone:
e rs/subramanianOver/subramanianOver.pdfn focom.pdf
http://www.usenix.org/events/nsdi04/tech/full_pap
http://www.eecs.umich.edu/~farnam/pubs/2005-hwj-i
An unjust law is no law at all. - St. Augustine
But perhaps what's really meant is:
23:30 PDT = 06:30 UTC = 08:30 CEST ?
Hmmm ... was wondering WTF Slashdot wasn't working today. My route to slashdot goes through Level3.
More like the CIA is upgrading their equipment.
...and not one Netcraft joke?
Way back in the day when I was a Network Controller at BBN Planet, if we began to have cascading routing outages we'd call it "Flapping"... Visualize a wounded bird squirming around on the ground flapping...
Takes me back... My first night on the job a rat in Berkeley chewed through the wrong cable and got himself fried -- he also happened to take the entire west-coast off the internet for the better part of a day.
Then there was the time an electrical worker got vaporized in a hole near MIT which caused quite a problem too as it overloaded the MIT power station, but the fallout wasn't nearly as bad as the day of the rat...
Be who you are and say what you feel, because those who mind don't matter and those who matter don't mind. - Dr. Seuss
This joke is dead.
It's amazing what these crypto experts can come up with to stay ahead of the game. What's next -- qwned?
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
Well we seem to know why Level3 went down, but why did Verio go down at the same time?
happened in Detroit in the last 24 hours. Apparently all ingoing/outgoing traffic to other Tier One ISPs had problems in that city. Also, Philadelphia had really slow traffic within Level3 (and slower to all the others), and had major problems connecting to Verio. San Diego also had some problems, especially within the Level3 network. St. Louis was the only area without major problems...
For a breakdown, check out this view of the data.
This sig donated to Pater. Long live
Yes, that's exactly what this is. You better curl up into the fetal position in the corner and start crying.
Maybe I won't be able to get some work done today ;^)
(/me has an entirely Internet-based job)
It's better to vote for what you want and not get it than to vote for what you don't want and get it.
- E. Debs
Is there any way we can blame Microsoft for this?
Were they upgrading to one of the Beta builds of Windows Vista Home Edition?
I'm not back online
I was in the same boat. It was like 07:30 UTC or so, I was trying to play a game, but my ping was high and I'd intermittently get disconnected from it every few minutes. It was the weirdest outage I'd seen because I could get to some web sites, google, yahoo, excite (at first I thought it might just be cache so I went to excite which I don't think I've ever visited with this computer). But I couldn't get to slashdot, guildwars page, or update my antivirus. It was like my connection was down for anything except the major search engines. Even as of about 12:00 UTC my connection was still slow.
Can someone who has data on what the percieved "Verio" outage is let me know? Obviously Level(3) customers would have issues reaching Verio and Verio would have trouble reaching them but reviewing one of our external monitoring systems I only see 3 events and only one that is not customer related. So unless you're in that isolated corner of the network in Europe...
They would... except that they're not accessible..
Level 3 went down at 22:42 pst and was available around 23:50 pst. Verio started having problems right around the same time that Level 3 was coming back up. The Internet Health Report from Keynote showed me what was going on, scary that it was.
No Not Again! Its whats for dinner.
I don't know that they've replaced Sprint yet on my list of most sucktastic internet companies. Time was you lost connectivity to an important piece of the Internet (Like your favorite Quake TeamFortress server) and a traceroute would show the failure somewhere in the Sprint backbone. So far they've been more reliable than Sprint at their worst, at least for me.
If they go under, well Tier 1's don't ever really die. Chances are one of the other Tier 1's will buy their assets and it'll be business as usual. Usually the buyer is MCI.
Of course the true test is pretty easy -- has anyone who works at Level 3 had their paycheck bounce yet? Surely there are a few readers among their employees...
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
I can't tell you how many times I cursed my Windose box yesterday trying to figure this one out
- Turn to the West
- Bow humbly
- Beg forgiveness
No, of course not, you blithering imbecile. L3 had a 2 hour global routing meltdown. Now, it's fixed. Whilst their routes were flapping, other carriers saw transient increases in latency and some problems with reachability, to some sites. However, everything continued to work properly for non-L3 customers. Two hours later L3's routes are back and working properly. End of story, nothing to see here, move along please.
Slashdot editors, do you really expect us to believe that no-one had submitted a more coherent or accurate story than this one? Come on, for heaven's sake.
Anyway, a network engineer's view can be seen in the overnight traffic on NANOG: http://www.merit.edu/mail.archives/nanog/2005-10/ "Tier One ISPs dying" indeed. Worst. Story. EVER.
"None are more hopelessly enslaved than those who falsely believe they are free." -- Goethe
Glad to see that Tier 1 ISPs are joing the ranks of BSD and Apple.
The Anti-Blog
yeah
AccountKiller
....or is it related?
http://www.internettrafficreport.com/asia.htm is another place to check internet traffic - it's a different metric, but it's still useful. Taiwan and India seem to be the only places doing okay.
AccountKiller
I could easily fix this problem. I would just restore it from the Recycle Bin.
Slashdotting the entire internet!
So what , was this some sort of Divine Slashdot effect ...
The only things certain in war are Propaganda and Death. You can never be sure which is which though
I'm sure you know this, but for the rest: "flapping" is the common term for when a router's routing tables rapidly cycle between two invalid states. The dead bird analogy is pretty descriptive, but the term "flapping" has technical and not allegorical origins.
Dewey, what part of this looks like authorities should be involved?
(finishes stuffing last 'Pizza Pop' in gob) (unplugs microwave and removes from telco rack) Thorry.(spitting crumbs)
These are the first salvos of the EU's attack on the Internet, since the US wouldn't give in to their demands.
Good job guys. Now that incredibly fragile IP protocol is completely screwed, along with any chances of my getting onto match.com tonight.
Comment removed based on user account deletion
I was playing online last night and it was like the internet had a stroke. WoW went down but I was still on voice chat with my friends. My trace routes were timing out at at about 5 hops down the line.
The good news is, I got more sleep last night.
Comment removed based on user account deletion
Comment removed based on user account deletion
I got the DTs(delerium telnets) from not getting my fix last night...
The scary thing is it makes you wonder is some terrorist who has intimate knowledge of how Tier 1 ISP's work doing a trial run in the middle of the night by knocking out Level 3 and Verio backbones so later they could try to knock out ALL the backbones in a co-ordinated terrorist attack.
It doesn't make me wonder that. Terrorists do not give a shit about this kind of thing. To even invoke the word "terror" in this discussion is ludicrous.
- A.P.
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
not surprising, the major search engines are afaict EXTREMELY well connected.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Oh, so THAT's why my daily spam load suddenly dropped by about 35% or so...
Bruce Lane, KC7GR,
Blue Feather Technologies
It isn't just a coincidence that this outage happened on the same day as a gay pride parade somewhere. Clearly, our Lord Jesus hates these sinners and is exacting his revenge.
[a modest proposal: google for "satire" before modding down]
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
more like a grand version of the time the boss "organized" the cables on the routers so they line up nicely
Snowden and Manning are heroes.
They are doing pretty well, not amazingly so (what telco is?) but they have a lot of cash and a stable recurring revenue base. They also have a pretty good outlook because they are one of the few companies not caught with thier pants down when the FCC mandated E911 support - which a lot of people are coming to Level 3 for. If you think VOIP has a future then so does Level 3. The market thinks so; regardless of your outlook the stock has been up quite a bit recently.
To call them a "dot bomb" is really unfair since they were far more financially prudent during the timeframe, which is why they are still around at all in the dark forest of discarded Telco husks.
Disclaimer, I work for Level 3. But on the other hand doesn't that mean that I know more than most people about the real situation here?
I have had my paycheck bounce at companies I've worked for in the past and been told I'd have to wait an extra month or two for pay at said companies (you know the kind, six employees and the owners mom uses the company AMEX for trips to DisneyWorld while you wait weeks more to get paid). Level 3 is a few billion dollars away from that sad state.
And don't accuse me of drinking Kool-aid either - after going through a lot of layoffs over the years you have a VERY realistic outlook on what the company does well and what it does not.
yeah, i never liked mobs
I bet most players don't even know it means "Monster OBjectS" !
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
It's a face? I never looked at it that close, because my parents told me it would hurt my eyes. Next time I'm outside, I'll make sure to stare directly at it until I can make out the face!
Let me try again:
Your dying bird analogy is OK, because that's pretty much how it acts. But they say a router is "flapping" because its routing tables are flailing about aimlessly like maps to the Superdome.
So, it describes what's actually happening, not how it compares to something your cat's trying to eat.
Dewey, what part of this looks like authorities should be involved?
So everybody with an alternate path around Level 3 should have routed around them properly. And yet, they weren't routed around. That's a concern.
If you went down because of this outage, your provider is totally dependent on Level 3, which is not good. This is a useful warning - if you went down, and your operation is important enough that it needs to stay up, you need to look very hard at your provider's upstream connectivity. Better hosting services have connections to several Tier I providers, just in case something like this happens.
Seems a Fiber ISP in Palo Alto says it began in Germany. Thay have Deutsch Telecom with problems before Level 3. Long link to the NOC updates is on the top of their homepage today.
Don't you get it? They deleted the internet!!
Good thing I downloaded it last night.
Disconnect your television. Do your own research. Draw your own conclusions. They're probably lying. Don't be a sheep.
I was looking for a Linux Virtual Host, blah, blah.
Stumbed apon these pretty pictures (near bottom of page) .
Curious, I thought, what happened to Level(3) ? I though for a second because perhaps unixshell had a peering with those people that Level(3) were in dispute with.
Nope, just one of those regular outages that make the 99.999% promises sound a little over done.
[% slash_sig_val.text %]
My pager went off at 1:48am EDT. Was able to get to my boxes from my Verizon connection, but couldn't get to other hosts via L3. Put in a ticket at 2:15am that L3 was having problems. Stupid L3 sucking woke me up. Grrrr.
TossableDigits.com: Temporary Phone Numb
No you idiot, it's because we block you on sight.
Karma: Meh (Mostly from meh.)
It's spelled "Hear! Hear!"
Nonononono... he was clearly replying to the parent's question. Please everyone, send your bill to SRoberts7758.
"And now, to whom from Level3 do I send my bill?"
HERE HERE!!!
94% of Repubs and 21% of Dems voted to renew the Patriot Act
Mobile OBjects, not Monster OBjects
retrorocket.o not found, launch anyway?
Let's go kill some MOBs sounds great to me. Unfortunatly the people I MMO with often don't understand what a MOB is. Damn you Graphical games for changing terms I've been using for years!
That which is done from love exists beyond good and evil
Or the global mesh of cooperating, overlapping wireless footprints?
Things are better in both the past and the future.
cd /Internet; rm -rf Level3
PHP Developer Virginia this sig sold out!
No one said it would react *quickly* to a nuclear war. It took about two hours for routing to go around Level3 rather than through it.
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
Click here to watch it. A group of people going crazy during an Internet outage. Perfect timing. :)
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
Maybe their bandwidth is really just saturated by the spammers they harbor.
Does it make you happy you're so strange?
Now Cogent appears down too.
Now Cogent is having a completely random and unrelated failure. What a total coincidence.
you moron, go to the site and see for yourself.x ?Period=RH24
x ?Period=RH1
http://scoreboard.keynote.com/scoreboard/Main.asp
right now cogent is having a problem.
http://scoreboard.keynote.com/scoreboard/Main.asp
TAGGE: "And what of the Rebellion? If the Rebels have obtained a complete technical readout of this station, it is possible, however unlikely, that they might find a weakness and exploit it."
:-)
You know what happened less than two hours later
Loss of the internet would have a very large financial effect, more then enough to be a juicy target for those that feed on hate.
No matter how subtle the wizard, a knife between the shoulder blades will seriously cramp his style.
I'd rather get an X-ray than have brain surgery.
Some other things people get worked up about but terrorists are unlikely to attempt: sabotaging bridges and tunnels to cause traffic jams; sabotaging electricity distribution to cause blackouts; sabotaging railroad tracks, making commuters late for work!
You have to assume though that any major landmark could be a target. For example, I could see the Golden Gate being a potential target, along with the Space Needle in Seattle, a few other bridges, etc. Also I can imagine that tunnels could be attacked with poison gas, anthrax. Even though the likelihood of damage from such an attack would be low, the fact that it happened would be terrifying to a lot of people. I.e. it is not enough to think death, and attacks on one's security need not take that form. Instead attacks on familiar landmarks and attacks against a general sense of safety.
Now, a large-scale cyber attack seems to me to be economically damaging enough that someone who wished our country harm might try it but it lacks the propaganda capabilities that traditional terrorist attacks have. For example, if Bin Laden destroyed the Golden Gate Bridge, he might be able to drum up some additional support, but if he claims to have disrupted the internet, most of his supporters will probably respond with blank stares. North Korea, OTOH, would have a much more likely motive....
LedgerSMB: Open source Accounting/ERP
The dog ate my router!
So God knows IOS?! I'm starting to believe in Him again.
Used to do contract work at an auto company's plant. The main data center's primary job was to feed test programs to an distributor testing line and collect the stats. It was located in the middle of the plant on the second floor, next to the row of test stands.
Some time after my contract had ended I visited the place and it was a total disaster.
During the model change shutdown (when most of the plant maintainence and rearrangement was done) the millwrights were welding on some cableways on the ceiling of the plant floor below. The fumes from the welding, of course, rose to the ceiling and escaped through the first hole they could find - around the big fire sprinkler pipe that went up through the floor of the computer room and into the space beneath the raised floor.
It tripped one ionization smoke alarm and sounded the warning - but nobody was around during the shutdown to hear it. Shortly thereafter it tripped a second one and the halon system went off. The computer power shut down and $10,000 worth of halon blasted into the computer room. Half of it came out through vents under the floor, throwing the raised floor panels and a decade's accumulation of fine dust (much of it byproducts of metal cutting and anealing) all over the room. And finally sounding an alarm at the guard shack.
The guards came over and found the room in disarray but no slightest sign of a fire. A couple million bucks worth of computer equipment, slated for replacement in another few months but still critical to the plant's operation, was standing there, covered with dust (likely to cause trouble for the disk drives later) but otherwise intact. So they followed procedure and reset the halon system, switching to the backup cylinder, to protect the computer in case an actual fire made it to the comp room. (Normally that's a good idea, since smouldering that sets off smoke detectors is often followed some time later by an actual fire.)
Of course the welding was still going on - just not at the moment the guard sniffed the comp room. (Welders out to lunch, pulled out due to the alarm, or having decided to come down off the ceiling for a bit after the blast of gas from above.) And they still had work to do. So of course they went back to it.
In less than an hour the situation repeated, dumping the SECOND $10,000 worth of halon on the non-fire. B-(
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
"flapping" is the common term for when a router's routing tables rapidly cycle between two invalid states.
Also known as "route flapping". Also applied when the routes are valid, but a router still alternately advertising a pair of routes because it keeps changing its mind about which is better.
This, by the way, is partly the result of an internet standard that isn't sufficiently prescriptive. The BGP protocol itself is well defined, but its implementation is left open. Unfortunately, it prescribes a system that has gain, delay, and negative feedback. So a naive implimentaion leads to oscilation when deployed, and something must be hacked to stop it.
Getting it to stop is a black art of coding workarounds - both to keep yourself from oscilating and to react quickly and appropriately (rather than blindly doing a full recomputation of your routing tables with each received flap) when somebody else starts flapping, so your packets keep going through despite the bad net weather.
If you get something stable among your own machines, you still have the issue of whether it stays stable when they're talking to somebody else's, or to an earlier version of your own. Avoiding breaking your own earlier stuff in a new release is difficult. Especially so since you don't necessarily know WHY your implementation is working (if nothing else, because you don't know what some of the other implementations it's successfully conversing with are up to). And testing before release is very hard, because you can easily get something that works just fine in all your lab test cases but breaks when deployed on the real net.
Since deploying a broken BGP implementation breaks, not just the routers it's deployed on, but large sections of the rest of the net, backbone providers and ISPs are very leery about buying routers from somebody without a proven BGP implementation. This creates a chicken-and-egg problem for companies trying to break into the router business: You can't get customers without a proven implementation, and you can't prove an implementation without customers.
Last I heard (by word of mouth, a couple years ago) there were only two independently-developed implementaions of BGP that had achieved this level of confidence.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Netcraft confirms it. Tier 1 ISPs are dying.
* The IDC confirms once again that market share among Tier 1 ISPs has fallen again.
etc....
I thought about writing a long wandering spoof but don't have the time...
LedgerSMB: Open source Accounting/ERP
Used to be that people would predict "The End of Usenet As We Know It, GIFs at 11", but technology has progressed a lot since then...
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
The MAEs don't really matter much any more, at least for Tier 1 peering, which has almost all moved off of public peering and into private peering. A lot of that private peering takes place in carrier hotels or telco POPs - Equinix has 7-8 big locations, Seattle's Westin building, a bunch of different buildings within a block or two of the main Los Angeles telco POP, and a few others. Some private peering also happens on fibers run between carrier offices.
Most of the Tier 1 providers have lots of excess bandwidth - if the DC area peering were to fail, most of them would have enough spare peering capacity in New York or Atlanta or Chicago to recover without major capacity losses, and BGP would reroute most of the rest reasonably well. The West Coast is in a bit worse shape, just because the distances are longer - SF-LA is only ~350 miles (~3.5ms one-way), but SF-Seattle is a lot farther, and isn't quite as overbuilt, and with many carriers, if you lose the direct route, you take a 2-3000 mile loop through Salt Lake City to recover (unless you've got two central California routes, on I-5 and 101 or railroads.)
European peering architectures have much differently - the geography's different, and the carrier relationships were different, so huge fractions of that traffic go through LINX and a few other points like AMSIX, and losing LINX would be seriously bad.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
If you've got a given prefix, and people are exporting bogus reachability information about it, start advertising two prefixes of half the size. I know one large ISP with a /8 who had to start publishing two /9s because some bozo outfit was doing incorrect route summarization and claiming that their little circuit in South America had a really great route to that /8. It's a cheap trick, and you shouldn't leave it up too long if you can avoid it, but works really well when you need it.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Most of the time, ISPs can protect themselves against their neighbors failing, and most of the time they do. A few years back, some random company advertised that their T1 was the best way for half the world to reach MAE-EAST (or some target of that size), and suddenly half the world's internet traffic was trying to get down that wire before it melted, making it difficult for any equipment nearby it to even scream for help. Lots of ISPs started doing a lot more BGP filtering after that, and developing methods to monitor the advertisements the outside world was seeing about them, and things got safer, but it's still possible to screw up.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Their direct customers would obviously be affected, so if you've only got one ISP connection and they're down, you're out of service - if losing connectivity in the middle of the night is a problem, you need to arrange for diversity, but during the daytime you're more likely to have a backhoe take out the wires on your street than have your ISP down for more than an hour or two.
L3's a big wholesale provider, so if they're down, it can affect people who didn't know they were using their services; maybe they're using a small ISP that buys half its bandwidth as transit from L3, or maybe their ISP is using L3 to reach specific areas where they don't have geographical coverage or provide specific types of service. So the outage may feel a bit more widespread than it really was, but it's still the middle of the night. The recent L3-Cogent fracas was a bit more visible because they handle different kinds of customers - L3 provides bandwidth to lots of small ISPs with consumer end-users, while Cogent provides big cheap pipes to lots of hosting business, so the interference was fairly synergistic, and it lasted a lot longer because it was a Layer 8 / Layer 9 business disagreement, not a Layer 3 technical problem that can be fixed by engineers.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
what went wrong in Level3: http://www.merit.edu/mail.archives/nanog/msg13166. html
- Just because we CAN do a thing, does not mean we SHOULD do that thing.