Slashdot Mirror


Electricity Outage Puts Routing to a Tough Test

infofarmer writes "Today at about 11:30 MSD (GMT+4) a major electricity outage in Moscow, Russia brought new meanings to words like "uninterruptible", "redundant" and "uptime" for network administrators, who haven't experienced such harsh and unexpected power failures since the USSR got its Internet connection. Half of the city is totally out of electricity - including subway and the most important traffic exchange point, half of the top russian sites went down, including www.mail.ru, www.rambler.ru, www.lenta.ru, some of them haven't been brought up yet. IP packets going from ADSL users in Moscow to some local sites got rerouted to somewhere in London and then back to Scandinavia, where they met their "No route to host" deadend. Other routers found themselves in a loopback, which made many packets get dropped with TTL expired. The point is that most of popular servers have got two or three mainline Internet connections, but lack of BGP/RIP2/whatever configuration resulted in packets losing their way to hosts."

48 of 233 comments (clear)

  1. Probably unrelated by SIGALRM · · Score: 5, Funny
    half of the top russian sites went down, including www.mail.ru, www.rambler.ru, www.lenta.ru, some of them haven't been brought up yet.
    And in other news, spam volumes suddenly and unexpectedly plummeted.
    --
    Sigs cause cancer.
    1. Re:Probably unrelated by jafomatic · · Score: 3, Insightful
      Argh. I was beaten to it and quite badly at that.

      In all seriousness, we (or, y'know, lawmakers somewhere) should really look for the spam volume trending before-during-and-after the outage.

      A surprise for some, no surprise for the rest of us?

      --
      ::jafomatic
    2. Re:Probably unrelated by PenguinBoyDave · · Score: 2, Funny

      That explains why I'm not getting any email today. Without SPAM, I'd never get any email.

      --
      I'm not a troll, but I play one on Slashdot.
    3. Re:Probably unrelated by Chmarr · · Score: 2, Informative

      Well, the wiki on my site was continually being probled with vandalism attempts by various machines around the world for the past couple of days, and it stopped dead right around the time of the power failure.

      So... no prizes for guessing where the control machines for the botnet were.

  2. LOL by infonick · · Score: 3, Funny

    The msk-ix went down, and now that its back up, your going to have it slashdotted?

    Can someone give this guy a metal?

    --

    You are confusing me with someone who cares.
    1. Re:LOL by MyLongNickName · · Score: 4, Funny

      Iron? Aluminum? Which one do you prefer?

      --
      See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
  3. In soviet russia... by Winckle · · Score: 5, Funny

    Oh nevermind...

    1. Re:In soviet russia... by ravenspear · · Score: 5, Funny

      Power fails you.

    2. Re:In soviet russia... by PornMaster · · Score: 3, Insightful

      Unfortunately, power has failed Russians for longer than any of them can remember...

  4. no more all off mp3 .com by acomj · · Score: 4, Funny

    no more all off mp3 .com

    Obviously the MPAA/RIAA are to blame..

    1. Re:no more all off mp3 .com by gricholson75 · · Score: 2, Informative

      I checked right away, it's still up.

  5. silly packets by Anonymous Coward · · Score: 2, Funny

    must have been males- didn't stop to ask for directions

  6. Luckily, by gricholson75 · · Score: 3, Funny

    bride.ru is still up.

    1. Re:Luckily, by pegr · · Score: 3, Funny

      Yeah, but what about fark.ru? If that site stays down, russian productivity will skyrocket! Our economy will collapse! Cats and dogs living together! MASS HYSTERIA!

    2. Re:Luckily, by fafaforza · · Score: 3, Interesting

      Well, its hosted in Florida.

  7. I have a mirror... by yotto · · Score: 3, Funny

    Good thing I saved all my russian pr0n.

  8. Wow by cybersaga · · Score: 3, Funny

    That's crazy. These sites were down before they hit Slashdot.

  9. Alternate Headline by cmburns69 · · Score: 3, Funny

    An alternate headline should be:

    Correct router configuration can be difficult!

    --
    Online Starcraft RPG? At
    Dietary fiber is like asynchronous IO-- Non-blocking!
  10. I'm worse than Russia. by Nytewynd · · Score: 2, Funny

    Last night I lost power for about 3 hours. My laptop worked. My cable modem stayed connected on battery backup. My router is plugged in and died.

    Russia's connections at least made a couple of hops before dying. Mine died on 1 hop. It did illustrate the uselessness of a battery-backed up modem on my network, however.

    --
    /. ++
    1. Re:I'm worse than Russia. by Nytewynd · · Score: 2, Interesting

      Maybe I'm oversimplifying, but you could have plugged your modem directly into your laptop. No other computers would have worked, but you would have had internet connectivity.

      If you had any servers running, though... well... time to get a UPS. :-)


      That is obviously an option, but that laptop is set up only with wireless and I really didn't care enough to configure the landline and assign IPs.

      My server has gone down because of the power twice in the past 2 days. I would consider UPS if it mattered. Last night even UPS wouldn't have saved me. Power was out for 3-4 hours. For the trivial web pages and teamspeak server it runs, I'm fine with it being offline.

      --
      /. ++
  11. Yes, but... by rewt66 · · Score: 5, Funny

    Did kremvax stay up?

  12. In all seriousness... by tgd · · Score: 4, Informative

    For the last three or four weeks my gmail account has been POUNDED by 100-200 cyrillic spam messages every day. The filters catch them, but I have to clean out my spam folder pretty often.

    I've gotten none in the last couple hours.

  13. The submitter has to have his priorities checked by arhar · · Score: 3, Interesting

    Knowing how things are done in Russia, you should be a lot more concerned with things OTHER than Internet.. Everything is such a fucking mess over there, that's I really hope no serious injuries happen. I already read the news that sewer water is being dumped into the Moscow river because of a plant failure. In times like these, who gives a shit about Internet?

  14. Odd.... by MarkGriz · · Score: 4, Funny

    I can't seem to log into my bank account to update my out-of-date account information.
    Wonder if these are somehow related.

    --
    Beauty is in the eye of the beerholder.
  15. Obviously in need of a tighter setup by MynockGuano · · Score: 3, Funny

    but lack of BGP/RIP2/whatever configuration resulted in packets loosing their way to hosts."

    Those mischevious packets. Unraveling networks where'er they roam.

  16. Perfect opportunity... by Vexler · · Score: 3, Funny

    Someone should blackmail the Moscow electrical grid. "If you ever want to send spam again, fork over $200 and send it to this address..."

  17. In other news... by supabeast! · · Score: 2, Funny

    The RIAA's crack anti-mp3 commando teams are rumored to be cutting a bloody swath through street markets and datacenters across the city.

  18. in times like these, the 'net is a godsend by ChipMonk · · Score: 4, Informative

    I think you need to check your priorities. How do you think geeks all over the world just found out about the power failure?

  19. Not big headline? by antdude · · Score: 2, Interesting

    It was interesting that news.google.com, cnn.com, msnbc.com, etc. do not have this story on its front news page. I guess the outage isn't severe like one in New York a few years ago.

    --
    Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
    1. Re:Not big headline? by Politburo · · Score: 2, Insightful

      It was interesting that news.google.com, cnn.com, msnbc.com, etc. do not have this story on its front news page. I guess the outage isn't severe like one in New York a few years ago.

      Well, first off, you're factually incorrect. The outage 2 years ago affected a large area of the Eastern United States as well as some areas of Canada, not just NYC. Furthermore, the sources you cite are all american sources. It's no surprise that they tend to report american events more than world events.

  20. Internet... works! by Cyberax · · Score: 4, Informative

    I live in Russia, about 1000 km from Moscow. We were hit by network outage, nothing worked (even Slashdot :( ) for about 30 minutes. Number of routes announced by both of our peers was about 700 instead of normal 150000.

    But then routes began to appear again! I was amazed, Internet routed itself around damaged segments, packets were routed through Japan (!), Finland and Holland instead of Moscow. The most funny part was when I traced the route to a computer in the next building - it went through Saint-Petersburg :)

    I was able to access Slashdot, and most of Russian sites (http://newsru.com/ , http://ntv.ru/ , http://nbc.ru/ not directly affected by outage.

    1. Re:Internet... works! by lheal · · Score: 4, Interesting
      both of our peers

      That's why.

      TCP/IP and the Internet anticipate cooperation among sites. You and your neighbors should all happily route each other's packets.

      The trouble is that in many places it doesn't work that way. There are rural "leaf" nodes, of course, but there are many more sites which have only one connection because of what I consider to be petty business decisions.

      Two competing ISPs in the same area should share a direct link to each other. If they have different upstream providers, then when one provider goes down the other picks up the slack. In any case local traffic should stay local.

      The fear, of course, is that one ISP will choose a bad provider and take advantage of the other. That has an easy fix: if the other one starts to abuse you, pull the plug.

      Single points of failure are not supposed to exist.

      --
      Raise your children as if you were teaching them to raise your grandchildren, because you are.
    2. Re:Internet... works! by David+Horn · · Score: 2, Funny

      >> We were hit by network outage, nothing worked (even Slashdot :( ) for about 30 minutes.

      Yeah, but Slashdot doesn't work anyway. ;)

      --
      PocketGamer.org - For the gamer on the go!
  21. Spam by vevva · · Score: 2, Funny

    That explains why there was less spam in my inbox today.

  22. Re:No spam for 4 hours! by joeytmann · · Score: 5, Funny

    Whats your email addres?

    --
    Insert funny smart-ass comment here.
  23. Re:The submitter has to have his priorities checke by keraneuology · · Score: 4, Insightful
    Considering that sewage, power and medical processes could all rely on the internet...

    There's more traffic on the 'net than pr0n, wazrez, mpEs and /.

    Some of it actually matters.

    --
    If the g'vt kept the data on you that google does you'd better believe you'd be calling it "doing evil"
  24. Re:The submitter has to have his priorities checke by josecanuc · · Score: 4, Interesting
    I already read the news that sewer water is being dumped into the Moscow river because of a plant failure.

    This is what is supposed to happen. All (nearly all?) sewage treatment plants have a bypass to send the input straight to the output, which is usually a river or lake.

    They do it because when a treatment plant cannot accept any more sewage, whether due to excessive water input by rain, or by power loss, the customers are better served by *NOT* letting the sewage back up into their houses. The stuff has to go *somewhere* when all their holding tanks are full. This is the last-resort method of dealing with problems at such plants.

  25. Routing by EdMcMan · · Score: 2, Interesting

    Are there any technical reports out of what happened to the network? What is the russian equivilant of NANOG?

    1. Re:Routing by Cyberax · · Score: 2, Informative

      Yes, but they're not public :(

      Right now poor admins are trying to find stable routes for Russian traffic, which overloaded some international channels.

  26. Re:The submitter has to have his priorities checke by venicebeach · · Score: 4, Insightful

    Yes, but slashdot is concerned with the internet, and so this is an appropriate forum to discuss how an event like this affects the internet. I don't think someone who runs an ISP in Russia should be trying to figure out how to get the sewer working, they should be figuring out how to get the internet up.

  27. Re:The submitter has to have his priorities checke by Anonymous Coward · · Score: 2, Insightful

    Nobody cares that this was in russia, that people can't get their email, or that it was because of a power outage.

    The reason this is on ther front page is because the internet is suposed to be able to handle things like this. People will be watching how the routers automatically deal with the outage (there's one response like that already), and what manual intervention it needed. Hopefully this information will be used for training the next generation of router admins.

    Even if this was because of a meteor strike or nuclear bomb, we'd still be interested on how the net took it. We'd be more interested in the everything else about the event, but the response of the network would still be interesting

  28. Fzzzp by lordsilence · · Score: 3, Funny

    Perhaps these guys touched live wires ^^

    Off-topic, but an interesting read :)

  29. UES Management Faces Criminal Investigation by Anonymous Coward · · Score: 4, Informative

    http://mosnews.com/news/2005/05/25/chubaiscriminal case.shtml

    From the article:

    Russian prosecutors on Wednesday opened a criminal case against the management of power monopoly Unified Energy System (UES) after a major power outage in Moscow, agencies reported Wednesday.

    The case was opened to investigate possible negligence, the Interfax agency quoted the Prosecutor General's Office as saying.

  30. I think this is a "political outage" of some sort by melted · · Score: 5, Interesting

    There's a Russian politician of Yeltsin era, Anatoly Chubais who is in charge of RAO UES Russia (which is an uber-organization controlling production and distribution of energy in Russia).

    While the guy is not as powerful as he was a few years ago, he still poses a significant threat to Putin's third (and fourth, and so on) term presidency, and further concentration of power in Putin's hands.

    So within half a few hours of outage, Putin blamed Chubais directly for this, and Russian justice dept opened up a criminal case against him. If you know anything about Russia, you know that Russian DOJ (Prokuratura) doesn't start criminal cases against wealthy and powerfull businessmen and politicians unless instructed to do so by Putin.

    So I'd bet dollars against donuts that this outage was caused by folks from Lubyanka (FSB aka KGB) purely to remove Chubais, and if cards play well maybe even give him a lengthy prison term.

  31. let's separate that in two issues by tinkerton · · Score: 2, Insightful

    I'm sure Putin will exploit the power outage to weaken and possibly get rid of Chubais.

    Whether the FSB caused the outage directly, to prompt an attack on Chubais is another matter. Maybe they were working on a plan but it wasn't ready yet. They have a lot to do :)
    Even Putin sometimes just exploits opportunities.

    In any case, the outcome is the same.

  32. lack of BGP/RIP2/whatever configuration by g-san · · Score: 4, Informative

    I doubt it was the lack of RIP2 configuration that caused this. You don't use RIP in the core, you use BGP as the exterior protocol and most likely OSPF or ISIS as the interior protocol.

    UPS: at least in one place in MSK-IX they did have proper UPS backups, you can tell from routing tables that some BGP connections have an uptime of 4 weeks plus. They did bounce (or it had a power failure) one of their core routers as all those peering connections only have an uptime of 8.5 hours. I'd rather not provide a link to this as the last thing they need is their core routers slashdotted with BGP table summary requests.

    Connectivity: it appears MSK-IX is peered with at least 12 other sites that are also peered with another major IX. For example they are connected to three other sites that are also connected to AMS-IX and four other sites that are also peered with LINX, among a few others with only 1 connection to another Internet Exchange. Many of these were thru Informtelecom XXI, so if they also had power problems everything was running on 50% normal capacity. There should have been enough connections to keep things running (i.e. no single point of failure), but that is assuming everything is working/powered, and assuming these guys in the middle could/would handle all the traffic (unlikely).

    BTW, packets don't lose thir way, routers lose their routes to destinations. When all the crap started the routes began to "flap", i.e. go up and down as routers were reset, power came back on, routers went back down under the heavy load, manually trying to route around the problem, etc. When your peer sees your routes flapping, they usually put a holddown on them for a period of time, meaning they won't readvertise your route updates to other routers on the internet (said flaps propogate all over the world, putting undue stress on other routers). So even once you get everything working again, the internet waits for a little bit to accept your routes. Well, some do and some don't or some wait longer. That's why you see routers still forwarding packets to London, apparently London thinks it can still get to Moscow so it's still advertising routes. You don't get the count to infinity problem with BGP, but loops are still possible, especially during major outages and route flapping. And routers get "routing loops," not "found themselves in a loopback."

    I provided as much details as I could, it's lacking in a few places because I can't follow russian websites.

  33. Re:The submitter has to have his priorities checke by keraneuology · · Score: 2, Insightful
    If that's true, whomever designed those systems deserves to be punished severely. Why would you put all your eggs in a basket you don't own?

    The same can be said of the electrical grid. And the cellular network. And the water network. And the sewage system. Or the public road infrastructure. Or the food distribution chain. Face it - virtually every aspect of modern life requires you to rely almost completely on infrastructures that you do not own.

    One has to remember what the internet actually is - a system to transport data. For me it has proven to be far more reliable than the power grid - when the lights go out the internet connection at my house remains active. Should a system go down the first time a packet is dropped? Absolutely not. But that isn't the case here. What Russia is seeing is a massive widespread power failure that is probably beyond the designed tolerance. And keep in mind what else -could- be happening. Why was the sewage shunted to the river? Was it to keep it out of the basement of the local hospital?

    --
    If the g'vt kept the data on you that google does you'd better believe you'd be calling it "doing evil"
  34. CIA running Netwar Wargame This Week by carcosa30 · · Score: 2, Interesting

    This is interesting news coming as it does in the week that the CIA is scheduled to run a 3 day netwar exercise called Silent Horizon.

    http://apnews.myway.com/article/20050525/D8AAFUIO2 .html

    Am I just blowing smoke here...?

    --
    Intolerance for ambiguity is the mark of the authoritarian personality.