Slashdot Mirror


Akamai DNS Outage Messes up Net

katre writes "Checking all my favorite sites this morning, I saw that about half a dozen seem to be offline. Trying to figure out why, I found an interesting article on the front page at http://isc.incidents.org/. Seems that the problems at Akamai are screwing over Yahoo, Google, Microsoft, Fedex, Xerox, Apple, and others. Whatever happened to my decentralized net with no single point of failure?"

43 of 522 comments (clear)

  1. add esignal too by Lawrence_Bird · · Score: 2, Insightful

    provider of real time market data...

    hope the al quedas aren't taking notes on this..

  2. Well . . . by Maradine · · Score: 4, Insightful
    Whatever happened to my decentralized net with no single point of failure?


    Its still there, and you're using it. The only organizations affected by this are those who chose to use a service that acts as a single point of failure.

    --

    trustedworlds.net - gaming, security, and the gunk that lives in between

    1. Re:Well . . . by Anonymous Coward · · Score: 3, Insightful

      Yup, 13 nameservers that all do the exact same job located at different places around the world, with different access providers. All fully capable of doing their job without the others.

      Sure sounds like a single point of failure to me.
      </sarcasm>

    2. Re:Well . . . by Bob9113 · · Score: 4, Insightful

      Whatever happened to my decentralized net with no single point of failure?

      Its still there, and you're using it. The only organizations affected by this are those who chose to use a service that acts as a single point of failure.


      You said it brother (and beat me to the punch). This is a clear talking talking point for anyone who is attempting to justify avoiding a monoculture. When you brings up Microsoft, around which revolve a number of good examples of the dangers of monoculture, you risk the debate turning political and will almost certainly be discounted as a Linux/Apple/Unix zealot by at least some in the listening audience. It is very worthwhile to have other examples besides Microsoft and cotton when explaining the risks.

  3. Re:I'm definitely not a technical guru... by Malc · · Score: 5, Insightful

    How many *think* they can't live without web access? Offline working can be surprisingly productive, and as it often forces more thinking and planning (e.g. in preparation for being back online, and just thinking through what would happen of you could be online) the results end up being better.

  4. Re:I'm definitely not a technical guru... by MindStalker · · Score: 5, Insightful

    You mean decentralized?
    Anyways butting both DNS records on the same point of failure breaks standards. These companies deserve to be hit hard (PR wise) for not building a roburst network.

  5. Re:I'm definitely not a technical guru... by jocknerd · · Score: 3, Insightful

    I actually would probably get work done without web access!

  6. Whatever happened to your decentralized net? by YetAnotherName · · Score: 4, Insightful

    The web happened my dear friend, and it was based on the predominant distributed computing model at the time: client/server. Even DNS, with its highly distributed spread of processing and data, has a set of (overloaded) root servers with the commensurate single points of failure. The solution? Peer-to-peer.

    Too bad even the term P2P raises so many red flags with certain Associations of America. :)

  7. DNS issue... by Tuxedo+Jack · · Score: 3, Insightful

    You would think that the root DNS servers would be kept up to date with critical information. Just what happened, and how did Akamai get knocked around this? Did they screw with their DNS information and change their nameserver addresses or something?

    --

    Striking fear in the authors of godawful fanfiction, I am here, appearing in darkness, Tuxedo Jack!
  8. Re:decentralized DNS is a pipe dream by southpolesammy · · Score: 2, Insightful

    I am unable to access the server listed above from various server locations spread across the country & using different ISP's.

    That's not the DNS outage problem -- the site is simply slashdotted.

    --
    Rule #1 -- Politics always trumps technology.
  9. Outsourcing too much = Single Point of Failure by CharonX · · Score: 2, Insightful

    The problem, as I understand it, is that Yahoo, Google & co. "outsourced" their DNS service.
    I could have accepted that medium-big sized IT companies don't want to run their own DNS servers, but giants Google & co. should have enough money to do so instead of relying on servers located somewhere else.
    Funnily enough www.google.com still works for me (thanks to DNS caching I guess)

    --
    +++ MELON MELON MELON +++ Out of Cheese Error +++ redo from start +++
  10. I'd like to know by Ricerocket63 · · Score: 2, Insightful

    how they can screw up there entire DNS, and it's still down. It started as far as I can tell right after 8:30 or so, the last outage was due to a software update on there own site. It's now nearly 11am and it's still not working.. Man, I would think you could restore from backup at least in that time frame, and have something up for people.. Wonder if there will be an credit on the account this month...

  11. Akamai is evil! by scovetta · · Score: 3, Insightful

    When I was in grad school at Cornell, my O/S professor went on a rant about the evils of Akamai. No one believed him. Now we know he was right.

    --
    Wer mit Ungeheuern kämpft, mag zusehn, dass er nicht dabei zum Ungeheuer wird. --Nietzsche
  12. Root servers not decentralized? by Otto · · Score: 5, Insightful

    It's not truely decentralized...
    The root nameservers are the most obvious example...


    The most obvious example? The fact is that there are 13 of them, in widely scattered locations across the globe, and it's not decentralized?

    Damn man, what exactly would you consider "decentralized" then?

    Root servers go down all the time. It's not particularly unusual. There's THIRTEEN of the things. Up to 8 have been down at once with no major effects on the network, IIRC.

    --
    - Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
    1. Re:Root servers not decentralized? by sys49152 · · Score: 4, Insightful

      I'm sorry, my friend, but thirteen servers does not mean decentralized it means replicated. The fact that they are geographically dispersed doesn't matter. Furthermore, the root servers just redirect to the authoritative server, so your "company.com" search goes to Verisign for resolution. What happens when Verisign, oh, I dunno, decides to send back the IP address of a cheesy search engine instead on an error code for domain names that don't exist. I tell you what happens, the Internet breaks.

      To be truly decentralized not only do we need more than 13 overloaded root servers, but no one entity should be authoritative. How that's done is left as an exercise to the reader.

    2. Re:Root servers not decentralized? by Anonymous Coward · · Score: 1, Insightful

      Root servers provide "trusted" DNS tables to the other DNS servers out there.

      All of the root DNS servers could go away and the internet would still work. New domains that are added daily would not make it to the DNS list maintained by the root servers.

      Maintaining a geographic separation between root servers makes it so the an earthquake in California or a fiber optic cable cut by a back hoe in Virginia doesn't disable the entire internet.

      The routing of lookups might be slowed, but lookups will happen and only the nodes lost during the incident (cable cut, earthquake) are lost.

      The Internet IS tough and resilient. Akamai is just a company that hosts other content, they are NOT the Internet.

      I live the greatest adventure anyone could desire -- Tosk the Hunted

    3. Re:Root servers not decentralized? by tyler_larson · · Score: 3, Insightful
      I'm sorry, my friend, but thirteen servers does not mean decentralized it means replicated. The fact that they are geographically dispersed doesn't matter.

      I'm sorry, my friend, but it most certainly does mean decentralized. Here's why:

      Decentralized means "having power or function dispersed from a central to local authorities". Each individual top-level nameserver operates entirely independantly of the others to the extent that it is capable of remaining completely operational in the absence of the others.

      DNS is actually the epitome of a decentralized service--as perfect an example as there comes. Assuming it is implemented as perscribed in the RFCs, there is no single point of failure (an incorrectly implemented DNS system is not the result of a poor design, it's the result of poor implementation--you can't blame DNS).

      There are 13 totally and completely independant top level servers. The only thing that ties them together (in a practical sense) is that they speak the same protocol and synchronize with eachother if possible. All top-level domains have at least two nameservers (generally much more), and all second level domains are required to have at least two authoratative nameservers as well. If any one of these servers in the whole chain fails at any time, the others will pick up the slack--it's part of the protocol.

      Implementing this service correctly such that no failure will take down your own domain is left as an exercise for you. It's your domain and your nameserver. You're responsible for insuring that it works. The "system" correctly assures that each one of your own nameservers will be queried until one responds. If you take all of your own nameservers offline, there's obviously nothing that the DNS system can do to help you. That's what Akamai's problem was. Don't blame DNS.

      --
      "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
      RFC 1925
  13. Lack of multiple points of failure by bastardadmin · · Score: 5, Insightful

    I can see the logic that went into this plan:
    "Well, Akamai has a few million DNS boxes, if we put everything there we'll be fine! That's not a single point of failure!"
    Yeah, about that... multiple vendors may have been a good idea in retrospect instead of just one monolithic provider.
    Time to re-examine the definition of Single Point of Failure.

  14. Re:I'm definitely not a technical guru... by fish_in_the_c · · Score: 5, Insightful

    you can still get to all those sites. You just have to REMEMBER the ip instead of depending on the computer to look it up for you ;). TCP/IP was designed to have not centeral point of failure and still does it's job well. DNS was not quite designed in such a way.

    --
    âoeTolerance applies only to persons, but never to truth. Intolerance applies only to truth, but never to persons.
  15. Re:Good morning, Mr. Gore. by Ralph+Wiggam · · Score: 3, Insightful

    Damn that was funny 4 years ago. Do you have any good "hanging chad" material?

    Al Gore was talking about creating *legislation* that helped foster the Internet.

    Why do Conservatives bitch to high hell when anything they say it taken out of context, but repeat dumb quotes by Liberals out of context for years and years?

    Maybe they should stop worrying so much about people who havn't had a political job in 4 years and worry about the people who do have important jobs now and are doing them so amazingly badly.

    -B

  16. Easy to answer by falcon5768 · · Score: 2, Insightful
    Whatever happened to my decentralized net with no single point of failure?
    Easy, when most websites use some service of just one company, then it doesnt much matter how decenteralized the web is.

    The way to solve it is get more companies out there who provide the same sevices, something not easy after the dot bust era when people dont want to take such risks.

    --

    "Slashdot, where telling the truth is overrated but lying is insightful."

  17. Re:Lack of notification by Umrick · · Score: 2, Insightful

    Err.. What are they supposed to do? Spam everyone who ever registered a domain and say, "oops our bad, but by the time you get this, it'll all be over?"

    If it's really that critical, then set up Nagios to monitor those ips or something.

    I had one person call this morning because they couldn't reach Google. And what was she trying to use it for? She broke a window this weekend and was looking for a dealer who sells her type of window.

    I have a much bigger issue with spams clogging my incoming mail folders than I do with transient DNS issues.

  18. Re:I'm definitely not a technical guru... by AKnightCowboy · · Score: 5, Insightful
    How many *think* they can't live without web access?

    *Live* and *work* are too entirely different things. I could not get any of my work done with network access.

  19. Re:I'm definitely not a technical guru... by bluethundr · · Score: 4, Insightful

    ...how many *think* they can't live without web access? Offline working can be surprisingly productive, and as it often forces more thinking and planning (e.g. in preparation for being back online, and just thinking through what would happen of you could be online) the results end up being better.

    F'real. To think, they did all that even before the Altair was a twinkle in Ed Roberts' jockey shorts!

    --
    Quod scripsi, scripsi.
  20. decentralized net? by ptrangerv8 · · Score: 2, Insightful

    The Net is decentralized... however, if several *LARGE* sites happen to be resolved through one DNS server and it crashes, people think that the 'net is down'... IIRC, Helldesk people bitch about this - people calling up and saying 'I can't get to www.mytimewastingbullshitpage.com, is the net down?' Not realizing that just becuase one or two or thirty sites are down, the net is still up....

    FWIW, I missed google for all of 10 minutes, and figured it was my work ISP....

  21. Re:Good morning, Mr. Gore. by Anonymous Coward · · Score: 2, Insightful

    The difference is that we know we're joking and just being mean in a school yard sort of way. We don't take it seriously and only keep doing it in the 'little brother poking at big brother because it gets a rise out of him and there's nothing he can do about it' way. It's childish amusement.

    When liberals do it, they're telling The Big Lie and with the help of your liberal dominated media, turn those Big Lies into Pravda-like Truth and then use their own lies as political weapons.

    Your media boosts the left while hurting the right at every opportunity.

    How many times have you read "So-n-so, ultra conservative Congressman from xyz"? When it comes to someone like Kerry who is a top 5 ultra liberal, they never tell you that. They sure as hell never refer to him as "Senetar Kerry, ultra liberal Senator from ultra liberal Mass. Junior Senator to Ted Kennedy. ...".

    See the difference now? Probably not, but it was worth a shot.

  22. Single Point of Failure? by stinkyfingers · · Score: 2, Insightful

    It's only a sinlge point of failure if you can't get to *ALL* of yout websites, instead of some.

  23. Success considered harmful? by DragonHawk · · Score: 3, Insightful

    I was thinking about this while scrambling to answer the phone, check outage reports, and generally calm down customers.

    If a product or service, such as Akamai, does their job very well, everybody will want to use them. If everybody uses them, you create a single point-of-failure. Any design flaw in that product or service becomes a disaster, simply through volume. Does this mean a successful product or service can actually be a bad thing for people?

    Other examples include just about anything from Microsoft, older versions of Sendmail and BIND (worm-of-the-week problem), and Firestone tires.

    (I'm not trying to advocate communism, excessive government regulation, or anything like that. So fanatical libertarians, conspiracy theorists, etc., can put down the rant-o-matic flamethrowers. :) )

    Comments?

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  24. Correction by PhuCknuT · · Score: 4, Insightful

    Akamai didn't mess up the net. Akamai messed up some web sites that are akamai customers. Remember kids, www is only a subset of the internet, and akamai customers a small fraction of the www.

  25. Re:I'm definitely not a technical guru... by Anonymous Coward · · Score: 2, Insightful

    You're also relying on these random companies to not violate your privacy and equally importantly to keep your data safe from destruction.

    Do you have any idea how poor the data safety & recovery policies are at most of these places?

    You're much better off having your own PC, putting VNC on it behind a firewall with an SSL VPN or even just ssh, and copying your precious data to a CD once a week or so. That's far better than most places are doing for you.

    You know how liable they are when they lose your data? Not at all. Just poof, gone. They might say they're sorry but it is unlikely they'll even admit anything happened. User error, you know?

  26. Re:I'm definitely not a technical guru... by Anonymous Coward · · Score: 2, Insightful

    If they would do their jobs, there would not be an issue.

    If the users, who think they don't need to worry about the net, would stop surfing porn with IE, stop clicking on every goddamned attachment that says "A fun game to play" or "Thought you'd like to see this", would stop signing up for every privacy-violating list on the planet then maybe the network guys would actually have a POSSIBILITY of keeping the network online!

    Oh yeah, and yo momma wears combat boots!

  27. I noticed this problem this morning and 1st thing by aardwolf204 · · Score: 2, Insightful

    I noticed this problem this morning when I was hunting for an updated version of YahooPOPs. I wasnt getting replies from Google. I opened another FirePanda window and my homepage, slashdot, was working fine (Hey look at that on the homepage, Yahoo changed their mail service today, no luck for YahooPOPs). I tried yahoo, altavista, even msn in different tabs but I wasnt getting anywhere.

    I tried pinging google and I was getting a reply so my first thought was, there is something terribly wrong at verizon DSL. I must make the most of what fragmented connection I have now before its down all day and I'm stranded actually doing work.

    Thats when I started opening every story on slashdot's homepage in different tabs and setting them all to threshold 3, threaded... Just incase.

    Come to think of it, I'm going to change my slashdot bookmark from slashdot.org to 66.35.250.151 just incase of DNS failure.

    Need my SlashCrack

    --
    Im dreaming ofa big bndwdth, That can resist the /.crowd.May ur days b merry & bright & may al
  28. "DNS was not quite designed in such a way" by Ernesto+Alvarez · · Score: 5, Insightful

    you can still get to all those sites. You just have to REMEMBER the ip instead of depending on the computer to look it up for you ;). TCP/IP was designed to have not centeral point of failure and still does it's job well. DNS was not quite designed in such a way.


    DNS was designed to be robust enough. Not one root server but many (ok, that's the weak point, we've all seen many DDoS against them, but it's not THAT bad). All zones are handled by their own servers, and (in theory) multiple servers for each zone. All in all, it's not a bad design.

    If what happened was that someone put all the servers behind one link, it's not DNS' fault, the BOFH there screwed up (and considering it's akamai, they should not have done that).

    (If that's not what happened, sorry, I couldn't RTFA, it's slashdotted or there's some sort of DNS problem there too).
  29. Put up of shut-up! Re:Good morning, Mr. Gore. by sharper56 · · Score: 2, Insightful

    If you want to have a true dialogue instead of fingerpointing with "nah-nah" gibes, you'll have to actually state which films you're talking about and what were the quotes that are "out-of-context".

  30. Whatever happened to my decentralized net... by Lord+Kano · · Score: 3, Insightful

    Whatever happened to my decentralized net with no single point of failure?

    Outsourcing and consolidation.

    LK

    --
    "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
  31. novell and dns... by ecalkin · · Score: 3, Insightful

    This was years ago (3? 4)... I set up a novell server and setup dns on it as a forwarder and pointed workstations to my novell server for dns.
    One of the neat things was the log screen that showed dns actions and you could follow the trail of dns requests to see how they were resolved. what makes this not O/T is that i beleive that this went into a log.

    The reason that I think about that is, if DNS stopped working, i'm not sure that i have cached numbers that i could easily get to....

    eric

  32. Missed the point... by Otto · · Score: 2, Insightful

    I was only pointing out that his example was bad.

    In this case, Akamai had some sort of major issue. Okay, fine. Fair enough.

    But the root servers themselves are a bad example to point to for a "single point of failure". They're not. The root servers, by themselves, are very robust, widely scattered, and any one of them can, in theory, handle the whole load. Admittedly, for the root, that load ain't a heck of a lot by comparison.

    Now, the DNS system itself has several thousand single points of failure, depending on how you define failure. Like you said, all .com traffic goes to Verisign's control, etc, etc.

    The root servers, however, are not one of these points of failure. They do what they were meant to do.. to be the root DNS servers. Several can fail and the root lives on.

    --
    - Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
  33. Re:Lack of notification by Umrick · · Score: 2, Insightful

    A reasonable idea... I however doubt that any service would issue anything alert wise unless it was caused by some sensational event. New nasty worm, terrorism... A simple outage, even on this scale just isn't exciting enough for the newschannels.

    Shame that. Might warrant a blurb tonight on the news, but it certainly won't dislodge the scroller that has the most recent body count in it, and probably no "this just in" by the talking heads.

  34. Doesn't work that way any more by TBone · · Score: 2, Insightful

    Unless the server that lives at IPaddress W.X.Y.Z only hosts 1 server, and that server has it's documents in the server root folder. Most webservers any more use virtual name services to map HTTP requests to the right "web server" and set of documents.

    My personal server runs 7 domains with 12 or 13 sites. Some have real docroot folders, some use the default "you aren't looking in the right place" set of docs. But using an IP address to access a web site probably won't work in these days of many servers per machine.

    --

    This space for rent. Call 1-800-STEAK4U

  35. first you need to understand dns by Anonymous Coward · · Score: 1, Insightful

    just because these guys use akamai hosted dns, and it broke, doesn't mean the rest of the world cares, or is even affected.

    Can anyone suggest that these guys build in some redundancy into their architecture? Using dns zone servers from only one provider is begging for trouble, since if that provider goes down, your servers no longer resolve.

    This is an architectural problem created by poor planning. Anyone who has a single point of failure in their architecture will eventually go down. Doesn't matter if this SPoF is Akamai, UUNET, or ATT. Regardless of how redundant any one provider is internally, a single provider is a SPoF from the architectural perspective of the website owner.

    That's why we host at UUNET and have a second shop and dns zone servers at a local ISP who is connected via a provider who is not UUNET.

    If UUNET wrecks their network in some massive outage, our backup site (webservers and ternary dns) kicks in.

  36. Google down? by thenerdgod · · Score: 3, Insightful

    My god... with google down my effective IQ is 12!

  37. Re:I'm definitely not a technical guru... by Malc · · Score: 2, Insightful

    Life without irony would be quite dull!

  38. Re:Uh by Slime-dogg · · Score: 5, Insightful

    It is misleading to refer to the box as a "Linux" box. Was it really the kernel that was at fault for the machine being cracked, or was it a bug in one of the daemons that the machine was running? There are differences between a Linux box that runs BIND and another that runs EZ-DNS (or whatever).

    How about this: Instead of labelling the Akamai boxes that have problems as "Linux" boxes, label them as "BIND" boxes, or whatever DNS server it is that it runs. Perhaps there's a FreeBSD machine in there that is having similar problems.

    It is allowable, though, to refer to a Windows box as just that. MS ships an all-in-one product, and seldomly do admins use Windows to run BIND, Apache or other OSS servers.

    All of this hand-ringing in an effort to paint "Linux" as bad, or as "just as bad" is dopey. One might as well point a finger at the administrator of the machine that was hacked, the services that were running on it, etc. Most Windows problems are caused by the same thing too. It is wiser to point at the admin (and the services one chooses to run) than to point at the OS, or the kernel.

    --
    You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.