Akamai DNS Outage Messes up Net
katre writes "Checking all my favorite sites this morning, I saw that about half a dozen seem to be offline. Trying to figure out why, I found an interesting article on the front page at http://isc.incidents.org/. Seems that the problems at Akamai are screwing over Yahoo, Google, Microsoft, Fedex, Xerox, Apple, and others. Whatever happened to my decentralized net with no single point of failure?"
but I believe the centralized concept of the 'net is something that is coming to an end, much to our loss. I'm pretty bothered by the fragility of this system. How many of you can't work without web access?
Don't be a looter...and yes, I know that it's spelled with an "A" instead of an "E".
Its still there, and you're using it. The only organizations affected by this are those who chose to use a service that acts as a single point of failure.
trustedworlds.net - gaming, security, and the gunk that lives in between
Whatever happened to my decentralized net with no single point of failure?
Its there. Get out your old Usenet reader. See, you still have your porn.
Know what I like about atheists? I've yet to meet one that believes God is on their side.
DNS dying on you? Just throw it on the pile of other connection problems
;)
I think everyone has several "single" points of failure -- my cable modem dies at least twice a month and my wireless router conks out at least twice a day
Yahoo is already resolving through scd instead of akamai. I didn't check any of the others.
If you clear your cache, you will probably get the new entries, unless your ISP hasn't caught onto the problem yet.
How ya doin', Al?
...I can't even get to http://isc.incidents.org/
You could still access Slashdot, couldnt you?
Be very, very careful what you put into that head, because you will never, ever get it out. - Cardinal Wolsey
Hmmm.
The web happened my dear friend, and it was based on the predominant distributed computing model at the time: client/server. Even DNS, with its highly distributed spread of processing and data, has a set of (overloaded) root servers with the commensurate single points of failure. The solution? Peer-to-peer.
:)
Too bad even the term P2P raises so many red flags with certain Associations of America.
This should cause some problems for akami, they had an outage may 24th. Once can be overlooked twice? these are some big companies they are going to be calling them. I bet there is some sweating techs in the cool noc right now
War isn't about who's right. It's about who's left.
Do we know if this at all related to the Linux kernel 2.4.2x/2.6 DoS exploit discovered yesterday?
What ticks me off about this incidents (and I suspect that there have been several in the last 6 months) is that there is absolutely no notification given, either during or after the event. During this outage, some news outlets were still reachable (including Slashdot), and a simple notification would have saved hours (* 10s of thousands of network dudes worldwide) of time and much grief from the big bosses who couldn't reach Yahoo Finance, I mean critical business web sites.
Are these guys so convinced of their omnipotence and indispensibility that they don't feel the need to communcate with the world about what is going on?
sPh
that the /.'ers aren't trying to take credit for slashdotting the entire WWW.
"Facts are meaningless. You could use facts to prove anything that's even remotely true!" -- Homer Simpson
Pwned by CNAME to Akamai?
(You can't have CNAME records for the base domain, hence google.com would have had an A record instead, whilst www.google.com would have been a CNAME to akamai)
Well I guess it's back to IP addresses for us!!!
....
I'll be at 127.0.0.1 until this blows over.
May the Maths Be with you!
It's not truely decentralized...
The root nameservers are the most obvious example...
The most obvious example? The fact is that there are 13 of them, in widely scattered locations across the globe, and it's not decentralized?
Damn man, what exactly would you consider "decentralized" then?
Root servers go down all the time. It's not particularly unusual. There's THIRTEEN of the things. Up to 8 have been down at once with no major effects on the network, IIRC.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
I can see the logic that went into this plan:
"Well, Akamai has a few million DNS boxes, if we put everything there we'll be fine! That's not a single point of failure!"
Yeah, about that... multiple vendors may have been a good idea in retrospect instead of just one monolithic provider.
Time to re-examine the definition of Single Point of Failure.
Let's see so far today.. We had a report on Yahoo... They're down. A report to a virus linked to Symantec.. they are up and down. We always link to Google, they are having problems... wooo. Now we just need another patent from Microsoft to bring them down... which by my records shouldn't be too long.
Hmmm.
Checking all my favorite sites this morning...
Microsoft, Xerox and FedEx are some of my favorite sites too! But due to the outage I'm stuck slumming it here on Slashdot...
Yeah, google didn't work and we didn't know what to do. We tested and determined the problem was akamai within a minute. So I used AIM to ask a friend who could still resolve google what the ip was. he passed it to me over aim using gaim encryption no less. We then created an alias for google on our dns server. google.ourdomain.com.
We also developed a new DNS protocol in the process. ESEDOIM: Extremely slow encrypted DNS over instant messenger. Who wants to write an RFC?
The GeekNights podcast is going strong. Listen!
It's not like a092156fg.akamai.net is in Seattle and k1039665.akamai.net is in Saskatoon. Instead, all of *.akamai.net goes to whatever cluster is "closest" to the requesting IP (based on BGP, Colonel's Secret Recipe, etc)
So if Akamai's DNS gets screwed up, I would expect major weirdness. And as more sites join EdgeSuite (where you host your entire domain on Akamai's servers & DNS) the effect must magnify.Of course, I could be completely wrong. I'm not a routing god, just a guy who thinks Akamai is a cool hack.
From NANOG:
From here neither www.google.com, nor www.apple.com work. Both seem to return CNAMES to akadns.net addresses (eg, www.google.akadns.net, www.apple.com.akadns.net), and from here all of the akadns.net servers listed in whois are failing to respond.
Akamai didn't mess up the net. Akamai messed up some web sites that are akamai customers. Remember kids, www is only a subset of the internet, and akamai customers a small fraction of the www.
The real cost of a web site dropping is a lot more difficult to figure out than you might imagine. Say Amazon goes down for a couple of hours. Are all those potential sales lost forever? I doubt it. Some people will just come back and order later. The firm is unlikely to see any long term impact unless the outage becomes habitual. Non-retail sites probably have even more flexability. About the only area in which an outage could have a real, long term adverse impact would likely be in financial services. If Schwab goes down for half a day they will suffer big time for a long time. If you're talking "the economy" as in the big picture economy" suffering - forget it. Web based commerace isn't that important yet.
Seriously we need a *.sht domain.
It appears that, at around 8:30 AM EDT (US Eastern Daylight Time), Akamai's DNS network experiened some kind of major failure. All of their DNS servers (that anybody could find) were not responding to DNS queries. It appears that Akamai started to come back online at around 10:00 AM EDT.
Since a great many big name sites use Akamai, this effectively made large parts of the Internet unreachable. The destination servers themselves were up, but clients were unable to turn names (like www.example.com) into network addresses (like 192.0.2.42).
As Akamai maintains dozens, if not hundreds, of DNS servers across the globe, it is extremely unlikely that this was due to a normal equipment failure or DoS attack. Some kind of internal system trouble is much more likely. Whether a deliberate attack, or an accident, is unknown to me at this time. It could just be an internal configuration change blew up in a really bad way. Sh*t happens.
I do not know if this was just an Akamai DNS problem, or if other Akamai services were also affected.
Due to the way Akamai is usually implemented, it happened that, in many cases, the second-level domain names (like example.com) worked, but subdomains (like www.example.com and mail.example.com) did not. This is because most organizations put in CNAME records (pointing to names in *.akadns.net) for the subdomains. You cannot use a CNAME record for a domain that has other records, though, so most domains still had traditional A records, on their own nameservers, at the second-level.
The following sites/organizations are known to use Akamai: Yahoo, Google, Microsoft, Altavista, FedEx, Xerox, Apple
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
Not too long after 9/11, I was surfing the net and needed to look up something at the Library of Congress for one of my classes. It wouldn't connect. At first I thought we'd just lost DNS (not so uncommon an occurance at my university in those days), but found I could still connect to slashdot.org and some other sites.
.edus mostly.) The ones that replied, I plotted on a US map based on their DNS LOC. (A project I wrote for a previous class.)
Being a geek, I thought up a list of about 30 sites to ping, scattered across the US. (.govs and
I freaked out a bit when the mid-atlantic seaboard came up missing. I crossed my fingers hoping that it was just some idiot who'd accidently cut one of the main fibers (which it what it ended up being) and not that Washington DC was now a big hole in the ground.
A preposition is a terrible thing to end a sentence with.
DNS was designed to be robust enough. Not one root server but many (ok, that's the weak point, we've all seen many DDoS against them, but it's not THAT bad). All zones are handled by their own servers, and (in theory) multiple servers for each zone. All in all, it's not a bad design.
If what happened was that someone put all the servers behind one link, it's not DNS' fault, the BOFH there screwed up (and considering it's akamai, they should not have done that).
(If that's not what happened, sorry, I couldn't RTFA, it's slashdotted or there's some sort of DNS problem there too).
GPG 0x1B479C78
From NANOG mailing list again:
Google pulled references for akamais dns servers a short period ago. they are presently serving their own dns requests.
Also:
People seem to be getting around this by changing their DNS entries.
E.g. www.yahoo.com always used to be a CNAME for www.yahoo.akadns.net. But
now:
# host www.yahoo.com
www.yahoo.com is an alias for www.dcn.yahoo.com.
www.dcn.yahoo.com has address 216.109.118.64
www.dcn.yahoo.com has address 216.109.118.65
www.dcn.yahoo.com has address 216.109.118.66
www.dcn.yahoo.com has address 216.109.118.67
www.dcn.yahoo.com has address 216.109.118.68
www.dcn.yahoo.com has address 216.109.118.69
www.dcn.yahoo.com has address 216.109.118.70
www.dcn.yahoo.com has address 216.109.118.71
www.dcn.yahoo.com has address 216.109.118.72
www.dcn.yahoo.com has address 216.109.118.73
www.dcn.yahoo.com has address 216.109.118.74
www.dcn.yahoo.com has address 216.109.118.75
Which is owned by Yahoo! (via HotJobs.com).
If it weren't slanted, it'd be |.
(Apologies to whomever I'd seen that from before.)
That green slime had it coming.
The problem is that those sites created their own single point of failure by all using Akamai for DNS. When Akamai DNS fails, sites that depend on it for their own DNS fail.
It used to be nearly impossible for this to happen. The original rules for DNS were that you had to have at least 2 nameservers for your domain, preferrably 3 or more, and they couldn't be on the same physical networks. With that rule having a single network go down rarely made any domain unresolvable (backbone networks whose outages could render dozens or hundreds of other networks unreachable being the exception). Maybe we should put the old nameserver-diversity rules back into place.
I wonder if Microsoft/AdTI will buy the "\." domain? News for Nerds slanted the other way!
It is misleading to refer to the box as a "Linux" box. Was it really the kernel that was at fault for the machine being cracked, or was it a bug in one of the daemons that the machine was running? There are differences between a Linux box that runs BIND and another that runs EZ-DNS (or whatever).
How about this: Instead of labelling the Akamai boxes that have problems as "Linux" boxes, label them as "BIND" boxes, or whatever DNS server it is that it runs. Perhaps there's a FreeBSD machine in there that is having similar problems.
It is allowable, though, to refer to a Windows box as just that. MS ships an all-in-one product, and seldomly do admins use Windows to run BIND, Apache or other OSS servers.
All of this hand-ringing in an effort to paint "Linux" as bad, or as "just as bad" is dopey. One might as well point a finger at the administrator of the machine that was hacked, the services that were running on it, etc. Most Windows problems are caused by the same thing too. It is wiser to point at the admin (and the services one chooses to run) than to point at the OS, or the kernel.
You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.