Akamai Having Problems?
A reader writes:"It appears that sometime during the night, Akamai had some problems causing some connectivitly issues with many hosts thoughout the night. Akamai provides a DNS load balancing solution to many major internet companies/sites including (but notlimited to) Google, Yahoo, etc. Is it a bad idea to rely so heavily upon one service for our major internet needs? " Not much details - but I can confirm having problems this morning. Thanks to alert readers for pointing that they were having "DoS related issues" and that service was restored as of 1400 GMT.
Perhaps this is related to the SBC strike?
I can confirm problems accessing the apple.com trailers, but microsoft.com has no problems. I thought they were using Akamai's services as well?
yes, i noted also it, when i wanted to watch new movie trailers at apple's qt site, which is appearantly and unfortunately hosted by akamai.
Akamai also hosts files (images, binaries) for many major websites. Seems like they have some pretty insane bandwidth too...
Posting a link to their website on Slashdot should help them out.
Beauty is in the eye of the beerholder.
Terrorism.
And this is *news*???
I don't think it's a problem to go with one company. As long as that company has a distributed solution with many uplink providers. So, basically redundancy when something happens because no matter how good you are there will always be hickups.
Evolution or ID?
It appears that websites that use Akamai's distribution system are currently not reachable. Security related web sites effected are symantec.com and trendmicro.com. Virus updates may fail as a result. Further details are currently not available and updates will be posted here as they become available. Thanks to Vidar Wilkens for alerting us of this problem.
According to a post to NANOG, the outage may be the result of a DDOS attack. At this point, Akamai has not ETA for a resolution.
Update 09:45 EST: Looks like some of the Akamai hosted sites start to come back."
You gotta love that "Quiet, well kinda quiet". ;)
UNIX? They're not even circumcised! Savages!
- here
- here
- here
- here
Nice bit of bandwidth theft, there.Is it a bad idea to rely so heavily upon one service for our major internet needs?
We do that already. Remember when verisign introduced Sitefinder, thus effectively making various services (like spam filters etc) unusable because non-existing domains all of a sudden replied with a valid IP.
Underholdning.info
Akamai's NOC says service restored approx 1400GMT. Earlier NOC quotes include: It is a system-wide problem that "looks like it may be a DOS attack".
Their system is supposed to be distributed in such a way that any major outage in a section of the internet would not affect their overall ability to deliver the content, so presuambly any outage an ISP would not hit their too hard.
w ww.peacefire.o rg/bypass/Proxy/akamai.html
BTW something interesting:
http://a1.g.akamaitech.net/6/6/6/6/
**FREE** Track and view your phone's via CellID and/or WIFI and/or GPS
Yahoo had trouble for at least an hour or so.
Me and a lot of people I know have been having issues with apple.com specifically the quick time trailers section. Download speeds hit rock bottom, at about 200bytes/second on a 3MB cable connection. As I said, this was a number of people experiencing the same speeds.
Blueyonder UK
I spent ages trying to think of sig, but never did
The cleaning lady needed electricity to her vacuum cleaner.
Poor sysadmins.
I demand the Cone of Silence!
NANOG Archieve
Rus
Cheap UK and US VPS
Of course it is a bad idea.
However, blame that on the other competing services who haven't become cheaper, faster or better at whatever it is that makes Akamai so popular.
Avantslash - View Slashdot cleanly on your mobile phone.
I love how the first reaction when something goes wrong is to replace it, or introduce competiton, or whatever. Yes, there are plenty of times when a service needs competition to encourage it to suck less. But go find me another company that is even remotely prepared to do DNS load-balancing. Verisign? Oh, that's a great idea. Going to start one yourself? Let us know when you have the infrastructure.
The fact is, we have NO idea what caused this. There's no link to any story anywhere - just one reader report. It could be Akamai's fault. It could be their upstream providers. It could be failures elsewhere in the Internet. Could be someone uploaded a bad zone file. Or maybe some over-zealous backhoe operator slashed some fiber somewhere.
It's probably best to reserve judgement until you have all the facts. (And if you're about to hit the reply button, yes, I'd say the exact same thing if MSFT lost their DNS service).
There is no sig, there is only Zuul.
Not many details. Many, dammit! :)
People say that the Internet can't be knocked out. That may be true in the infra-structure sense, but if you're able to knock down Akamai or any other major solution provider, think of the sites that would go down (Google, Yahoo et al), and the repercution on the global economy. So yes, the domino theory doesn't apply to the Internet, but it becomes exponentially more dangerous when we rely on one domino for a significant share of of communications.
I guess this throws a wrench in their claim of 24/7 uptime on their main page. Nice how their marketing team says 100% availability, when people get PhD's by adding more 9's to their 99.99..%'s
They did manage to serve up the banner ad, but the rest of the page just sat there. Same with cnn.com. Seemed to resolve itself here in Philly about a half hour ago.
OS/X Server :) - No Sasser here...
I hope its not my fault. I knocked out 3 of akamai servers with a router problem =(
War isn't about who's right. It's about who's left.
The outage was apparently related to a DDOS attack against project Gutenberg that started this morning.
At work we lost connectivity to a handful of remote sites located in the Northeast, Midwest, and Southeast. Other sites in the same region but different cities were not affected. I was told it was a fiber cut on AT&T's backbone.. wonder if it has anything to do with this.
A slip of the foot you may soon recover, but a slip of the tongue you may never get over. -Benjamin Franklin
Our Akamai rep tells us that it was an issue with a software version rollout. They flushed all their image caches, and effectively caused a DOS on themselves.
I've had so many problems with akamai as of late...it seemingly has a monopoly over just about any commercial website I'm interested in. I don't see images very often while I'm at work...they just idle. Maybe it's a sign that I should stop shopping when I'm supposed to be working :(
Is it a bad idea to rely so heavily upon one service for our major internet needs?
Yeah. Duh. But, where else can I get a /. fix?
I couldn't get to eBay this morning either. It seems to be resolved now though.
I was unable to get to the sites for the major AV Vendors this morning. I chalk it up to Agobot as it
f au lt5.asp?VName=WORM_AGOBOT.GN&VSect=T
DDOS's their sites. See the following link:
http://www.trendmicro.com/vinfo/virusencyclo/de
Due to a peering problem between ATT and UUNet, a subset of UUNet users may have experienced problems accessing Akamai delivered sites between 8-10pm EDT on Saturday May 22, 2004. The problem has been fully resolved.
I think I'll stop here.
As a small company we have a limited view of the Internet, but it seems to us that there have been DNS and connectivity problems thoughout the Internet for the last 90 days or so. I was guessing that there was a DDoS attack against the root DNS servers that wasn't being reported. This would seem to be along the same lines.
sPh
Bittorrent reduces the load on the central server by having everyone who downloads content upload content to other users. Couldn't a similar system be designed for HTTP connections? Obviously it would be designed with much smaller files in mind and with less overhead.
I realise no one give a shit about some large company's bandwith but for small community sites it could really make a difference. They wouldn't have to pay for a company to mirror their site and would save on bandwith costs.
This wouldn't work for server side scripts (as the HTML output would be different for every user) but for static HTML and images it would be perfect.
I'm not sure if this is related, but last night about 5pm UTC my host http://www.ezzi.net got hit with a DDoS attack. Couple hours later they were up and running though.
Due to a peering problem between ATT and UUNet, a subset of UUNet users may have experienced problems accessing Akamai delivered sites between 8-10pm EDT on Saturday May 22, 2004. The problem has been fully resolved.
Maybe the problem has recurred.
Live your life each day as if it was your last.
Since I've had problems like this with my ISP, I figured it was something local. I guess not.
OK, moderate me redundant because now I see a million other people saw the same thing...
Am I part of the core demographic for Swedish Fish?
maybe they autoupdated all their servers and made them reboot?
let's see what updates they have on their support site.
Live your life each day as if it was your last.
is due within the hour. We're pulling out our SLAs to find out what recourse we have against them. We were down for almost 90 min.
maybe there's a new DNS replace project going on, ... ...
and their test runs are conflicting with the "old" DNS service.
anyway, get a caching DNS server
from my experience TODAY, once the name got resolved
the rest of the data from the site would load like normal
.. with akamai-hosted sites that has an odd effect in Mozilla Firefox 0.8 on Linux. A combination of Firefox doing an unnecessary reverse lookup on the IP that's being connected to (this is in addition to the regular forward lookup to get the IP, and waits until timeout, usually 30 seconds) and akamai's lack of any reverse zones configured for their boxes.
A buddy of mine worked through further diagnosis to reveal this problem and registered a bug report with the MozDev team, however, after he contacted Google to inform them of the problem, they put in a blank in-addr.arpa zone file for their IP's, which resulted in an immediate negative result on that reverse zone lookup. If the rest of akamai would get on the stick and do the same, the problem would be history.
and thought my router was having problems.
Happens every time!
I've noticed that akamai seems to carry an mx record for www.spamcop.net. As of the last couple of days. I can't seem to resolve bl.spamcop.net -- is this the same issue? Anone else having this problem?
Roving Web-Teleoperated Robot
seems also to be down. I was trying to access it after the Cannes result, and thought the US government had censored it...
Google passes Turing test : see my journal
I do know that Symantec's site is difficult to reach and they use Akamai.
I worry about my AV updates being incomplete...
As seen on http://alpha.cesmail.net/graphics/spamstats.gif
have you been defaced today?
A guy I spoke with this morning at Akamai said this morning that the problem was NOT the result of any outside attack on the company's servers. Rather, he said, the problem stemmed from a bug within a tool that allows customers to purge old content and update their cache with new content. Akamai said the problem lasted about 90 minutes, and affected numerous Akamai customers. No response, though, as to why this bug suddenly reared its head.
...because you never know who you're dealing with.
Akamai provides a DNS load balancing solution to many major internet companies/sites including (but notlimited to) Google, Yahoo, etc. Is it a bad idea to rely so heavily upon one service for our major internet needs?
Don't you see the irony? How much of the internet populace depends on Google for their searching needs?
I suspect the problem here, as there, is that there aren't many who can compete at a service level.
tasks(723) drafts(105) languages(484) examples(29106)
Last night, the problem I saw was that Qwest couldn't connect to Verizon. Verizon in MA was basically hard down because of this. I got the Qwest guys and the Verizon guys working on it. Sucks to have a client in Hong Kong that calls me in the middle of the night when they can't get their email.
this sig has been rated E for Everyone.
Currently akamai is configured as an open proxy. This is obviously bad (for them), since anyone can steal the service they're selling for big bucks.
i mages.sl ashdot.org/topics/topiclinux.gif
Try:
http://a40.g.akamaitech.net/7/40/1601/1d/
(it works with ANY URL)
Obviously, they noticed it, and tried to fix it. Their fix turned out to block valid customers (like Apple, as has been mentioned), so now they have rolled it back to the free-for-all setup.
They're probably working on a better fix right now.
This is nicely commented on in a recent story over at CFO where it says "Broadly speaking, Akamai needs servers near the consumers of content..[] Akamai, on the other hand, has servers pretty much everywhere."
To trim the facts down a bit: Akamai has servers near by most users these days, and the distributed DNS gives you returning DNS to the closest contentserver. If I, who live in Norway, try to access fbi.gov from any computer from a ISP connected to the NIX (Norwegian Internet eXchange) I get a DNS response that leads me to Akamais servers in Oslo, Norway. I've tried this for some time, just to see what happens, with cnn.com, apple.com and fbi.gov. While on a trip to Sweden I tried this while connecting through a local DSL-provider and I got a response from a server located in Sweden, hence even the swedes have their own Akamai mirror these days.
The problems with a DDOS from someone in Norway would, if directed towards a domain or webpage and not an IP-address lead to downtime on that specific local mirror, not Akamais entire network. We can from this conclude that only such events as a major blackout in Akamais core network or like this time, DOS'ing their own network would take out their service.
"-Who said sit down?!"
-- S. Ballmer @ MSDC 2003.
Akamai just told me it was a 90-minute glitch (between 8 and 9:30 Eastern time) caused by a software bug. The company says everything's back to normal.
Hiawatha Bray
Tech Reporter
Boston Globe
...for those that don't know, a market where it is unprofitable to be the 2nd company around (usually, you can sell cheap because the major company wants to reap profits). A small "Akamai" competitor is no competitor at all, really. You need to have a similar huge network in order to compete. They would undoubtably clash and one would come out as the winner.
So well, if it hadn't been Akamai it'd probably be someone else. Of course, one company can still build a helluva redundant network, if they want to... it's just usually not cost-efficient.
Kjella
Live today, because you never know what tomorrow brings
From there webpage
...Akamai routinely delivers 15% of the total Web traffic and pushes on average 40+ GPS (and growing)...
Now if I would know what the hell GPS means.
I don't think the mean Global Positioning System or misspeld GB/sec?!
>> Had I been going to bed earlier every night? Have I been sleeping later? Has Tyler been in charge longer and l
I spoke with Akamai support. They indicated that it was a far reaching problem, but I have not heard the reason yet.
The customer login to the admin portal was down as well. It was almost like someone dump the customer account database.
Akamai has a QOS commitment of 100% uptime based on the idea that not all of the 1,000's of servers could go down at the same time. But... There you go.
This information has been deemed classified. Please remove all information on how to disable Homeland Security notices from this and any other computer that it may reside on.
The FBI counter terrorism task force in conjunction with the FBI computer security teams will be contacting you shortly.
Thank you for your cooperation.
You mean, not many details?
Come on, Hemos. You're always making dumb grammar mistakes. This place looks so unprofessional as a result.
I'm sorry to hear of their trouble. I offer prayers for Akamai.
Akamai is aware of a service interuption earlier today affecting content delivery.
We have identified the root cause and have implemented the fix. Issues retrieving content should be decreasing or resolved. Updates will continue to be posted on the Akamai Edge Control Management Center.
so there is something wrong with their cdn. so much for 100% availability. my guess, all the edge servers were ok but there may be a problem with their noc or software.
Live your life each day as if it was your last.
I was expecting to see lots of jokes about them being slashdotted. Oh well.
... only human.
When it comes down to it...
They're still...
..and I love the fact they they hostted the M$ site on Linux.
Please use [ informative / summarizing ] SUBJECT LINES
Flame me here
It took Microsoft down for DAYS.
All due to a router config bug introduced by Microsoft.. So it was really Microsoft DoSing themselves via Akamai.
And it would be unfair to blame the router config for more than a few hours of outage. The big problem was the complete and utter paralysis of management on the conference calls.
I don't think the details of that outage have been leaked much. It was quite a hoot talking to those involved during the outage. And it wasn't hard, given the duration.
It is my recollection that the problem related to Microsoft's filtering DNS requests from Akamai.
You misspelled the word "misspelled".
Sigs are bad for your health.
No, no, no, The End of the Internet is here
how exactly do you pronounce "Akamai" ? It's one of those words that I use and see only online, but never have to actually speak, or have ever heard anybody speak. Or maybe it was just never meant to be spoken.
one word: Speedera. I've used both Akamai and Speedera, and Speedera has better service, better tools, better people to deal with, and much better prices.
Comcast has been having lots of DNS problems, so I used Posadis (http://www.possadis.org) as what amounts to a caching DNS server on my local network.
The new version works very well, and will make your network connection seem much faster.
You were mistaken. Which is odd, since memory shouldn't be a problem for you
An isolated issue occurred this morning (roughly during the period of 8:00 a.m. - 9:30 a.m. ET), where multiple Akamai customers experienced intermittent performance and availability degradation.
This degradation was the result of a bug within one of Akamai's backend content control management tools, which allows the expiration of content on the Akamai network. The degradation was not a result of any outside interference with Akamai's network (such as Denial of Service or hacking).
Upon identification of the bug, Akamai quickly took corrective action which returned customers to normal service levels. Akamai is currently putting measures in place to return the content management tool to its normal working order and is adding safeguards such that the issue will not occur in the future. In the meantime, Akamai customers are able to serve their content through the Akamai Network normally.
As part of Akamai's normal proactive customer communication policy, Akamai customers will be kept informed of the latest developments through the Akamai portal, the EdgeControl Management Center, https://control.akamai.com. Any further inquiries may be directed at Akamai Customer Care at 1-877-4-AKATEC.
Speedera
;-)
Many companies dont rely solely on these guys.
Check em out
http://www.speedera.com
Akamais system will cache anything it is asked to (other comments in this thread link to pages that tell you how to use Akamai to get around censorship of sites, or to cache your own material) so I guess all the DOSers would have to do is to tell each local Akamai box in each rack to cache, say, a few thousand large files each that they do not already cache?
In this way a few thousand bytes of http requests could make the Akamai servers *EACH* attempt to fetch terrabytes, or more, of data...
My theory of the apparently "random" DDoS attacks we've seen in the past years is that they are tests of new attack strategies, and possibly demonstrations for potential clients / black-mail victims, etc.
If you can bring down Akamai, you can bring down anyone.
Sig for sale or rent. One previous user. Inquire within.
Comment removed based on user account deletion
It seems weather.com images are not coming through as well. Perhaps this is related?
Unfortunately this would be breaking news on the local TV channels in the US. I swear, if I win the lottery I'm going to start my own news program.
Marxist evolution is just N generations away!
...One could wonder if they have an emergency plan if e.g. some evil hacker was to shut down all their servers. Unless they got remote power management it would be a looong day on the phone with a lot of people.
Akamai's competitors have different scaling tradeoffs. The last time I knew numbers was a couple of years ago, and it may have changed, but Akamai had a very large number of mostly small servers located on many carriers networks, AT&T had a couple hundred very large servers (mostly at peering points, which takes advantage of being a carrier, though they also bought some transit for content distribution), and Speedera was somewhere in between. AT&T's directions included lots of streaming media, and Akamai was doing fancy database things.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
An isolated issue occurred Monday May 24, 2004 (roughly during the period of 8:00 a.m. - 9:30 a.m. ET), where multiple Akamai customers experienced intermittent performance and availability degradation.
This degradation was the result of a bug within one of Akamais backend content control management tools, which allows the expiration of content on the Akamai network. The degradation was not a result of any outside interference with Akamai's network (such as Denial of Service or hacking).
Upon identification of the bug, Akamai quickly took corrective action which returned customers to normal service levels. Akamai is currently putting measures in place to return the content management tool to its normal working order and is adding safeguards such that the issue will not occur in the future. In the meantime, Akamai customers are able to serve their content through the Akamai Network normally.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Correct, especially when it's DNS-based load balancing. See this excellent document for a full explanation:
http://www.tenereillo.com/GSLBPageOfShame.htm
Am I the only one who has noticed that when you have a society of geeks who communicate mostly by text, there is a great disparity in the way people choose to pronounce things? At least most of us have settled on a pronunciation of Linux and J. K. Rowling's Hermione.
Why start your own news broadcast when there's Naked News?
Between 8:10 AM EDT and 9:30 AM EDT (GMT-4) on Monday, May 24, 2004, Akamai customers using EdgeSuite and other Akamai Services experienced extensive performance and availability issues.
This incident stemmed from processing an invalid command generated from one of Akamai's backend content control management tools. This tool controls the expiration of content on the Akamai network.
Although there are numerous safety checks designed to engage before commands are sent to Akamai's servers, an invalid command sent out by one of the content control management tools bypassed two key safety checks.
Because Akamai's servers are programmed to restart when presented with an invalid request, once the invalid content control had bypassed these safety checks, Akamai servers continuously halted and restarted in an effort to process all of our customers' pending content management commands.
The problem was immediately detected by Akamai's automated monitoring systems, and Akamai personnel had localized the problem and identified a solution by 8:40AM EDT. The solution was immediately deployed on the network and by 9:30AM EDT, the problem had been completely resolved.
We regret any inconvenience this may have caused you or your users. Please contact your customer care representative if you have any questions.
Live your life each day as if it was your last.
Even if it is BSOD'd I guess it is still "UP".
the images aren't loading, and surprise surprise:
e t is an alias for a1969.l.akamai.net.
$ host i.imdb.com
i.imdb.com is an alias for i.imdb.com.edgesuite.net.
i.imdb.com.edgesuite.n
a1969.l.akamai.net has address 65.200.201.6
a1969.l.akamai.net has address 65.200.201.15
a1969.l.akamai.net has address 65.200.201.29
a1969.l.akamai.net has address 65.200.201.5
akamai. I guess they are still having 'issues'
What does Akamai do for me? Hosts lots of annoying popup ads, shockwave ads, and adware. Most of the adware I find hooked in IE has a codebase from akamai. /enjoyed an ad free day
Summary:
Between 8:10 AM EST and 9:30 AM EST on Monday, May 24, 2004, Akamai customers using EdgeSuite and other Akamai Services experienced extensive performance and availability issues.
This incident stemmed from processing an invalid command generated from one of Akamai's backend content control management tools. This tool controls the expiration of content on the Akamai network.
Although there are numerous safety checks designed to engage before commands are sent to Akamai's servers, an invalid command sent out by one of the content control management tools bypassed two key safety checks.
Because Akamai's servers are programmed to restart when presented with an invalid request, once the invalid content control had bypassed these safety checks, Akamai servers continuously halted and restarted in an effort to process all of our customers' pending content management commands.
The problem was immediately detected by Akamai's automated monitoring systems, and Akamai personnel had localized the problem and identified a solution by 8:40AM EST. The solution was immediately deployed on the network and by 9:30AM EST, the problem had been completely resolved.
We regret any inconvenience this may have caused you or your users. Please contact your customer care representative if you have any questions.
More Details:
The Akamai content control system manages the expiration (purging) of content on Akamai's EdgeSuite, EdgeSuite Secure, and Streaming networks. It consists of a collection of tools that validate and authorize content expiration requests, and then generates a set of commands for the content servers. These commands are distributed to the content servers every few minutes.
These commands go through a number of safety checks before being sent to the content servers. As part of Akamai's design philosophy and software engineering practices, all control messages go through multiple levels of safety checks, including a test on the code in the live system. However, this morning, an invalid command was sent out because of defects in two key safety checks.
The first safety check was bypassed because the command triggered a specialized, rarely-used configuration for managing content. The second safety check, which causes content servers to reject malformed commands in the live system, had a defect introduced into it during the release of EdgeSuite 5.0 in April 2004 that effectively had disabled this check.
Processing the invalid command caused Akamai servers to halt and restart. When Akamai servers restart, they process all pending content management commands to ensure that the content they serve remains consistent. During restart, the servers would thus attempt to reprocess the invalid command, which would cause Akamai's servers to again halt and restart repeatedly.
After the problem was resolved at 9:30 AM EST, the content control management tools were then re-enabled at 12:30 AM EST, with the exception that processing is still disabled for the specialized configuration that bypassed the safety checks. The processing for this specialized configuration will be enabled in the near future. Akamai is also conducting an audit of all safety checks associated with the content control management tools.