Microsoft Bots Effectively DDoSing Perl CPAN Testers
at_slashdot writes "The Perl CPAN Testers have been suffering issues accessing their sites, databases and mirrors. According to a posting on the CPAN Testers' blog, the CPAN Testers' server has been being aggressively scanned by '20-30 bots every few seconds' in what they call 'a dedicated denial of service attack'; these bots 'completely ignore the rules specified in robots.txt.'"
From the Heise story linked above: "The bots were identified by their IP addresses, including 65.55.207.x, 65.55.107.x and 65.55.106.x, as coming from Microsoft."
I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?
This is my sig.
Are we sure this traffic comes from Microsoft? Could it not consist of forged network packets? You don't need a reply if you are running a DDOS. On the other hand, why would anyone, including Microsoft, want to bring down CPAN?
Nae king! Nae laird! Nae yurrupiean pressedent! We willna be fooled again!
For ignoring robots.txt, they don't deserve any more nor less.
You do not have a moral or legal right to do absolutely anything you want.
Why make things worse? Block the ip address or range and notify the admins. This isn't a chan mob.
"as we need additional information to be able to track down the problem."
IP addresses aren't enough? You're MS--if you can't fix the problem and IP addresses are given, damn, that's just sad. You're freaking massive multi-billion dollar tech companies, and this is the best you can do?
No wonder Chinese hackers own our asses.
Then again, it took Comcast 9 months to fix a security hole in customer accounts (which would have required an s to http to make pages SSL'd), and the only reason it was "fixed" was because they did their annual website makeover and changed their entire system to something Flash based. Then again, I had contacted a VP, VP's security, referred to web security, and talked to web security 3x, talked to a manager. The last 3 groups verified the problem. It was referred to their web applications team by that point, who sat on it.
Lovely world we live in.
Actually, your statement works better with 'INSERT LANG HERE'...
I'm always surprised by how people seem to think that any language has a monopoly of some sort on sloppy and/or lazy coders. Been doing IT a long time, and the one thing that never changes is the sloppy/lazy code issue. It even predates programming, you know - look at infrastructure around the world for examples of "just toss something out there, hope it works".
An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
A quick guess? Identifying unique sites by domain name, rather than by IP address, and either the bot or server not respecting HTTP 301 redirects.
With Rosetta Code, I once had www.rosettacode.org serving up the same content as rosettacode.org. My server got pounded by two bots from Yahoo. I could set Crawl-Delay, but it was only partially effective; One bot had been assigned to www.rosttacode.org, while another to rosettacode.org, and they were each keeping track of their request delay independently. I've since corrected things such that www.rosettacode.org returns an HTTP 301 redirect to rosettacode.org, and have was eventually able to remove the Crawl-Delay entirely.
I've since worked towards only serving up content for any particular part of the site on a single domain name, and have subdomains such as "wiki.rosettacode.org" redirect to "rosettacode.org/wiki", and "blog.rosettacode.org" to "rosettacode.org/blog". Works rather nice, though it does leave me a bit more open to cookie theft attacks.
YMMV; As I said, that was a quick guess.
tasks(723) drafts(105) languages(484) examples(29106)
They admitted they were powerless to solve their own problems without help from their victims.
Heh. It's another "damned if you do; damned if you don't" scenario. Usually, people criticise Microsoft for developing software without bothering to consult or test with actual customers. Now we have a manager of a MS dev group that actually does communicate (though not exactly with "customers"), and acts on what they say, so he's criticised for needing help from his "victims".
Ya can't win that game.
But the fact is that if you're developing server-side web software, you need to test it against real-world sites, not just the toy sites you've set up in your lab. And we all know the "Sourcerer's Apprentice" sort of bug that produces a runaway test that tries to do something as many times as it can per second until it's killed. Good testers will be on the lookout for such events, but it's understandable that they might fail occasionally
Among web developers, MS does have a bit of a reputation for hitting your new site with a flood of requests, trying to extract everything that you have (even the content of your "tmp" directory which your robots.txt file says to ignore). There are lots of small sites that block MS address ranges for just this reason.
It should be considered good news that there's at least one MS manager who understands all this, and is willing to talk to the "victims" and fix the problems. Now if they could fix the next-level problem, that this sort of thing happens repeatedly and their corporate culture seems to have no way to prevent it from happening again.
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
REAL solutions to immediate problems don't depend on the rest of the world changing to suit my needs. Also, the fact remains that there are links out there that point to "http://www.rosettacode.org/w/index.php?something_or_other", not all of those links will (or can) change, and I would be an absolute fool to knowingly break them, if I want people to visit RCo via referral traffic.
tasks(723) drafts(105) languages(484) examples(29106)
There is a spec for robots.txt. If someone's not following it, then it's their fault. Given Microsoft's past history, I know where I'd point the finger absent any more concrete information.
"Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
As said below, never ascribe to malice that which can be adequately explained by stupidity.
Must be really easy to just beat you in the face, and say “Ooops, I’m sorry, I’m so st00pid! *drool*” I call bullshit on that rule.
My rule: Don’t make judgements at all (either way), about things that you just don’t know.
How about: Don't mistake organizational stupidity for individual stupidity. This isn't the case of a single bad coder making a mistake, this is an organization that's chosen to how much effort to apply. How much testing and review? What failsafe's, logging and active monitoring? Will options for feedback be accessible and responsive? Stupidity and Malice aren't mutually exclusive for an individual, and certainly not for an organization.
tomorrow who's gonna fuss
I've never liked that saying because of the implication that malice and stupidity are exclusive.
Dumb and mean are often found together.
The enemies of Democracy are