Amazon Bots Cause Grief For Associate Web Sites

Redirection Limit for this URL exceeded. by Jan0815 · 2002-12-08 22:52 · Score: 4, Interesting

I am not able to view any of the mentioned links. Keeps on redirecting between login and some other page.

Funny to see that someone complaining about abuse links to pages that do not work with Webwasher filtering.

Re:Redirection Limit for this URL exceeded. by Anonymous Coward · 2002-12-08 22:56 · Score: 0

Not woking for me either.

could be by hairmare · 2002-12-08 22:54 · Score: 2, Interesting

.. something about not accepting any cookies? cookie filtering is just great ;)

Re:could be by Jan0815 · 2002-12-09 00:09 · Score: 4, Insightful

Hehe. In fact I am filtering cookies, scripts, popups, referrer, webbugs etc.

So I guess I am not very informative about my habits - which I think is my freedom to do. And if a site doesn't work that way, the site owners clearly indicate that they are not willing to accept me a s a visitor - which is their freedom.

At least /. works well that way ;-)
Re:could be by Planesdragon · 2002-12-09 04:20 · Score: 1, Troll

So I guess I am not very informative about my habits - which I think is my freedom to do. And if a site doesn't work that way, the site owners clearly indicate that they are not willing to accept me a s a visitor - which is their freedom.

Your logic is wrong.

The owners of a website, especially if they are not aware of your habits, are not rejecting ('not accepting') you as a visitor / customer.

At the worst, they're not taking efforts to accomodate your nonstandard way of browsing the web. YOU were the one who chose to apply filters--hence, the active part in the exchange is you, not the website owner.
Re:could be by Jan0815 · 2002-12-09 07:03 · Score: 2

My logic is wrong?

Explain to me, when exactly became cookies something I MUST enable? When became Web-bugs standardized? At which it was ridiculed to have no referrer? And exactly when was it agreed that JavaScript is part of the HTML/XHTML/HTTP standard?

Exactly what is non-standard in my way of browsing the web? If you mean unusual I could agree, but non-standard is wrong.
Re:could be by Planesdragon · 2002-12-09 07:14 · Score: 2, Insightful

Exactly what is non-standard in my way of browsing the web? If you mean unusual I could agree, but non-standard is wrong.

You're confusing technical uses of the word with colloquial ones. Web standards have _never_ been the normal case; there has always been some tweak or extension that makes the web useable in ways that a significant proportion of the decision-makers seem to like.

Your adherence to Standards is a non-standard act (an act against the norm, unusual, et al), and as such it is an unforseen action on your part and not an action on the part of the website owner & their developers to exclude you.

My point wasn't about standards, it was about whose action caused you to be unable to use the non-standard commerical websites. Since non-standard design is the norm for the industry, it's a case of the website failing to take extraordinary action (making their sites standards-compliant) to keep you as a visitor, and not them taking action to deny you as a visitor.

At best, they're ignorant and you're suffering from the consequences of your choice. At worst, they're gulity of not wanting to expend the effort to accomodate you--but that only happens if they have the ability & opportunity to meet your needs. (i.e., only to those websites that you contact with a request for a toned-down main or alternate version of their web page that you can visit.)
Re:could be by Alsee · 2002-12-09 10:04 · Score: 2

a request for a toned-down main or alternate version

LOL. A website that simply works is "toned down"?

I can tell you how many times I've come across website that DOES work perfectly, except they ADDED code to redirect you to a basicly blank page saying "this site requires Internet Explorer/cookies/javascript/whatever". I have sometimes disabled this and used the site.

My point wasn't about standards, it was about whose action caused you to be unable to use the non-standard commerical websites.

In many cases they take a perfectly good website and disable it with these tests. Having cookies/javascript/whatever off may cause specific features to not work, but you actually have to go out of your way to design the entire site to fail.

If I have cookies off and it doesn't remember my prefferences, fine. If I have javascript off and help windows don't work, or the formatting is lousy, fine. But to fail to display ordinary text? That nearly has to be intentional, even gross incompetence is usually partially functional.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.

Boohoo!! by Ratface · 2002-12-08 22:56 · Score: 5, Interesting

Given that many people still boycott Amazon for their stance on software patents, I guess that they won't be shedding many tears.

One could argue something about watching out for who your bed-partners are! Bear in mind that a company that has such a disregard for even their affiliates has to have a pretty poor respect for anyone else out there! Caveat emptor!

--

A little planning goes a long way...

Re:Boohoo!! by scrutty · 2002-12-08 23:22 · Score: 5, Informative

Looked to me like the latest update on the noamazon.com site that you link to was Valentine's Day, 2001. Hardly the most active looking protest.

--
-- Oh Well
Re:Boohoo!! by Abnormal+Coward · 2002-12-09 00:21 · Score: 1

if you click the link at the bottom of the page for www.amazon-books.org it takes you to amazon.com :).

Looks like amazon managed to get that domain ;-p
Re:Boohoo!! by gl4ss · 2002-12-09 02:22 · Score: 2

i dunno how active you have to be to not use some companys services.

ksuicide2k2

--
world was created 5 seconds before this post as it is.
Re:Boohoo!! by 0x0d0a · 2002-12-09 03:47 · Score: 2

I think the point is that the boycott failed -- amazon didn't pay any attention, and most consumers ignored it.

--
May we never see th
Re:Boohoo!! by Anonymous Coward · 2002-12-09 05:51 · Score: 0

Yes, it failed in the sense that it did not force Amazon to change its practices. However, they still aren't getting any of my money, or that of many other people either. Maybe we can't make them change, but we sure don't have to buy from them either.

Re:IMPORTANT ANNOUNCEMENT - PLEASE READ by crashandburn99 · 2002-12-08 22:59 · Score: 0, Offtopic

My God,

The nigerian 419 scam, anyone wishing for a bit of humor should check out the recent UserFriendly strips on the subject. http://www.userfriendly.org

crashandburn99

The Alexa archiver -- you can stop that one. by Deal-a-Neil · 2002-12-08 22:59 · Score: 5, Informative

We've noticed quite a few requests for robots.txt by the Alexa archiver. So a suggestion to boot may be throwing this into your root directory of your domain's web site (in a file called robots.txt)

User-agent: ia_archiver
Disallow: /

And if its really annoying, bloody hell, just do an active firewall block and put the sharks (lawyers) away with those goofy lawsuits before they start wasting our senators' time and taxpayer cash.

Re:The Alexa archiver -- you can stop that one. by hairmare · 2002-12-08 23:07 · Score: 5, Informative

The problem seems to be the amzn_assoc crawler that alexia uses on amazons behalf to find out about broken links.

the bot was scanning through some kind of cgi script thus generating thousands of requests.

At issue is not only the frequency of page retrievals, but also the duration of the crawl. For example, on Nov 26th, amzn_assoc visited one of my sites 13,406 times over a period of 17 hours, consuming approximately 200 Mbytes of bandwidth via calls to MrRat's CGI script.
Re:The Alexa archiver -- you can stop that one. by D+iz+a+n+k+Meister · 2002-12-08 23:30 · Score: 4, Interesting

Seems like Alexa sold Amazon a whole lotta nothing when they agreed to verify the links on AWS sites.

According to one of the posts here:

Again, I don't get how my links can be broken since Amazon is delivering the content.

--

He painted a unicorn in outer space. I'm askin' ya, what's it breathin'?
Re:The Alexa archiver -- you can stop that one. by rapidweather · 2002-12-09 01:10 · Score: 1

13,406 visits? My page counter says I've only gotten 24:-(
What's wrong with a Marilyn Monroe themed amazon.com page? Otherwise, I have no problems with Amazon, but I only grossed $40.00 this year, with a commission of $2.00:-)
------
It's here:
http://www.geocities.com/rapidweather/geo12.html

--
Rapidweather's Linux Screenshots.
Re:The Alexa archiver -- you can stop that one. by execom · 2002-12-09 02:35 · Score: 2, Funny

Anyway, every webmasters use the W3C Link Checker (http://validator.w3.org/checklink/) in order to track their broken links. So this bot has no reason to be.

--
I need a Sino-Logic 16. Sogo-7 data-gloves, a GPL stealth module...

I've been following that problem... by dagg · 2002-12-08 23:05 · Score: 5, Informative

I'm an Amazon associate, and I've been following this problem. Amazon's web-bots are looking for outdated links to books that don't exist, etc. The reasoning is that if the associate fixes the dead-links, then Amazon (and the associate) will presumably make more money.

The problem is that the bots are way too diligent. They go to every single link on every single page, even if the page is dynamically generated. Many sites have an infinite combination of url's, and as a result, the bots sit on them trying to download every single variation of query. That means that Joe Amazon Associate's web site is hammered with requests and his bandwidth fees go through the roof.

The simple solution would be to just stop Amazon from sucking up the bandwidth via a robots.txt file. But Amazon says that is not allowed. There's the dilemma.

Amazon.com has been silent on this issue for the last several days. My bet is that the bots won't come back without some heavy-duty tuning.

--Your Sex

--
Sex - Find It

Re:I've been following that problem... by oo7tushar · 2002-12-08 23:10 · Score: 3, Informative

most bots have this problem when they're initially made.

Remember when you could boost your ratings on Google but trapping the bots?

--
internet like monkeys'
Re:I've been following that problem... by cottonmouth · 2002-12-08 23:41 · Score: 1, Informative

Strange thing is I am not an Amazon associate and I get those bots.
Re:I've been following that problem... by Life2Short · 2002-12-09 02:48 · Score: 1

Amazon in general seems as nutty as a fruitcake. I regularly get email messages notifying me that shipment of items has been delayed for stuff I've already received! HELLO! Today I'm informed that a DVD is "beckoning" me from my wish list when I already purchased it and watched it 2 weeks ago. HELLO! HELLO! Somebody over there needs to get the collective act together.
Re:I've been following that problem... by StevenMaurer · 2002-12-09 06:36 · Score: 2

And for the Amazon associate who is really smart, that would be a way out of the problem right now.

You just put in an invisible link at the top of your your page and assume anything that follows it must be a robot. But instead of trapping it, you switch to make all your normal links go nowhere.

Voilla! No more robot.
Re:I've been following that problem... by Mike1024 · 2002-12-09 07:12 · Score: 4, Insightful

Hey,

Amazon's web-bots are looking for outdated links to books that don't exist, etc.

Wouldn't a better solution be to modify the software at amazon.com, so that every time there was a book not found/out of date error, it logged the refering affiliate and HTTP_REFERER request header?

I can't see why they would need bots and suchlike for such a simple procedure...

Just my $0.02,

Michael

--
"Goodness me, how unlike the FBI to abuse the trust of the American public." -- The Onion
Re:I've been following that problem... by nstrom · 2002-12-09 08:16 · Score: 2

This assumes the bot uses a depth-first search method. If it uses breadth-first search, for example, it will still hit all the normal links on the page before it gets to the "trap" page.
Re:I've been following that problem... by Allaria · 2002-12-09 09:56 · Score: 1

The problem with that is that both amazon and the referrer lose money on the broken link.

What would be a better idea is that amazon should keep a db of referrers and books they link to. When a book goes out of print, it sends off an auto email to the referrers of the book, letting them know they have a dead link.

--
If a and b in c, and a can create b, and a can create a, and b can create b, and b cannot create a, then a created c.

Just block it? by jeroenb · 2002-12-08 23:06 · Score: 2, Interesting

The Associates Operating Agreement states:
Therefore, you agree that we and our corporate affiliates may take such actions and that you will not seek to block or otherwise interfere with such crawling or monitoring (and that we and our corporate affiliates may use technical means to overcome any methods used on your site to block or interfere with such crawling or monitoring).

As such, it doesn't say that you agree not to block them or that you're violating their license if you do block them. All you agree to is that they can monitor your site, but if you don't like how they do it, it doesn't state that you have to put up with their crawler. The only thing you do agree to is that they can use "technical means to overcome" your blocking. But so what? Let them waste money on attempting to monitor your site by modifying their crawler :) Does anyone believe they'd actually do that? Most likely they'll just leave you alone.

Re:Just block it? by Anonymous Coward · 2002-12-08 23:18 · Score: 1, Informative

Therefore, you agree that we and our corporate affiliates may take such actions and that you will not seek to block or otherwise interfere with such crawling or monitoring (and that we and our corporate affiliates may use technical means to overcome any methods used on your site to block or interfere with such crawling or monitoring).

Actually, doesn't it say that you aren't allowed to block it, but if you do, they can try and get around it?
Re:Just block it? by Anonymous Coward · 2002-12-08 23:40 · Score: 5, Informative

That is so blatantly wrong how can it be modded up to 4?!

It says exactly that you agree not to block them!

"you agree ... and that you will not seek to block or otherwise interfere with such crawling or monitoring"
Re:Just block it? by Anonymous Coward · 2002-12-09 01:33 · Score: 0

Welp, i just took my links off my site since i actively block bots.

Oh well...
Re:Just block it? by DrXym · 2002-12-09 02:13 · Score: 2

Except how can you agree not to block them if you're one of the vast numbers of associates who run their pages from someone elses server?

If I was runnning a network being clobbered by Amazon I would put any barriers I felt like, such as dropping their packets and there is not a damned thing they could say or do to stop me. I'm not an associate and it's too bad for them that they can't see fit to play nice.

All they have to do by TerryAtWork · 2002-12-08 23:18 · Score: 4, Informative

To make this palatable is lower the request rate to something like 1 per minute.

Most robots do something like that.

Of course - it takes a lot longer....

--
It's Christmas everyday with BitTorrent.

Re:All they have to do by valisk · 2002-12-09 00:00 · Score: 5, Interesting

Thats why they should set it to max request 1 page per minute from any one site, but check out many thousands of sites during that one minute.
Robots have been around since the web started and it suprises me that the designers of this robot havent looked at previous design and good practice.
If any of you Alexia numbskulls happen to be reading this perhaps you could buy yourself a copy of HTTP the def. guide from O'Reilly, which has a tremendously clear explanation of what to think about to prevent your robots from destroying every site they visit that isn't sat on a T3 and Sun Fire w/ 64 CPUs and 64 GB ram.

--

Economic Left/Right: -0.62
Social Libertarian/Authoritarian: -3.69
Re:All they have to do by gnurb · 2002-12-09 02:10 · Score: 1

i read that at first as "To make this *patentable*...

--
hooray! it's a sex wiki

it says... by hairmare · 2002-12-08 23:28 · Score: 2

we ignore /robots.txt and we'll circumvent every actions you take to not let us crawl your cgi-bins

kneel down and I'll spank you, associates...?

Just goes to show by zachusaf · 2002-12-08 23:30 · Score: 4, Insightful

Noting comes for free. Presumably, they are Amazon Affiliates to get a cut off a sold book. You don't get anything for free. Perhaps an opportune time to do the Barnes and Noble thing?

Re:Just goes to show by Anonymous Coward · 2002-12-09 02:20 · Score: 0

Just cos it's got the word "Noble" in the name doesn't mean it actually is..
Re:Just goes to show by Fly · 2002-12-09 10:34 · Score: 3, Insightful

That was the stupidest reply I've seen yet. (Probably because it's rated so highly.) The issue is not that Amazon is requiring something in return from their affiliates, but that they're inadvertently destroying their affiliates with a broken web-spider design.

--
end of line

Read before you sign. by nuggz · 2002-12-08 23:43 · Score: 5, Informative

It looks like these people are signing agreements they didn't read or understand.

They have a few options that I can see.

Terminate the agreement.
Bill for the bandwidth, or sue for damages.
Various technical measures (which are prohibited by the agreement)
Point out to your contacts at Amazon that this is pointless and dumb in such a manner they actually listen.

Make a mini site for the amazon site/bot, but the rest of your website in a second location (that they don't have access too)

Why deal with a company like this anyway, they're obviously inconsiderate pricks (at least) move on with your life.

Re:Read before you sign. by Anonymous Coward · 2002-12-09 06:40 · Score: 0

You could also set up bandwith throttling (e.g., mod_throttle) which will still allow them to crawl you site and protect against this class of DoS attacks.

As a DoS defense, and applied to all clients, this should not run afoul of the agreement.

amazon... by katalyst · 2002-12-08 23:44 · Score: 3, Insightful

Seems to be going the Microsoft way. They seem to be exploiting their monopoly in their sphere of business. Their recent ploy to patent their click n buy commerce system had attracted lots of attention from the people and the OS community. Many Open-letters were exchanged. But people seemt haev already forgotten; the average human, understandably is worried only about factors that affect him, and that too, immediatly. Now this new issue....

--
|/________
|\A|ALYS|

Re:amazon... by CaptainPsyko · 2002-12-09 00:35 · Score: 3, Insightful

Market Leader != Monopoly. Yes, Amazon is the king of online shopping sites. But Amazon is far from a monopoly. Amazon faces a good deal of competition in most markets, not only from other websites, but also from Brick & mortar stores. If you think that Amazon isn't competing with the bookstore down at your local mall, think again. Until that local bookatore closes, along with B&N.com Amazon will have competitors. Amazon is far from a monopoly - just a very successful store.
Re:amazon... by Freshie · 2002-12-09 02:52 · Score: 1

They might not be a monopoly, but the Canadian Postal Office mail delivery trucks have AMAZON.COM written all over them. Government contracts for cheaper shipping sounds a bit monopolistic to me.

--
'I don't want more choices. I just want better things.' - Edina Monsoon
Re:amazon... by CaptainPsyko · 2002-12-09 02:55 · Score: 1

Actually, it sounds to me like the Post Office is competing with UPS & Fed Ex - in order to compete, they have to be competitive. That happens by offering nice contracts for cheaper shipping. The trade off is they get LOTS of shipping.

Why is everyone so quick to cry monopoly? I'm one of the most anticorporateist types I know, but just because a company is big, has large market share, and deals with goverment agencies (especially ones that compete directly with private industry) does not make it a monopolist.
Re:amazon... by Freshie · 2002-12-09 03:06 · Score: 1

I understand. That's why I say 'not a monopoly'. It just seems that the big corporations are getting bigger, and the little guys trying to scrounge a few bucks on the side are getting screwed by the companies they advertise and raise revenue for. There was a time when the internet was a free open space. There were ideas to be shared, thoughts to be provoked, and money to be made. Now there's big companies, copyrighting mouse clicks, and image dimensions, deciding where you can go to buy what you want. ISP's deciding what you can and can not see, and all the provocative thought int cyberspace has been relegated to 'nerds' at /. and extremetech. It's just a little dinenheartening that one day the internet will cease to exist as an entity of it's own. It will be an affiliate to CNN-Time-Warner-MicroZon.

--
'I don't want more choices. I just want better things.' - Edina Monsoon
Re:amazon... by Cedric+C.+Girouard · 2002-12-09 04:25 · Score: 2

They might not be a monopoly, but the Canadian Postal Office mail delivery trucks have AMAZON.COM written all over them. Government contracts for cheaper shipping sounds a bit monopolistic to me.

Which in turns means cheaper stamps for us to send mail with. I dont see anything wrong with Canada Post selling otherwise useless space on it's trucks to Amazon. And the day you start shipping as much as Amazon does, don't worry. Canada Post will cut you a good deal too.

--
Marriage is considered capital punishment for the theft of a goat in some third world countries...

its not running at the momant by jkcity · 2002-12-08 23:47 · Score: 5, Informative

http://forums.prosperotechnologies.com/n/mb/messag e.asp?webtag=am-associhelp&msg=2579.1&maxT=3">http ://forums.prosperotechnologies.com/n/mb/message.as p?webtag=am-associhelp&msg=2579.1&maxT=3

ok that is a post from the associates board

in which amazon state

"Hello Associates.

Thank you for providing such valuable feedback. The Alexa crawl (id amzn_assoc) has ceased while we investigate the statements made in this post. We plan to address the following concerns:

1. The impact the crawler may have on bandwidth
2. The number of pages the crawler hits per second
3. How the Alexa crawler might identify and ignore AWS pages or links

Points of clarification:

1. Regarding Archive.org, Alexa has confirmed that material that is crawled by the 'amzn_assoc' crawler is not donated to the internet archive. It is used exclusively for the purposes of the Broken Link Reports.

2. The Alexa crawler 'amzn_assoc' differs from the 'ia_archiver' crawler. The 'ia_archiver' can be excluded by using a robots.txt file and will not violate the Amazon.com Associates operating agreement.

You should expect a response from us by COB Friday as it may take a few days to research your concerns. This issue is important to us and we will get it resolved. Thank you for your patience.

The Amazon.com Associates Program"

I participated in that conversation myself though and I don't think I seen one happy person that though making the agreement so we had to let them crawl our sites as often as they like.

cj.com report error links but they do it from the server end, amazons system is just stupid and it was only done to try and give there alexa company some work todo.

so I guess its just wait and see now till we know if the bot starts back up again.

Re:its not running at the momant by jkcity · 2002-12-08 23:50 · Score: 1

here is the link from above

thats the link I was supposed to post kinda messed it up at the top, sorry :).

1984? by httpamphibio.us · 2002-12-08 23:54 · Score: 2, Funny

I haven't read 1984 in a long time, but I don't remember big brother coming from the amazon.

--
sig.

;-) had to be said by Anonymous Coward · 2002-12-09 00:00 · Score: 5, Funny

Interesting stance from the folks who called on the Senate to prosecute those who degrade the technical quality of service at web sites

Whoah! That'd mean Slashdot would have every senate lawyer after it right?

ia_archiver = wayback machine by cstrom · 2002-12-09 00:05 · Score: 5, Informative

As has been noted elsewhere, the affiliate bot ignores robots.txt. Disallowing ia_archiver will have the effect of removing the site from the wayback machine (http://www.archive.org/), which may not be what you want to do.

It is like that old saying... by Anonymous Coward · 2002-12-09 00:14 · Score: 0

I participated in that conversation myself though and I don't think I seen one happy person that though making the agreement so we had to let them crawl our sites as often as they like.

"Sleep with the Devil, get hammered hard in the ass".

Ok, so I just made that one up. But it could have been an old saying. It's not as if it is any worse than some clichés people spew out all the time, after all. :)

Maybe by QQ2 · 2002-12-09 00:22 · Score: 3, Funny

Maybe he's one of the boys from brazil.

Amazon Amazon Amazon.... by Anonymous Coward · 2002-12-09 00:26 · Score: 5, Funny

Seems every other link on the 'net is a link to some book on Amazon. All too often I'll follow an innocent looking link and find myself at Amazon yet another time.

Reminds me of that old horror movie where they try to drive away from a haunted house, but every road they take leads them back up the driveway to the place.

Re:Amazon Amazon Amazon.... by Anonymous Coward · 2002-12-09 03:18 · Score: 0

Seems every other link on the 'net is a link to some book on Amazon.
That's oh so true.

Clarification by jeroenb · 2002-12-09 00:34 · Score: 2, Interesting

If you agree not to block or interfere with crawling or monitoring, you're not telling them they can do whatever they want. You agree they can crawl and/or monitor your site, but not doing that in any way *they* want to.

It's OK if they crawl/monitor my site using a bunch of people surfing my site all day long. I won't attempt to block that. Anything else, I might.

Re:Clarification by Izeickl · 2002-12-09 01:34 · Score: 3, Interesting

There is no clause in the contract saying "You will not block our crawler/monitor as long as you deem it ok", you quite simply agree to let them monitor it with no restrictions, the added clause "but not doing that in any way *they* want to" is your opinion and addition, but not actually within the contract agreed, so unless you get a private agreement or they change it themselves its not written that you have to like the way they do it.

--
Laptop Reviews
Re:Clarification by jeroenb · 2002-12-09 01:40 · Score: 2

Unless they state in their contract *how* they're going to crawl/monitor, I do have the right to block whatever I want without violating this as long as I don't prevent them from crawling/monitoring at all. So yeah this is a pretty useless agreement, but it's mostly very stupid instead of restrictive (although everybody seems to believe the latter.)
Re:Clarification by Tony-A · 2002-12-09 02:16 · Score: 2

"(and that we and our corporate affiliates may use technical means to overcome any methods used on your site to block or interfere with such crawling or monitoring)."
Depending on exactly what "technical means" means, this sounds like:
All your sites are belong to us.

I wonder... by Anonymous Coward · 2002-12-09 00:44 · Score: 3, Funny

"... called on the Senate to prosecute those who degrade the technical quality of service at web sites." Would that include the Slashdot effect?

Amazon figures by slashuzer · 2002-12-09 00:57 · Score: 0

Do as I say, not as I do. I am not surprised by this attitude.

Way around Amazon's partner agreement... by pla · 2002-12-09 01:06 · Score: 5, Insightful

Simple 'nuff...

Just temporarily (perhaps 1 day) block ANY client's class C (not just that of Alexa's crawler) that starts generating more than X hits per second for longer than five minutes.

By doing so, you haven't taken steps to specifically thwart *Amazon's* activity, you have simply enacted a reasonably security measure to block DOS attacks. If Amazon actually dared to sue for blocking them, you'd have a HELL of a countersuit on the grounds that their 'bot triggered your DOS alarm.

Personally, I'd just block their bot and if they complain, tell them where they can stick their partner agreement. No self respecting online retailer needs their own "partners" degrading their QOS. Anyway, When I want to buy something, I use either Google, or a product-specific price-search engine (like PriceWatch). Amazon counts as my LAST choice for finding something (actually not quite true... If I need to use Google to find a product for sale, I often check Amazon first, just to get things like UPC or ISBN numbers to narrow my search).

Re:Way around Amazon's partner agreement... by SpikeSpiff · 2002-12-09 04:36 · Score: 2, Insightful

When I want to buy something, I use either Google, or a product-specific price-search engine (like PriceWatch). Amazon counts as my LAST choice for finding something (actually not quite true... If I need to use Google to find a product for sale, I often check Amazon first, just to get things like UPC or ISBN numbers to narrow my search).
This should be called the fundamental (slashdot) attribution error. Assuming that we are representative of the market.
Reminds me of a VC I know. They were sitting in a conference room back in 1998 hearing a pitch from an online bill presentment company. The partner's first objection was that obviously everyone already had online banking and bill payment. To prove it, he asked everyone in the room if they had online banking. Everyone did.
Out in the real-world, > 2% of people had online banking.

--
"All that is required for evil to triumph is for good men to do nothing." - Edmund Burke
Re:Way around Amazon's partner agreement... by Anonymous Coward · 2002-12-09 05:13 · Score: 0

> By doing so, you haven't taken steps to specifically thwart *Amazon's* activity, you have simply enacted a reasonably security measure to block DOS attacks. If Amazon actually dared to sue for blocking them, you'd have a HELL of a countersuit on the grounds that their 'bot triggered your DOS alarm.

The partnership agreement doesn't allow for this. Even though it's effectively a DOS attack, a partner still has to allow it.

Amazon lawyer: "So, you had no idea what the name or IP of the Alexa bot was? You had read no newsgroups or web sessions, you had No Idea what was going on, is that what you're telling this Court, Mr Associate?...."

Associate: "Uhhhhhhh....."
Re:Way around Amazon's partner agreement... by Anonymous Coward · 2002-12-09 09:33 · Score: 0

Associate: I knew, I just didn't go to any extra trouble to let their DOS attack through. They matched the same pattern as the script kiddies, so their packets got treated the same way.

There is an easy fix for this!!!! by shawnwe · 2002-12-09 01:35 · Score: 3, Interesting

Instead of crawling websites, why doesn't amazon and other companies just require you to have formated index of all the links you provide on your website. Could be amazon.xml in the root. And this file could be dynamic or hand-typed...

http://www.yourwebsite.com/amazon.xml http://www.somewebsite.com/~yoursite/amazon.xml

Re:IMPORTANT ANNOUNCEMENT - PLEASE READ by Anonymous Coward · 2002-12-09 01:46 · Score: 0

They'd only read User Friendly if they were seeking a really, really tiny bit of humor. Like perhaps some kind of trace element of humor.

Ah, so that's what ia_archiver is... by Alioth · 2002-12-09 01:58 · Score: 5, Informative

A while back (when I was still using a CobaltRaQ2 - adequate for the job, but not particularly speedy with cgi scripts) I got DoSSed by ia_archiver (yes, cgi-bin is in robots.txt, no I'm not associated with Amazon, but someone else who links to the cgi-script in question probably was). I thought ia_archiver was another Teleport Pro, and just modified the acutal script to display a rejection page if it saw ia_archiver in the HTTP_USER_AGENT.

Finally, I know what it is...

It was trying to crawl *every* available url for the CGI script - and it appeared to be buggy because it got itself into an endless loop changing from one mode to the other.

--
Oolite: Elite-like game. For Mac, Linux and Windows

pwned :-) by Deal-a-Neil · 2002-12-09 02:20 · Score: 2

I think that you've just violated your agreement by sharing your revenue information, and your choice of punishment is either a amzn_crawler DoS or a monetary penalty of 20 times your gross revenue generated.

Blocking... by Orne · 2002-12-09 02:23 · Score: 2

Is it time to add Amazon to the /etc/hosts.deny file?

If you're a member company, employing Amazon's services, then in my opinion you should be responsible for providing Amazon with the links you want Amazon to vend, not that Amazon should crawl through your site for your pricing information...

This is a stupid idea!!!! by tomblackwell · 2002-12-09 02:27 · Score: 4, Interesting

There is no guarantee that the "formatted index of all links" is accurate, or up-to-date. Amazon wants to make sure that every single amazon affiliate link meets their criteria.

Your solution would work only for the intelligent and diligent and lucky. There are many Amazon affiliates who are neither.

Ironic by deanj · 2002-12-09 02:32 · Score: 1

You know, I find it really ironic that the page where they explain about how they're looking for broken links as a link to alexa.com/associates that's broken.

Goofballs.

Before you disallow ia_archiver... by Lowca · 2002-12-09 02:35 · Score: 1

... take a look at this comment: http://yro.slashdot.org/comments.pl?sid=47296&cid= 4843032

--
Utilizing magnetic schemata since

Re:Before you disallow ia_archiver... by Alioth · 2002-12-09 21:26 · Score: 2

I only disallowed ia_archiver from the cgi-scripts - it wasn't impacting on the rest of the site (and since I thought at first it was another program like Teleport Pro, I didn't want to disallow it for the whole site). My robots.txt also instructs crawlers not to go into my cgi-bin directory.

--
Oolite: Elite-like game. For Mac, Linux and Windows

Alternatives to Amazon! by Flow · 2002-12-09 02:41 · Score: 5, Informative

If you don't like the tactics of Amazon, there are alternatives. One of the best is BookSense.com. Not only do they offer an affiliate/partner program, you'll also be supporting independent bookstores (rather than the chains or Amazon):

http://www.booksense.com/affiliate/

Re:Alternatives to Amazon! by Anonymous Coward · 2002-12-09 03:52 · Score: 0

And Powell's World Of Books...
Re:Alternatives to Amazon! by azaroth42 · 2002-12-09 06:24 · Score: 3, Insightful

But do they have a web services interface? That's the important part.

--Azaroth
Re:Alternatives to Amazon! by Anonymous Coward · 2002-12-09 11:31 · Score: 0

Oh God! Come on, moderators! This is supposed to be funny, not Insightful. It comes from a quote by some MS big wig talking about web services. It went something like 'but in terms of web services, does amazon have a presence? no.'

The post doesn't even make any sense, much less has any insight.

Can someone clarify? by callipygian-showsyst · 2002-12-09 02:44 · Score: 4, Interesting

What's the deal here? It's hard to beleive this is malicious--probably just the result of Amazon hiring the cheapest possible kids to do the Perl hacking/crawling. If they hired more experienced professionals, they may have been able to crawl their affiliated sites better.

Amazon is crawling these sites so that they can be featured on their website. When you search for an item, Amazon lists the prices and availability from the associates--everyone wins.

It seems that Amazon is searching a bit too often--combined with some affiliated sites that have very s-l-o-w dynamic pages, which is causing some problem. It's hardly a crime that Amazon is commiting--after all they want the most accurate, up-to-the-minute information on their website.

--
Best Buy can have you arrested

Timing? Christmas sales? by r2ravens · 2002-12-09 02:47 · Score: 5, Insightful

The timing of this problem is interesting. A few years back, we had the problem of the one-click patent and the fact that Amazon used it to disrupt the christmas sales of Barnes and Noble. It seems that the one-click thing became a less pressing problem on December 26. Although I can't remember the specifics of other events, it sticks in my mind that other ploys used to disrupt competitors businesses have been timed to screw with the christmas season.

I know that the people being DOS'ed by Amazon are defined as 'affiliates', but maybe Amazon percieves 'affiliates' in the same way Microsoft percieves 'partners'; people to use and then buy or destroy. How much you wanna bet that this problem goes away after christmas? Of course, the claim will be that it was brought to their attention and it was fixed, but the timing of the whole thing is very suspicious. Perhaps this was the plan all along.

In these days of slim margins in business, maybe Amazon figures the average internet user is smart enough to figure that it their preferred site is slow, they will go directly to Amazon for their purchase and Amazon would be able to avoid reimbursement of their 'affiliate' for the sale.

Has this problem been going on, but been unnoticed for a while, or did it just start? I'm no consipiracy theorist, but the elements seem to be there for this to have been intentional and the timing is very suspicious. Why couldn't they have done this last month, or the month before if they're just checking for outdated links? Am I out in left field with this idea?

Anyway... just a different perspective and some food for thought.

--
War is Peace. Freedom is Slavery. Ignorance is Strength. - George Orwell or George Bush?

Re:Timing? Christmas sales? by Anonymous Coward · 2002-12-09 04:03 · Score: 0

Nice hypothesis, but it doesn't add up. Without the affiliates, Amazon's google ranking would plummet. And without the affiliates clogging the search engines, it would be easier for websurfers to find real competitors like Barnes&Noble, Powells, Booksense.... That has to be worth far more than the commissions paid to affiliates.
Re:Timing? Christmas sales? by ColdGrits · 2002-12-09 04:08 · Score: 2

"I'm no consipiracy theorist, but..."

You just KNOW that when someone uses that line, then you are in for a nice whacky conspiracy theory that doesn't stand up to more than half a second's scrutiny. And you just confirmed that.

Hint - IF Amazon were deliberately DOSing a site (as opposed to simply runing a link-checking robot written by a clueless moron as is the case here), THEN the site woudl be too slow for people to even GET to the Amazon links, and thus would not think to go to Amazon directly (why woudl they go to Amazon if they don't know there is something being recommended in the first place)?

"I'm no consipiracy theorist, but it's all a conspiracy, I tell you!"

--
People should not be afraid of their governments - Governments should be afraid of their people.
Re:Timing? Christmas sales? by Anonymous Coward · 2002-12-09 04:35 · Score: 0

Oh, come on now. There's at least a little bit of sense in this theory. How many other online book retailers can you think of off the top of your head? How many can the average browser? Of course there are the web-stores of the book chains, but Amazon certainly has a reputation for being cheaper than them, especially to the average user. It's not an unreasonable assumption that, should an Amazon affiliate be down, the user would go straight to Amazon and try their search there.

At the same time, this seems a bit underhanded even for a large corporation. Maybe some rogue coder thought it would be funny, but I doubt this is a management-approved strategy (although the timing of the patent lawsuits certainly could have been).
Re:Timing? Christmas sales? by idontgno · 2002-12-09 06:24 · Score: 1

Agreed.
Fundamental Rule #1 of Understanding Conspiracy Theories:
Never attribute to malice that which can be adequately explained by stupidity.

--
Welcome to the Panopticon. Used to be a prison, now it's your home.
Re:Timing? Christmas sales? by pjrc · 2002-12-09 07:05 · Score: 3, Insightful

I'm no consipiracy theorist, but the elements seem to be there for this to have been intentional and the timing is very suspicious.
Like the timing of responding publicly quite promptly.
Like the timing where they disabled the 'bot soon after some people posted concerns about it?
If it really were some sinister plot to rob associates of their referal fees (which could be done much more easily by simply making accounting errors, Enron or RIAA style), don't you think they would have remained silent, or at least kept the 'bot running as a lengthy "though investigation" proceeded until the 26th?

--
PJRC: Electronic Projects, 8051 Microcontroller Tools

stop whining :) by Lazy+Jones · 2002-12-09 03:03 · Score: 2

Amazon pays so much in affiliate fees that they can have all the bandwidth they like from us ... I've seen much worse crawlers, from german search engines to broken proxies doing 10 hits/second on dynamic pages to stupid windows users who wanted to make our (very dynamic) website available for offline browsing. If you can't take a few 1000 hits/day because your CGIs are so slow, then what is your site doing on the web anyway? ;-)

--
"I love my job, but I hate talking to people like you" (Freddie Mercury)

Powells Offers More by Anonymous Coward · 2002-12-09 03:39 · Score: 1, Interesting

Powells Books offers a better associate program for web sites. Why even deal with Amazon's crap?

Informed View by peterdaly · 2002-12-09 03:58 · Score: 5, Informative

I am an Amazon Associate who has experience with the Alexa Crawler. I believe the crawl is intended to find broken links, or links to products that are no longer stocked.

The Amazon Associates program has been around long enough for "page rot" to kick in, and I am sure there are many sites out there with links to non-existant products, such as old editions of books, etc. Historically, associates had to build static links (for the most part) by hand, and embedded them in more or less static page.

The problem comes in due to the recent introduction of their web services, where sites can build essentially unlimited pages based on dynamic real-time queries to amazon. I don't believe their intent is to "thrash around" in these sites, which is what is occuring.

A few month ago, I asked to have the Alexa bot crawl my site, (StarvingMind.net) , I was curious about the reports it was able to generate. The bot ended up in endless loops and had to be manually stopped by someone at Alexa. They spent an impressive amount of time trying to identify and fix the problem my site was creating for their bot. I don't know whether my specific problem was ever resolved, but I have the impression the bug was found and fixed. I also have the impression that the bot is very immature code and buggy.

Based on the personal and public responses I have seen from the Amazon and Alexa people involved, they actually do care about these issues very much, and don't wish to cause harm by the bots use. I believe their goal is to eliminate the link rot that has accumulated on associate sites over the years, manytimes with the site owner unaware of the problem.

Web services threw a curve into the mix, and that is where the major problems are occuring. The post I a replying to seems to imply Amazon may want to "use then throw out" the associates. I think that is pure speculation without any knowledge of the fact. Amazon has recently gone from what appeared to be no fulltime staff to a team of people dedicated to supporting and running the associates program. I believe they consider it a very cost effective way of advertising, and I expect it is doing quite well for them. Based on their recent actions, I believe they are trying to build a strong long term relationship with the active ones of us, as we bring them a fair amount of business.

Another post has pointed out they have stopped the crawl while the issues talked about here are looked into. They realize they may have made a mistake, and are trying to figure out how to address the problem. They have been responsive (with me at least) resolving problems like this in the past, they deserve a chance to resolve it this time as well. They have started down the right path, by stopping the crawl.

-Pete

--
Soccer Goal Plans

What's with the pro-active solution... by Boss,+Pointy+Haired · 2002-12-09 04:05 · Score: 2

...to what any sensible software engineering team would have built as a re-active solution?

Problem:

Some of our affiliates have out of date links.

Dumb Solution:

Create stupid high bandwidth consuming spider that endlessly crawls affiliate sites looking for out of date links;

or

Sensible Solution:

When an out of date link comes along to the website, display an apology screen to the visitor (whilst not letting up on any other sales opportunity) and email the affiliate telling them to get their site up to date.

Some people just don't fink.

Re:What's with the pro-active solution... by tony_gardner · 2002-12-09 04:25 · Score: 2

In the mean time, you've just lost at least one sale per broken link. Perhaps they don't think that's acceptable?
Re:What's with the pro-active solution... by Cedric+C.+Girouard · 2002-12-09 04:58 · Score: 3, Insightful

In the mean time, you've just lost at least one sale per broken link. Perhaps they don't think that's acceptable?

Amazon seems to be good at recommending items in relation with what you're searching for... Why not just force-feed another one of theses "People who searched for this item also enjoyed these (totally unrelated by the way.) items."
That way you potentially save a sale (dont tell me that every single person who clicks on one of those amazon links actually BUYS the product.) and you manage to annoy the reader with some free ads, and potentially screw the associate out of a sale. Everyone wins. (Ok. except perhaps the associate.)

--
Marriage is considered capital punishment for the theft of a goat in some third world countries...
Re:What's with the pro-active solution... by Alsee · 2002-12-09 10:35 · Score: 1

Just a suggestion, maybe add "Montreal" somewhere in your sig? If it's your wesite maybe enlarge the text "Montreal's rave community" on the home page, possibly in both english and french.

Man, I live in the New York City region, but it's like outer Mongolia for rave info.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
Re:What's with the pro-active solution... by Cedric+C.+Girouard · 2002-12-11 12:56 · Score: 1

Just a suggestion, maybe add "Montreal" somewhere in your sig? If it's your wesite maybe enlarge the text "Montreal's rave community" on the home page, possibly in both english and french. Man, I live in the New York City region, but it's like outer Mongolia for rave info.
Really offtopic here, but since you're not listing an email addy, I cant get back to you directly.

We're not aiming to stay local. We're really aiming to have local people help us cover their local rave scene. I'd really like to be partying in montreal, toronto, and NYC, but there's only one of ol'lil'me, and just so many hours in a week. If you want to take up the NYC section of the site, just drop me a line, and we'll make you some room. Even get on a bus and go party with you guys sometime.

--
Marriage is considered capital punishment for the theft of a goat in some third world countries...

Why the Alexa crawler is great and awful by Anonymous Coward · 2002-12-09 04:20 · Score: 2, Interesting

Alexa's web crawler is great from one perspective and terrible from another.

On the great side their crawler can easily use an entire T3 with just a stock PC driving the requests.

On the terrible side the crawler has is stateless - it has NO IDEA OF WHAT IT'S RECENTLY DONE. It doesn't know when it has hit a particular site 1M times in the last hour.

So when they say "it only crawled each site on average every 4 seconds" that is on average. You know, take total urls divided by total time. Doesn't say anything about how hard they hit aaa.com

The problem is that the crawler is designed in the extreme to be efficient. Keeping site stats and blocking GETs is inefficient.

You generate a list of URLs for it to crawl. It blindly crawls this list in order. To prevent aaa.com from getting hit with the first 100k requests (assuming aaa.com has 100k urls in the list) you randomize the list before crawling.

Problem is the randomization isn't perfect, and also any site with a high % of urls in the list is still going to get hammered.

Now I don't know if this is the crawler Alexa used on the associates. But I wouldn't be too surprised.

Re:Why the Alexa crawler is great and awful by Perspicacious · 2002-12-09 10:44 · Score: 1

Just a note on your comment about "it only crawled each site on average every 4 seconds" They actuall address this: "To eliminate any possibility that a particular site will be hit more then once every two seconds, our crawler records the most recent two minutes of IP addresses and calculates the frequency of a particular IP within that two minute time frame. If the IP has been hit above a programmable threshold set at two seconds, the page is sent back to the cache and not crawled until the frequency is greater then two seconds." Last time I checked, I can surf faster then that.
Re:Why the Alexa crawler is great and awful by Anonymous Coward · 2002-12-09 19:45 · Score: 0

By this measure, they could hammer a site for two minutes back off for two and then hammer it again.

Don't laugh. It has happened to their crawler. Especially where a site has an infinite number of URLs due to session ids and becomes a significant fraction of the crawler url list.

OK from here by Animats · 2002-12-09 04:26 · Score: 2

Even though I signed up for the Amazon.com associates program for Downside, I'm not seeing any hits from strange user agents. That's a relief, because I have hundreds of links which change daily. I don't need some 'bot trying to download the entire database of financial statements for every US public company.

Looking at user agents, the browser war is over. IE is #1, and Netscape often isn't even in the top 10; various indexer 'bots generate more traffic than Netscape.

Re:OK from here by Anonymous Coward · 2002-12-09 05:22 · Score: 1, Informative

> Looking at user agents, the browser war is over. IE is #1, and Netscape often isn't even in the top 10; various indexer 'bots generate more traffic than Netscape.

Looking at user agents is incredibly foolish since most browsers' agent strings default to IE and most users don't change that default.

Amazon Interview Experience by Anonymous Coward · 2002-12-09 04:31 · Score: 0

Amazon's hiring practices are questionable at beast. I interviewed with them four years ago for a coding job. The interview consisted of a woman in a room asking me if I had any questions. I had a bunch, which she attempted to answer ... but after 5 or 6 minutes, I realized this was all there was to the interview, and that the company had either sent a horrible interviewer, or that this wasn't the place for me.

The one question she did ask was whether I was faimiliar with Windows. I wasn't sure in what context she was speaking, so I told her I'd done some programming in it, but I was more familiar with programming in UNIX. A frown crossed her face, and she said "I guess you wouldn't know why I can't get my email then". The woman had a dial-up laptop card that wasn't plugged in to the wall.

She promised to follow up with me. Never did. I wasn't upset. Perhaps it was one bad interviewer, but if somebody gave her a job, I'd hate to know what else they have working for them.

Re:Amazon Interview Experience by callipygian-showsyst · 2002-12-09 05:10 · Score: 1

...to go off on a (slight) tangent.
First of all, I do think this is relevant. If a company has poor quality software, then they're probably hiring badly. And Amazon's over-agressive spider, that gets caught in infinite loops on CGI-BIN directories--certainly is poor quality.
Just last month, I interviewed for two jobs (and got two offers!). I declined both opportunities, in part because neither asked me any technical questions during the interview process! A complete fraud who interviews well could have had the jobs (and one didn't even check references!). I wouldn't want to work for a company that doesn't inteview well.

--
Best Buy can have you arrested

Breakin' the law, Breakin' the law! by Anonymous Coward · 2002-12-09 04:38 · Score: 0

From the Amazon User Agreement:
"You are granted a limited, revocable, and nonexclusive right to create a hyperlink to the home page of Amazon.com so long as the link does not portray Amazon.com, its affiliates, or their products or services in a false, misleading, derogatory, or otherwise offensive matter."

So if I write Amazon sucks, I'm no longer allowed to visit or buy stuff from Amazon. Oops, darn.

21 Dog years by Anonymous Coward · 2002-12-09 04:59 · Score: 1, Interesting

This is slightly offtopic, but if you are in the NY area, I highly reccommend you see the play "21 Dog Years: Doing Time@Amazon.com" about a guy who went from customer service to bizdev to resignation. It's based on this book; and yes it is very funny that Amazon carries it. They profit from their own critics.

Re:Amazon sucks. by Cedric+C.+Girouard · 2002-12-09 05:02 · Score: 3, Informative

My wife used to work for Amazon. She was attacked by a coworker and forced to quit because the management would not do anything about it. She had to visit the doctor for months after the attack that gave her whiplash and nerve damage. In my mind, Amazon is a very bad company and should be punished.

That sounds weird... Isnt the US "Land of the lawsuit" ? I've read about people suing companies for sexual harrassment, and winning. Now you get physical damage, assault and whatnot, and she has to quit ? Wouldnt one of those late-nite 1-800-SUE-ME lawyers take this case ? Seems pretty much open and shut to me.

--

Marriage is considered capital punishment for the theft of a goat in some third world countries...

from Amazon's submission to Congress... by BonThomme · 2002-12-09 05:02 · Score: 2, Interesting

"Absent from our suggested federal response is a role for the Federal Communications Commission. The reason is straightforward: the distributed denial of service attacks involve coordinated and criminal transmission of content over the Internet. It is hard to see how the FCC has statutory authority over such matters. Yet even if it had, or were given, such authority, the agency currently lacks the resources and expertise to do what is necessary at this point, namely, to fight the criminal activity. Simply put, useful FCCinvolvement would require statutory changes, additional resources, and additional expertise to succeed. This is work better left to law enforcement agencies."

Okay, note the line "...distributed denial of service attacks involve coordinated and criminal transmission of content over the Internet"

Criminal transmission of content? WTFF??

Note also how it goes on to say the FCC shouldn't get involved since "FCC involvement would require statutory changes..." In other words, let's not waste time with all this analysis and law-making business and just get straight to the enforcement of what we want.

Re:from Amazon's submission to Congress... by Anonymous Coward · 2002-12-10 04:23 · Score: 0

Okay, note the line "...distributed denial of service attacks involve coordinated and criminal transmission of content over the Internet" Criminal transmission of content? WTFF??

Seems pretty straightforward to me. Packets are information (content). Denial of Service attacks are a transmission of this content. Denial of Service attacks are also illegal.

Therefore, Denial of Service attacks are an illegal transmission of content.

The quote you give seems pretty clearly to me to be saying, DoS attacks are already illegal, and the law enforcement personel are doing a perfectly good job of keeping up with the problem; The law already covers this perfectly well, we don't need any new laws; Don't fix what isn't broken, especially when there will be wierd side effects (setting a precedent of the FCC regulating the internet based on what is done there, moving the enforcement of anti-DoS investigations to an inexperienced agency). That seems reasonable to me.

evil amzon by Anonymous Coward · 2002-12-09 06:08 · Score: 0

all your bots are belong to us...

Re:Read before you sign (and before you post) by pjrc · 2002-12-09 06:50 · Score: 2, Informative

They have a few options that I can see.

Terminate the agreement.
Bill for the bandwidth, or sue for damages.
Various technical measures (which are prohibited by the agreement)
Point out to your contacts at Amazon that this is pointless and dumb in such a manner they actually listen.

Here's an idea.... How about politely posting a question or two about it in the appropriate forums? Who knows, something crazy might happen, like responsible people at Amazon might respond and turn the bot off while they investigate. Then, they might post a reasonable explaination and take reasonable steps to make sure they're not abusing associate's servers.

Here's another idea.... Try reading the pages that slashdot linked to. I know that's a lot of work, so I'll save you a bit of effort by posting each slashdot link, and a brief summary of what you would have found had you bothered to click on it and ACTUALLY READ it (before posting here with a subject advocating actually reading the terms and conditions).

Amazon Associates and Web Services developers are crying foul over the hammering they're taking - Alan Richmond comments that the bot made 13406 hits in 17 hours on November 26, transfering a total of 200 megs. Many posts preceed this, and several follow it. It's all pretty level headed discussion. Many people seem to feel the bot is not designed that well and ought to be improved, but very little of it amounts to "crying foul". Even Alan says he want an explaination. Nobody is terminating their agreement, attempting to recoup significant losses, threatening to sue, advocating blocking (other than discussion of robots.txt). People in the forum are expressing their concerns "in such a manner they actually listen", which happens to be a polite, level-headed manner... which you would know of had you actually read the forum, rather than blindly posting here that the associated should read the terms and conditions before they "sign".
Amazon fessed up - Amazon explains what they're doing, and why, and the steps they've taken to avoid abusing servers. They claim they've designed the bot to avoid accessing any server more than once every two seconds (Alan's example is 13406 hits in 17 hours, or one hit every 4.56 seconds, on average)
Amazon acknowledged problems exist - They actually say they're investigating, and while they're investigating their bot's impact, they've taken it off-line. They also answer the question that appears frequently in the forum... the purpose of "ia_archiver" vs "amzn_assoc". It's not clear what they'll actually do, but they obviously are trying to respond to people's legitimate concerns
but points to recent Operating Agreement changes - Yes, while Amazon appears to be taking the matter seriously, they also are making it clear that they expect to be able to verify the accuracy of links from associates. They explain the purpose in the agreement (and it's really not that unreasonable, is it?)

This just isn't that sensational of a story. Yet another 'bot that needs some refinement, but a it IS designed to avoid more than one hit every 2 seconds (and the evidence posted seems to be consistent with that). They at least did respond to people's concerns and they took the bot off-line while they investigated it. Sounds pretty reasonable. It's not clear what might actually be done, and some of it appears that Amazon is claiming the problem isn't so great... but clearly they are attempting to respond to people's concerns.

Amazon feels they have a right to check the links on associate sites, and they put it in the terms. Again, it's really not that unreasonable.

What is unreasonable is the inflamatory summary appearing on the main slashdot page. Yes, timothy and other slashdot "editors" can claim it's all just editorial from "theodp" who submitted the summary. But what kind of editing it that?

The summary concludes with:

... Amazon and any of its corporate affiliates the right to do so, but also to use unstated technical means to overcome any methods that are used to try to block or interfere with such crawling or monitoring. Interesting stance from the folks who called on the Senate to prosecute those who degrade the technical quality of service at web sites.

The link is to Amazon's position on DDOS attacks... there's really no similarity to a well-intentioned 'bot, which clearly identifies itself, limits itself to 0.5 Hz access rate, AND was responsibly taken off-line and reexamined when some people complained that it used too much bandwidth.

--
PJRC: Electronic Projects, 8051 Microcontroller Tools

So why are they crawling me? by iggymanz · 2002-12-09 06:57 · Score: 2

Alexa is all over my web logs every day....I don't even link to amazon (or any other commercial site, just some basic open source ones...apache, openbsd, sourceforge, etc)

Soon I might just block them....but I would like to know how I got on their list of sites to crawl to excess.

Instead of the spidering... by Xenographic · 2002-12-09 06:57 · Score: 2, Interesting

... why don't they just collect the 404s off the requests to their site? No need for spiders; if someone puts up a bad link, they can find out as soon as someone clicks on it. *sheesh*

Re:Instead of the spidering... by Anonymous Coward · 2002-12-09 12:10 · Score: 0

This would be a bit like Microsoft's new practice of having their software report back to them each time it crashes on a user's computer. Probably better than no reports at all but I'd prefer that they find their own problems before I run into them myself.

premature deployment by Anonymous Coward · 2002-12-09 07:28 · Score: 1, Insightful

Never ascribe to intent what may be accounted for by simply rolling out premature code that has been subjected to very little test. Amazon has a bias toward making schedules at the expense of testing.

Of course its timing by Anonymous Coward · 2002-12-09 07:34 · Score: 0

Amazon's software schedules follow the same seasonal cycles as the rest of the company. It is likely that this happened simply because the software was getting rushed out into production just in time for the holiday rush (so the engineers working on it can be assigned to holiday duty somewhere else).

Re:Amazon sucks. by Anonymous Coward · 2002-12-09 07:43 · Score: 0

Too bad you were too much of a pussy to do anything about it, yourself.

Too busy jerking off to the latest Michael Jackson photos, eh?

Better Thread by Chetmurray · 2002-12-09 07:47 · Score: 2, Insightful

Here amazon admits the issue and how they have stopped the bot until they can investigate the issue.

Amazon is actually very affiliate friendly. They have banned the scumware like wurldmedia, ebates and others that try and hijack affiliate comissions. Unlike affiliate programs by overstock.com,buy.com and others that are so desperate for short term cash they will screw over their current affiliates for some quick cash.

Considering buy.com is so deep in with the scumware people, i am surprised slashdot.org advertises them.

I did read by nuggz · 2002-12-09 07:56 · Score: 2

I did read the links.

Amazon released a bot that negatively affected the affiliate websites.

This is at the very least inconsiderate.

I posted my opinion how this or similar activities COULD be handled.

You seem quite defensive about it, were you the one who wrote a buggy bot?

Google hits me more than 1/sec by Anonymous Coward · 2002-12-09 08:11 · Score: 0

Funny though they can do no wrong

Re:Amazon sucks. by rawg · 2002-12-09 10:30 · Score: 1

Its a matter of proof. The only people that seen it happen are on the same side as Amazon because they do not want to loose their jobs or get in trouble. None of the lawyers that we have contacted will take the case. Its really retarded. We tried for 6 months to do something, and now we have given up. a) We do not have the money to fight them. b) No free lawyer will go against Amazon. We are sueing the attacker though . I didn't want her to work there anyway.

--
The above is not worth reading.

Re:Amazon sucks. by rawg · 2002-12-09 10:35 · Score: 1

Yeah, I'm not really into picking fights with women. Especially 7' 300 pound black women from the desert. She is being sued though.

--
The above is not worth reading.

Ditto by Anonymous Coward · 2002-12-09 11:24 · Score: 0

I too have steadfastly not used Amazon, and I find the noamazon.com site quite useful.

No they don't by Anonymous Coward · 2002-12-09 12:06 · Score: 0

Oh yes they do. by Anonymous Coward · 2002-12-09 13:29 · Score: 0

I didn't know you read my http logs. Not only do they hit my site that often they also hit my office mate homepage that often.

28,000 hits from Googlebot in 1 hour! by Anonymous Coward · 2002-12-09 13:44 · Score: 0

That's right 28,000 hits to a single dns entry in one hour, which is about 7.78 hits/second.

Disclaimer! by Mullen · 2002-12-09 14:47 · Score: 2

I use to work for Amazon.com as a Unix Admin and I can tell you Amazon and Alexa are barely related. They are two different companies, it's just that one owns the other. Barely anything between them on the computer system level is intergrated. The main offices for Amazon.com are in Seattle and Alexa offcies are in S.F., Ca.
If someone is making a mistake at Alexa, Amazon.com can not really be held responsible.

--
Linux O Muerte!

Slashdot Mirror

Amazon Bots Cause Grief For Associate Web Sites

136 comments