Slashdot Mirror


Webmasters Pounce On Wiki Sandboxes

Yacoubean writes "Wiki sandboxes are normally used to learn the syntax of wiki posts. But webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank. One such webmaster recently demonstrated this successfully. Isn't it time for Google finally to put some work into refining their results to exclude tricks like this? I know all the bloggers and wiki maintainers would sure appreciate it."

324 comments

  1. Why just wikis? by GillBates0 · · Score: 4, Insightful

    Why not normal discussion boards and blogs? We, for one, saw how the SCO joke (litigious b'turds) managed to GoogleBomb SCO in first place without a problem.

    --
    An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
    1. Re:Why just wikis? by caino59 · · Score: 5, Funny

      We, for one, saw how the SCO joke (litigious b'turds) managed to GoogleBomb SCO in first place without a problem.

      You forgot the link: Litigious Bastards

    2. Re:Why just wikis? by abscondment · · Score: 3, Interesting

      posting on Wikis doesn't screw up your own blog.

      posts on message boards will be deleted quickly, unless the board is expressly google bombing (as in the current Nigritude Ultramarine 1st placer) / people are stupid

      i think the idea is that wikis make it easier in general for your post to stay up and not affect your blog.

    3. Re:Why just wikis? by nautical9 · · Score: 4, Interesting
      I host my own little phpBB boards for friends and family, but it is open to the world. Recently I've noticed spammers registering users for the sole purpose of being included in the "member list", with a corresponding link back to whatever site they wish to promote. They'll never actually post anything, but they've obviously automated the sign-up procedure as I get a new member every day or so, and google will eventually find the member list link.

      And of course there are still sites that list EVERY referer in their logs somewhere on their site, so spammers have been adding their site URLs to their bot's user agent string. It's amazing the lengths these people will go to spam google.

      Sure hope they can find a nice, elegant solution to this.

    4. Re:Why just wikis? by Anonymous Coward · · Score: 5, Funny

      Why not normal discussion boards and blogs?

      As an employee of JBOSS, I'm shocked and appalled at your suggestion. Fortunately, JBOSS is working on a new JBOSS solution to overcome this problem using JBOSS. We at JBOSS are passionate that our JBOSS technology will prevent even non- JBOSS users from taking advantage of boards this way.

      Frank Lee Awnist
      JBOSS Employee
      JBOSS Inc.

      JBOSS JBOSS JBOSS

    5. Re:Why just wikis? by ichimunki · · Score: 5, Informative

      The real problem with Wikis is that the link will remain there, even after it has been removed from the current page, because most Wikis have a revision history feature. So what's needed is careful set up in the robots.txt file and other HTML clues for the web crawlers to exclude anything but the most current version of a page (and to skip over the other 'action' pages, like edits, etc).

      My wiki got hit by this stupid link, but not in the sandbox. Of course, recovering the previous version of the page is easy... it's wiping out any trace of the lameness that gets trickier. I suppose the easiest way to defeat this would be to require simple registration in order to edit Wiki pages.

      What else can we do? Alter the names of the submit buttons and some of the other key strings involved in Editing?

      --
      I do not have a signature
    6. Re:Why just wikis? by Andy+Mitchell · · Score: 3, Insightful

      I'm not sure this will make you feel better but this startergy has a limited lifetime.

      The contribution of your page to another pages page rank depends on two factors, firstly the page rank of your page, and secondly the number of links coming from your page.

      As more people take up this tactic the return everyone gets from it, gets smaller. E.g. When there are hundred of links on that page they cease to have any real value. Eventually people should give up on this one.

    7. Re:Why just wikis? by Pieroxy · · Score: 4, Funny

      You forgot the link: JBOSS.

    8. Re:Why just wikis? by Andy+Mitchell · · Score: 1

      You missed out on: 1) The opinions expressed in this post are not necersarily those of.... 2) This message is (c) of....

    9. Re:Why just wikis? by athakur999 · · Score: 1

      Most wiki sandboxes will let you modify them without any sort of registration at all, so it's much more time effective than signing up for a bunch of discussion boards, waiting for the validation emails, etc. They also probably have a higher average page rank than most discussion boards and blogs would, so a little goes a long way.

      --
      "People that quote themselves in their signatures bother me" - athakur999
    10. Re:Why just wikis? by Anonymous Coward · · Score: 2, Informative

      Just set your robots.txt to exclude the user list. Or if you don't have many friends and family, send yourself an 'approve member' email. Then start training your spam filter on fake accounts.

    11. Re:Why just wikis? by clarkcox3 · · Score: 5, Funny

      That's just irresponsible. By putting that link there (the one that says Litigious Bastards), you're contributing to the problem.

      Again, responsible people do not put "Litigious Bastards" links in their slashdot posts.

      Think about it? How would you like a google search for Litigious Bastards to point to your company, leading everyone to think that you and your co-workers are nothing but a bunch of Litigious Bastards?

      --
      There are no tiger attacks in my area and it's all because this rock I'm holding keeps the tigers away.
    12. Re:Why just wikis? by boa13 · · Score: 3, Informative

      So what's needed is careful set up in the robots.txt file and other HTML clues for the web crawlers to exclude anything but the most current version of a page (and to skip over the other 'action' pages, like edits, etc).

      It has probably already been done in any wiki software worth its salt. Here's what MoinMoin does for example:

      * It has a regexp of HTTP_USER_AGENTS which should receive a FORBIDDEN for anything except viewing a page. The default setting includes many known bots (including Google) and utilities such as wget.
      * Most pages contain the appropriate robot meta tag, whith the relevant noindex and/or nofollow settings.

      In addition to that, the webmaster can of course set up a robots.txt file, and actually should do so because there are tools out there which don't understand the robot meta tags (or they don't want to take a performance hit) and the user agent of which can easily be changed by the user... wget comes to mind.

      Of course, it shouldn't be too hard to add regexps to prevent certain links from being done, or certain hostnames or IPs from altering the site (editing pages, reverting them, deleting them).

    13. Re:Why just wikis? by bcrowell · · Score: 1
      Why not normal discussion boards and blogs?
      Heck, why not the front page of Slashdot? Wouldn't it be cool if you could get an article frontpaged on Slashdot that linked to your Nigritude Ultramarine site (where you talk about spamming wikis)?

      I think a link from http://slashdot.org probably boosts your Google page rank a little more than one from a typical Wiki sandbox :-)

    14. Re:Why just wikis? by M.+Silver · · Score: 1

      Because the sandbox area is specifically for dinking around posting test messages, so nobody particularly maintains it. Normal discussion boards and blogs *do* get hit (Google for "comment spam"), but they're more likely to get cleaned up before Google spiders the page.

      --

      Slashdot's token middle-aged housewife
    15. Re:Why just wikis? by Eivind · · Score: 4, Informative
      It's working almost *too* well. Not only are SCO the number one hit for "litigious bastards", but they're also the number one hit for "litigious" or "bastards" alone.

      Then again maybe that mostly says something about their popularity.

    16. Re:Why just wikis? by Dirk+Pitt · · Score: 0, Troll

      Yeah, you know if 'bastards' brings up your site before a mention of the republicans, you're doing something very wrong.

    17. Re:Why just wikis? by hunterx11 · · Score: 1

      Funnier would be if the original Slashdot article on Nigritude Ultramarine won the contest.

      --
      English is easier said than done.
    18. Re:Why just wikis? by Anonymous Coward · · Score: 0

      So should we all use JBOSS now?
      Or probably something faster than JBOSS?

    19. Re:Why just wikis? by mrtroy · · Score: 3, Funny

      Top 5 reasons that unix > linux, according to SCO

      SCO UNIX® is a Proven, Stable and Reliable Platform
      SCO UNIX® is backed by a single, experienced vendor
      SCO UNIX® has a Committed, Well-Defined Roadmap
      SCO UNIX® is Secure
      SCO UNIX® is Legally Unencumbered

      HAHAHAHAHAAHHAHAHAHAHAHAHA

      That should be a top 10 list, and on letterman's show

      --
      [I can picture a world without war, without hate. I can picture us attacking that world, because they'd never expect it]
    20. Re:Why just wikis? by May+Kasahara · · Score: 1

      Ironically enough, the first-placer that you mentioned has been the one group that has been spamming my wiki's sandbox the most in recent days (see sig)... methinks its a group effort on the part of the messageboard community -_-

    21. Re:Why just wikis? by Hooded+One · · Score: 1

      So what's needed is careful set up in the robots.txt file and other HTML clues for the web crawlers to exclude anything but the most current version of a page (and to skip over the other 'action' pages, like edits, etc).

      That should be done anyway. It's annoying when I'm searching for information and the Google link goes straight to the Edit page of a Wiki. Yes, I can easily go to the regular page from there, but this will confuse many users, and it just looks tacky.

    22. Re:Why just wikis? by jesterzog · · Score: 1

      Why not normal discussion boards and blogs?

      Do you mean like this one? It would have to be one of the most spammed and neglected discussion boards that I've ever come across.

  2. Cyberneighborhood Not-Watch? by raehl · · Score: 5, Interesting

    In the real world, there are neighborhood watch signs to "deter" criminals.

    Perhaps there could be a command in the robots.txt file which says "Browse my site, but don't count any links here for page ranking"? That would make your site less of a target for spammers, but not prevent you from being ranked at all.

    1. Re:Cyberneighborhood Not-Watch? by lunax · · Score: 3, Insightful

      Why not put the sandbox in it's own folder and add an entry to the robots.txt telling it not to browse that folder?

    2. Re:Cyberneighborhood Not-Watch? by Random+Web+Developer · · Score: 5, Informative

      There is a robots meta tag for this that you can put in your headers for a single page (robots.txt needs subdirs) but unfortunately most webmasters are too ignorant to realize the power of these:

      http://www.robotstxt.org/wc/meta-user.html

      --
      Artists against online scams http://www.aa419.org/
    3. Re:Cyberneighborhood Not-Watch? by Random+Web+Developer · · Score: 2, Informative

      The problem with wiki's is that they use 1 template for all pages, including the sandbox, everything is wiki.pl?PageName or something like that. You would have to dive in the code instead of just "using" the wiki

      --
      Artists against online scams http://www.aa419.org/
    4. Re:Cyberneighborhood Not-Watch? by naoiseo · · Score: 3, Insightful

      This fails to address the real issue.

      That is, even if you make your links useless (easy with a no-follow meta tag) it wont help, the majority of this spam is AUTOMATED, and will spam your wiki/blog/guestbook based on simple page queues.

      Your best personal defense is to manually remove any page or html queues that a spammer would pick up on as being common to a certain type of postable web page or element.

      Bloggers have been creating blacklists (banning both poster ips and destination urls) with some degree of success. This is a deterrent, having a spammer show up on a blacklist whereby webmasters use a distributed file to 'clean' their blogs automatically.

    5. Re:Cyberneighborhood Not-Watch? by Jeff+DeMaagd · · Score: 1

      I think one quick, easy fix is to disallow hyperlinks in the comments / guest book. If it isn't an "a href" then Google's spider won't take it.

    6. Re:Cyberneighborhood Not-Watch? by Random+Web+Developer · · Score: 2, Informative

      as most spam posts have several links in them, wordpress allows setting a treshold: X number of links in the comment gets cued for moderation.

      --
      Artists against online scams http://www.aa419.org/
    7. Re:Cyberneighborhood Not-Watch? by phutureboy · · Score: 4, Interesting

      You can also list robots.txt commands as meta tags in the [head] portion of the document. So, the wiki authors could just put them in the sandbox template, and individual site owners would not even have to know about / monkey with robots.txt to be protected.

    8. Re:Cyberneighborhood Not-Watch? by Random+Web+Developer · · Score: 1

      having the wiki coders handle this would definetely rule to the umpteenth degree!

      --
      Artists against online scams http://www.aa419.org/
    9. Re:Cyberneighborhood Not-Watch? by jacoplane · · Score: 2, Insightful

      I think the real problem is that spammers aren't likely to look at how you've configured spiders to handle your site. So even if you do this i'm sure it won't get rid of the spammers.

    10. Re:Cyberneighborhood Not-Watch? by chris_mahan · · Score: 1

      best would be an autorevert to a several-months-old version of the page every week or so. This would also flush the histories.

      --

      "Piter, too, is dead."

    11. Re:Cyberneighborhood Not-Watch? by bkhl · · Score: 2, Insightful

      Also, we really need to replace the klugy robots.txt files and robots meta-tags with headers built in to the HTTP protocol.

      Like it is, it's hell to try to get decent robotic behaviour out of anything other than HTML pages.

    12. Re:Cyberneighborhood Not-Watch? by Tony-A · · Score: 1

      I think the real problem is that spammers aren't likely to look at how you've configured spiders to handle your site.

      Hmmmm, is there any way to indicate that indicated stuff has negative value in computing anything useful? Build booby traps. Catch boobies.

  3. Oh well by SpaceCadetTrav · · Score: 5, Informative

    Google and others will just lower/diminish the value of links from Wiki pages, just like they did to those open "Guest Book" pages on personal sites.

  4. Yes... PLEASE... by Paulrothrock · · Score: 4, Insightful
    Google needs to do something about this. I had to turn off comments on my blog because all I was getting was spam. Two or three a day that I had to go in and delete. I have to now find a system that will keep the bots out.

    What happened to the nice internet we had in 1996?

    --
    I'm in the hole of the broadband donut.
    1. Re:Yes... PLEASE... by ack154 · · Score: 1

      I still haven't really seen a problem with this on my blog. I've had comments enabled for the past two years and have maybe gotten 3 or 4 total spam comments in that time (one today actually).

      Mine has always been set to not allow anon comments, but I know most people have that set as well.

      I have been using MovableType and just haven't really had any problems. Been lucky I guess.

    2. Re:Yes... PLEASE... by lukewarmfusion · · Score: 2, Interesting

      As my site grows, I'm thinking about adding a mechanism to address those issues: when the user requests a page for the first time, he'll get a session value that says he's a valid visitor to the site. When he submits a comment, he has to have that value, or comments aren't allowed. I don't know how you'd write a script to circumvent that. (If someone can tell me, I'd love to know so I try to prevent it!)

    3. Re:Yes... PLEASE... by karmatic · · Score: 1

      For my blog (which uses WordPress), I added in a redirect page. This page has noindex,nofollow on it - so no pagerank goes out.

      Also, any comments containing more than X links or spammy terms (customizable) automatically requires moderator approval.

      So far, no successful spam. Remove the incentive, nobody will bother.

      "What happened to the nice internet we had in 1996?
      AOL. No really (although other "user-friendly ISPs hurt too"). Because of the influx of technically illeterate (or just incompetent) people, "Spammy" techniques work. PR manipulation, bulk mailings, etc. actually make money.

      Where the suckers are, the people who exploit them go. A mandatory proficiency/IQ test to get on the 'net would go a long way towards helping alleviate these problems.

    4. Re:Yes... PLEASE... by n-baxley · · Score: 4, Interesting

      The system was even easier to rig back then. Back in 96ish, I created a web page with the title "Not Sexy Naked Women". Then repeated that phrase several times and then gave a message telling people to click the link below for more Hot Sexy Naked Women which took them to a page that admonished them for looking for such trash. I added a banner ad to the top of both of these pages, submitted them to a search engine and made $500 in a month! Things are better today, but they're still not perfect.

    5. Re:Yes... PLEASE... by happyfrogcow · · Score: 2, Funny

      What happened to the nice internet we had in 1996?

      i blame blogs

    6. Re:Yes... PLEASE... by Anonymous Coward · · Score: 0
      influx of technically illeterate (or just incompetent) people,

      Nice irony there.

    7. Re:Yes... PLEASE... by Paulrothrock · · Score: 1
      I'm using Wordpress, and before that b2. It's only started in the past month, too.

      Unfortunately, my spam comments fill in the email fields, so I can't turn of anonymous comments. Is there any way for me to get the IP addresses of spam comments and forward this to the authorities?

      --
      I'm in the hole of the broadband donut.
    8. Re:Yes... PLEASE... by Paulrothrock · · Score: 2, Insightful

      No, I blame opportunistic bastards who can't see that it's okay to not profit from something. *Thinks about his sledding hill that was destroyed by an upscale minimall.*

      --
      I'm in the hole of the broadband donut.
    9. Re:Yes... PLEASE... by Safety+Cap · · Score: 1
      I had to turn off comments on my blog because all I was getting was spam.
      The simple solution is to require the poster to read a distored graphic of a random numeric value and enter the value into a field in order to submit his message.
      --
      Yeah, right.
    10. Re:Yes... PLEASE... by Frizzle+Fry · · Score: 1

      Based on your use of bold, you seem to be saying it's ironic that he couldn't spell illiterate, but equally ironic is that his screed against the "technically illeterate" is contained in an improperly closed italics tag.

      --
      I'd rather be lucky than good.
    11. Re:Yes... PLEASE... by Anonymous Coward · · Score: 0

      YEA! Fuck the blind! They have no business on the Web anyway.

    12. Re:Yes... PLEASE... by Talonius · · Score: 1

      Yes, but that solution is a hindrance to the blind and visually impaired. I've helped several people set up screen readers on their computers here in St. Louis -- I know for a fact that they do use the Internet, quite a lot.

      (Consider reading the news online through a screenreader vs. trying to read the daily newspaper.)

      --
      My reality check bounced.
    13. Re:Yes... PLEASE... by ack154 · · Score: 1

      Well, aside from being able to forge IPs and such, my question to that would be...

      What authorities would you be sending them to? It isn't really "illegal" to spam someone's comments, at least, not that I know of.

    14. Re:Yes... PLEASE... by Anonymous Coward · · Score: 0

      I was thinking that, right after I clicked submit instead of preview.

      D'oh!

    15. Re:Yes... PLEASE... by Nasarius · · Score: 1

      Well if you're setting a "session value", you're either using cookies or rewriting the links. So all that the script has to do is handle cookies properly or follow your "post a comment" links, neither of which is very hard.

      --
      LOAD "SIG",8,1
    16. Re:Yes... PLEASE... by lukewarmfusion · · Score: 1

      I didn't realize that a script could handle cookies so easily. Perhaps I should just stick with my first plan to use a "type in these fuzzy-looking letters and numbers" method.

      Of course, I'm also requiring logins for most things.

    17. Re:Yes... PLEASE... by NightRain · · Score: 1
      Just upgrade to Wordpress 1.2. You can set a threshold for number of links before a comment automatically requires moderation. So set it to 3, and if there are more than 3 links, it needs moderation before it appears.

      You can also set up some keywords which make a post require moderation. Given that the idea of comment spam is to raise google rank for certain keywords, blacklisting these keywords is fairly effective.

      Ray

    18. Re:Yes... PLEASE... by Anonymous Coward · · Score: 0

      Perhaps a drivers license, only for the internet. A "Surfers License," so to speak.

    19. Re:Yes... PLEASE... by joggle · · Score: 2, Interesting

      Why not generate an image containing modified text like yahoo and others? Using a little PHP magic, it shouldn't be too hard (see here to get a start).

    20. Re:Yes... PLEASE... by Nurseman · · Score: 1
      " YEA! Fuck the blind! They have no business on the Web anyway."

      Not to be a troll, or feed the trolls, but how does a blind person view the web now ? If it is large text based, than there is a way that this can work, or if it is done by voice, cant you add a little MP3 "To leave a comment, first type "Bill Gates is wise"" or something like that ?.

      --
      Save a Life. Donate Blood. Please.
    21. Re:Yes... PLEASE... by Paulrothrock · · Score: 1

      Thanks! I really hadn't thought about who I'd report the IPs to. Maybe I coudl sue them or something, maybe for any profits derived from posting to my page without permission...

      --
      I'm in the hole of the broadband donut.
    22. Re:Yes... PLEASE... by Anonymous Coward · · Score: 0

      so are you still a fraudulent asshat, or did you grow out of that?

    23. Re:Yes... PLEASE... by Willard+B.+Trophy · · Score: 1
      I guess your blog just hasn't been found yet. Is it known to Google?

      Before I installed MT-Blacklist, I had over 130 comment spams on my site. Now I get at most a couple a day that get by MT-Blacklist, and these can easily be added to my blocked list.

      For a while, I was getting over 100 blocked spam attempts from an IP address purportedly in Hungary.

    24. Re:Yes... PLEASE... by Idarubicin · · Score: 1
      Then repeated that phrase several times and then gave a message telling people to click the link below for more Hot Sexy Naked Women which took them to a page that admonished them for looking for such trash.

      That was you? You bastard. It's your fault I could never find porn.

      --
      ~Idarubicin
    25. Re:Yes... PLEASE... by Paulrothrock · · Score: 1
      I think I've figured it out. I will create an email address on my server. Then I will write a bookmarklet that I can hit when I'm moderating comments with over three links to send the offending web site an email detailing the fact that I will consider any more spam posts from their site or their business associates harassment and prosecute accordingly. I'll check this account weekly to see if they get the hint. If not, they will have a lawsuit on their hands.

      It's a shame that people can't just do the right thing. I guess that's why I got cussed out by some asshole Roto Rooter guy I parked behind who couldn't get out.

      --
      I'm in the hole of the broadband donut.
    26. Re:Yes... PLEASE... by ack154 · · Score: 1

      Ya, it's been in Google for quite some time. Like I said, I never saw the big influx of comment spam. Saw a couple, but never huge numbers.

      Yet, I know people that have turned comments on as a test for a few days and have just gotten slammed with spam and turned them right back off.

      Who knows. If it becomes a problem for me, I'll look into it more, but right now, it hasn't been an issue, luckily.

    27. Re:Yes... PLEASE... by Fjord · · Score: 1

      If you've set up a page to allow comments to be posted, then you have implicitly given permission for people to post comments. If you don't want random people posting random things, then don't go through all the work to set up a server that allows them to.

      --
      -no broken link
    28. Re:Yes... PLEASE... by Fjord · · Score: 1

      Blind people can blog with us seeing eye folk just fine. They can use a braille reader or have a voice synth read the pages they are on (depends on preference. The only blind guy I knew only used the braille reader). The mp3 thing is a good idea as a fall back (wouldn't want to exclude the deaf or those without a speaker hooked up).

      --
      -no broken link
    29. Re:Yes... PLEASE... by SirPrize · · Score: 1

      curl and wget can both store and use cookies between invocations. A simple three-liner then to request the first page, perform the post to a preview, and then confirm the preview.

    30. Re:Yes... PLEASE... by sampowers · · Score: 1

      Ha. I think if you set up an email address for correspondence with spammers, it's just going to get spammed into a crater.

    31. Re:Yes... PLEASE... by Paulrothrock · · Score: 1

      But if I put a disclaimer saying that this is for personal, non-profit use, they would have to obey it or be sued. I'm sure there are a lot of servers where you're not allowed to use resources for commercial purposes. I could put it in my TOS.

      --
      I'm in the hole of the broadband donut.
    32. Re:Yes... PLEASE... by Paulrothrock · · Score: 1

      Such is the power of unlimited email addresses from my hosting company: I can create one, send them the warning, and then delete it before they have a chance to spam me.

      --
      I'm in the hole of the broadband donut.
    33. Re:Yes... PLEASE... by Yacoubean · · Score: 1

      I think I stumbled onto your page back then. ;) Does that mean you owe me money? Or was I just another one of your pornbait casualties? I really had a chuckle when I saw that though, assuming it was your site.

    34. Re:Yes... PLEASE... by n-baxley · · Score: 1

      Bwahhahahahahahaha!!

    35. Re:Yes... PLEASE... by n-baxley · · Score: 1

      Well, if it makes you feel any better, the company I was serving ads through, Riddler, shut me down after 3 months when I broke into their top 25 list. I told them it shouldn't matter as their ads were still getting seen, but oh well. Glad to bring a little smile to your life. :)

    36. Re:Yes... PLEASE... by Technically+Inept · · Score: 1
      With most blogs, that's swatting a fly with a sledgehammer.

      A friend of mine recently had the insight that while his site was highly ranked enough to attract a ton of blog spam, it certainly wasn't important enough that spammers would attempt circumvention of even the simplest security measures.

      Hence, the comment entry page became:
      Enter the number 3 here:
      Enter your comment here:

      It's been in place for a month and a half has a 100% success rate so far.

      --
      Now watch me hit this drive.
    37. Re:Yes... PLEASE... by LiquidCoooled · · Score: 1

      Thats like putting up a EULA on your software and expecting the pirates to actualy take notice.

      --
      liqbase :: faster than paper
    38. Re:Yes... PLEASE... by J'raxis · · Score: 1
      A cookie is just an HTTP header. An HTTP header is just a line of text. To wit:
      Cookie: name=value
      You can write do this in 10 lines of Perl code, using something like LWP::UserAgent, or even drop down to doing it with plain old sockets. Here's a complete request, replete with cookies:
      GET /url HTTP/1.1
      Host: www.example.com
      Cookie: name=value
      Any competently-programmed bot would probably add fake user-agent and referer headers to make it look like plain-vanilla MSIE, and make it look like it was coming from a parent page on your site.
    39. Re:Yes... PLEASE... by Anonymous Coward · · Score: 0

      No, leave it up and then prosicute under the CAN-SPAM act.

    40. Re:Yes... PLEASE... by Fjord · · Score: 1

      I am doubtful. To actually have a contract, both partied have to agree to it. You can't say "by posting you agree to abide by these terms" and expect that to be binding either. All the TOS can do is make it so that you can block their IP and they can't sue you for hardship on their business.

      --
      -no broken link
    41. Re:Yes... PLEASE... by cgenman · · Score: 1

      What? You mean people behaving in a self-aggrandizing, immature, money grubbing fashion... on the Internet?! Say it isn't so!

  5. like porn by millahtime · · Score: 4, Interesting

    These seems similar to the system all those porn systems used to get such a high rank in google.

    Kind playing the system with the content not being quite as desirable.

  6. Naughty behaviour by doodlelogic · · Score: 1

    Just a shame that Google is one of the few search engines that are any good.

    I always use All the Web when looking for any company or organisation I know the name of, but for more general queries I'm looking for a clean, fast, non-buggy alternative to the google giant. Preferably open source.

    Any suggestions?

    1. Re:Naughty behaviour by Anonymous Coward · · Score: 0
      I always use All the Web when looking for any company or organisation I know the name of, but for more general queries I'm looking for a clean, fast, non-buggy alternative to the google giant. Preferably open source.

      WTF does "preferably open source" mean for a search engine? Who cares if it's open source or closed source or something some guy hacked up in his basement out of an old shareware app he wrote? It's a search engine!

    2. Re:Naughty behaviour by Syzar · · Score: 1

      Nutch aims to create opensource search engine, though they don't have anything yet.

    3. Re:Naughty behaviour by Anonymous Coward · · Score: 0
      I swore off Alltheweb after they started using Yahoo tracking links.

      I have found favorable results with Wisenut.

    4. Re:Naughty behaviour by Doesn't_Comment_Code · · Score: 2, Informative

      I'm looking for a clean, fast, non-buggy alternative to the google giant. Preferably open source.

      Any suggestions?


      The only big one I know of right now is Nutch. It is an open source search engine that is in the later stages of development, but hasn't produced a large, usable site yet.

      nutch.org

      Since it will be open source, you will be able to read the ranking algorithms and change/abuse them as you see fit.

      This one http://search.mnogo.ru/ is also available.

      --

      Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
    5. Re:Naughty behaviour by Anonymous Coward · · Score: 0
    6. Re:Naughty behaviour by Lodragandraoidh · · Score: 1

      Teoma is a fairly decent search engine; not open source, however.

      --

      Lodragan Draoidh
      The more you explain it, the more I don't understand it. - Mark Twain
    7. Re:Naughty behaviour by julesh · · Score: 1

      Since it will be open source, you will be able to read the ranking algorithms and change/abuse them as you see fit.

      Actually, this is an interesting idea. Allowing site users to fiddle with the algorithm used to rank their results... if you're not using the same algorithm as everyone else, SEO tactics might actually decrease results!

  7. You know... by fizban · · Score: 3, Insightful

    ...what Google needs? A "Was this result helpful in your search?" button for each link returned, so that the search itself also influences page ranks. Maybe that will help get rid of this Google bombing mess.

    --

    +1 Insightful, -1 Troll. What can I say, I'm an Insightful Troll.

    1. Re:You know... by Anonymous Coward · · Score: 1, Insightful

      Because obviously bots couldn't mess with that....

    2. Re:You know... by Anonymous Coward · · Score: 1, Insightful

      "...what Google needs? A "Was this result helpful in your search?" button for each link returned, so that the search itself also influences page ranks. Maybe that will help get rid of this Google bombing mess."

      Except spambots can also work to make sure that the most helpful links are the ones linking to spam sites.

    3. Re:You know... by Anonymous Coward · · Score: 4, Insightful

      that button will also get spammed, as bots will click 'yes' for their sites and 'no' for the competitors sites

    4. Re:You know... by goon+america · · Score: 3, Insightful

      Wouldn't that be equally abused?

    5. Re:You know... by gunnk · · Score: 1

      I'm guessing that you are asking:

      "What's to keep Google-bombers from marking down the significance of real links in order to increase the rank of their links?"

      One way to mitigate it is simply to let a given IP address mark a link as good or bad only once. The bomber would have to use a multitude of IP addresses in order to make any significant counter to the huge number of legitimate users that would be marking them down. It would be too labor intensive and therefore cost prohibitive.

      --
      Life is short: void the warranty.
    6. Re:You know... by Nasarius · · Score: 2, Insightful

      Ah, but how long will it take for someone to write a worm with a Google-abusing payload? We've already got spammers using hacked PCs to send mail.

      --
      LOAD "SIG",8,1
    7. Re:You know... by radish · · Score: 1

      There already is one. It's on the toolbar...

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    8. Re:You know... by Anonymous Coward · · Score: 0

      I'll post this again, since some crack-addled moderator reckons it was overrated.

      When you report a link as being unsuitable, Google doesn't act on it automatically. It's reviewed by a human. The only thing spammers will get if they try to abuse it is a complaint from Google to their provider for wasting their time.

    9. Re:You know... by alanh · · Score: 1

      What about the so-called "mega-proxies" from certain ISPs? They make all of their requests from hundreds or thousands of users look like they're coming from the same IP address.

      --
      - AlanH
    10. Re:You know... by Woogiemonger · · Score: 1

      You know what Google needs? A "Was this result helpful in your search?" button for each link returned, so that the search itself also influences page ranks. Maybe that will help get rid of this Google bombing mess.

      I think the only way for that to work is to charge something like $5 a year per Google user account. Otherwise, soon Google would have to start dealing with scripting that searches for certain web sites and chooses "Yes, this link is useful!" thousands of times per second. Then maybe you can try to restrict it by IP address, but the scripts would turn into some sort of trojan that makes other people do the same thing. Then you'd have them identify some sort of letters/numbers in graphics, which I remember seeing a previous Slashdot story about scripters having a way around as well...etc, etc.

    11. Re:You know... by Xhad · · Score: 1
      You could have a registration process and keep track of how much individual users vote.

      Some would probably object to having to log on every time they want to search for something, and I agree, but the people who would be willing to keep themselves perpetually logged in at home would be able to make a difference on their own.

      There would have to be other measures to prevent abuses, but having one of those "prove you're a human" tests at registration and requiring a valid email address would be a start. Again, seems rather intrusive just to use a search engine, but that's why they'd have to still allow unregistered users to SEARCH but not VOTE if they choose.

    12. Re:You know... by Anonymous Coward · · Score: 0

      If you want to look into this more the proper term is "Relevance Feedback".

      Not that I'm an IR student or anything. (IANAIRS)

    13. Re:You know... by DrJonesAC2 · · Score: 1

      You could combite it with verifying a number from an image AND only let 1 IP address vote per day. That would make it a fairly solid system.

    14. Re:You know... by pseudochaotic · · Score: 1

      Even better, they could add a javascript that sends them back information on which links you clicked. Harder to bot than the existing system, and if you don't like them watching what you click, there's always copy/paste.

      --
      And the l33t shall inherit the 34r7h.
    15. Re:You know... by JohnGalt00 · · Score: 1

      Which doesn't help for us hippie (Linux/BSD/Mac) users...

    16. Re:You know... by nsingapu · · Score: 1

      If you look in google source they do track clicks via javascript.

      They claim* to use this information in a passive manner (i.e. they watch aggregated information collected from certain searches over a short timeframe to evaluate the effectiveness of changes).

      I think (hope) it must be a matter of time until click throughs and lack thereof play an extremely prominent role in their algorithm - why have a handful of programmers make changes when you can utilize the largest collection of searchers out there to make changes for you? The ideal search engine would be one that searches like me - or you - or John Elway...whatever just as long as there is a guy behind the box

      If implemented well (well as in rather intellegently - track not only clicks but multiple clicks on the same search and timeframes between them etc.) then it would have potential to mitigate crap like this. **

      *I believe the source of this claim is a google employee "googleguy" who frequents webmasterworld.com

      **this network has been submitted via 3 seperate spam reports starting approximately six months ago and still many sites rank well for competitive one word phrases. This network does not (as a whole) use the techniques relevant to this discussion - they employ hidden images on their respective homepages to link to a number of their pages a sitemap of hidden pages and a friends page - but really - javascript redirects - thats so 2003

  8. < jab jab > by jx100 · · Score: 2, Interesting

    Well, couldn't have been that successful, for he didn't win.

  9. Some people ... by TheGavster · · Score: 2, Insightful
    It still gets me how the people who are participating in the nigritude ultramarine thing don't see anything wrong with what they're doing. This line particularly got me:
    "Without, as opposed to guestbook spamming, being evil it's a sandbox after all."

    Yes its a sandbox, no its not your personal playground.
    --
    "Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
  10. google works by mwheeler01 · · Score: 3, Informative

    Google does tweak their ranking system on a regular basis. When the problem becomes evident, (and it looks like it just has) they do something about it...that's why they're google.

    --
    Pretty widgets? What pretty widgets?
    1. Re:google works by lommer · · Score: 1

      Agreed, I also think that completely devaluing wikis or not counting them in pagerank at all is a mistake. Unmolested wiki's are generally a very accurate source of links to pages that are reputable and useful and Google knows this. I think they would be more likely to implement a scheme where the longer a link has been in place on a wiki, the more it counts for thereby eliminating the usefulness of the small, short-lived links that the article discusses.

  11. Who's fault is that? by lukewarmfusion · · Score: 4, Insightful

    Google's algorithm isn't the problem. The problem is the availability of easily abused areas such as these "sandboxes."

    Some search engines accept any old site. Others accept sites based on human approval and categorization. Google is a nice combination of the two - by using outside references (counting how often the site is linked) it assumes that the site is more relevant. Because other people have put links on their sites. That's a human factor, without directly using human beings to review and categorize the sites and rankings.

    Sure it can be abused, but it's not Google's fault; perhaps these areas of abuse (blogs, wikis, etc.) should address the problems from their end.

    1. Re:Who's fault is that? by Chanc_Gorkon · · Score: 1

      Yes it is. When with less than a million links miserable failure searches on google are linked to President Bush's biography on the whitehouse web site, that's a problem (leave your political views out of this). Same geos for Weapon's of Mass Destruction and other google bombs. Google....fix it now before it gets to be a real problem.

      --

      Gorkman

    2. Re:Who's fault is that? by lukewarmfusion · · Score: 1

      Two issues here:

      1. The problem still exists on the side of the provider with the links. Who coordinated these million links that resulted in the "Google bomb?" Why not complain to them?

      2. Is it really a problem? Google has no public responsibility to report rankings according to the demand of anyone; if they wish to block Linux altogether and replace Linus/OSS searches with Microsoft-sponsored results, they can do so. But it would hurt their business and credibility. I'm confused as to why people think that they have any right to dictate how Google should run their company or rank the search results.

      There was a Jewish group that complained to Google because some searches returned an anti-semitic website. After removing the link for a brief period, they put it back up because it was there as a result of honest ranking through their algorithm. While I don't enjoy seeing such a result, I defend their (the site's) ability to say what they wish and Google's decision to rank sites as they see fit.

      Google is number one because they do things better than anyone else. Your last sentence, "fix it now before it gets to be a real problem" is ridiculous - what "real problem" could there ever possibly be as a result of search engine rankings?

    3. Re:Who's fault is that? by Chanc_Gorkon · · Score: 1

      No wait....the Google algorithm has a hole. Does the presidents biography have the words miserbale failure in it? Why is the linked text taken as a meaning of what is on the site? Those webmasters who put the link all over thier sites are only taking advantage of a hole in the Google algorithm. Google should simply do a text search and make sure that miserable failure is actually ON the web page that that text links to. Then google bombs would have no effect.

      --

      Gorkman

    4. Re:Who's fault is that? by lukewarmfusion · · Score: 1

      That's actually a good recommendation. You could still rank the site because it has the link, but detach the link from the label.

      What if a site does not contain any text - like a blank page with a flash movie on it? You'd still want it to be found through Google, but there would be no way of knowing what search terms should find it without using the label from the linking site.

    5. Re:Who's fault is that? by bcrowell · · Score: 2, Interesting
      Google's algorithm isn't the problem. The problem is the availability of easily abused areas such as these "sandboxes."
      I'm not even convinced Google's algorithm has a problem. One thing a lot of people don't realize about the page rank algorithm is that your page rank goes down if you have lots of outgoing links that aren't reciprocated with links coming back from the site you linked to. It may be that this technique simply leads to a reduction in the page rank of the sandbox, which, after all, is appropriate, since the sandbox isn't something the the sandbox's owner even wants people to find by Google searching.

      Sure it can be abused, but it's not Google's fault; perhaps these areas of abuse (blogs, wikis, etc.) should address the problems from their end.
      Yeah, the simplest thing to do would be for the sandbox's owner simply to use the robots.txt file to forbid indexing of the sandbox page. That keeps the rest of the web site's page rank from being adversely affected, deters spammers from abusing the sandbox, and does Google's users a service by not directing them to the sandbox, which they don't want to find.

      Spammers aren't stupid -- if I was an Evil Spammer(tm), I'd certainly make sure my script checked the robots.txt and didn't waste time spamming sandboxes that weren't going to be indexed.

    6. Re:Who's fault is that? by patches · · Score: 1

      Well if someone was going to put a blank page up with just a flash movie, then they can use the html meta tags to tell the searchengine what search terms can find that page....

      --
      The worst part of being athiest.... You don't have anyone to talk to during orgasm!
    7. Re:Who's fault is that? by M.+Silver · · Score: 1

      The problem is the availability of easily abused areas such as these "sandboxes."

      One of the first things I did was delink the sandbox on my TWiki install.

      If I really felt the need for one, I'd point to the one on TWiki.org. But a well-written wiki shouldn't really need a sandbox.

      --

      Slashdot's token middle-aged housewife
    8. Re:Who's fault is that? by themusicgod1 · · Score: 0, Troll

      If the internet has associated the ideas "miserable failure" and "bush" together, then I'm sorry, you are just going to have to live with this. Just because you can't see the connection, a connection so strong that it is the number one connection measured at this location currently, doesn't mean that it's not a valid one.

      When I use a search engine, I am NOT looking for a webpage with the words that I enter in, ie example if I were to search for "miserable failure" I would not be looking for a webpage with "miserable failure", but rather the meaning behind those two words, and the best examples of webpages that describe miserable failures, the state of being a miserable failure, or something in relation to the state of being a miserable failure. The internet seems to agree that bush is an example of the above. If poeple wanted to look for the exact words they type into google, when they use google, google would not provide spelling suggestions when mis-spellings occur...this is yet another example at how the meaning is what is sought, not the words.

      The 'webpage must contain text' / plaintext search idea just plain sucks. This is how search engines used to rank, wasn't it? And didn't google blow them all out of the water, so to speak? Because it worked better, and was less prone to abuse? And what really on the internet, would be a good idea of "miserable failure", other than bush? (feel free to link to it) There is doubless imprvement to be made on google...but it's pretty damn good as it is.

      --
      GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
    9. Re:Who's fault is that? by BlitzPig_Sal · · Score: 1

      I don't see Googlebombing as such a big problem. It would be practically impossible to bomb a commonly searched word or phrase such as "Linux" as the legitimate sites would overwhelm the googlebomb attempts.

      And even in the case of "miserable failure", is there really any harm being done? Just because a site appears in the top ten does not lend it any more credibility, it only indicates popularity.

      Anyone using Google needs to understand what the top search results actually represent. If they are just assuming that they are the top authoritative sources about the search phrase, they really need to get a clue.

  12. ROBOTS.TXT by gtrubetskoy · · Score: 4, Insightful
    The burden is not on Google, but on Wiki sandbox admins, who should provide proper ROBOTS.TXT files to inform Google that this content should not be indexed.

    As a sidenote, I think that with recent Wiki abuse, the issue of open wikis will become a similar one to open proxies and mail relays.

    1. Re:ROBOTS.TXT by sylvester · · Score: 1

      wtf. That's not insightful.

      First of all, while my wiki is mostly personal junk, there's no reason it shouldn't be indexed. And many open source projects use Wikis as a primary source of documentation.

      Secondly, the cat is out of the bag; I doubt these spammers are checking whether the sandboxes are indexed by Google.

      I'm mostly pissed off that the edits to my sandbox have been only from nigritude ultramarine people. Frankly, I think google should stomp on that contest by not allowing the words to be searched for together.

    2. Re:ROBOTS.TXT by Anonymous Coward · · Score: 0
      First of all, while my wiki is mostly personal junk, there's no reason it shouldn't be indexed. And many open source projects use Wikis as a primary source of documentation.

      The source of the problem are sandboxes. With most open source projects, spam in a wiki will be quickly spotted and gotten rid of, but in sandboxen junk can sit for months, long enough for google to make note of it.

      I'm mostly pissed off that the edits to my sandbox have been only from nigritude ultramarine people. Frankly, I think google should stomp on that contest by not allowing the words to be searched for together.

      There is little that can be done about this particular case, but another powerful tool against automated wiki spam is use of captchas. We need more captchas everywhere.

    3. Re:ROBOTS.TXT by sylvester · · Score: 1
      The source of the problem are sandboxes. With most open source projects, spam in a wiki will be quickly spotted and gotten rid of, but in sandboxen junk can sit for months, long enough for google to make note of it.
      Bah! The source of spam is not email. The source if this problem is not the sandbox, it's the wikispammers. I watch the Sandbox page like any other. Moreover, the Sandbox's history is kept, just like any other page, so the spam is still successful in creating links even if it's removed.
      There is little that can be done about this particular case, but another powerful tool against automated wiki spam is use of captchas. We need more captchas everywhere.
      Captchas are a stopgap solution, since you can eventually write software to guess a captcha. Putting up captchas drives an arms race. I recognize that arms races are sometimes inevitable, but they certainly aren't desirable.
    4. Re:ROBOTS.TXT by Anonymous Coward · · Score: 0

      > The burden is not on Google, but on Wiki sandbox admins, who should provide proper ROBOTS.TXT files to inform Google that this content should not be indexed.

      I just had a crazy thought: maybe they actually want the content to be indexed AND don't want spam.

    5. Re:ROBOTS.TXT by drinkypoo · · Score: 1

      If google doesn't find a way to work around this problem, and someone else does, google will become irrelevant.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    6. Re:ROBOTS.TXT by mlk · · Score: 1

      While this is fine for the Sandbox (which should be raked every few days anyway), its not good for the rest of the Wiki when it is a source of information, like mine, or Wikipedia.

      --
      Wow, I should not post when knackered.
    7. Re:ROBOTS.TXT by CaptainSuperBoy · · Score: 1

      I agree that robots.txt is the correct solution for people who don't want their wiki abused.

      However this is nothing like the issue of open proxies. The wikis aren't spewing any garbage traffic out into the Internet, they aren't actively attacking sites and being abused to send spam. It'd be great if Google fixed their pagerank system to detect weblog comment spam and wiki spam, but nobody should be thinking seriously about throwing "open wikis" on blacklists or cutting off their Internet access.

    8. Re:ROBOTS.TXT by CaptainSuperBoy · · Score: 1

      You can exclude the sandbox from indexing without affecting the rest of the site.

    9. Re:ROBOTS.TXT by mlk · · Score: 1

      I know.
      But the entire site is open to this kind of attack.
      And a good spammer could edit the pages so that the layman would not notice, so unless you had a large number of editors (like Wikipedia), the small Wikis would be hit.

      --
      Wow, I should not post when knackered.
    10. Re:ROBOTS.TXT by CaptainSuperBoy · · Score: 1

      That's just a critique of the wiki in general. Wikis have always been "open" to defacement but those problems have already been solved with pattern matching, rollbacks, page protection, etc. Sandboxes should either be monitored like real wiki pages, or excluded from searches.

    11. Re:ROBOTS.TXT by dysprosia · · Score: 1

      Great. Then spammers will go and peddle their links on the other Wiki pages.

  13. Same site, a few days later: Don't do it. by micha2305 · · Score: 2, Insightful
    Ok, but the same webmaster says:

    I decided to stop posting backlinks in Wiki sandboxes, the SEO strategy previously explained. [...] In the meantime I'm asking developers and those hosting Wikis of their own to please exclude sandboxes from search engine results (via the robots.txt file). Doing so would shield the sandbox from backlink-postings, and there is no need for it to turn up in search results in the first place.

    This sure makes sense, and who knows, maybe future wiki distributions do it by default. (If

    <meta name="robots" content="noindex">
    would work universally...)
  14. Complacency by faust2097 · · Score: 5, Interesting
    Isn't it time for Google finally to put some work into refining their results to exclude tricks like this?

    It was time to do that at least a year ago. It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.

    But they're too busy making an email application with 9 frames and 200k of Javascript to pay attention to the reason people use them in the first place. It's a little disappointing, I'm an AltaVista alumni and I got to watch them forget about search and do a bunch of useless crap instead, then die. I was hoping Google would be different.

    1. Re:Complacency by koreth · · Score: 1
      But they're too busy making an email application with 9 frames and 200k of Javascript

      Because, of course, if they weren't doing that, every last one of the engineers on that project would be tinkering with the search engine instead. It's not like they have separate engineering teams or people with different areas of expertise there or anything.

    2. Re:Complacency by Carnildo · · Score: 1

      It was time to do that at least a year ago. It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.

      Please, if you're going to complain, give a concrete example of the search terms you're using, and what results you're expecting. I haven't had any trouble finding what I want on Google in the years I've been using it.

      --
      "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
    3. Re:Complacency by jdavidb · · Score: 1

      Hear, hear! I think it's pretty clear from his marketing speak ("popular consumer product") that he is trying to sell something that people do NOT find useful or interesting enough to link to and that thus does not come up high enough on Google for his liking.

      Move along; it's obvious from his dialect that this man is not one of us.

      BTW, no marketing or advertising has ever been able to replace or even approach the effectiveness of "word of mouth." This is one reason Google is so great (though of course not perfect); it's like aggregating everyone's opinions to assign a relative weight. The main people we see whining about it are usually those who's product or whatever does NOT have the kind of interest from the public that they want. They should be ranting against the entire public instead (or maybe against the laws of economics); Google is not at fault.

      It would have been in this guy's interest to actually provide the example you ask for; at least then we could see his product. However, he must be scared that we wouldn't be interested. Always be wary of advertisers who seem to fear public opinion.

    4. Re:Complacency by gstoddart · · Score: 1
      It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.


      Hear hear. I find a lot of searches nowadays will hit a bunch of portal pages whose sole job seems to actively get in my way of doing a search. I've hit some of these pages from a bunch of very different searches.

      I've just started uding Google's feedback that lets you identify something as a portal page. I don't know if it is actually helping, but I've definitely ran into the problem you describe.
      --
      Lost at C:>. Found at C.
    5. Re:Complacency by Anonymous Coward · · Score: 0

      What a pity no one feeds you, assmaster. Try subtlety for a change. You are a shame for the troll community.

    6. Re:Complacency by zerocool^ · · Score: 1


      It was time to do that at least a year ago. It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.

      This actually got to me in the past few days. I tried to go online to find a charger for the battery that goes with a Canon ES900 8mm camcorder (because I bought the battery and realized I have no way to charge it). Most of the links on google were of the variety: "Take database of every known camcorder. Make each model number a link to a site".

      I hate google link farms.

      --
      sig?
    7. Re:Complacency by julesh · · Score: 1

      I'm an AltaVista alumni

      I think you mean alumnus. Alumni is plural.

      and I got to watch them forget about search and do a bunch of useless crap instead, then die. I was hoping Google would be different.

      And I think it is. In case you hadn't noticed, gmail is about searching... just searching through e-mail, rather than web pages. Same as Google Groups is about searching newsgroups.

    8. Re:Complacency by lavaface · · Score: 1
      I tried to go online to find a charger for the battery that goes with a Canon ES900 8mm camcorder (because I bought the battery and realized I have no way to charge it).

      Either you don't know how to search or you didn't look very hard. I found this page in less than five minutes. I just did a search for canon es900 "battery charger" and the third link offered a clear hit. The first link offered a purchase for a universal battery charger as one of the accessories. I'm curious, how did you search for this and get lost?

    9. Re:Complacency by lavaface · · Score: 1

      I call bullshit. Searches for ipod, tivo, ethernet dvd player, canon cameras and many others bring immediate results. Obscure pieces can be a little troublesome but then again that doesn't fit the description "popular consumer product." Do you have an example ??? . . .

    10. Re:Complacency by zerocool^ · · Score: 1

      Yeah, i found that page, too. It is a different color hyperlink, which means I have been there. That's another problem with google link farms. That page has the words ES900 on it just to generate more hits. That battery charger is a "universal" charger, which "automatically adjusts for each battery size." It's not specifically for the canon ES900. Not to mention that it's 36 british pounds, times two plus cross-atlantic shipping means that it will cost $85 and get here in August.

      See?

      --
      sig?
    11. Re:Complacency by lavaface · · Score: 1

      I have to agree with you on this subject: batteries are particularly loathesome for link farms. I helped my frind find a battery for an Akai minidisc recorder. It was the most troublesome search I've come across. However, you may just want to buy a universal charger. The first link returns this. If the electric ratings match you'll be fine. Only 43 US dollars. Or get an extra battery for 23 more. Hope this helps.

    12. Re:Complacency by zerocool^ · · Score: 1

      I have been away for a few days, sorry I didn't reply. I did go ahead and order a battery charger that was $49.99 + shipping. It looked from the picture and sounded from the description to be a charger specifically for the canon es900, but when it arrived, it too was a "universal" charger. I was not happy, but it did charge my battery, and it came with a car adapter, so I guess it's cool. But, if it hadn't been able to charge my battery, rest assured that there would have been a charge-back (no pun intended) to my credit card.

      ~Will

      --
      sig?
  15. Well, it's about time this gets some attention by digitalgimpus · · Score: 4, Insightful

    I've noticed that my blog's getting lots of spam from sites that don't seem like typical spam sites....

    From what I can see, it looks like those "search ranking professionals" who "guarantee to raise your google rank in 30 days" are using blog spamming, and perhaps Wiki Spamming as a way to increase their clients ratings.

    It's not about meta tags, or submitting anymore... it's spamming.

    Perhaps it's time for people to finally be warry of these services. After all, can a third party really guarantee a position in another companies search index?

    IMHO those services are pure evil. They either do nothing, or they do something to increase page rank... what is that "something"? How many options do they have?

    If they are going to use my blog... why can't I get a cut in that business?

    1. Re:Well, it's about time this gets some attention by Lurker+McLurker · · Score: 4, Insightful
      IMHO those services are pure evil.
      No, 9/11 was pure evil, some unwanted comments on a blog is an annoyance. If you have a website that allows anyone to post comments, you will get some you don't like. That's life.
      --
      Mod parent up!
    2. Re:Well, it's about time this gets some attention by sabernet · · Score: 1

      here's an idea. link to some of those services and let the slashdotting begin:)

    3. Re:Well, it's about time this gets some attention by jtwronski · · Score: 0

      I agree completely. I can't count how many times my customers have asked me "What about those companies that guarantee 1st page rankings? What are they doing that you aren't?". Its hard to compete (honestly, anyway) with folks who have sold their souls and annoyed countless thousands by taking unfair advantage of the features that have made google the #1 search engine out there. Link trades, registration, and smart content and meta tags apply less and less to rankings nowadays. At least I can console myself (and hopefully, my customers) in that I can offer rankings without being annoying or stepping on others' toes.

    4. Re:Well, it's about time this gets some attention by sabernet · · Score: 1

      semantic nonsense for the sake of making yourself feel smarter is also annoying. Get a life.

    5. Re:Well, it's about time this gets some attention by 87C751 · · Score: 1
      I've noticed that my blog's getting lots of spam from sites that don't seem like typical spam sites....
      I had a spate of comment spamming too, about a month ago. In fact, that was what inspired me to move from blogware (WordPress) to a full-up CMS (PostNuke). The comment spammers' scripts don't seem to have found PostNuke yet. By the time they do, I'll have anti-bot measures in place (if I haven't simply closed comments to unregistered users).
      --
      Mail? Put "slashdot" in the subject to pass the spam filters.
    6. Re:Well, it's about time this gets some attention by Anonymous Coward · · Score: 0

      After all, can a third party really guarantee a position in another companies search index?

      Here's what Google has to say on the subject:

      Beware of SEO's that claim to guarantee rankings, or that claim a "special relationship" with Google, or that claim to have a "priority submit" to Google.

    7. Re:Well, it's about time this gets some attention by Henry+Stern · · Score: 1

      I beg to differ with you on the matter of it being only "an annoyance." I've had to delete comments on my own weblog that (supposedly) link to underage pornography sites. I'm not a lawyer, but I'm fairly certain that it is illegal to link to child pornography. Assuming that this is true, those SEOs are actually causing you, the innocent weblog/wiki owner, to unwillingly and unwittingly commit a criminal act.

      Is it still just "annoying?"

    8. Re:Well, it's about time this gets some attention by Lurker+McLurker · · Score: 1
      Surely, as slashdot puts it, Comments are owned by the Poster.

      If you knowingly left a link to a child porn site on your own blog up, you might get into legal hot water, though the authorities have their hands full catching the people who operate such sites.

      --
      Mod parent up!
    9. Re:Well, it's about time this gets some attention by Anonymous Coward · · Score: 0

      SEO is not evil in and of itself. That basic idea is to create web pages that load quickly, use a lot of descriptive text and avoid the use of javascript and anything else that might choke the spider.

      This encourages cleaner code, compatibility, and easier to use sites. Most "respectable" SEO's avoid spamming like the plague for fear of getting their client blacklisted by Google.

      I don't really see the problem with the Wikis. If Google is putting too much weight on the links, it sounds like it is Google's problem.

      I don't buy what other people suggest in censoring sandboxes from Google. I think it is completely up to the web site owner, and I don't think it is anyone else's business. Honestly, if Google can't understand a sandbox for what it is, then Google _needs_ to do something about it!

    10. Re:Well, it's about time this gets some attention by Heywood+Yabuzof · · Score: 2, Insightful


      OK, so it's not really fair to get into relative levels of "evil", but let's also not minimize the "evil" that search optimizers do. It's not just a bunch of extra comments on blogs or wikis.

      Their fundamental business model is CONTRARY to my interests as a consumer trying to get product information. They don't wish to let me find the product or the review or the site that MOST PEOPLE FOUND USEFUL, they only want me to find the one that PAID THEM THE MOST MONEY.

      I realize that's just the way things are, but that's obviously counter to my whole purpose for using a search engine like Google. They are intentionally polluting the search results. It's not the methods I find "evil" (although blog comment and wiki spamming are pretty shady) as much as the end result - the loss of helpful web searches.

    11. Re:Well, it's about time this gets some attention by Anonymous Coward · · Score: 0

      I'm not a lawyer, but I'm fairly certain that it is illegal to link to child pornography.

      Keep in mind that these websites may be maintained by people that do not live in the US. BTW, the internet is now owned by America.

    12. Re:Well, it's about time this gets some attention by Boing · · Score: 1

      Howsabout, nothing is pure evil... every human action is a function of multiple motivations, some of which are well-intentioned and some of which are ill-intentioned.

      You managed to score some good moderation for what is essentially a pedantic argument; despite the wording, I highly doubt the original poster literally believed pagerank services to be the absolute worst human creation, and you impressed no one (except, apparently, a few moderators) with your ability to say "oh, yeah? well I can think of much worse things!"

      Especially since you conveniently ignore the fact that human history has seen worse things than 9/11, which (by your logic) would inherently negate your own claim that 9/11 was "pure evil".

  16. This happened to me by JohnGrahamCumming · · Score: 4, Interesting

    This happened on the POPFile Wiki. Eventually I solved it by changing the code of the Wiki itself to have an allowed list of URLs (actually a set of regexps). If someone adds a page which uses a new URL that isn't covered it wont show up when the page is displayed and the user has to email me to get that specific URL added.

    It's a bit of an administrative burden, but stopped people messing up our Wiki with irrelevant links to some site in China.

    John.

    1. Re:This happened to me by karmatic · · Score: 1

      Just do a google-unfriendly redirect. That at least stops the PR-Spammers.

  17. E2 by mirko · · Score: 1

    I have got the impression that this could work with E2 as well as probably most bbcode powered fora.

    --
    Trolling using another account since 2005.
    1. Re:E2 by proj_2501 · · Score: 1

      E2 doesn't have external links except those posted by gods, and they also have a vicious team of editors just waiting to pounce on things like this.

    2. Re:E2 by generic-man · · Score: 1

      Not likely. Everything2 isn't indexed by Google any more.

      --
      For more information, click here.
    3. Re:E2 by Spudley · · Score: 1

      Yep. My site's forum got hit be a porn spammer a few weeks ago.

      Fortunately, when it comes to forums, they do have the advantage you have to register and login before you can spam the forum itself, so it's easy to block them, by IP if neccessary.

      The current favorite tactic on forums is just to register, and not bother ever logging in, and putting the spammy web address into the user details so it shows up in the user list.

      Again, once I blocked him and deleted the users he had created, I didn't have a problem. But if needs be, it would be easy enough to remove the all web addresses from the member list page.

      It's a real shame it's coming to that, though. :(

      --
      (Spudley Strikes Again!)
  18. I've seen this by goon+america · · Score: 3, Informative
    I just reverted some pages on my watch list on Wikipedia that had been edited with a google spam bot to link all sorts of words back to its mother site.... lots of mistakes, looked like the script they were using hadn't been tested that well yet. (Would post an example, but wikipedia is completely fuxx0red at the moment).

    This may become a big problem for sites like this. The only solution might be one of those annoying "write down the letters in this generated gif" humanity tests.

    1. Re:I've seen this by willCode4Beer.com · · Score: 1

      Even the "Human" tested images don't work anymore.

      Many porn sites have "clustered" humans by having people enter the text to get access.

      We need a way to have a human test that also verifies that it is a human making the post.

      The only thing I can think of is digitally signing everything. But, that would mean giving up the MYTH of being anonymous on the web.

      ---

      --
      ----- If communism is a system where the government owns business, what do you call a system where business owns govern
    2. Re:I've seen this by Chester+K · · Score: 1

      I just reverted some pages on my watch list on Wikipedia that had been edited with a google spam bot to link all sorts of words back to its mother site....

      It's too bad it stays around in Wikipedia's history pages --- meaning the spammer is still getting full value from their links.

      --

      NO CARRIER
  19. apache + search + p2p = distributed search engine by datrus · · Score: 2, Insightful

    Something that would make a nice opensource project would be to include p2p search functionality in apache itself.
    This way all the modificed web servers would make a giant distributed search engine.
    Some nice algorithms like koorde or kademlia could be used.
    Anyone thought about starting something like this?

    David

  20. Google. by Rick+and+Roll · · Score: 3, Interesting
    When I search on Google, half the time I am looking for one of the best sites in a category, like perhaps "OpenGL programming". Other times, however, I am looking for something very specific that may only be referenced about twenty times, if at all.

    When I do search in the first category, especially for things such as wallpaper, or simpsons audio clips, the sites that usually turn up are the least coherent ones with dozens of ads. I usually have to dig four or five pages to find a relevant one.

    The people with these sites are playing hardball. Google wants them on their side, though, because they often display Google text ads.

    Right now, my domain of choice is owned by a squatter that says "here are the results for your search" with a bunch of Google text ads. I was going to/may still put a site there that is very interesting, and the name was a key part of it.

    I firmly believe that advertisements are the plague of the Internet. I would like to see sites selling their own products to fund themselves. Google doesn't really help in this regard. The text ads are less annoying than banner ads, but only slightly less annoying.

    Don't get me wrong, I like Google. It's an invaluable tool when I'm doing research. I would just like to see them come out in full force against squatters.

    1. Re:Google. by nsingapu · · Score: 2, Interesting

      Don't get me wrong, I like Google. It's an invaluable tool when I'm doing research. I would just like to see them come out in full force against squatters.

      Google owns oingo.com - perhaps the largest collection of squatter sites out there.

    2. Re:Google. by Boing · · Score: 1
      The people with these sites are playing hardball. Google wants them on their side, though, because they often display Google text ads.
      Ah, but there's the trick... manipulating pagerank precludes these sites from having to buy text ads. By fixing the pagerank algorithms, Google would more than likely increase the text ad purchases by these sites.

      The key to remember here is that Google text ads are only as valuable as Google's popularity. Google's popularity is only as valuable as its ability to fulfill it's primary purpose: returning relevant web pages to a search string.

      If Google does an elaborate dance to avoid offending its text ad clients, and consequently loses its searcher base, then the text ad clients are going to disappear anyway.
    3. Re:Google. by lavaface · · Score: 1
      When I do search in the first category, especially for things such as wallpaper, or simpsons audio clips, the sites that usually turn up are the least coherent ones with dozens of ads . . . Don't get me wrong, I like Google. It's an invaluable tool when I'm doing research.

      Is that what they call research these days? ; )

      In all seriousness, it seems your examples are bunk . For example the first page for simpsons audio clips is a page with dozens of clips. As for wallpaper, well, be more specific. Do you want space wallpaper, cartoons, underwater, nature? If you're looking for something particular (whether its a favorite simpsons quote or a wallpaper of the sydney harbour) use the keywords, in quotes if necessary to target what you need. Just don't complain its Google's fault when your search terms suck to begin with.

  21. Tomorrow today yesterday by boa13 · · Score: 4, Insightful

    But webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank.

    The Arch Wiki has sufferred several times from such vandals in the past few months. I'm sure other wikis have, too. They create links over single spaces or dots, so that casual readers don't notice them. Attentively watching the RecentChanges page is the most effective way to find and fight them, but this is tiresome. I guess many wikis will require posters to be authenticated soon, which is a blow in the wiki ideal, but not such a major blow. Alternatively, maybe someone will develop heuristics to fight the most common abuses (e.g. external link over a single space).

    So, this is not new, but this is now news.

    1. Re:Tomorrow today yesterday by Neophytus · · Score: 1

      One to look out for is if html can be posted. It makes the span invisible to any human reader but I doubt that any current search engine can identify the purpose of such a tag.

    2. Re:Tomorrow today yesterday by yppiz · · Score: 1
      Link spamming has happened on The Metaweb too. But it has not been a serious problem.

      Mediawiki, the code base the Metaweb uses, has several features that minimize the damage done by link spammers. The first is watchlists - anyone who uses the Metaweb can set up a watchlist so that they are notified when there's a change to one of their favorite pages. Link spammers usually hit someone's favorite page, and that someone undoes the damage in a few minutes to a few hours.

      Mediawiki also allows administrators to lock pages from edits (the front page is locked), and to ban IP addresses. And, like any Wiki, anyone can roll back changes to a page.

      The combination - peer review, notification of changes, and tools that make it easier to undo damage than to create it - combined with a (small) community of active users, means that while link spammers have hit us, the damage gets undone quickly enough that we're not worth it.

      --Pat / zippy@cs.brandeis.edu

    3. Re:Tomorrow today yesterday by NoMoreNicksLeft · · Score: 1

      If most of this is being performed by spambots, couldn't some image "what is the word" turing test be added to the system? Hell, there's probably even a perl module.

      Then again, won't stop the spammeisters from dutifully doing this by hand.

      Hmm.

      I propose the equivalent of a turing test for spammers. It would be posed as a single question that they find impossible to to answer successfully. Examples:

      Is it ethical to advertise to people without their prior consent?

      Should all commercial email be protected by free speech?

      Have you ever been persecuted by evil, fanatical anti-spam zealots?

      Sure, there might be 1 or 2 out there savvy enough to not be caught by this, but the vast majority would hit the yes button before really giving it any thought.

    4. Re:Tomorrow today yesterday by Spudley · · Score: 1

      One to look out for is <div style="display:none;"> if html can be posted. It makes the span invisible to any human reader but I doubt that any current search engine can identify the purpose of such a tag.

      Of course, blocking tags like that would penalise anyone using that sort of thing to hide page elements that will be shown on a mouse over, or whatever.

      --
      (Spudley Strikes Again!)
  22. Not a big deal by arvindn · · Score: 4, Informative

    Recently the Chinese wikipedia suffered a spam attack with a distributed network of bots editing articles to add link to some chinese intenet marketing site. In response, the latest version of MediaWiki (the software that runs the wikipedias and sister projects) has a feature to block edits matching a regex (so you can prevent links to a specific domain). Wikis generally have more protection against spamming than weblogs. So I wouldn't worry.

  23. Hmm by Julian+Morrison · · Score: 3, Interesting

    Leave the links, edit the text to read something like "worthless scumbag, scamming git, googlebomb, please die, low quality, boring" - and lock the page.

    1. Re:Hmm by Saeed+al-Sahaf · · Score: 1
      Leave the links, edit the text to read something like "worthless scumbag, scamming git, googlebomb, please die, low quality, boring" - and lock the page.

      The point is that methods such as this require a lot of time on the part of the wiki webmaster. If you get a whole lot of this crap at your wiki / blog, what are the chances you are really going to want to spend a few hours *every* day to mess with the links by hand?

      --
      "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
    2. Re:Hmm by dustinbarbour · · Score: 1

      If someone can write a bot to automagically edit wikis, I'm sure someone can write a bot to go through the same wiki and modify offending links.

    3. Re:Hmm by Eythian · · Score: 1

      ...or change the link to point to an anti-wiki-spamming site, such as this, so that they get the keywords that the spammer was looking for.

  24. This is a concern for the Google Gorilla? by Mr.Fork · · Score: 2, Interesting

    Wait a minute - a way to spoof Google to get your page ranked better through WiKi? OMFG! Call the internet police, call Dr. Eric E. Schmidt, call out the Google Gorilla goons! I'm sure the good Dr. has a fix like the ones he used at Novell...

    The problem with the whole Google model is that it's biased to begin with. If I'm looking for granny-smith apples, chances are an internet chimp they've bought the space with banana's to Google's goons. It becomes obvious when you see a chimp site that is near the top that has no business at the top. To the experienced googler, it's just an annoying fly on the screen and you just move further down.

    I'm hoping that Google doesn't get too bogged down in becoming that big Ape like Micro$oft and be a little more proactive in protecting their business property. It's bad enough that they're selling top space to companies willing to pay, but here's hoping they don't slip on their own banana peels.

    --
    Management is doing things right; leadership is doing the right things. - Peter F. Drucker
  25. True by Pan+T.+Hose · · Score: 4, Funny

    "Isn't it time for Google finally to put some work into refining their results to exclude tricks like this?"

    I agree. I hope Google will finally put some work into refining their search results. I mean, they are probably the worst search engine ever! Now, Yahoo, MSN, Overture, Altavista... Those are much better. But Google?! Please...

    --
    Sincerely,
    Pan Tarhei Hosé, PhD.
    "Homo sum et cogito ergo odi profanum vulgus et libido."
    1. Re:True by Jeff+DeMaagd · · Score: 1

      I think it stands to reason that Google shouldn't give ANY opening to competition. If there is a major complaint about how the system works, fix it.

      If Google just sits around then the competition will likely catch up.

  26. Re: by Anonymous Coward · · Score: 0

    yeah but the winner's wiki spamming is all over also:

    http://www.google.com/search?hl=en&lr=&ie=UTF-8& c2 coff=1&edition=us&q=merkey.net+wiki&btnG=Searc h

  27. Sure, that will work by Safety+Cap · · Score: 1

    Because IP addresses can't be forged. Evar!

    --
    Yeah, right.
    1. Re:Sure, that will work by Short+Circuit · · Score: 2, Informative

      I know you're being sarcastic, but one way to prevent forged IP addresses is to require the user to "preview" their comment before posting.

  28. Who cares, web search is "done" by Ars-Fartsica · · Score: 0
    If I were Google, the last place I would be putting more funding into is web search. Any algorithm will be spoofed, so the search nerds will never be satisifed long term. Average users though seem quite enambored with it, so the ROI for a new algorithm isn't clear.

    Compare this to the ROI for music search, non-web search etc and its pretty clear Google's R and D is better directed to new products. Get used to it folks, when they go public there will be a huge expectation of new products on a regular basis. Web search will get tuned when ad keyword revenue dictates it.

  29. server-supplied meta-info to reduce search weight? by osmethnee · · Score: 1

    A possible solution I've been toying with... 1. Servers provide a meta-tag for certain pages which search engines interpret as reducing/eliminating that specific page's search weight. 2. Scripts which allow user-created content (wikis?, guestbooks, weblog comment forms, forums, and so on) can be updated by the content-provider to include this meta-tag. 3. To encourage spammers to check this tag and move on elsewhere if it's implemented, these same scripts should enforce a longish (5 second?) delay for all user-initiated content changes. [and seeing this is slashdot] 4. ??? 5. Profit!

  30. The bot can keep the cookies it gets by Anonymous Coward · · Score: 0

    Coding cookie preserving http connection takes about 20 lines of java code. You better think about something better.

    1. Re:The bot can keep the cookies it gets by Anonymous Coward · · Score: 0

      Coding cookie preserving http connection takes about 20 lines of java code.

      Or one line of Perl. Thanks for reminding me why Java sucks.

  31. It just might work! by mcmonkey · · Score: 4, Funny

    'You know what Google needs? A "Was this result helpful in your search?" button for each link returned'

    Yes! Genius! That's it! Google needs some kind of system of rating results to modify future results returned--a system of 'mods' if you will.

    Of course some people will 'mod' stuff down just because they don't like the viewpoint expressed, or they're in a perennial bad mood because their favorite operating system is dead, so we'll need to have a system of allowing people to rate the moderations--'meta-mod' if I may be so bold.

    It sounds crazy, I know, but I think we could do this.

    1. Re:It just might work! by analogduck · · Score: 1

      Or maybe the same manipulative people who's actions you're trying to defeat thru the proposed Google modding *might* just 'mod' their own web sites up with said Google modding system. It opens up a whole new can of worms for people to abuse the ranking system.

      I could just see the next wave of worm infections which infect systems then open up http requests to Google in a distributed assault to perform searches then mod down the competitors and/or mod up the sites of people manipulating Google search results. This way it would appear to Google as though myriads of unique visitors are each independantly modding / ranking these search query results.

      I'm not trying to stomp on your idea -- it is a nice idea -- but it has some major exploitation opportunities unless there are appropriate mechanisms put in place to thwart such evil devise.

      --
      ~"If at first you don't succeed, chainsaw juggling is probably not for you."
    2. Re:It just might work! by metamatic · · Score: 1

      Yes, I've seen a web site that uses a system like that. It works incredibly well. You never see anything off-topic, trollish, ill-informed, ignorant or downright moronic posted...

      --
      GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
    3. Re:It just might work! by mcmonkey · · Score: 1
      You never see anything off-topic, trollish, ill-informed, ignorant or downright moronic posted...

      Of course. Such frivolity has no place on the internet. This is a forum for intellectual discourse and the unfettered sharing of ideas.

      Oh, and I need a place for pr0n when the hotel movies are $9.99 a pop but the net access is free :D

  32. Re:apache + search + p2p = distributed search engi by Bert690 · · Score: 1
    Something that would make a nice opensource project would be to include p2p search functionality in apache itself. This way all the modificed web servers would make a giant distributed search engine. Some nice algorithms like koorde or kademlia could be used. Anyone thought about starting something like this?

    We looked into something a lot like what you suggest (and actually have it up and running inside our intranet with 2k or so users). The problem with doing this on the internet is that p2p techniques are MUCH more susceptible to spamming than centralized techniques in general (because, for one, p2p reputation systems are very difficult to get right). Another problem is that most existing p2p search methods work great for finding popular content but not very well for finding that very specific peice of information that maybe only you are looking for at the current moment. Kademlia/Chord are DHT's and do not solve the text search problem on their own. While some p2p networks have adapted DHT's for keyword searching, the results still leave a lot to be desired (IMO).

  33. omg by Anonymous Coward · · Score: 0

    This is a really sad day for news. Get over it and quit crying.

  34. Google may well downrate this by Animats · · Score: 1

    I expect that Google will in time give drastically lower weight to easily-modified pages like "blogs" and "wikis". They're not that hard to recognize.

  35. visual security code for sign-up by Saeed+al-Sahaf · · Score: 4, Informative

    Most BB boards (including phpBB, upgrade!) and blogs (including Slashdot) now feature the visual security code for sign-up. But, of course, this does not prevent hand entry of spam...

    --
    "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
    1. Re:visual security code for sign-up by stevey · · Score: 5, Insightful

      There was a story about defeating this system on /. a while back.

      Rather than using OCR or anything poeople would merely harvest a load of images from a signup site - possible when there are only a given number of finite images, or when there is a consistent naming policy.

      Then once the images were collected they would merely setup an online porn site, asking people to join for free proving they were human by decoding the very images they had downloaded.

      Human lust for porn meant that they could decode a large number of these images in a very short space of time, then return and mount a dictionary attack...

      Quite clever really, sidestepping all the tricky obfuscation/OCR problems by tricking humans into doing their work for them ..

    2. Re:visual security code for sign-up by Enrico+Pulatzo · · Score: 1

      That's really clever. Maybe we do deserve spam after all.

      What am I saying?

    3. Re:visual security code for sign-up by Bitsy+Boffin · · Score: 2, Informative

      Except that the images ("turing numbers" as they are often called) are dynamically generated from random character sequences, and probably with equally random distortions.

      You'd be pretty lucky to hit the exact same image twice.

      --
      NZ Electronics Enthusiasts: Check out my Trade Me Listings
    4. Re:visual security code for sign-up by nautical9 · · Score: 1
      Is this a plugin or something for phpBB - I just upgraded to 2.0.8, and it's not an option I can find anywhere.

      And even using the email activation (which I'm reluctant to do - I could really care less if I have a valid email address from my users), the member list still lists off new unactivated users.

      I did take the AC's suggestion and blocked the memberlist in my robots.txt, so at least their spam attempts won't be fruitful, but it still inflates my database with bogus users. Annoying.

    5. Re:visual security code for sign-up by dprior · · Score: 1

      So the lesson is...

      When signing up for free porn and asked to verify the text created by an image, be sure to put in complete garbage to poison the system.

    6. Re:visual security code for sign-up by chris_mahan · · Score: 1

      You sign up for nr0p?

      --

      "Piter, too, is dead."

    7. Re:visual security code for sign-up by Saeed+al-Sahaf · · Score: 1

      Plug-in. Search the phpBB community board.

      --
      "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
    8. Re:visual security code for sign-up by RazorX90 · · Score: 1
    9. Re:visual security code for sign-up by pen · · Score: 1

      Blocking the page with robots.txt isn't effective against spambots that disregard robots.txt

    10. Re:visual security code for sign-up by CyberKnet · · Score: 1

      If you would take a slight logic step sideways though, you would see that it will stop google from increasing the other sites pagerank, because google will not index that page.

      So no, it wont stop you having bogus users being created, but yes, you will get a warm fuzzy knowing that your (popular and high pageranked) site is not helping their nefarious techniques..

      Interestingly enough, this explanation was also in a round about way the last line of the OP's comment. Fancy that.

      --
      Video meliora proboque deteriora sequor - Ovidius
    11. Re:visual security code for sign-up by Anonymous Coward · · Score: 0

      I guess I should work on my reading comprehension... ;-)

    12. Re:visual security code for sign-up by NemosomeN · · Score: 1

      Blocking it with robots.txt keeps it from affecting google's pagerank... More of a quick stab back at the people doing it (A stab that they will never likely know about, though).

      --
      I hate grammar Nazi's.
    13. Re:visual security code for sign-up by smagruder · · Score: 2, Informative

      Check out the Visual Confirmation mod in the /contrib folder in your phpBB installation. Read the README.html file for installation instructions.

      --
      Steve Magruder, Metro Foodist
  36. 9/11 was genius ! by Anonymous Coward · · Score: 0


    Landmines that USA sells to poor nations is evil

    and 3000 people die a month on american roads but i dont see people burning down GMC or Ford

  37. Why are people so surprised? by stubear · · Score: 0

    The web was designed around the concept of trust and this simply does not work anymore. The only way to fix the internet is to eliminate all form of anonymity and temper this with strong legal protection of private information. Until this happens you will always have to deal with spam, viruses, hackers and the like. Once people can be held accountable for their actions online then, and only then, will the internet work as it was intended.

    1. Re:Why are people so surprised? by sweetleaf · · Score: 1

      Anonymity is an important component of free speech. It provides a necessary way to espouse unpopular ideas without punishment or retribution.

    2. Re:Why are people so surprised? by Anonymous Coward · · Score: 0

      And if that doesn't work, then what?

      Build a new network for the Good Citizens?

      Or perhaps a "jail" network for the Bad Citizens?

      Thanks, but no thanks. Leave the 'Net alone.

    3. Re:Why are people so surprised? by eyepeepackets · · Score: 1

      Yes, same thing happened at the end of the BBS days in the late 80s and early 90s -- only brain-dead sysops or unsuspecting noob sysops would allow anonymous logins. Interestingly enough, it was the concept of "liability" which forced sysops to deny the anonymous login: The sysop was responsible for the complete system.

      Personally, I'm rapidly getting to the point where the idea of licensing people for internet access is appealing. Much like getting a drivers license, Joe Schmuck user _must_ be identifiable when on the network and have proven he has the minimum knowledge base to use his computer and have a basic understanding of network etiquette, control and usage.

      Along the same line, companies which insist upon creating grotesquely insecure programs which access the network should face a _very_ steep per-instance fine when their product negatively effect the network. For example, Doors OS passes viruses, trojans, backdoors, bots, worms, etc. and the makers of Doors OS pay $1000.00 for every found instance. Granted, Doors OS is going into the red/toilet very quickly regardless of how many billions they have in the mattress, but this is a good thing, for the example will leave an indelible impression on the other players, most of whom will act responsibly when the gun is pointed at their fat wallets.

      In 1995 when the public "discovered" the internet, the future as we have it today was foretold by many: Excessive advertising everywhere; getting a buck the primary creative motivator for most content creators (hence the overbearing mediocrity of the web and most sites thereupon), standards rapidly being usurped by non-standard lock-in ploys; an endless stream of cons, liers and outright thieves (most of whom pass themselves off as business folk) through which the network user must dodge, weave and wiggle through to his destination. The network has become more than an information medium, it's become a grotesque game of getting the information you need without getting bounced, trouced and generally fucked over in the process.

      So, all this is to say I agree with you now and have said it before myself: Until the anonymous network login is dead, the network will be a sewer.

      --
      Everything in the Universe sucks: It's the law!
  38. Sandbox persistence by gmuslera · · Score: 2, Insightful
    If its a test area, is needed to store it? Wikis could just have it live for the current session or testing of the user, and when the user logs out or finish editing, simply delete/restore it to a default introductory text. Don't need to be some kind of collaborative blackboard or graffiti wall, or at least, if it must be, that be the webmaster choice to be that way (at least TikiWiki let me disable the sandbox if i want).

    But if the problem is to have in websites areas where visitors (even unregistered ones) can post random text and links, even slashdot is potentially target of the same (maybe should be a "Spam" mod score?) or by the way, any site where unregistered visitors can store content in a way or another, be wiki or not.

  39. "Finally"?? by jdavidb · · Score: 4, Interesting

    Isn't it time for Google finally to put some work into refining their results to exclude tricks like this?

    I take extreme issue with that statement, and I'm surprised noone else has challenged it. Google does in fact put quite a bit of work into making themselves less vulnerable to these kinds of stunts. They even have a link on every results page where you can tell them if you got results you didn't expect, so they can hunt down the cause and refine their algorithm.

    The system will never be perfect, and this is the latest issue that has not (yet) been dealt with. Quit your griping.

    1. Re:"Finally"?? by jdavidb · · Score: 2, Informative

      I checked, and I've got documented evidence of this. On April 25 last year, I reported that earthlink.net was showing up as the top search result for queries involving various religious words, including "Bear Valley Bible Institute." The Church of Scientology (which owns Earthlink) was clearly engaging in something to distort the page rank of earthlink. I had noticed this for a long time before I recorded it.

      On that same day, I reported the problem to Google via their feedback mechanism. I note today that the problem is gone.

      Now if I can just do something about the "Church Of Christ at eBay Low Priced Church Of Christ. Huge Selection! (aff)" ads I keep getting on Google, I'll be happy... ;)

    2. Re:"Finally"?? by jdavidb · · Score: 1

      So at any rate, to sum up, I find the whining about Google "finally" doing something about this to be very unfair, since Google actively works on this kind of problem. It is disingenuous to dismiss their hard work and suggest that they have done nothing.

    3. Re:"Finally"?? by jdavidb · · Score: 1

      Further evidence that Google has not been negligent in dealing with googlebombing and spamming: Google's Spam report page. For those of you who whine that Google is doing nothing about "poor results" (usually those of you hawking junk we don't want to buy, I notice), you might want to reevaluate reality.

    4. Re:"Finally"?? by mithras+the+prophet · · Score: 1

      Jolly little thread you're having with yourself there, eh?

      --
      four nine eighteen twenty-7 thirty-nine forty-7 fiftyeight sixty-nine seventy-9 eighty-8 one-hundred-and-nine one-twenty
    5. Re:"Finally"?? by scrytch · · Score: 1

      > The Church of Scientology (which owns Earthlink)

      They do not. Their founder is a scientologist, and he's probably funnelled some of his share back. This is far from ownership or even control. The vast majority of their shares are in public hands.

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
    6. Re:"Finally"?? by Anonymous Coward · · Score: 0

      Yup, finally we can test a link free credit report

    7. Re:"Finally"?? by Anonymous Coward · · Score: 0
      whatever

      some of the oldest googlebombs (like this one) still work

      yeah, they've really done a lot to fix it

      why does google get a free pass from so many people? is the "cool" factor really so overpowering?

  40. Why doesn't google by hackstraw · · Score: 1

    simply make a distinction between "I am looking to buy something" searches vs "I am looking for information about something".

    They are cleary different kinds of searches, and I do both of them, yet I get the same results for both kinds of searches. With the exception for froogle, which is definitely a step in the right direction, but not quite there.

    Although the interface has gotten a little better on altavista (remember them??), but searches like: for used condoms do not make sense for retail stores at all. I'm sorry guys, there isn't a market for used condoms, but if there were I'm sure someone would be more than willing to supply the demand.

    The google search for used condoms is a little better, but the advertising links on the right hand side does have:

    Used Anything -Dirt Cheap
    at Gov't & Police Auctions Near You
    Seized, Surplus Property. Hot Deals
    www.GovernmentAuctions.org

    And please do not take a tangent on "used condoms", its just a sick memorable example.

    1. Re:Why doesn't google by MoonChildCY · · Score: 1

      A funny advert from altavista showed up when searching for those used condoms...

      You Can Discover Unique Products on eBay

      You can find used condoms right here. With over 5 million items for sale every day, you'll find the unique items you're looking for at the world's online marketplace - eBay.

  41. Easy solution by lightspawn · · Score: 2, Insightful

    Edit robots.txt to let search engines know they should ignore sandbox pages.

    1. Re:Easy solution by advocate_one · · Score: 1

      unscrupulous webmasters would stick their own wikisandbox in a sister site under their control and deliberately not put in a robot.txt file just so they can mess with their own rankings.

      --
      Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
    2. Re:Easy solution by CaptainSuperBoy · · Score: 1

      Why do you need a wiki to do that? A wiki is just a web page. How is that any different than setting up a network of doorway sites? Google already knows to de-emphasize sites that all link to each other.

  42. naked women are trash? i'll take all you got by waspleg · · Score: 3, Funny

    you know what they say about another man's garbage

  43. That's very interesting KEYWORD by Doesn't_Comment_Code · · Score: 1, Funny

    That's a very interesting article.

    Sig
    --
    KEY PHRASE <A HREF=www.my_website.com> KEYWORD KEYWORD KEYWORD <\A>

    --

    Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
  44. Re:apache + search + p2p = distributed search engi by datrus · · Score: 1

    Right, preserving the integrity of the search results in the case of malicious users injecting false information into the system sounds like a challenging problem.
    But a p2p search engine seems to be the only way to go for an open-source search engine implementation.

    David

  45. image based spam control by MaximusTheGreat · · Score: 3, Interesting

    What about using random image based spam control lik the one yahoo uses on its new mail signup?
    So, every time you edit/post comment, you would be presented with an image with a random distorted text, which you will have to type in to be able to edit/post. That should take care of automated systems.

    1. Re:image based spam control by JamieF · · Score: 2, Insightful

      Hear, hear. Systems (software or otherwise) that offer something of monetary value for free, and provide no mechanism whatsoever to prevent people from exploiting them, are going to get exploited. Shocking!

      Maybe it wasn't obvious to blog and wiki programmers that the ability to post a comment or edit a wiki page was worth money. It isn't worth a lot per post, but because these are online systems, they are very susceptible to bots that can post in huge volume. All of those posts together can alter a site's placement in Google search results, and that's definitely worth money.

      Instead of whining about Google being influenced by attacks that use your Wiki or blog, how about making it hard for bots to post in the first place? Is that really an important feature that you can't live without?

    2. Re:image based spam control by Blakey+Rat · · Score: 2, Insightful

      I've always wondered why the image is always distorted images which are hard to read on speckled backgrounds?

      Why not just show the picture of an object, like an apple or something, and ask the user to type in what it is? I mean, you could have a few hundred of these and it would be nearly impossible for an automated system to guess. (You have a few hundred different items, and like 5-10 images of each item.) I dunno, seems easier to me, but I don't write web software.

    3. Re:image based spam control by themusicgod1 · · Score: 1

      bad idea
      thanks for excluding all blind people from being able to contribute, and seriously pissing off anyone using lynx/elinks. Besides, there are 'automated' ways around these letters, and has been for quite some time.
      Automated systems aren't the problem. It's the people who are using the automated systems.

      --
      GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
    4. Re:image based spam control by metamatic · · Score: 1

      The first time I saw an open "anyone can edit" wiki, I thought "Jeez, that'll last about six months".

      --
      GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
    5. Re:image based spam control by Anonymous Coward · · Score: 0

      You're right to a point. But you know what, I don't want to spend half my life trying to keep scum bags from ruining a good thing. There is such a thing as respect and honesty, and I expect everybody to have a certain amount of it.

      Sometimes it's not worthwhile to have something if you don't leave it open for honest people. For some reason, if you leave something wide open, the vast majority of people really will respect your wishes and use that service exactly the way you ask them to. Wikis are one thing that work like this. If they become restrictive, they rapidly lose value. Many are left wide open, and for the most part people respect that.

      For once, I want to see something that is protected only by people's honesty. It could work. We just need to get rid of these scumbags.

    6. Re:image based spam control by Anonymous Coward · · Score: 0

      When was that, like 1996?

    7. Re:image based spam control by MaximusTheGreat · · Score: 1

      For the blind, you simply allow them to post/edit after a one time signup and verification of email id. Once they log in they don't hav eto go through image verification system

    8. Re:image based spam control by bshanks · · Score: 1

      My impression was that for a captcha, you want to have the system be able to generate new images and grade them automatically. Otherwise, if there are a small (thousands, millions?) of images, an attacker could download them all and somehow solve them once, then program their bot with the solution. This isn't possible if the captcha is auto-generating a different test each time.

      So, generating a picture of an apple doesn't qualify because a computer can't do it automatically.

      Btw, other posts on this thread say that indeed, spammers are actually mounting the above described attack (d/l all the test images and solve them once). Apparenly they are copying the images to porn sites and then requires human visitors to the porn sites to answer the questions! This way they don't have to spend time answering the question themselves.

    9. Re:image based spam control by bshanks · · Score: 1

      As developer of WikiGateway, I certainly think that allowing bots to post is important! This is the keystone of a whole range of future features such as local wiki editing clients, alternate wiki frontends which automatically interface with the backend using bots, utilities to copy pages across different wiki sites, proxy services to connect wikis to other protocols such as email, etc.

      Many of these could be implemented if every wiki developer added some feature. But with bots, third parties can write overlays. In other words, bots are the key to decentralizing the addition of new features to the wiki experience; without bots, everything is limited to the pace of the wiki engine developers.

      I suggest that a captcha be implemented by default in the edit box to prevent bots from posting. But, registered and community-verified users should be able to enable bot-posting for bots which can authenticate with their username and password.

      "Community-verified" means that the community, by consensus (the same way wiki pages are edited), adds a flag to your user account that indicates that you are eligible to run bots.

    10. Re:image based spam control by bshanks · · Score: 1

      Here's the link I meant:

      WikiGateway

  46. Simple solution by Anonymous Coward · · Score: 0

    Mark the sandbox pages as non-indexable, non-followable with either meta tags or robots.txt.

  47. Re:apache + search + p2p = distributed search engi by datrus · · Score: 1

    I guess the spamming problem might be solved if each web-server is crawled by it's own builtin search agent and a set of indepent search agents that also belong to the p2p network.

    This way, even if the webserver advertises false information, this information isn't taken into account if it isn't verified by information coming from independent agents of the network.

    David

  48. YHBT. YHL. HAND. [Was: Re:Well, ...] by waveclaw · · Score: 2, Insightful

    No, 9/11 was pure evil

    Overuse of absolutes can lead to their deterioration. As an American I couldn't feel more turgid: now when the Europeans get ready to yell HITLER!!!! in IRC, I can just pre-emptively yell 9/11!!!!!!! and lose/end the conversation.

    To be fair, the difference between these 'blog abusing 'minor annoyances' and the large scale deaths/destruction of 9/11 can be seen as just a matter of scale. To some people I know, the economic impact of terrorism keeps them awake at night: the value of human life be damned, watch that bottom line! (Not the most civicly minded people, IMHO.)

    Being respected members of polite business society, these people and their defective outlook just as dangerous to you and I as the wiki 'blog abusers and 9/11 baby killers. To them, you are either a customer, employee or garbage to be taken out by security.

    This, by the way, is how we treat anybody who we have successfully alienated. Look at these 'blog spammers. Would anyone have cried if Al Queda had blown up a spammer's house?

    Both sides of this argument stand at the top of a moral mountain with a very slippery slope and are trying to make the other fall off as far and as fast as possible. I'm waiting to see who tumbles first.

    Like they say on bash.org: I will become rich and famous when I invent a device to punch people in the face through the Internet.

    --

    "You cannot have a General Will unless you have shared experiences. You cannot be fair to people you don't know."
  49. Hit them in the wallet by Anonymous Coward · · Score: 0

    What happened to the nice Internet we had in 1996?

    They are playing hardball. We can strike back by building web browsers that have ad blocking enabled by default. Maybe we can drive a few of them out of business.

  50. This drives me insane. by Anonymous Coward · · Score: 0

    I deal with this day in and day out on infoanarchy.org/wiki. The administrator there really has no conception of how to block people who are constantly posting spam and I am trying to find a method to automatically revert pages back when they are changed.

    The problem is - allowing good changes and not ending up in a wrestling match with the spammer (and, yes, a Google contest - to me - is the same as spam).

    I realize this is all part of the constant struggle between spam to be more effective and sneaky and services like Google to only reference relevant results.

    That said, don't introduce yourself to me as someone who does that or you might get pushed down a long flight of stairs. Whoops!

  51. Webserfs? by leandrod · · Score: 1

    You meant webserfs for webmasters, didn't you?

    --
    Leandro Guimarães Faria Corcete DUTRA
    DA, DBA, SysAdmin, Data Modeller
    GNU Project, Debian GNU/Lin
  52. Comments Should be Hosted by Poster by Anonymous Coward · · Score: 0

    This whole problem would be moot if the commenting paradigm weren't so ass-backwards. Slashdot and other comment sites shouldn't host the comments, they should link to them. This would have the added benefit of posters getting to keep all their comments in one place.

  53. Because the sandbox... by Anonymous Coward · · Score: 0

    Is the only part of wikis that anybody can post to, right? No, so how about we toss all the "solutions" regarding protecting/limiting the sandbox out. If it needs a solution, it should be more generalized one.

    I believe some of the blog software was using a google redirect mechanism to prevent links from polluting page ranking. Not sure how well it worked, but perhaps something like that would be useful here.

  54. really simple countermeasure by bcrowell · · Score: 1
    Sorry for the double post, but a really simple countermeasure just occurred to me. Code the wiki so that the sandbox will mung outgoing links. So if I run a sandbox on my wiki site wackywiki.org, people who are using the sandbox to learn how to wiki can enter [http://wackywiki.org/otherpage.html|foo], and it works as expected, but if they enter [http://blog.outer-court.com | Nigritude Ultramarine], it gets transformed into html as a link to an internal error page explaining that outgoing links from the sandbox are disabled.

    Yet another simple countermeasure would be to empty the sandbox if it hasn't been touched in one hour.

    1. Re:really simple countermeasure by lukewarmfusion · · Score: 1

      Great ideas. I have something similar on my site - I'll be sure to try some of those (meanwhile hoping that your email address doesn't end with @sco.com).

  55. Hardly new by Junks+Jerzey · · Score: 1

    Wikis have always been useful for self-promotion in less obscene ways. If you're knowledgeable in a field, and there's a wiki for it, then some tasteful posting and linkage is good for getting your name around. Ditto for Usenet and web-based discussion forums.

    And wikis have always been abusable--by design, really--by people with agendas. Hate Java or Python or Emacs or Perl or Windows? Then go to a popular wiki, delete positive comments about them, add positive comments about your own pet topic, and there you go. There's even a term for this.

  56. It's already been invented. by herrvinny · · Score: 3, Informative
    1. Re:It's already been invented. by Eythian · · Score: 1

      As someone who runs a personal wiki, I don't care if the sandbox gets indexed. I don't like that people want to use it for their own advertising (or rather, I don't like that they do use it for that). The problem isn't the people running wikis with sandboxes, it's the people who spam them. When all people with wikis exclude the sandbox from being indexed, spammers will just use some other page. So right now, my robots.txt block on the sandbox only works because few other people do it.

  57. Originality Filtering by mclove · · Score: 1

    Has anyone ever experimented with the idea of using a page's "originality" to help determine its place in search results? Maybe comparing text on a page with text on other pages in the results and moving any very similar pages down on the rankings. You'd have to add some sort of garbage filter to prevent people from stacking their pages with randomly-generated nonsense, but that's certainly doable. It wouldn't eliminate all of this SEO crap, but it would at least get rid of the fifty zillion nearly identical Amazon or BizRate or other pages that come up on a lot of searches, and would severely handicap some types of SEO techniques as well.

    Admittedly this is a lot more computationally intensive than most current search algorithms (as you'd pretty much have to do this in real time) but I can't imagine it's beyond the abilities of a Google or a Microsoft.

  58. Time to reconsider Wikis. by KevinDumpsCore · · Score: 1, Insightful

    > Isn't it time for Google finally to put some work into refining their results...

    Isn't it time to also reconsider the Wiki paradigm? More sites (like this) are requiring logins. "Golden Prose" indeed! IMHO, Wikis are evolving into crude Content Management Systems.

  59. Disallow weblinks by Will2k_is_here · · Score: 2, Interesting

    With regards to just editing the sandbox which nobody monitors anyway, why not just include a rule to deny adding URLs. There is no conceivable reason to allow a user to add a URL in the sandbox.

    And if your thinking "I want to practise adding links with the required syntax", it's not hard. The only thing you need to use the sandbox for beyond learning how other basic syntax works (and you can apply that to links without practising) is structuring.

  60. Clean sandbox daily. by chiph · · Score: 2, Informative

    As any cat owner will tell you, you need to clean the sandbox out periodically. In the case of a Wiki, overnight would probably be a good idea.

    Chip H.

  61. Grow up by scrytch · · Score: 4, Funny

    You know, googlebombing might have some better effect if you did it in reverse, e.g. SCO. Right now the second link for "litigous bastards" after sco.com is ... a page urging people to googlebomb. Gee, how subversive, no one will figure out how that worked... Hell every time you mention SCO come up with a different link for SCO so their google results will be peppered with such commentary after... People search for "SCO", not "litigous bastards".

    "Dumb fucker", "miserable failure", etc ... that was funny. Once. Get over it and take some real action against these, uh, litigous bastards, or at least improve the trick a little.

    --
    I've finally had it: until slashdot gets article moderation, I am not coming back.
    1. Re:Grow up by maxwell+demon · · Score: 4, Insightful

      Well, why not link SCO to something the reader gets real value from? Some page where they can learn something about SCO? After all, since those pages indeed tell something about SCO and therefore contain the word SCO, it should even be more effective.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    2. Re:Grow up by Anonymous Coward · · Score: 1, Insightful

      You forgot the most important SCO link.

  62. just like spam by SethJohnson · · Score: 2, Insightful


    Your suggestion is well-thought-out, but is plagued by two problems.

    1. The bombing bots won't give a rat's ass if you add this to robots.txt. Just like spammers, there's not cost for them to hit your site anyway. Even if Google is instructed to ignore the links.

    2. Your site's google ranking is affected by the quality of the links you feature pointing at other sites. Your solution unbalances this whole matrix.
  63. Another solution besides robots.txt by wamatt · · Score: 3, Interesting

    Spammers are going there because you have a high PR. So cut the PR supply and you in business, http://www.site.com/~url=http://www.link.com and voila - URL rewriting. no more PR for mr spammer.

  64. Blocking for comment spam by Ian+Bicking · · Score: 1
    This doesn't realy apply to Wikis, but for comments you can use redirects to foil this spamming. Well, not to prevent it, but at least to remove the incentive.

    This foils PageRank in some ways (after all, valid comments with links should increase the rank, at least some of the time). It would be a shame to do the same with all Wikis. Otherwise more onerous authentication may be necessary, which is also against Wiki principles (though common in Wiki implementations). Or some vetting, perhaps using this PageRank-fooling measure until the page changes are approved.

  65. Use pages not changed for a few days. by TorKlingberg · · Score: 1

    Perhaps google could index only wikipedia pages that has not been changed for a few days. So as long as people keep removing the spam links, they will have no effect on the pageranks. I won't work for forgoten old wikis, but it will for the big ones. I believe that completely ignoring links on Wikipedia is a bad idea, since the average quality of sites linked from there is very good.

  66. "Webmaster"? by Anonymous Coward · · Score: 0

    A "webmaster" maintains a website. This, however, covers the work of "spammers", anyone over-zealously promoting their website to the detriment of the web.

    Big difference. Let's not soil the good name of productive, non-scum career fields.

  67. Googled for "Wireless" got Pr0n by KidSock · · Score: 1

    Last night I googled for "wireless linux intersil prism" (or something like that) and found myself on a site with a page full of nothing but keywords about wireless stuff surrounded by porn banners. I was very happy^Hupset!

  68. detailed article on spoofing Google by freddyfred89 · · Score: 1

    From The New Yorker, 5/31/2004 SEARCH AND DESTROY By James Surowiecki If you go to the Internet search engine Google, type in "miserable failure," and click on the "I'm feeling lucky" icon, you will be directed not to an article about "Ishtar" or the 1962 Mets but, rather, to the White House Web site and the official biography of President George W. Bush. Congratulations. You've been Google-bombed. A Google bomb goes off when people conspire to have a particular phrase (in this case, "miserable failure") link to a given Web page, effectively tying the phrase to the page. Other famous Google bombs include one linking "more evil than Satan himself" to Microsoft's home page and, currently, one that links "weapons of mass destruction" to a page that reads, "The weapons you are looking for are currently unavailable. . . . Click the Regime Change button, or try again later." Google bombing may be a party trick, something to amuse office workers as they trudge through the day, but it exemplifies one of the biggest challenges that Google faces as it heads toward its multibillion-dollar I.P.O. Google is as much a ranking system as a search engine. It is more efficient than any other site at analyzing information and making decisions about its importance. Google is successful not because if you search for "Enron" it will return 1.75 million pages that contain the word but because, of those 1.75 million, the most relevant are right at the top. In large part, Google does this by relying on the collective intelligence of the Web itself. At the core of Google's technology is a voting system. Every link from one Web site to another is treated as a vote; sites that get more votes are considered more valuable and, in Google's system, are weighted to have more influence. Google also takes hundreds of other factors into consideration, such as font size and the location of words on the page. But, fundamentally, the Web pages that Google says are best are the pages that the Web as a whole thinks are best. Google's success has created a problem, though: if you have a voting system, people are going to try to manipulate it. Google bombing is the innocent face of this. Less innocent is the industry dedicated to helping Web sites maximize their Google rankings-the racket known as "search engine optimization." Some American companies have armies of programmers toiling away in Bangalore solely to boost their Google rankings. Much of what the "optimizers" do is reasonable, helping companies do a better job of presenting content, using keywords, and building pages to which others will want to link. (These are termed "white hat" tactics.) But there are also plenty of black hats-known as "index spammers"-who have simply adapted the methods and tricks of the old political machines. In the days of Boss Tweed, people were encouraged to vote early and often, dead men were placed on the voting rolls, and citizens were paid for their votes. On the Web, companies "cloak," which means, among other things, that they disguise the real content of their sites, in an attempt to fool Google into thinking that a page is relevant to a search. Deep-pocketed players pay other sites to link to their sites, to foster an illusion of popularity. Some companies set up "link farms"-a host of interconnected Web sites that exist primarily to link to each other. A big company with a major Internet presence, for instance, can buy thousands of domain names, set up Web sites, and effectively create thousands of links out of nothing. Google, of course, knows about all this. In its recent I.P.O. filing, it said that the threat from index spammers was "ongoing and increasing," and so it has embarked on a campaign to outsmart them. A couple of weeks ago, for instance, it essentially banned a company called WhenU because of its cloaking tactics. (WhenU's Web site will no longer appear if you search for the company on Google.) To stymie the cheaters, Google issues periodic revisions to its algorithm, and companies breathlessly await the subsequent changes in their rankings. (Th

  69. Simple solution for wikis! by Sindri · · Score: 0, Redundant

    Just delete the sandbox every 24 hours.

  70. The question isn't who caused the problem by Solandri · · Score: 1

    The question is who has the problem and what's the best/easiest solution. The wiki admins don't care - their software is doing what it's supposed to be doing. Google is the one that has the problem with it because it degrades their search service. Do they solve it by convincing thousands (millions?) of sandbox administrators to all change their system? Or do they solve it by changing their algorithm?

  71. Weblog Anti Page Rank Boosting Techniques by Laebshade · · Score: 1

    And generally you are right, though I'd like to put out an instance where that isn't the case. With WordPress it has 2 very nice plugins available: one that uses google as a redirect, and the other creates a md5 hash of the website url. So, for example, this link will take you to google, which takes you to my website, which then redirects you to slashdot.

  72. Which is why I thought it was real time by swb · · Score: 3, Interesting

    I thought it was a real-time thing, where the account creation bots passed the image that loaded during the signup process to a porn site and the images were decoded by a real person, and the result passed back to the bot who then signed up for the account.

    To avoid the timing problems with porn signons needing to happen concurrent with account signups, the account generation process was actually initiated by a porn signon. It limits your account generation ability, but only to the extent that you have porn traffic.

    Did I just imagine this, or does it work that way?

    1. Re:Which is why I thought it was real time by allism · · Score: 3, Informative

      You didn't imagine it, but perhaps a clearer understanding of the technique can be achieved by reviewing the previous discussions. Here's a link to the Slashdot article that discussed this last January.

    2. Re:Which is why I thought it was real time by Captain+Splendid · · Score: 1
      It limits your account generation ability, but only to the extent that you have porn traffic.

      So, basically, there's no limits!

      --
      Linux, you magnificent bastard, I read the fucking manual!
  73. good workaround: 'mail all commits' by jmason · · Score: 1

    We've also had problems on the SpamAssassin Wiki.

    Our solution has been to ensure that all changes are emailed to a mailing list, where we can monitor them and remove the spam links within minutes of their arrival.

    An ideal solution: Google should define an attribute for the A tag, which indicates that a URL should not be used in computing Page Rank. We could then modify our Wikis so that page links from Wikis are not included.

    Same thing would work for weblog comment spamming, too.

    1. Re:good workaround: 'mail all commits' by xiando · · Score: 0

      Having changes mailed to you may work on a small scale, but personally I would prefer not to having my in-box overfilled with messages telling me someone corrected some spelling error. Less (mail) is more. 1 mail x 20s reading time daily works, 1000+ mail about like I imagine wikipedia would receive becomes a problem.

  74. Re:apache + search + p2p = distributed search engi by Bert690 · · Score: 1
    Right, preserving the integrity of the search results in the case of malicious users injecting false information into the system sounds like a challenging problem. But a p2p search engine seems to be the only way to go for an open-source search engine implementation.

    Agreed... I think it's solvable too -- just difficult, which makes it all the more fun.

    Also I accidentally linked to the wrong paper in my original reply. I meant to link to this one, but that other paper was at least marginally related.

  75. Google's on the right track by Scott+Richter · · Score: 1
    It was time to do that at least a year ago. It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.

    That's what froogle's for. Are you familiar with it? Type in a term and you'll find nothing BUT consumer items. If you're looking for a review site, that's eopinions. And if your complaint is that google should have a dedicated product review site, that fails your other point of them being all things to all people.

    Also, I have NEVER typed in the model number of a product, with some reasonable attendant keywords, and NOT found good review info as well as the manufacturer's site. A little google prowess is all that's required. As someone else mentioned, do you have an example of a search that failed? With keywords, please.

    But they're too busy making an email application with 9 frames and 200k of Javascript to pay attention to the reason people use them in the first place. It's a little disappointing, I'm an AltaVista alumni and I got to watch them forget about search and do a bunch of useless crap instead, then die. I was hoping Google would be different.

    Please, gmail is a wonderful and necessary idea. Most webmail email clients suck - it's impossible to find your messages, either because they're not indexed or because you had to delete them to keep under your tiny limit. A ton of people, myself included, can't wait for gmail.

    Also, it's not like they aren't actively fixing the problems with abuse, but it's hard to keep up with the entire spamming world. Recall SearchKing - they killed that effectively. And just because they don't publically detail their changes to PageRank doesn't mean they aren't working on it.

    I would expect the PageRanks of the Wikifarms to decrease within a month.

  76. Blind web surfing. by amber_of_luxor · · Score: 1

    Not to be a troll, or feed the trolls, but how does a blind person view the web now ? If it is large text based, than there is a way that this can work, or if it is done by voice, cant you add a little MP3 "To leave a comment, first type "Bill Gates is wise"" or something like that?.

    Blind people surf the net using one of the following "solutions".:

    • Screen reading software, such as JAWS. These read from the top of the page down. The voice(s) can usually be adjusted somewhat by the user.
    • Braille Display Screens. This displays the text as braille dots, one line of text at a time.

    Audio embeds conflict with the screen reader output. [Ever browsed a website listening to a nice cd, only to also get some webdesigner's idea of what constitutes "good music" to come through your speakers at the same time? Audio embeds are the same problem, for users of screen readers.]

    Audio embeds don't conflict with braille display screens. The "problem" there is the assumption that an audio output is setup.

    Audio embeds work, if the user does not use a screen reading program. The combined audio streams usually result in neither being heard correctly.

    Deaf blind people can only surf the web with braille display screens. They don't install audio output on their systems. Audio embeds won't work for them.

    Amber

    --
    Wind Beneath Thy Wings
  77. Gee, Slashdot! You're Swell! by crashnbur · · Score: 1

    Somewhere, a Google employee is reading this Slashdot article thinking, "Oh shit. So much for next week's vacation."

  78. Just /. the bastards by Lonewolf666 · · Score: 1

    For the people with fast internet connections, it should be easy to do a repeated page mirroring with some tool like HTTrack. Maybe contolled by a little script that keeps repeating it...
    That will drive their amount traffic through the roof and cost them. Usually, the amount of free traffic included with a webhosting account is limited to something like 100 GBytes/month. Exceed it and get a big bill.

    --
    C - the footgun of programming languages
  79. use robots.txt! by moosesocks · · Score: 1

    I don't understand why the wiki-owners just don't put a robots.txt file in the directory of the sandbox indicating the search engine to NOT index the page containing the sandbox.

    That's why it's there.....

    --
    -- If you try to fail and succeed, which have you done? - Uli's moose
  80. The no-weight tag by IBitOBear · · Score: 0, Redundant

    It would be cool if sites could set a page or page-group with a "google weight" via a meta-tag. The weights would be from 0 to 100, with 100 being normal and 0 being no-value at all.

    Then sites could take things like their sandboxes and tell people that they are zero-weighted. In fact Wiki, blog, and-such software could automatically zero-weight the free non-user and sandbox pages to prevent this kind of abuse.

    Then you put a disclaimer at the top: "These pages are excluded form search engine page rankings."

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  81. Amazing by Anonymous Coward · · Score: 0

    Google is filled to the brim with highly intelligent PhDs. This won't take long to fix, right?

  82. that's not gonna stop bots by Anonymous Coward · · Score: 0

    Sorry. This tag only stops well behaved bots.

    Deliberately written mis-behaving bots will just ignore it.

  83. mod parent up by themusicgod1 · · Score: 1

    seriously though. google's algorythm works.

    --
    GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
    1. Re:mod parent up by Anonymous Coward · · Score: 1, Informative

      Update your spellchecker. It's algorithm.

  84. Free pr0n--and screw over the spammers. I love it by Anonymous Coward · · Score: 0

    Now to get the word out to all the smut connoisseurs.

    gewg_

  85. Re:just need to stop the pagerank bots by Anonymous Coward · · Score: 0

    It will stop the bots that google and other search engines use for page ranking. It won't stop the spambots but it will prevent the spammers from gaining anything by it. If spammers can't raise the page ranking of a site by using those spambots, then the use of the spambots will stop, or at least not spam your site as it would be useless to do so. Making your site use HTTPS for everything would increase the resources needed to spam your site, making it even less useful to spam any site that uses those tags.

  86. Not just wikis by knuth · · Score: 1

    It's not only wikis that are appropriated by these spammers. I had to shut down a discussion board I ran because the spam got to be too much. I was logging on several times every day to delete the junk.

    The point of the article, I think, is that wikis are the new frontier for slimy spamming SEOs. The weasels have used "comment spam" on regular blogs. They have spammed referer logs. Now they are giggling over how they can defecate on wikis.

  87. Wiki_Sandbox/robots.txt by Morosoph · · Score: 1

    User-agent: Googlebot
    Disallow:

  88. And How , If I may Ask Do You Abuse Your /. sig by Avishalom · · Score: 1

    I didn't mean for it to happen but after a few posts on /. my page surged on google (when searching for my name)

    here's an example

  89. PigeonRank by BoneThugND · · Score: 1

    While Google's PageRank algorithm does not consider that 'subject' of a page that is linking to another page, their search algorithm is heading that way (sorta like TEOMA). Meanwhile Google-bombing will always work if there isn't a good REAL page about the text that is used for the Google-bomb. As for the META robot tag, don't expect people to use it, if they won't even validate their code(hell, I don't). Google will find ways to sort through the vast amounts of disorganized, unstructured data that is the World Wide Web we know and love.

  90. I am writing a script to clear wiki sandboxes.. by joeldg · · Score: 1

    just go through and clear them.
    easy.. they are almost all spam, even down on page 19 of the google results.
    bah

  91. Thanks for the informative quote. by simoncion · · Score: 1

    I enjoyed it. Please retain the formatting of the source page, or add paragraphs if none exist. Huge blocks are harder to read.

  92. pft by themusicgod1 · · Score: 0

    I don't use one and I'm proud of this fact d: fancy spellin i save fer english papers.

    --
    GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
  93. I discussed this with the Wiki creator by Anonymous Coward · · Score: 0
    a couple years back via e-mail. I asked him "if anybody can edit the pages, what's to stop someone from recursively going through the site and adding a shitload spam?" His response was something along the lines of "I know. Please don't write a story about it." (I was using an e-mail address from a newspaper's domain.)

    The concept of the wiki was flawed from the beginning. These kind of naive, utopian communities never work, because there's always going to be someone who is willing to run through the site and ruin it for the rest.

  94. Damn 'em by Owlman · · Score: 1

    The insensitive clods! Wikis help the commie side of the web alive.

    1. Re:Damn 'em by Owlman · · Score: 1
      </ /me me uses joke I don't know the history of>
  95. Relink spammers by whitis · · Score: 1

    If the spammers are linking text like " " or "." to hide their activities, google will easily be able to identify those and block those sites but then spammers will start linking words.

    How about we relink any spam we find from http://www.spamsite.com/ To: http://www.searchenginespammers.net/bb-spammer.cgi /http://wwww.spamsite.com/ After linking, 1) click the link (or better have a program visit it with the correct referrrer string or report the link via a web form on the cgi) and 2) move the link to your search engine accessible spam page. Actually, reporting via a web form is better than clicking the link if you are doing it manually because you don't increment the sites hit counters and you don't expose your computer to malware.

    Of course, someone would need to register searchenginespammers.net and install a cgi there that would basically display a page describing the criminal practice of bulletin board/wiki spamming, and then lists all the referrer strings that have brought it to this particular page.

    This will help search engines like google identify the wiki spammers and purge their sites from their search results. In the short term, searches for the keywords they tried to drive to their site would now take them to searchenginespammers.net and once the folks at google take action they can use it to activiate a filter mechanism. Other sites besides google can use the information. Someone could start a PICS or DNS based blacklist based on listings at searchenginespammers.net that people could use to prevent patronizing such sites. Email filters could use the list to help identify spam.

    Like any site that lists spam URLs, there is the possibility that people will spam other peoples URLs to discredit them, so that needs to be taken into account.

    Also, this thread is a reminder that when mentioning a company we dislike ( SCO, MPAA, RIAA , Macrovision , Microsoft, George W Bush, etc. should either not link their name or link their name to a site that describes their misconduct; we don't want to help them get better search engine rankings.

    1. Re:Relink spammers by dos_dude · · Score: 0
      Of course, someone would need to register searchenginespammers.net and install a cgi there that would basically display a page describing the criminal practice of bulletin board/wiki spamming, and then lists all the referrer strings that have brought it to this particular page.
      I haven't yet implemented the referrer strings thingy, and I registered chongqed.org instead of searchenginespammers.net, but the idea is the same.
    2. Re:Relink spammers by whitis · · Score: 1

      I haven't yet implemented the referrer strings thingy, and I registered chongqed.org instead of searchenginespammers.net, but the idea is the same.

      Good for you! However, it looks like your current linking strategy would just give chongqed.org a high page rank for those keywords whereas the strategy I was outlining would give a high page rank to a page that named the spammer in the title. It is important that the spammers be readily identified by someone reading the search results not by having to visit an anti-spam site.

      Simulated search results for "viagra":

      • Viagra for less!
        www.cheapviagra.com
        Buy viagra for less! No prescription!
      • Half price viagra!
        www.halfpriceviagra.com
        Viagra only $3.29 per dose!
      • Official Site: VIAGRA (sildenafil citrate) - Information About ...
        Pfizer's official site provides information on the prescription drug Viagra. ... VIAGRA is a prescription drug used to treat erectile difficulties. ...
        www.viagra.com/ - 29k - Jun 6, 2004 - Cached - Similar pages
      • Wiki Spammer: www.cheapviagra.com
        www.searchenginespammers.net/bb-spammer.cgi/http:/ /www.cheapviagra.com/
        www.cheapviagra.com has been spamming Wiki and Bulletin board sites to increase their page rankings for the keyword viagra...
      • Wiki Spammer: www.halfpriceviagra.com
        www.searchenginespammers.net/bb-spammer.cgi/http:/ /www.halfpriceviagra.com/
        www.halfpriceviagra.com has been spamming wiki's and/or bulletin boards to promote their site and/or increase their page rankings for the keyword viagra...
      • Wiki spammers using keyword "viagra"
        http://www.searchenginespammers.net/keywords.cgi?v iagra
        This is a list of spammers who have been spamming wiki's, bulletin boards, blog comments, and guestlogs in order to increase their search engine page rankings or get people to...
      • chongqed.org
        chongqed.org. All your page ranks are belong to us! ...
      Please note that I am making up the names of viagra spammer sites but some real company, spammer or otherwise, has probably actually registered those names. Yep. Both domains are register. They might not be spammers but they both look sleazy: domain scalpers and a "cialis" search engine site.

      Note that this is how chonged.org really appears on google. Hardly search engine savy. The people you are trying to reach have no idea what "chonged" means and could care less about a reference to a badly translated sega video game. If anything, you look like a sleazy search engine optimization service.

      Note that it is important that the referenced page actually include the keywords being spammed or google will probably not link to them.

      By the way, it wouldn't be a bad idea to talk to someone at google and other ranking search engines about how to best implement this.

  96. Wiki posts by Anonymous Coward · · Score: 0

    Why not an automated reply system? If you become annoyed by a link, simply add it to a list to access on an ongoing, random basis. No need to accept replys. How could they possibly complain, after all they did post a link to bring you to their address. With a few million people responding I should think they would be delighted.