Slashdot Mirror


Webmasters Pounce On Wiki Sandboxes

Yacoubean writes "Wiki sandboxes are normally used to learn the syntax of wiki posts. But webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank. One such webmaster recently demonstrated this successfully. Isn't it time for Google finally to put some work into refining their results to exclude tricks like this? I know all the bloggers and wiki maintainers would sure appreciate it."

33 of 324 comments (clear)

  1. Why just wikis? by GillBates0 · · Score: 4, Insightful

    Why not normal discussion boards and blogs? We, for one, saw how the SCO joke (litigious b'turds) managed to GoogleBomb SCO in first place without a problem.

    --
    An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
    1. Re:Why just wikis? by Andy+Mitchell · · Score: 3, Insightful

      I'm not sure this will make you feel better but this startergy has a limited lifetime.

      The contribution of your page to another pages page rank depends on two factors, firstly the page rank of your page, and secondly the number of links coming from your page.

      As more people take up this tactic the return everyone gets from it, gets smaller. E.g. When there are hundred of links on that page they cease to have any real value. Eventually people should give up on this one.

  2. Yes... PLEASE... by Paulrothrock · · Score: 4, Insightful
    Google needs to do something about this. I had to turn off comments on my blog because all I was getting was spam. Two or three a day that I had to go in and delete. I have to now find a system that will keep the bots out.

    What happened to the nice internet we had in 1996?

    --
    I'm in the hole of the broadband donut.
    1. Re:Yes... PLEASE... by Paulrothrock · · Score: 2, Insightful

      No, I blame opportunistic bastards who can't see that it's okay to not profit from something. *Thinks about his sledding hill that was destroyed by an upscale minimall.*

      --
      I'm in the hole of the broadband donut.
  3. You know... by fizban · · Score: 3, Insightful

    ...what Google needs? A "Was this result helpful in your search?" button for each link returned, so that the search itself also influences page ranks. Maybe that will help get rid of this Google bombing mess.

    --

    +1 Insightful, -1 Troll. What can I say, I'm an Insightful Troll.

    1. Re:You know... by Anonymous Coward · · Score: 1, Insightful

      Because obviously bots couldn't mess with that....

    2. Re:You know... by Anonymous Coward · · Score: 1, Insightful

      "...what Google needs? A "Was this result helpful in your search?" button for each link returned, so that the search itself also influences page ranks. Maybe that will help get rid of this Google bombing mess."

      Except spambots can also work to make sure that the most helpful links are the ones linking to spam sites.

    3. Re:You know... by Anonymous Coward · · Score: 4, Insightful

      that button will also get spammed, as bots will click 'yes' for their sites and 'no' for the competitors sites

    4. Re:You know... by goon+america · · Score: 3, Insightful

      Wouldn't that be equally abused?

    5. Re:You know... by Nasarius · · Score: 2, Insightful

      Ah, but how long will it take for someone to write a worm with a Google-abusing payload? We've already got spammers using hacked PCs to send mail.

      --
      LOAD "SIG",8,1
  4. Some people ... by TheGavster · · Score: 2, Insightful
    It still gets me how the people who are participating in the nigritude ultramarine thing don't see anything wrong with what they're doing. This line particularly got me:
    "Without, as opposed to guestbook spamming, being evil it's a sandbox after all."

    Yes its a sandbox, no its not your personal playground.
    --
    "Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
  5. Who's fault is that? by lukewarmfusion · · Score: 4, Insightful

    Google's algorithm isn't the problem. The problem is the availability of easily abused areas such as these "sandboxes."

    Some search engines accept any old site. Others accept sites based on human approval and categorization. Google is a nice combination of the two - by using outside references (counting how often the site is linked) it assumes that the site is more relevant. Because other people have put links on their sites. That's a human factor, without directly using human beings to review and categorize the sites and rankings.

    Sure it can be abused, but it's not Google's fault; perhaps these areas of abuse (blogs, wikis, etc.) should address the problems from their end.

  6. ROBOTS.TXT by gtrubetskoy · · Score: 4, Insightful
    The burden is not on Google, but on Wiki sandbox admins, who should provide proper ROBOTS.TXT files to inform Google that this content should not be indexed.

    As a sidenote, I think that with recent Wiki abuse, the issue of open wikis will become a similar one to open proxies and mail relays.

  7. Same site, a few days later: Don't do it. by micha2305 · · Score: 2, Insightful
    Ok, but the same webmaster says:

    I decided to stop posting backlinks in Wiki sandboxes, the SEO strategy previously explained. [...] In the meantime I'm asking developers and those hosting Wikis of their own to please exclude sandboxes from search engine results (via the robots.txt file). Doing so would shield the sandbox from backlink-postings, and there is no need for it to turn up in search results in the first place.

    This sure makes sense, and who knows, maybe future wiki distributions do it by default. (If

    <meta name="robots" content="noindex">
    would work universally...)
  8. Well, it's about time this gets some attention by digitalgimpus · · Score: 4, Insightful

    I've noticed that my blog's getting lots of spam from sites that don't seem like typical spam sites....

    From what I can see, it looks like those "search ranking professionals" who "guarantee to raise your google rank in 30 days" are using blog spamming, and perhaps Wiki Spamming as a way to increase their clients ratings.

    It's not about meta tags, or submitting anymore... it's spamming.

    Perhaps it's time for people to finally be warry of these services. After all, can a third party really guarantee a position in another companies search index?

    IMHO those services are pure evil. They either do nothing, or they do something to increase page rank... what is that "something"? How many options do they have?

    If they are going to use my blog... why can't I get a cut in that business?

    1. Re:Well, it's about time this gets some attention by Lurker+McLurker · · Score: 4, Insightful
      IMHO those services are pure evil.
      No, 9/11 was pure evil, some unwanted comments on a blog is an annoyance. If you have a website that allows anyone to post comments, you will get some you don't like. That's life.
      --
      Mod parent up!
    2. Re:Well, it's about time this gets some attention by Heywood+Yabuzof · · Score: 2, Insightful


      OK, so it's not really fair to get into relative levels of "evil", but let's also not minimize the "evil" that search optimizers do. It's not just a bunch of extra comments on blogs or wikis.

      Their fundamental business model is CONTRARY to my interests as a consumer trying to get product information. They don't wish to let me find the product or the review or the site that MOST PEOPLE FOUND USEFUL, they only want me to find the one that PAID THEM THE MOST MONEY.

      I realize that's just the way things are, but that's obviously counter to my whole purpose for using a search engine like Google. They are intentionally polluting the search results. It's not the methods I find "evil" (although blog comment and wiki spamming are pretty shady) as much as the end result - the loss of helpful web searches.

  9. Re:Cyberneighborhood Not-Watch? by lunax · · Score: 3, Insightful

    Why not put the sandbox in it's own folder and add an entry to the robots.txt telling it not to browse that folder?

  10. apache + search + p2p = distributed search engine by datrus · · Score: 2, Insightful

    Something that would make a nice opensource project would be to include p2p search functionality in apache itself.
    This way all the modificed web servers would make a giant distributed search engine.
    Some nice algorithms like koorde or kademlia could be used.
    Anyone thought about starting something like this?

    David

  11. Tomorrow today yesterday by boa13 · · Score: 4, Insightful

    But webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank.

    The Arch Wiki has sufferred several times from such vandals in the past few months. I'm sure other wikis have, too. They create links over single spaces or dots, so that casual readers don't notice them. Attentively watching the RecentChanges page is the most effective way to find and fight them, but this is tiresome. I guess many wikis will require posters to be authenticated soon, which is a blow in the wiki ideal, but not such a major blow. Alternatively, maybe someone will develop heuristics to fight the most common abuses (e.g. external link over a single space).

    So, this is not new, but this is now news.

  12. Sandbox persistence by gmuslera · · Score: 2, Insightful
    If its a test area, is needed to store it? Wikis could just have it live for the current session or testing of the user, and when the user logs out or finish editing, simply delete/restore it to a default introductory text. Don't need to be some kind of collaborative blackboard or graffiti wall, or at least, if it must be, that be the webmaster choice to be that way (at least TikiWiki let me disable the sandbox if i want).

    But if the problem is to have in websites areas where visitors (even unregistered ones) can post random text and links, even slashdot is potentially target of the same (maybe should be a "Spam" mod score?) or by the way, any site where unregistered visitors can store content in a way or another, be wiki or not.

  13. Re:visual security code for sign-up by stevey · · Score: 5, Insightful

    There was a story about defeating this system on /. a while back.

    Rather than using OCR or anything poeople would merely harvest a load of images from a signup site - possible when there are only a given number of finite images, or when there is a consistent naming policy.

    Then once the images were collected they would merely setup an online porn site, asking people to join for free proving they were human by decoding the very images they had downloaded.

    Human lust for porn meant that they could decode a large number of these images in a very short space of time, then return and mount a dictionary attack...

    Quite clever really, sidestepping all the tricky obfuscation/OCR problems by tricking humans into doing their work for them ..

  14. Easy solution by lightspawn · · Score: 2, Insightful

    Edit robots.txt to let search engines know they should ignore sandbox pages.

  15. Re:Cyberneighborhood Not-Watch? by naoiseo · · Score: 3, Insightful

    This fails to address the real issue.

    That is, even if you make your links useless (easy with a no-follow meta tag) it wont help, the majority of this spam is AUTOMATED, and will spam your wiki/blog/guestbook based on simple page queues.

    Your best personal defense is to manually remove any page or html queues that a spammer would pick up on as being common to a certain type of postable web page or element.

    Bloggers have been creating blacklists (banning both poster ips and destination urls) with some degree of success. This is a deterrent, having a spammer show up on a blacklist whereby webmasters use a distributed file to 'clean' their blogs automatically.

  16. YHBT. YHL. HAND. [Was: Re:Well, ...] by waveclaw · · Score: 2, Insightful

    No, 9/11 was pure evil

    Overuse of absolutes can lead to their deterioration. As an American I couldn't feel more turgid: now when the Europeans get ready to yell HITLER!!!! in IRC, I can just pre-emptively yell 9/11!!!!!!! and lose/end the conversation.

    To be fair, the difference between these 'blog abusing 'minor annoyances' and the large scale deaths/destruction of 9/11 can be seen as just a matter of scale. To some people I know, the economic impact of terrorism keeps them awake at night: the value of human life be damned, watch that bottom line! (Not the most civicly minded people, IMHO.)

    Being respected members of polite business society, these people and their defective outlook just as dangerous to you and I as the wiki 'blog abusers and 9/11 baby killers. To them, you are either a customer, employee or garbage to be taken out by security.

    This, by the way, is how we treat anybody who we have successfully alienated. Look at these 'blog spammers. Would anyone have cried if Al Queda had blown up a spammer's house?

    Both sides of this argument stand at the top of a moral mountain with a very slippery slope and are trying to make the other fall off as far and as fast as possible. I'm waiting to see who tumbles first.

    Like they say on bash.org: I will become rich and famous when I invent a device to punch people in the face through the Internet.

    --

    "You cannot have a General Will unless you have shared experiences. You cannot be fair to people you don't know."
  17. Time to reconsider Wikis. by KevinDumpsCore · · Score: 1, Insightful

    > Isn't it time for Google finally to put some work into refining their results...

    Isn't it time to also reconsider the Wiki paradigm? More sites (like this) are requiring logins. "Golden Prose" indeed! IMHO, Wikis are evolving into crude Content Management Systems.

  18. Re:image based spam control by JamieF · · Score: 2, Insightful

    Hear, hear. Systems (software or otherwise) that offer something of monetary value for free, and provide no mechanism whatsoever to prevent people from exploiting them, are going to get exploited. Shocking!

    Maybe it wasn't obvious to blog and wiki programmers that the ability to post a comment or edit a wiki page was worth money. It isn't worth a lot per post, but because these are online systems, they are very susceptible to bots that can post in huge volume. All of those posts together can alter a site's placement in Google search results, and that's definitely worth money.

    Instead of whining about Google being influenced by attacks that use your Wiki or blog, how about making it hard for bots to post in the first place? Is that really an important feature that you can't live without?

  19. Re:image based spam control by Blakey+Rat · · Score: 2, Insightful

    I've always wondered why the image is always distorted images which are hard to read on speckled backgrounds?

    Why not just show the picture of an object, like an apple or something, and ask the user to type in what it is? I mean, you could have a few hundred of these and it would be nearly impossible for an automated system to guess. (You have a few hundred different items, and like 5-10 images of each item.) I dunno, seems easier to me, but I don't write web software.

  20. just like spam by SethJohnson · · Score: 2, Insightful


    Your suggestion is well-thought-out, but is plagued by two problems.

    1. The bombing bots won't give a rat's ass if you add this to robots.txt. Just like spammers, there's not cost for them to hit your site anyway. Even if Google is instructed to ignore the links.

    2. Your site's google ranking is affected by the quality of the links you feature pointing at other sites. Your solution unbalances this whole matrix.
  21. Re:Cyberneighborhood Not-Watch? by jacoplane · · Score: 2, Insightful

    I think the real problem is that spammers aren't likely to look at how you've configured spiders to handle your site. So even if you do this i'm sure it won't get rid of the spammers.

  22. Re:Grow up by maxwell+demon · · Score: 4, Insightful

    Well, why not link SCO to something the reader gets real value from? Some page where they can learn something about SCO? After all, since those pages indeed tell something about SCO and therefore contain the word SCO, it should even be more effective.

    --
    The Tao of math: The numbers you can count are not the real numbers.
  23. Re:Cyberneighborhood Not-Watch? by bkhl · · Score: 2, Insightful

    Also, we really need to replace the klugy robots.txt files and robots meta-tags with headers built in to the HTTP protocol.

    Like it is, it's hell to try to get decent robotic behaviour out of anything other than HTML pages.

  24. Re:Grow up by Anonymous Coward · · Score: 1, Insightful

    You forgot the most important SCO link.