Slashdot Mirror


Splogs Clog Blog Services

SuperWebTech writes "A new generation of spam has emerged lately in the form of automatically-created spam blogs, or "splogs." One wily programmer manipulated Blogger's API to create a "spamalanche" of thousands of blogs whose sole purpose was to increase their real sites' pagerank. This clogged search engine results while filling RSS feed services with useless listings. Though Google, Blogger's owner, is doing its best to fix the problem, in the meantime several services have stopped listing any site they host. So far nobody has found a solution."

17 of 241 comments (clear)

  1. Username trend? by sethadam1 · · Score: 4, Interesting

    Anyone else notice that every username in the video is [letters]-[numbers].blogspot.com.

    Maybe start by disabling new blogs.
    Flag all usernames that meet that basic regex criteria.
    Hand filter that bunch.
    Add the same captcha you have on your comment system to the posting system.
    Re-enable registration.

    Seems kind of elementary, doesn't it? Why not try it?

    1. Re:Username trend? by De+Lemming · · Score: 5, Insightful

      Flag all usernames that meet that basic regex criteria.

      With all the efforts spammers do to avoid baisian filtering on e-mail, don't you think they will change their username format to something else half an hour after you implement this regex? Probably to something more variable (and dictionary based).

      Hand filter that bunch.

      And hand filtering thousands of blogs which are created automatically does not seem feasible...

  2. Splogs? Seriously wtf by ponds · · Score: 5, Funny

    With the Splogosphere maturing, we can expect to see Splogcasts in the near future.

  3. Word verification? by badasscat · · Score: 4, Insightful

    Wouldn't a simple word verification requirement when creating a blog cure this? I don't think many people would bother creating "thousands" of new splogs if they knew they needed to manually enter in user data for each one... why should you even be able to start up a blog using an API?

    Blogger already requires word verification for posting comments (if the blog admin turns it on) - am I missing something or would this also work to at least alleviate the splog problem too?

    1. Re:Word verification? by Bogtha · · Score: 5, Interesting

      Wouldn't a simple word verification requirement when creating a blog cure this?

      Yes and no. CAPTCHAs solve the problem for things like Slashdot, where you just have to worry about trolls with too much time on their hands. But when it comes to spam, there's a value to beating them, so what some enterprising spammers do is set up porn sites that tell people "enter the word you see here and get free porn!". Lots of horny geeks do the spammers' work for them. The difference between the two scenarios is that the spammers are willing to pay minute amounts to beat the CAPTCHAs, but the trolls aren't.

      --
      Bogtha Bogtha Bogtha
  4. This is what Google Blogs if for... by michaelzhao · · Score: 4, Insightful

    Google has recently announced an idea that would benefit bloggers. The idea is to have a separate blog search similar to sites like "Technorati". At first glance, this benefits bloggers. However, it benefits Google even more. By having Blog searches separate, they can significantly cut down on Google-Bombing. Google-Bombing really screws with their search algorithms.

    I think this may be the beginning of a wholehearted launch of "Google Blog". This issue has also been reported on the "TWiT Podcast" hosted by Leo Laporte. I can't remember which episode number it is, but if you search iTunes podcasts database, you should be able to find it.

    Example of Google-Bombing. Go to Google and search "Miserable Failure" and hit "I Feel Lucky". Regardless of what your opinions are. That type of behavior is still wrong.

  5. Charitable donation by Honkytonkwomen · · Score: 5, Interesting

    Simple: Just require a small donation to charity (through Paypal?) before they can create a blog. A dollar or two shouldn't matter to anyone who's putting up a real blog, but will deter sploggers.

  6. Couple of solutions? by keraneuology · · Score: 4, Interesting
    How about a spider-readable timestamp for blogs? If 5,000 new blogs pop up within 12 hours of each other linking to the same web page it is an obvious red flag.

    On top of this, once again the hosting services need to be held responsible: if a site is hosting an obviously spamvertised site then give them 24 hours to remove the site or be blocked from future indexing activities - and have current rankings deleted.

    --
    If the g'vt kept the data on you that google does you'd better believe you'd be calling it "doing evil"
  7. Damn Blog Hogs, Go swim in a Bog by slicer622 · · Score: 5, Funny

    I feel like I'm in a fog, without a seeing eye dog. What a sog! Burninate, Trog! Jeremiah was a bullfrog, but there was a server backlog. And that was just the prologue. Later we took a jog to get some egg nog. Just make sure to oil the cog. I know its a slog, but its better than smog. Thats the end of this log.

  8. Crap Search... by Saeed+al-Sahaf · · Score: 5, Funny

    The trick is to figure out which are "splogs" and which are "real" blogs, because both are usually crap.

    --
    "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
  9. Re:Splogs? Seriously wtf by biryokumaru · · Score: 4, Funny

    On a similar note, I think "Splogs Clog Blog Logs" would be a much better title.

    There should be an annual Seuss day where all article titles must be tongue twisters, and all summaries must be done in nonsensical rhyme.

    --
    When you're afraid to download music illegally in your own home, then the terrorists have won!
  10. splogs aren't the problem... by ianmassey · · Score: 4, Informative

    The problem surfaces when the "splogs" are used to comment spam and trackback spam legitimate blogs. It's through these links that PageRank is increased. If everyone starts proactively dealing with spam on their own sites, this problem will solve itself. MovableType users can upgrade to 3.2, which has spam blocking features, or use the great plugin MT-Blacklist. Either will eliminate this problem. An AC mentioned that WordPress has a similar set of options. I know that TypePad does. The only major blog service provider left to come up with a solution is Blogger, and in the interim you can require registration to post comments on your Blogger site or turn comments off entirely. LiveJournal and all the clones are blocked from trackback by 90% of normal blog sites already, so they don't even count.

    Another poster suggested that we ignore this problem, and it will go away. Untrue. Ignoring the 600 spam comments a day is exactly what the spammers would prefer you do, so that they can stink up every site on the internet with their crap. We are fortunate that in the case of this "new" form of spam, the tools necessary to get rid of it are already there and effective, we just need to get them all turned on.

  11. Re:Well let's get old fashioned by aussie_a · · Score: 4, Interesting

    P.S. stop relying on google so much, PageRank is obviously flawed if it can be so easily manipulated by spamtards.

    Do you have any alternate search engines (preferably with examples to prove that they're actually better) to use instead of google? I've tested out all the big names, and the results I get are almost always near-identical, with the small differences in the results returned not being that important.

    It is extremely frustrating when Google returns nothing useful, but I've yet to find a search engine that works better. Google's level of results seems to be the best anyone can achieve at the moment (and it's not really google that's setting the level of excellence).

  12. PageRank's fatal assumption by G4from128k · · Score: 4, Insightful
    PageRank appears to assume that each link is made independently of the target site. These splogs and other SEO tricks violate that assumption when commercially linked entities create links to each other's sites. Biasing the vote of a link based on some site credibility measure only helps slightly as automation lets sloggers create massive numbers of spurious links. With PageRank, its too easy to buy votes.

    Google needs some mechanism judging if a link is a fair link (made by an independent person/process) or "bought" link created by on on behalf of the same site that being linked to. I'd bet if Google analyzed these splogs and other SEO-generated sites, they'd find an excessive number of links from the splog to the target (or other in-network splogs) but few links from the splog to other relevant sites. Perhaps Google should reweight sites that seem to focus too many links in one direction. Of course, this is only a temporary solution as SEOs/sploggers could just use Google to find a set of random, but relevant, links to add to their splog.

    The deeper problem is that no matter what Google does, some clever SEO will find a way around it. And since sites seeking to be at the top of the search out number Google engineers by a wide margin, the SEOs would seem to have the advantage. The only group with greater numbers than the SEOs are Google users. I suspect the ultimate solution will mean social ranking systems where each Google user gets to rank pages and have a reputation for page ranking. The user reputation system would mitigate attempts by SEOs to either up-rank their pages or down-rank competitor's pages.

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:PageRank's fatal assumption by Richard+W.M.+Jones · · Score: 4, Interesting
      Google wrote a paper about TrustRank which is designed to evaluate the trustworthiness of a page, independent of number of links.

      (Disclosure: I work in "white hat" SEO, where we try to actually make sites more friendly, fast and useful for end users; this black hat SEO stuff doesn't do us any favours at all, so I'm keen to see these spammers wiped out by any means).

      Rich.

  13. Word verification is obsolete by Animats · · Score: 5, Informative
    Word verification is obsolete.
    • Programs have been written that can successfully decode capchas most of the time. It turns out not to be too hard to modify OCR programs to do this.
    • Word verification can be outsourced to third world countries at low cost.
    • Most cleverly, word verification can outsourced to users of your porno sites, who have to type in soneone else's capcha to get free pictures.

    All these approaches are in active use.

  14. Green Eggs and Spam by Comboman · · Score: 5, Funny
    Splogs clog blog logs.

    Spam jams Stan's LAN.

    Guy's WiFi goes awry.

    CERN confirms worm, firms squirm.

    Forget cassette and diskette, USB key snazzy.

    Nimrods applaud iPods abroad, while tightwads called slipshod clawed screen fraud.

    One Phish, Two Phish.

    Red Phish, Blue Phish.

    --
    Support Right To Repair Legislation.