Slashdot Mirror


The Ham and Spam of Weblogs

An anonymous reader submits "Will the blogosphere become just as spammy as Usenet? There may be over 10M weblogs out there, most of them seem to be fake spam blogs created to manipulate the search engines. Scott Johnson, CTO at Feedster, complained that "at times we see upwards of 90% of the traffic from Blogspot being spam," and the problem is likely to only get worse. Can blog search engines like Technorati, Feedster, and PubSub filter the signal from the torrent of noise? Or will we have to seek new approaches such as the social filtering used by Del.icio.us or collaborative filtering used by Findory to separate the ham from the spam?"

14 of 192 comments (clear)

  1. Re:Quick Fix by seneces · · Score: 2, Insightful

    Who is going to pay $1 to read about how your boyfriend dumped you last week and you're still crying in bed. Blog comment spam isn't all that hard to get rid of (filter links, filter content, or if you're just worried about search engines, use rel="nofollow").

    Anyone who has a blog that you have to pay to comment on (or to see) isn't going to get much traffic.

  2. Software is Not Social by lux55 · · Score: 5, Insightful

    I just wanted to point out that so-called "social software" is not social. Person-to-person communication through computers is mediated and indirect. Technology is a barrier to communication as much as it is an enabler. I agree that it is an enabler in situations where it is used to help overcome disabilities and things of that nature, however technology is used moreso by people who are actually avoiding being social. Email is often preferable to a telephone because it creates an additional barrier between ourselves and the "recipient" (aka person).

    A prime example of software in a "social" context is the chatter that accompanies networked video games. This does not form real relationships between people. I heard a teenager recently say that his gaming buddies, who he doesn't even know by name, are like family to him. Technology has helped a whole generation and then some to fail to learn what real relationships are. When a teenager can't distinguish between somebody he's only ever witnessed virtually shoot ze germans and the people who nurtured him before he was able to take care of himself, we have a problem Houston.

    And it's only getting worse. Now we've begun adding "social" in front of all kinds of new web applications. Anything that lets other users see your profile and the items you post and comment on them is seen as a valid replacement for real human contact.

    There was a line from a movie I saw recently called Crash, where Don Cheadle's character says to his girlfriend "It's the sense of touch. Any real city you walk, you know. You brush past people, people bump into you. In L.A., nobody touches you. We're always behind this metal and glass. I think we miss that sense of touch so much, that we crash into each other just so we can feel something.". The next time we use the word "social" to describe a new type of web application, I think we should give that some thought first.

    1. Re:Software is Not Social by aftk2 · · Score: 3, Insightful
      You raise valid points, but what would you have to say to these people?
      The passing of a forum god.
      For the people who are mourning the loss described in the link, is their grief less meaningful than that of those who knew the person directly, face-to-face? Perhaps, but perhaps not: I know a bunch of people, some of whom I see regularly, but with whom I haven't had as meaningful a relationship as some people I've spoken to online, but have never met in person. Is there a qualitative difference between the two types of social interaction? Probably - but I think it's too easy to say "the way we always used to do things is right" and "This is new, and less personal, and hence, wrong."
      --
      concrete5: a cms made for marketing, but strong enough for geeks.
  3. Re:To me (most) blogs ARE spam by aftk2 · · Score: 4, Insightful

    I'm just curious - exactly _what_ would include, if not for blogs? Certainly, I can understand not including those parked domain search sites: they're gaming the system, completely unhelpful, and filled with bogus content.

    But blogs? Sure, much of the content is poorly written, or not applicable to what most people - or, well, rather, 90% of a given population - are interested in. But in searches especially, doesn't it make sense to list results that include those normal people so interested in a particular topic that they blog about them?

    For example, blogs can be very helpful when facing computer troubles, provided you're dealing with bloggers who know how to write for Google. This is a good example. I mean, this surely has to be more worthy of inclusion in Google than the lion's share of those web-based bulletin boards that get indexed - you know the ones, with the "Next in thread" and the replies that are typically out of date, or altogether unrelated to your original query.

    Everyone's quick to dismiss things lately. Don't dismiss blogs, just because sometimes their content seems insular and not applicable to what you've searched for. That's a problem with the search engines, not the sites they index.

    --
    concrete5: a cms made for marketing, but strong enough for geeks.
  4. Re:To me (most) blogs ARE spam by bigman2003 · · Score: 3, Insightful

    The blogs don't bother me nearly as much as "those stupid parked domain search sites."

    I don't know how many times I have done a Google search, and the 3rd or 4th result comes back with my exact phrase..yay!

    Then I go to some stupid, totally lame site advertising domain names, or listing other sites, or something like that.

    I never have figured out how they get listed in Google the way they do though- because my search phrase is not listed on the page...so evidently they know something I don't.

    --
    No reason to lie.
  5. Re:Welcome to Slashdot. by croddy · · Score: 2, Insightful
    The vast majority of blog-style web sites are written and controlled by a single user. Slashdot has several editors, and all of the stories are contributed by visitor.

    Make no mistake: Slashdot is not what people are talking about when they complain of the spam that blogs have dumped into Google..

    Slashdot represents thousands of voices.

    Most blogs represent one voice only.

  6. Re:Welcome to Slashdot. by The+One+and+Only · · Score: 5, Insightful

    God forbid that one voice be allowed to speak without needing to ask the consent of thousands of others.

    --
    In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
  7. Re:Personal blogs compete directly with spam blogs by ciroknight · · Score: 2, Insightful

    I'd say blogs are more than just what you've said. Hear me out.

    Blogs are a new form of communication. Before, we had "editorials" which were published in newspapers, where someone of stature is making their opinion well known, simply to spark debate and interest in the public's mind. Now this is a turn for everyone to have their own editorial, and to foster debate and discusion. Welcome to Slashdot, by the way.

    Secondly, they offer a form of sympathy to the author; normally someone either says "I like your book" or "I don't like your book". This gives people a chance to say "Well, I liked your book, but the ending could be better. I don't think Saffron shoulda died when she fell into the swimming pool" or something like that. Sometimes it's rewarding to write something, but you never know how other people relate to it, and this is just a great opportunity to get that feedback, instantly.

    Lastly, it's an insight into the person. It shows what that person values by what they write about often. It shows how educated the person is by word choice and by sentence structure. It shows how thoughtful the person is when they ask questions. It shows how we're different, as people.

    Honestly, I think the problem is that nobody thought about the problem before it existed. When we thought of the Internet, we thought of it as a number of infinitely flexible services accessible by port interfaces. When we sat down and thought of the way we wanted to put the web together, we wrote a common interfacing language, and ways of accessing that information, by a standard, over the internet. But what we didn't think of was how different the kinds of media transported over the internet would become. Had we thought of it, we might be using blog:// to access blogs today, instead of a certain http address, just as we might be using images:// or video://. Honestly, it shows how well the original system was designed, but then again it also shows how we pretty much stopped designing the system after it solved our problem (same with email, IMO).

    --
    "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
  8. Re:Welcome to Slashdot. by ciroknight · · Score: 3, Insightful

    Take it up with them personally, then. Oh, or you could just use a search engine that actively removes blogs from their indicies. Or you could make your own and remove them personally. Or you could subtract out the sites you don't want in existing search engines.

    As I can see it, choice is on your side. They have the choice of posting or not posting. You have the choice of how you want to deal with it.

    --
    "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
  9. Re:Human validation by MochaMan · · Score: 2, Insightful

    you can spend 3 seconds typing in the word "uNFsaQ" to prove you're human.

    Unless you happen to be a blind blogger. With all the effort people have put into accessibility there's got to be a validation method that can work for the blind as well.

    Just mentioning this because I've seen this complaint several times by blind users on slashdot.

  10. Re:Welcome to Slashdot. by AngryElmo · · Score: 2, Insightful

    Freedom of speech is not a guarantee of audience.

  11. Re:Welcome to Slashdot. by ciroknight · · Score: 1, Insightful

    For fuck's sake, neither is the Internet. Just because I put something on the Internet, doesn't mean anyone's ever going to see it!!!!

    If Google brings it up as a top match, USE ANOTHER SEARCH ENGINE. The problem is you think services should cater specifically to you, while the company that runs the services is trying to think of the greater good of everyone.

    Freedom of Speech is all the Internet is. Audience is you. If you want to look at the site, go right on ahead, and if you don't, then you know how to avoid it, assuming you are an intellegent human being, and care enough to do the work to avoid it. If not, then use a better service that caters more specifically to you. It's as simple as that.

    --
    "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
  12. Re:2 years and no one will care by Absentminded-Artist · · Score: 5, Insightful

    I have been coding webpages since March of 1995. I have learned HTML 1.0, 2.0, 3.0, 4.0 and now CSS1.0 and CSS2.0 and... As exciting as all that can be sometimes I just want to post my thoughts and be done with it. There's nothing wrong with efficiency. Blog sites can be great time savers. I used to have a web journal, wrote entries in my Palm Pilot, hotsynced the data to my Mac and ftp'd it onto my server using Applescript - all the while snorting at all the newbies using blog sites. Then I decided I valued my time better. I opened up a blog in January of this year (http://thesplinteredmind.blogspot.com/ and have had a blast. I post once a week.

    Now, my blog isn't going to be popular. I cover mostly neurological problems and how to deal with them. But I've had some fascinating discussions with complete strangers because of my blog and I'll continue blogging into the forseeable future. Because of Google many people find my blog despite it being a small fish in a big and noisy blog sea. Google is a great tool and I'm glad they index blogs. Now, I'm as upset as the next guy about spam blogs, but "crap" blogs are relative. You may read my blog and find it lame. Others, including myself, would disagree with you. But if you don't find the subjects I write about interesting or valuable, so what?

    Slashdot cracks me up sometimes. What is it to some of you guys if somebody wants to blather on and on about their breakfast or their boyfriend? If the site is a bore move on, but you could tell that from the Google search, right? Seriously, I haven't found many blogs that come up in my searches that aren't related to my searches. Not as much as parked domain sites and adsense whores at any rate.

    Not all bloggers can't be bothered to code a web page. In fact, because I do code I'm able to personalize my site. Every month I tinker and tinker with the code when I find some time. Blogging may be an exercize in vanity, but then so isn't hosting your own website. In fact, the whole web publishing scene is about personal expression, and what's wrong with that?

    --
    The Splintered Mind - Overcoming
  13. Re:Human validation by MochaMan · · Score: 2, Insightful

    No I think you're right on there. From my understanding of it, most blind users are using screen readers to navigate the web, so enhancing whatever software it is that generates the images to also produce a sound file would probably suffice.

    I'm not entierly sure how many blind *and* deaf users there are browsing the web unassisted, but I suppose a broader solution would depend on what technology they're using to browse the web. Some form of braille reader, perhaps? If anyone knows the answer to this, I'd be very curious to know.

    If that were the case, I suppose a more universal, text based solution would be required -- this makes more sense to me anyway. Plus, Lynx users can be happy :)