The Ham and Spam of Weblogs
An anonymous reader submits "Will the blogosphere become just as spammy as Usenet? There may be over 10M weblogs out there, most of them seem to be fake spam blogs created to manipulate the search engines. Scott Johnson, CTO at Feedster, complained that "at times we see upwards of 90% of the traffic from Blogspot being spam," and the problem is likely to only get worse. Can blog search engines like Technorati, Feedster, and PubSub filter the signal from the torrent of noise? Or will we have to seek new approaches such as the social filtering used by Del.icio.us or collaborative filtering used by Findory to separate the ham from the spam?"
With email spam filtering you have to consider each email separately. A blog has a persistent identity and reputation. In theory, this should make it easier to filter blog spam than email spam. On results of this type of filtering is that it will will penalize new blogs in search results, both spammy and real.
Blog comment spam will remain a problem, of course.
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
Slashdot is a blog, created in the context of a news site, which we all come to and bitch about things we want out of technology, think is/are cool, and/or hate and want everyone to know why.
That being said, Google (along with other large search engines) have already taken stances on blogging, and are actively pursuing their individual stances. For most, this is creating their own blog service, and doing some shifting in their code to make sure blogs don't come out on top. But this isn't an absolute truth.
If you want these things, and Google doesn't offer them, make your own search engine, and do it better. No, seriously, don't look at me like I'm crazy; there have been over a dozen "major" search engines created after Google, some are only in serious use by geeky populations (AlltheWeb, as far as I can tell, fits this), some by the trendy, some by the "I hate Google"ites, etc. etc. It's as simple as that.
One reason I think Google's strayed from taking such a hardline on blogs is simply out of ease of use. Google doesn't want to complicate life with a million more search options, especially ones you can deal with yourself by subtracting out the majorly offensive sites (-livejournal -blogger -blogspot, etc).
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
People are using blogs and forums to post links to their own sites. These links show up as backlinks to Google, and due to Google's ranking procedure that determines which website is the most relevant to each search, each extra backlink pointing to a website can effectively make that website more relevant in the searches.
Luckily, Google is one step ahead of the spammers, and has allowed only one link from each forum to contribute as a valid backlink. Therefore, having 100 forum signatures linking to www.spamdomain.com will no longer give credit for 100 backlinks; Only one backlink will be credited towards www.spamdomain.com. The problem is, alot of people have not realised that Google has done this yet, and as a result, people are still adding 8+ forum signature links in their posts, hoping to cheat the search engine ranking system.
Valkyrie is about to die! Wizard needs food -- badly!
In the Wired article (I know this isn't about spam, but what the hell):
"Lately, it seems like almost every time you tune into your favorite Blogger-hosted blog to catch up on the latest gossip, meme, political diatribe or cybersnark, you find that the site is frozen in time. Or, there are multiple posts with identical content."
Uh, no, not as far as I can tell. "Frozen in time," perhaps, after someone decided to stop blogging, but I used blogger for six months and never had a single hitch. Apparently, googling "blogger sucks" gives you thousands of sites bitching about google's service.
Sometimes there are outages, when you can't get in to alter a post or something similar, but those were few and far between (at least they happened less than half a dozen times in six months, and it only lasted a few hours.)
I guess this is a sign about how popular blogger is. I mean, then only way to balance my experience (zero fatal errors in six months) with thousands of complaints is to assume that there are a HELL of a lot of bloggers out there.
Oh, and to those bitching in general about blogs: please shut up. Yes, there are annoying vanity blogs, but blogger -- and the blogging concept -- has been a godsend to specialists, as well as to political organizing.
Protect your liberties. Donate to the ACLU
Okay, great, so 90% of it is crap. It's a given, call it whatever you want. My personal favorite is the "long tail effect".
Built into blogs is a way to tell the crap from the good stuff -- they're linked together intelligently by people who can tell crap apart, and the people who don't write crap don't link to crap. So find one good blog, and you've found a hundred or more good ones just three levels deep in links. Go one more level, and there are a thousand. It's exponential. And chances are, most of them will be of the same calibur as the root blog, with that chance decreasing slightly as you follow deeper links.
So who cares if there's spam. It's not the same with blogs as with email -- the spam is intelligently filtered automatically, just by the normal process of each writer.
Now, if only Google could figure out a good algorithm to track it. It wouldn't be that hard. Just rate a few blogs by hand with a content value and a link value, and automatically give all their ancestors (pages they link to) a rating of (the parent site's rating)-[(number of levels deep)*(some adjustment factor)] where the adjustment factor is somewhere around maybe .75 so that links lose value the further away from the parent they are. It could be tweaked, but I think it'd work.
"!"
"Is there a qualitative difference between the two types of social interaction?"
;) If I didn't believe in communication through mediation, I wouldn't be here on /. right now.
;).
While you did answer your own question ("Probably..."), I do like your response. You raise good questions. I definitely don't believe that only face-to-face communication is real social interaction, but I could have been clearer on this point. I'm not an absolutist, and I'm not pining for the dark ages or anything like that
Anyway, my real point is that these online substitutes are serving more and more people as substitutes for the real thing, to the point where young'uns are being brought up not knowing that there is a difference. Instead of getting together (in cases that are actually able to) they go online and "chat". Mediated communication inherently encourages more mediation because we as human beings form habits. And while mediation can still produce relationships (I can't deny that), they are less rich than direct unmediated ones. And technology is inherently a mediator, no getting around it (pun slightly intended
To be perfectly honest though, most face-to-face relationships are just as mediated as those maintained through technology. Real-world mediators include our political and religious views, our egos, etc. which inhibit our ability to relate directly and honestly with one another just as much as the inability to see facial expressions on a forum.
I definitely use technology where appropriate to augment relationships at distances. I only see my family twice a year, but I keep in touch via telephone all the time, and I post photos to flickr for them to see. My sisters email me once in a while, which is great too. These things definitely have value, but they are no substitute for being able to see and hug my family. They simply help make the time between visits bearable.
Cheers,
Lux
putfwd.com - 1GB Free file storage with a twist
Separating them might be nice, true, but I thank Google for every time I've found exactly what I was looking for on a blog, especially when it was something really obscure that needed a human opinion, like a stupid setting in Windows I'm looking for, or some review of a concert that I missed. Blogs are information too; often better information than you can get anywhere else. I think what you're really angry at are "those stupid parked domain search sites", which are a little different. Just a bit.
"!"
Perhaps.
But being able to program, and being able to program well are two different things. And even if they become an expert programmer, doesn't mean that they will know how to use it properly.
Someone had to design those Javascript butterflies that follow my cursor around.
And seriously. HTML is not hard at all to learn. Or at least not so hard to learn to be able to put up a web page.
404 Not Found. Way to go! You found one very effective way to take down spam blogs: Slashdot 'em!
Still, I wish I could have studied that page for comparison. I found http://bobthebuilder123.blogspot.com/ one day in my blog referrer logs. I wondered why people interested in Bob the Builder had linked to me. They hadn't. The whole page is nothing but spam - all posted on one sunny day this month. If you can help me see what gonorrhea has to do with Bob the Builder I'd be very much obliged.
At any rate, I'd pointed this site out to blogger.com but would like to know what was different with the page you linked to from "Bob's" page because it's still up. Interesting side note, though. With the changes Google made to their page ranking system recently these stupid blogs may fade away. I can't find http://bobthebuilder123.blogspot.com/ in Google. No page ranking, no purpose for existing.
The Splintered Mind - Overcoming
I wonder if Google is letting it remain easy to make spam blogs w/ Blogger in order to get more data samples, to fine-tune their filters? i.e. replicate the internet problem in the small, with controllable parameters.
After all, why run through the entire gamut of blog styles and presentation formats, when you can just examine content-only from your own servers.
Google for "new idria, ca"
The first link *is* relevant, and maybe 2 more on the first Google page are as well.
The rest? PURE CRAP. Lawyers in New Idria, CA? Job listings? Home appraisals? All just SPAM.
(FYI, New Idria, CA is a ghost town. It has a population of 3. There are no homes being sold, and thank god, no lawyers there either.)
So, I was looking for further history & photos and I was flooded with marketing garbage. Take a look at some of the URLs. It's clear that they're trying to boost their rank based on city names and not actually relevant content.