The Ham and Spam of Weblogs
An anonymous reader submits "Will the blogosphere become just as spammy as Usenet? There may be over 10M weblogs out there, most of them seem to be fake spam blogs created to manipulate the search engines. Scott Johnson, CTO at Feedster, complained that "at times we see upwards of 90% of the traffic from Blogspot being spam," and the problem is likely to only get worse. Can blog search engines like Technorati, Feedster, and PubSub filter the signal from the torrent of noise? Or will we have to seek new approaches such as the social filtering used by Del.icio.us or collaborative filtering used by Findory to separate the ham from the spam?"
I wish Google had an option to exclude blogs from my search. Considering many blogs use b2evolution, phpBB, or whatever, Google could easily determine what IS a blog and what IS NOT and filter it accordingly. Google IMHO would be a much better place if I could exlude blogs and those stupid parked domain search sites from my queries.
::242
I'm not trying to be flamebait; It would be a nice option though.
90% of EVERYTHING is crap. It just happens that weblogs trend toward a specific TYPE of crap -- SPAM. I mean you may think JeffK is crap, but some of us find him funny, so anything with actual content has to be not crap to somebody (if only the creator). That means all the crap must be content-free.
Mal-2
How is the Riemann zeta function like Trump rallies? Both have an endless number of trivial zeros.
The guy makes a good point...human validation via captcha. If you're going to spend 10 minutes complaining, whining, bragging and/or loathing about something then you can spend 3 seconds typing in the word "uNFsaQ" to prove you're human.
If it takes you less than 10 minutes to write in your dear diary--I mean blog--then it's probably a 1 liner to the effect of "i think she likez me omglolbbq!!!" and you need to get off my internet.
Problem solved. Next?
"blogosphere"? Considering that blogs are probably the dumbest form of communication possible (a linear log of rambling bullshit) I can only hope that the Blogosphere is destroyed by the Vogon Constructor Fleet to make way for a colonic bypass.
With email spam filtering you have to consider each email separately. A blog has a persistent identity and reputation. In theory, this should make it easier to filter blog spam than email spam. On results of this type of filtering is that it will will penalize new blogs in search results, both spammy and real.
Blog comment spam will remain a problem, of course.
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
Who is going to pay $1 to read about how your boyfriend dumped you last week and you're still crying in bed. Blog comment spam isn't all that hard to get rid of (filter links, filter content, or if you're just worried about search engines, use rel="nofollow").
Anyone who has a blog that you have to pay to comment on (or to see) isn't going to get much traffic.
I just wanted to point out that so-called "social software" is not social. Person-to-person communication through computers is mediated and indirect. Technology is a barrier to communication as much as it is an enabler. I agree that it is an enabler in situations where it is used to help overcome disabilities and things of that nature, however technology is used moreso by people who are actually avoiding being social. Email is often preferable to a telephone because it creates an additional barrier between ourselves and the "recipient" (aka person).
A prime example of software in a "social" context is the chatter that accompanies networked video games. This does not form real relationships between people. I heard a teenager recently say that his gaming buddies, who he doesn't even know by name, are like family to him. Technology has helped a whole generation and then some to fail to learn what real relationships are. When a teenager can't distinguish between somebody he's only ever witnessed virtually shoot ze germans and the people who nurtured him before he was able to take care of himself, we have a problem Houston.
And it's only getting worse. Now we've begun adding "social" in front of all kinds of new web applications. Anything that lets other users see your profile and the items you post and comment on them is seen as a valid replacement for real human contact.
There was a line from a movie I saw recently called Crash, where Don Cheadle's character says to his girlfriend "It's the sense of touch. Any real city you walk, you know. You brush past people, people bump into you. In L.A., nobody touches you. We're always behind this metal and glass. I think we miss that sense of touch so much, that we crash into each other just so we can feel something.". The next time we use the word "social" to describe a new type of web application, I think we should give that some thought first.
putfwd.com - 1GB Free file storage with a twist
It was a bit unintuitive how you add sites to the filter list though -- just cut and paste "http://*.whatever.com/*" into your extensions list and any search results from whatever.com will then be greyed out.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Slashdot is a blog, created in the context of a news site, which we all come to and bitch about things we want out of technology, think is/are cool, and/or hate and want everyone to know why.
That being said, Google (along with other large search engines) have already taken stances on blogging, and are actively pursuing their individual stances. For most, this is creating their own blog service, and doing some shifting in their code to make sure blogs don't come out on top. But this isn't an absolute truth.
If you want these things, and Google doesn't offer them, make your own search engine, and do it better. No, seriously, don't look at me like I'm crazy; there have been over a dozen "major" search engines created after Google, some are only in serious use by geeky populations (AlltheWeb, as far as I can tell, fits this), some by the trendy, some by the "I hate Google"ites, etc. etc. It's as simple as that.
One reason I think Google's strayed from taking such a hardline on blogs is simply out of ease of use. Google doesn't want to complicate life with a million more search options, especially ones you can deal with yourself by subtracting out the majorly offensive sites (-livejournal -blogger -blogspot, etc).
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
People are using blogs and forums to post links to their own sites. These links show up as backlinks to Google, and due to Google's ranking procedure that determines which website is the most relevant to each search, each extra backlink pointing to a website can effectively make that website more relevant in the searches.
Luckily, Google is one step ahead of the spammers, and has allowed only one link from each forum to contribute as a valid backlink. Therefore, having 100 forum signatures linking to www.spamdomain.com will no longer give credit for 100 backlinks; Only one backlink will be credited towards www.spamdomain.com. The problem is, alot of people have not realised that Google has done this yet, and as a result, people are still adding 8+ forum signature links in their posts, hoping to cheat the search engine ranking system.
Valkyrie is about to die! Wizard needs food -- badly!
In the Wired article (I know this isn't about spam, but what the hell):
"Lately, it seems like almost every time you tune into your favorite Blogger-hosted blog to catch up on the latest gossip, meme, political diatribe or cybersnark, you find that the site is frozen in time. Or, there are multiple posts with identical content."
Uh, no, not as far as I can tell. "Frozen in time," perhaps, after someone decided to stop blogging, but I used blogger for six months and never had a single hitch. Apparently, googling "blogger sucks" gives you thousands of sites bitching about google's service.
Sometimes there are outages, when you can't get in to alter a post or something similar, but those were few and far between (at least they happened less than half a dozen times in six months, and it only lasted a few hours.)
I guess this is a sign about how popular blogger is. I mean, then only way to balance my experience (zero fatal errors in six months) with thousands of complaints is to assume that there are a HELL of a lot of bloggers out there.
Oh, and to those bitching in general about blogs: please shut up. Yes, there are annoying vanity blogs, but blogger -- and the blogging concept -- has been a godsend to specialists, as well as to political organizing.
Protect your liberties. Donate to the ACLU
I'd say blogs are more than just what you've said. Hear me out.
Blogs are a new form of communication. Before, we had "editorials" which were published in newspapers, where someone of stature is making their opinion well known, simply to spark debate and interest in the public's mind. Now this is a turn for everyone to have their own editorial, and to foster debate and discusion. Welcome to Slashdot, by the way.
Secondly, they offer a form of sympathy to the author; normally someone either says "I like your book" or "I don't like your book". This gives people a chance to say "Well, I liked your book, but the ending could be better. I don't think Saffron shoulda died when she fell into the swimming pool" or something like that. Sometimes it's rewarding to write something, but you never know how other people relate to it, and this is just a great opportunity to get that feedback, instantly.
Lastly, it's an insight into the person. It shows what that person values by what they write about often. It shows how educated the person is by word choice and by sentence structure. It shows how thoughtful the person is when they ask questions. It shows how we're different, as people.
Honestly, I think the problem is that nobody thought about the problem before it existed. When we thought of the Internet, we thought of it as a number of infinitely flexible services accessible by port interfaces. When we sat down and thought of the way we wanted to put the web together, we wrote a common interfacing language, and ways of accessing that information, by a standard, over the internet. But what we didn't think of was how different the kinds of media transported over the internet would become. Had we thought of it, we might be using blog:// to access blogs today, instead of a certain http address, just as we might be using images:// or video://. Honestly, it shows how well the original system was designed, but then again it also shows how we pretty much stopped designing the system after it solved our problem (same with email, IMO).
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
If you have a few minutes, click on the randomizer button at the top of the screen that reads "Next Blog" a couple of times. I'd be willing to say that at least 2 out of every 10 blogs is a spam farm.
It's just fucking sad.
Web2.0: I love when people Flickr my cuil and digg my boingboing until my google is reddit and I start to yahoo
Actually, Usenet is doing quite well. The spam battle has been won; there's very little spam in the technical groups. Serious workers in difficult fields are on there. Check out, say, "comp.games.development.programming.algorithms", where the people who write physics engines discuss how to do it. Or "comp.std.c++.moderated", where proposed changes to C++ are discussed. Usenet has far lower advertising content than the Web, where, today, "content" seems to be a little box in the middle of the page, surrounded by blinking ads.
Perhaps.
But being able to program, and being able to program well are two different things. And even if they become an expert programmer, doesn't mean that they will know how to use it properly.
Someone had to design those Javascript butterflies that follow my cursor around.
And seriously. HTML is not hard at all to learn. Or at least not so hard to learn to be able to put up a web page.
I have been coding webpages since March of 1995. I have learned HTML 1.0, 2.0, 3.0, 4.0 and now CSS1.0 and CSS2.0 and... As exciting as all that can be sometimes I just want to post my thoughts and be done with it. There's nothing wrong with efficiency. Blog sites can be great time savers. I used to have a web journal, wrote entries in my Palm Pilot, hotsynced the data to my Mac and ftp'd it onto my server using Applescript - all the while snorting at all the newbies using blog sites. Then I decided I valued my time better. I opened up a blog in January of this year (http://thesplinteredmind.blogspot.com/ and have had a blast. I post once a week.
Now, my blog isn't going to be popular. I cover mostly neurological problems and how to deal with them. But I've had some fascinating discussions with complete strangers because of my blog and I'll continue blogging into the forseeable future. Because of Google many people find my blog despite it being a small fish in a big and noisy blog sea. Google is a great tool and I'm glad they index blogs. Now, I'm as upset as the next guy about spam blogs, but "crap" blogs are relative. You may read my blog and find it lame. Others, including myself, would disagree with you. But if you don't find the subjects I write about interesting or valuable, so what?
Slashdot cracks me up sometimes. What is it to some of you guys if somebody wants to blather on and on about their breakfast or their boyfriend? If the site is a bore move on, but you could tell that from the Google search, right? Seriously, I haven't found many blogs that come up in my searches that aren't related to my searches. Not as much as parked domain sites and adsense whores at any rate.
Not all bloggers can't be bothered to code a web page. In fact, because I do code I'm able to personalize my site. Every month I tinker and tinker with the code when I find some time. Blogging may be an exercize in vanity, but then so isn't hosting your own website. In fact, the whole web publishing scene is about personal expression, and what's wrong with that?
The Splintered Mind - Overcoming
Like email spam, these sites will continue to exist so long as people click on the links, thus supporting the business model.
RichM
Data Center Knowledge
Google for "new idria, ca"
The first link *is* relevant, and maybe 2 more on the first Google page are as well.
The rest? PURE CRAP. Lawyers in New Idria, CA? Job listings? Home appraisals? All just SPAM.
(FYI, New Idria, CA is a ghost town. It has a population of 3. There are no homes being sold, and thank god, no lawyers there either.)
So, I was looking for further history & photos and I was flooded with marketing garbage. Take a look at some of the URLs. It's clear that they're trying to boost their rank based on city names and not actually relevant content.