95% of User-Generated Content Is Bogus
coomaria writes "The HoneyGrid scans 40 million Web sites and 10 million emails, so it was bound to find something interesting. Among the things it found was that a staggering 95% of User Generated Content is either malicious in nature or spam." Here is the report's front door; to read the actual report you'll have to give up name, rank, and serial number.
Animals shit in ~95% of their habitat...
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
I got ripped in 2 weeks. learn how with secret juice formula.
That is so untrue. There is value in what I write.
We know.
You see? You see? Your stupid minds! Stupid! Stupid!
The fact is that there are millions of old blogs, unused forums, ancient guestbooks, etc that are easy to spam automatically. While it might very well be true that 95% of comments on the internet are spam of some sort, they're probably read by a tiny fraction of internet users. People tend to stick to about a dozen big sites that get very little rubbish posted on them at all.
Car analogy: 95% of cars are rusty old heaps of crap that can't move. Thankfully they're in scrapyards and not on the roads.
http://twitter.com/onion2k
And in addition, the report itself doesn't even explain the result. It's a bullet point at the beginning of the report, but there's no explanation or analysis.
Breakfast served all day!
...95% probability actually. So I didn't bother.
These posts express my own personal views, not those of my employer
I guess that goes in hand with 95% of kdawson's submissions being crap and not worth the time.
Be seeing you...
Every single hour the Internet HoneyGrid scans some 40 million websites for malicious code as well as 10 million emails for unwanted content and malicious code.
So 40 million sites per hour is 960 million sites per day. While wikipedia says that there over 25 billion pages but can that number be accurate?
The subtext of this article is that you should forget about letting users create content on the Internet, because all they do is create junk and try to scam good honest people. Just leave the content creation to the institutions, and media conglomerates who know how to do it. It's safer that way, and you'll like it.
Well, I don't care if 99% of user-generated content it is crap; people need to be free to create it, because some individual in the other 1% may just come up with the cure for cancer, and despite whatever it does to Big Pharma's profits, everyone needs to be able to hear about it.
"95% of User Generated Content is either malicious in nature or spam"
"Never attribute to malice that which can be adequately explained by stupidity"
So I read "95% of User Generated Content is stupid" I agree, count me in.
In human terms, the majority of computers have AIDS. And we all know where they caught it.
Your mom?
"Ninety percent of everything is crud."
http://en.wikipedia.org/wiki/Sturgeon's_Law
I would say that 95% of email is commercial in nature, and not "user generated content". To me "UGC" is something that people who are actually active users (consumers as well as creators) of a service generate... not something injected into the service from outside by predators.
Out of the 5% that are not generated by spambots, 99% is still generated by idiots.
... a staggering 95% of User Generated Content is either malicious in nature or spam.
Considering 95% of internet users are malicious (see GIFT), it's hardly staggering that 95% of user generated content is malicious too. :p
"Convictions are more dangerous enemies of truth than lies."
We've seen this before, with Usenet, BBS's, MUD's, and Email. The advertisers, and the trolls, find it easy to spew their material across many thousands of targets, and get enough money or gratification from doing so that it funds their efforts. It doesn't even have to make money: they just have to believe that it _can_ make money, and the professionals will simply continue.
Whatever would make anyone think that "User Generated Content" forums would be any different?
Matters a lot how they get their "sample", honeypots, honeyclients, reputation systems and "advanced grid computing systems" (whatever it is). What is feeding information to that sample? Not old sites with rightful content sitting around since years ago, but in good part spammers, botnets, and people that want that your pc forms part of one. And mail is already known that is 95% spam. The sample is just too rigged to be at all related with what really is in internet or what you have some chance to see.
Emails spam aside, I would say that most of that is Google's fault. The other 95% of content created on the internet is in an attempt to SEO web sites in the other 5% of the internet that people do potentially read or visit. Google encourages web masters to get in bound links, thus the whole industry of spamming sites, directories, blog feed sites, and so on that have one purpose and one purpose only: getting as many anchor text links pointed to sites as possible so they will rank higher in Google for key terms.
Living in Chile
I take it that means there is a 95% chance that this report is bogus, or malicious?
Insightful and funny are really the same thing, except one has a punch line.
I'll have to change it from "Everything" to "95% of everything". :-(
Fact: Everything I say is fiction.
First, here's the actual report, without any form to fill out. (Backup copy at WebCitation.) Amusingly, the report is clearly written for a target audience who prints out PDF files on paper. It contains charts in tiny type.
The report covers the usual email issues, which will be familiar to Slashdot readers. New issues for 2009 are the following:
The report identifies Google's weak security in their search engine as a problem. Microsoft's Internet Explorer remains a problem, of course, but now Google is now the attack target of choice to drive traffic to a site that can attack the browser. Google still, apparently, hasn't figured out a good way to prevent link farms from driving up search position.