95% of User-Generated Content Is Bogus
coomaria writes "The HoneyGrid scans 40 million Web sites and 10 million emails, so it was bound to find something interesting. Among the things it found was that a staggering 95% of User Generated Content is either malicious in nature or spam." Here is the report's front door; to read the actual report you'll have to give up name, rank, and serial number.
It seems that at least as well as anyone can estimate, the current population really is about 5% of the total humans who've ever lived.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
It is no different than domain names. Type a random sequence of 4 characters .com, and the vast majority of times you will get some fairly innocuous spam site, e.g. dneo.com (picked at random), with no real content.
But it doesn't interfere much with most poeple's use of the web.
Matters a lot how they get their "sample", honeypots, honeyclients, reputation systems and "advanced grid computing systems" (whatever it is). What is feeding information to that sample? Not old sites with rightful content sitting around since years ago, but in good part spammers, botnets, and people that want that your pc forms part of one. And mail is already known that is 95% spam. The sample is just too rigged to be at all related with what really is in internet or what you have some chance to see.
Sorry to hijack this, but http://securitylabs.websense.com/content/Assets/WSL_ReportQ3Q4FNL.PDF seems to be the direct link to the paper.
"Anonymous could not immediately be reached for further comment." - International Business Times