Google's Research on Malware Distribution
GSGKT writes "Google's Anti-Malware Team has made available some of their research data on malware distribution mechanisms while the research paper[PDF] is under peer review. Among their conclusions are that the majority of malware distribution sites are hosted in China, and that 1.3% of Google searches return at least one link to a malicious site. The lead author, Niels Provos, wrote, 'It has been over a year and a half since we started to identify web pages that infect vulnerable hosts via drive-by downloads, i.e. web pages that attempt to exploit their visitors by installing and running malware automatically. During that time we have investigated billions of URLs and found more than three million unique URLs on over 180,000 web sites automatically installing malware. During the course of our research, we have investigated not only the prevalence of drive-by downloads but also how users are being exposed to malware and how it is being distributed.'"
Three million out of billions is not bad, assuming randomness (only, say 1 in 1000 chance of using a bad URL), but it is a lot worse than 180k out of billions.
However not all URLs are used equally. Bad URLs linked to some popular pron site, for instance, will get hit a lot more than Joe Sixpack's facebook site.
Engineering is the art of compromise.
Searchers won't use your engine if it does not give them what they want.
The problem with that is the number of sites that happen to host malware without meaning to. Too often the malware comes through advertising services or sneak through in user generated content that would be fine if not for a browser vulnerability. Google does a lot as it is, outright blocking the sites goes too far (unless that's the only thing that the site is made for, which is rare and would probably mean that the site is ranked low in the first place).