Google's Research on Malware Distribution

← Back to Stories (view on slashdot.org)

Google's Research on Malware Distribution

Posted by Soulskill on Sunday February 17, 2008 @11:45AM from the making-a-mountain-out-of-a-really-big-piece-of-rock dept.

GSGKT writes "Google's Anti-Malware Team has made available some of their research data on malware distribution mechanisms while the research paper[PDF] is under peer review. Among their conclusions are that the majority of malware distribution sites are hosted in China, and that 1.3% of Google searches return at least one link to a malicious site. The lead author, Niels Provos, wrote, 'It has been over a year and a half since we started to identify web pages that infect vulnerable hosts via drive-by downloads, i.e. web pages that attempt to exploit their visitors by installing and running malware automatically. During that time we have investigated billions of URLs and found more than three million unique URLs on over 180,000 web sites automatically installing malware. During the course of our research, we have investigated not only the prevalence of drive-by downloads but also how users are being exposed to malware and how it is being distributed.'"

23 of 83 comments (clear)

Min score:

Reason:

Sort:

Now then... by Bluewraith · 2008-02-17 11:51 · Score: 3, Funny

Where is the page listing each of the bad sites? I'd like to get started on my Virus Aquarium
Google itself? by XanC · 2008-02-17 11:53 · Score: 3, Interesting

Did Google consider itself to be a source of malware? http://blog.opendns.com/2007/05/22/google-turns-the-page/
1. Re:Google itself? by _merlin · 2008-02-17 13:16 · Score: 2, Interesting
  
  I'd say it falls into the same category as WGA: borderline malware. The name "Browser Error Redirector" doesn't make its purpose clear to a non-technical user; it sends information to a third party without user confirmation; it is installed without user consent. The information it sends to a third party may be innocuous, and it may be possible to uninstall, but it's still far from respectable.
2. Re:Google itself? by moderatorrater · 2008-02-17 13:56 · Score: 3, Insightful
  
  The name "Browser Error Redirector" doesn't make its purpose clear to a non-technical user I would argue that there is no way to make its purpose clear to the non-technical user without using at least a full sentence, probably a paragraph. For those who are familiar with the concept of error page redirection in the first place, it's a very adequate description, very honest and the first thing I would suspect once I realized there was a problem. If it had been "Browser Helper" or "DNS Accelerator" or "Bonzai Buddy" then arguing that the name wasn't clear would be applicable; as it is, it's a specific name for a specific condition that doesn't hide what it is.
3. Re:Google itself? by moderatorrater · 2008-02-17 14:16 · Score: 2, Interesting
  
  I read that article, and honestly, it comes off as someone trying to sound smart who really isn't. "Spyware" (used in the article) isn't the term for something that changes the behavior of the computer; it would be applicable if the software reports back to google about the browsing habits, but this isn't what's described in the article. It should be considered "malware" or "adware."
  
  Further, the argument about the name seems frivolous. Expecting a non-technical user to even realize that their error pages are being changed in the first place is stretching it; to suggest that the program could somehow name itself in such a way that a non-technical user would know what it did is ridiculous. If you know about the problem, the name is as good as any I could come up with, and certainly better than anything that could properly be called "spyware".
  
  Finally, the article would be 1/3 the length, but he's too busy talking about how he's so morally superior. Granted, OpenDNS is an awesome service that I recommend wide and far, but the fact that he's fixing the problem is enough to show most people that.
Read it again by EmbeddedJanitor · 2008-02-17 12:03 · Score: 4, Insightful

There are three million bad URLs being served off 180,000 web sites.
Three million out of billions is not bad, assuming randomness (only, say 1 in 1000 chance of using a bad URL), but it is a lot worse than 180k out of billions.
However not all URLs are used equally. Bad URLs linked to some popular pron site, for instance, will get hit a lot more than Joe Sixpack's facebook site.

--
Engineering is the art of compromise.
1. Re:Read it again by Anonymous Coward · 2008-02-17 14:13 · Score: 2, Insightful
  
  Also, it would likely be inaccurate to assume uniform randomness for the appearance of those pages in search results. They are likely optimized to turn up for very popular queries with every SEO trick available. So it's still 3 million out of billions, but those 3 million likely get significantly more than traffic than an average page.
Re:And what platform does the malware run on? by grcumb · 2008-02-17 12:13 · Score: 5, Interesting

I found it quite interesting that the methodology of the research doesn't even bother to check sites with Mac OS X or Linux operating systems. But on the server side, Apache websites running outdated versions of PHP were singled out for comment.

In all there were twice as many compromised IIS servers as Apache, but fully 50% of all compromised Apache servers were running some version of PHP.

It was also interesting to note that computer-related websites ranked second only to social networking sites as most likely to be compromised with redirections to malware sites. Seems we might want to tone down our holier-than-thou rhetoric. 8^)

--
Crumb's Corollary: Never bring a knife to a bun fight.
Maybe Goole should delist a few sites. by budgenator · 2008-02-17 12:25 · Score: 4, Interesting

It occurred to me that if Google started desisting sites that tried to implant malware into visitors computers, then webmasters would be much more diligent about keeping the crap off their sites, or at least keep a few more hapless victims out of harm's way.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds
1. Re:Maybe Goole should delist a few sites. by moderatorrater · 2008-02-17 14:24 · Score: 3, Insightful
  
  The problem with that is the number of sites that happen to host malware without meaning to. Too often the malware comes through advertising services or sneak through in user generated content that would be fine if not for a browser vulnerability. Google does a lot as it is, outright blocking the sites goes too far (unless that's the only thing that the site is made for, which is rare and would probably mean that the site is ranked low in the first place).
2. Re:Maybe Goole should delist a few sites. by budgenator · 2008-02-17 15:00 · Score: 2, Interesting
  
  That first one might not be true, since hosting servers in China is very cheap, so perhaps some entities host sites in China intended for non-Chinese audience in order to cut costs.
  I remember years ago that hosts used to have a "no porn" in there service agreements, for fear that their IP block might get blacklisted, Now we often run into the same thing due to virtual hosting, blocking one IP address might knock a 100 websites off the internet. Of course with China some of it may be the government trying to implant surveillance Trojans
  
  --
  Apocalypse Cancelled, Sorry, No Ticket Refunds
zero script policy for serious web use by Anonymous Coward · 2008-02-17 12:25 · Score: 3, Interesting

The problem is with the client software. I can understand the danger of sites that try to fool you into downloading and running an application, or infected media that harnesses an exploit in an application - but automatically infecting the machine just by visiting the site is beyond belief. There's a serious problem with what the "web" has become, forced upon us by reckless and naive developers. The WWW and HTML was never meant to be something that runs active code on the client. Period. Most of us realise there is no way this problem can ever be solved without revising exactly what a browser is supposed to be, as long as browsers will run code instead of interpreting data there will always be malicious sites set up to exploit this.

I have to observe a cast iron policy in my work. It means that quite a few sites on the internet are unavailable, but since they are mostly entertainment based it isn't a serious loss. No Javascript, no ActiveX, no Macromedia Flash. My activities are limited to viewing HTML and PDFs, even animated GIFs are blocked. In many years we have had no malware incidents (that I know of). Sometimes it's absolutely necessary to view a site containing potentially insecure content, so there is a "dirty machine" which is not allowed to connect to anything else and is wiped and reinstalled weekly.

The problem is that even serious academic and scientific sites (that should know better) are starting to add Flash plugins and heavy scripting, so it's getting hard for conscientious users to maintain security even where they want to. Insecure technology is being forced upon us by the site developers.

It would be nice if Google could display whether a site needs JavaScript, Flash or whatever and be able to search for HTML only content. The difficult way is to use Google Cache in text only mode of course.
Re:Search engine ranking by calebt3 · 2008-02-17 12:33 · Score: 5, Insightful

Searchers won't use your engine if it does not give them what they want.
Be careful what you ask for by davidwr · 2008-02-17 12:36 · Score: 2, Interesting

You'll start seeing people use H1 for everything. If you are lucky they'll override it with a style sheet so it doesn't look obnoxious.

I wonder if Google has ever considered a moderation system, allowing logged-in Google users to rank the results of their searches on a random and infrequent basis. It would be easy enough to have the "click here to open" link change to a "click here to open, and open survey in new tab/window" if the user said they were willing to moderate search results.

If a page got a bad "reputation" for a given search, its rank would go down for that particular search.
If a page got a bad "reputation" as a malware haven, link farm, or other abusive page, that page would be punished.
If a page got flagged as "illegal content" Google would drop the comment with a note saying "We are not the police, but please contact your local or national police. Click here for a list of national police web sites worldwide."
If a page got flagged as a copyright violation, Google would drop the comment with a note saying "We are not in the business of enforcing private court actions. To find a copyright attorney, click here."

--
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
1. Re:Be careful what you ask for by onepoint · 2008-02-17 14:57 · Score: 3, Interesting
  
  they have the vote for this on the tool bar. Which to my knowledge works rather well if you are a heavy user and consistently vote pages for which you do a search. I do about 40 to 80 search per day and I am sure that I vote on 90% of it, I have come back to the same topics to search and have seen changes which were major improvements ( lag time about 4 to 6 weeks )
  
  --
  if you see me, smile and say hello.
Google Malware team. by csk_1975 · 2008-02-17 13:31 · Score: 2, Interesting

Having first been unable to use google translate and now google search due to the "Error- Your request appears to be virus related please scan your computer for malware" I do wonder how sound any google analysis of malware is. If they have problems distinguishing between my computer that is not malware infected and the transparent port 80 proxy for my home cable ISP which is shared by 100,000s of computers some of which are obviously malware infected, then what hope a useful analysis of the much more devious and murky world of drive-by installers?
Re:And what platform does the malware run on? by mrxak · 2008-02-17 13:48 · Score: 2, Informative

There's a lot more servers out there running old versions of PHP than the very latest.

--
-mrxak
Onions Will Kill You
Actually they do add a warning for infected sites by Slur · 2008-02-17 15:07 · Score: 5, Informative

One site I work on got hit by a PHPBB SQL injection attack and had a tiny iframe inserted into the forum header that pointed to a well-known malware site, hightstats.net (and if you're curious the malicious script is in the strong/044 folder). Google picked up on the iframe's contents being a malicious script and added the malware warning to the search results pertaining to the forums section of our website.

I just wonder how it is that hightstats.net can still be in existence when it contains known malicious stuff that hackers are inserting into unwary websites?!

--
-- thinkyhead software and media
Nice plug for Google: by olddoc · 2008-02-17 15:33 · Score: 2, Interesting

The underlying problem is that advertising space is often syndicated to other parties who are not known to the web site owner. Although non-syndicated advertising networks such as Google Adwords are not affected...

Did you catch the above line in their article?

--
Power tends to corrupt, and absolute power corrupts absolutely.
Key points to take from the paper by The+Master+Control+P · 2008-02-17 15:41 · Score: 4, Informative

2/3 of all malware distribution sites & sites that link to them are hosted in China.
The next worst offender is the US with 1/6.
About 3.5M websites attempt to send you to exploits from 180K distribution sites.
63% of the 180K malicious sites are IIS, 33% are Apache, and a handful are other.
80% of malware from not in ads (e.g. iframes) was within 4 redirects of the malware distributor.
80% of malware from ads was more than 4 redirects from the distributor.
3/4 of distribution sites and 1/2 of landing sites are in 2 blocks occupying 6.5% of IP4.
Among drive-by downloads, 1/2 alter your startup, 1/3 attack your security, 1/4 corrupt your preferences, and 7% install BHOs.
87% of outbound connections the malware initiates are HTTP, 8.3% are IRC.
The three AV engines tested against malware retrieved by the study had detection rates of about 35, 50, and 70%.

The part I find scariest is the 3.5M malware fronts. I mean, there are only about 70M active hosts on the entire Internet - that's 5 percent! Since I think that trying to make programmers these days write secure code is a lost cause, we should focus on breaking up the software monoculture. This kind of shit really starts to lose it's efficacy if only 1/4 or 1/5 attempts even attack the right browser...
This can be fixed, but impacts ad revenue model by Animats · 2008-02-17 18:21 · Score: 3, Informative

The paper points out that most of the attacks involve redirection of some portion of page content. That's a useful piece of information, because, other than for advertising purposes, redirection of IFRAME items and images is quite rare. A useful blocking strategy would be to block all redirects below the top level page. Many ads will disappear; no great loss.
Checking for hostile full web pages is already being done. McAfee SiteAdvisor was the first to do that, then Google copied them. Our "bottom feeder filter", SiteTruth, does some of that too, although it throws out far more sites than McAfee or Google do, just by insisting that some identifiable business stand behind any page that looks commercial.
Google's revenue model depends, to some extent, on those "bottom feeder" sites: all those anonymous "landing pages", "directory pages", "made for AdWords pages", and similar junk. Those things bring in substantial AdWords revenue, although they don't usually generate much in the way of sales for advertisers. Throwing them out of the "Google Content Network" would cut Google's ad income. This is where "don't be evil" collides with Google's profitability.
This looks like a solveable problem, but the solution will come from the security companies, not the search companies. The search companies can't afford to fix it.
The choke point: distribution sites by quux4 · 2008-02-17 18:54 · Score: 2, Interesting

In the 10 months of data the researchers used, Google found 9,340 distribution sites. The other 180,000 sites simply redirect you to the the distribution site, which is where you download the malware.

It gets better - those 9340 distribution sites are under the aegis of only 500 autonomous systems. Which means Google could send their list to those 500 AS's - and each would have (on average) around 20 malware sites to clean up. After this, Google could keep notifying AS's of the distribution sites found (less than a thousand a month).

Looks like a very measurable and approacheable problem now! I can't wait for Google's spam report. (They are working on one, aren't they?)
Re:Search engine ranking by darthflo · 2008-02-18 00:07 · Score: 3, Informative

The GoogleBot doesn't execute JavaScript. Google listing any content from a given site means it does, to a certain point, degrade gracefully.

Also, what's your problem with JavaScript? If you ever used the Google front page (instead of your browser's quick search function or /search?q=your+query), you probably didn't mind not having to click into that textbox, now did you? JavaScript can cause some problems, but implemented sensibly (by the browser devs) it is no security threat and used responsibly (by web devs) has great benefits.