Slashdot Mirror


IE Should Use Google's Malware List

Frequent contributor Bennett Haselton writes with an idea that he thinks could help keep browsing on Microsoft's browser more secure for users — and benefit Microsoft as a result. "Tests show that IE's malware filter performs well against other browsers that use the Safe Browsing blacklist from Google. But wouldn't IE's filter be even more effective if it used both filter lists at the same time? And are the political obstacles to that really so insurmountable?" Read on for the rest of a plan that seems a lot more than half-baked. Most major browsers now come with a built-in blacklist of malware-infected or phishing websites, that display a warning if the user tries to access them in the browser. Internet Explorer 8 uses Microsoft's SmartScreen filter, while Firefox, Safari and Chrome all use Google's Safe Browsing API. Recent tests from NSS Labs reported that IE's filter blocked 81% of "socially engineered malware sites" from the lab's sample, while Firefox, in second place, blocked only 27%, and other browsers trailed even further behind. When NSS Labs ran a test of the different browsers' efficiency at blocking phishing sites, IE and Firefox scored about the same, both blocking about 80% of the sites in the sample. These results left a lot of unanswered questions, such as: Why Firefox, Safari and Chrome got such different scores in both tests (since they supposedly all use the Safe Browsing blacklist), and why such a huge gap between IE's and Firefox's performance in the malware test, but such close scores for the two browsers in the phishing test (the Google Safe Browsing API page says that the database is an attempt to list both malware and phishing sites, after all).

But I had a different question: Since Google allows anybody to use the Safe Browsing API, why doesn't Internet Explorer use it as well, in conjunction with their own blacklist, so that a site will be blocked by IE if it's present on either list? This would almost certainly increase the block rate for IE (unless the set of sites blocked by Safe Browsing was entirely a subset of the sites blocked by SmartScreen, which is extremely unlikely). Google's Terms of Use for the Safe Browsing API do require parties to obtain written permission for any usage that will result in more than 10,000 users sending "regular requests" to the API, which would obviously include Internet Explorer. But Google already serves requests for all Firefox users who have the SafeBrowsing API turned on, so for them to process requests for all Internet Explorer users might require four or five times as much computing power, not orders of magnitude more. It's impossible to guess what kind of deal Microsoft and Google would make for the right to have IE do lookups on the Safe Browsing API, but if Microsoft placed a dollar value on increasing the protection for their users, and that dollar value exceeded the cost to Google of running the servers to process the additional queries, then in theory they should be able to agree on a price between those two amounts. Google might well offer to service the queries for free, just for the prestige of being able to say that the Safe Browsing database provided protection for almost all major browsers on the market.

(Microsoft's SmartScreen team declined to comment on the record about their reasons for not using the Safe Browsing list in addition to their own database. I couldn't get an official response from Google about what position they would have on Internet Explorer using the Safe Browsing list, although unofficially an employee said the team would probably be "delighted" if IE were to use it.)

It's worth underlining what a strong statement Microsoft is making by not using the Safe Browsing list. They're not just saying that their own list is better. They're saying that the Safe Browsing list is of such low quality that adding it to their own product would actually make the product worse.

This is different from, for example, what McAfee and Symantec might say about each other's anti-virus lists. Consider the set of all viruses that McAfee blocks and the set of all viruses that Symantec blocks. Let List X be the overlap — the huge swath of viruses that are blocked by both McAfee and Symantec. Then let List Y be the set of all viruses that are blocked by McAfee but not blocked by Symantec, and let list Z be the set of all viruses that are blocked by Symantec but not by McAfee. (So McAfee blocks viruses in the set X+Y, and Symantec blocks viruses in the set X+Z.) Now, representatives from McAfee and Symantec will each say that their list is the better one, which they may or may not believe. But even McAfee is not claiming that List Z — that portion of the list that is blocked by Symantec but not by McAfee — is so worthless that McAfee wouldn't incorporate it into their own product if they could get it for free. If Symantec allowed any anti-virus maker to download Symantec's anti-virus signature database, then presumably McAfee would scratch their heads a bit about why Symantec would do this, but if they cared about giving their users maximum protection, they would incorporate it into their product as well (so that McAfee would then be blocking all viruses in the set X+Y+Z, instead of just the set X+Y as they were before). But Symantec doesn't make it available for free, so McAfee doesn't have the option of using it and the issue doesn't come up. Other than each company claiming their product is the better one (which is par for the course for competitors), the two companies' positions are not contradicting each other.

But consider the analogous situation for anti-malware lists, where X is the set of all sites blocked by both IE's SmartScreen and by the Google Safe Browsing API, Y is the set of all sites blocked by SmartScreen but not by the Safe Browsing API, and Z is the set of all sites blocked by the Safe Browsing API but not by SmartScreen. When Microsoft says that they don't want to use the Safe Browsing list in addition to their own — that they would rather block just X+Y than block X+Y+Z — they're saying that they're estimating that the list Z is of such poor quality (too much risk of containing too many false positives) that it would be better not to block it at all.

In this case, Microsoft's position really is contradicting that of Google, Firefox, Safari, and others who use the Google Safe Browsing API. To achieve the best tradeoff between user safety and convenience, should the sites on List Z — the set of sites on the Safe Browsing API blacklist but not on the SmartScreen blacklist — be blocked, or not? If the answer is Yes, then IE should use the Safe Browsing API in addition to their own SmartScreen list. If the answer is No, then Google should take the URLs in the Safe Browsing API list, run them through IE using some automated script, and then remove all the URLs that weren't blocked by IE — in other words, remove all the URLs on List Z from the Safe Browsing blacklist. But I can think of no consistent set of assumptions that would lead one to recommend that both companies continue doing what they're doing now — that IE should continue not to use the Safe Browsing API, and that Google should continue publishing the Safe Browsing API without trimming URLs that aren't also blocked by IE. Microsoft is saying that the URLs on List Z should not be blocked; Google is saying that they should be.

(Note that this argument is independent of the relative weights that you assign to the benefit of blocking a genuinely malicious site, versus the cost of accidentally blocking a site which is not malicious. Different users might assign different values to these costs and benefits, and depending on what values they assign, those users would want different thresholds to be used in deciding whether to block a site or not. And Microsoft and Google have picked default thresholds that they estimate will meet the needs of the average user. But no matter what values you assign to the benefit of blocking a malicious site and the penalty for blocking a false positive, it's still the case that blocking the sites on List Z either does increases the total cost/benefit score — in which case IE should block sites on the Safe Browsing list in addition to its own — or it doesn't — in which case Google should remove sites from the Safe Browsing list that aren't blocked by SmartScreen.)

I suspect, of course, that the answer is the former — that the set of sites on List Z, those which are blocked by the Safe Browsing API but not blocked by SmartScreen, are probably approximately as likely to be malware as the rest of the sites on the list, and that it would make Internet Explorer safer if Microsoft augmented SmartScreen to use the Safe Browsing API as well. So why don't they?

The answer is probably what people have been shouting out from the back of the classroom since the first paragraph: That for political reasons, Microsoft doesn't want to be seen incorporating anything from Google into their own flagship application. It's not news that a company would prefer to promote its products over its rivals'. But this goes beyond, for example, Microsoft bundling Internet Explorer with Windows instead of Google's Chrome browser. Chrome and Internet Explorer do virtually the same thing, so it would look positively odd for Microsoft to promote IE over Chrome. But IE's SmartScreen list and Google's Safe Browsing list can be used simultaneously, providing more protection than either one by itself.

Still, Microsoft has already calculated that it would be an unwise move politically to use Google's Safe Browsing list. So I'm not trying to second-guess the calculation that they made, based on data that was available to them at the time. Rather, I think that if some publicity can increase the political benefit that they could get from using Google's Safe Browsing list in conjunction with SmartScreen (and increase the political cost of not using it), that might lead them to recalculate and make a different decision. To that end, let me raise up a banner that people can gather under if they want to:

Microsoft, we will not think any less of you if you use the Google Safe Browsing API in Internet Explorer in conjunction with the SmartScreen filter! We'll give you credit for setting aside petty rivalries and using the technology of a competitor in order to make users safer.

The IE team's blog post about the initial success of the SmartScreen filter, from March 2009, cited statistics showing 10 million malware blocks in the previous six months, and asked readers to think about those numbers in terms of their impact on real humans and the grief it saved them: "These are BIG numbers — each malicious download blocked helps prevent compromise of that user's computer." Since then, Microsoft has released new statistics showing that SmartScreen has delivered about 70 million blocks since IE8 was officially released. Of course, not every one of those blocks made the difference between infecting a machine with spyware and keeping it clean (many users wouldn't have downloaded or installed the software that the website was trying to send them), but the IE team is right to be proud anyway. However that also means that if adding Safe Browsing support to IE resulted in only a small percent increase in the filter's effectiveness, it would mean several million additional malware blocks over the same period, and cumulatively tens of millions of more in the years ahead. Isn't that worth Microsoft forming an alliance with Google, especially if doing that would make them look good?

14 of 109 comments (clear)

  1. The Real Question by Anonymous Coward · · Score: 4, Insightful

    is why shouldn't Firefox, Opera, et al. use IE's list as well, if it's so much better?

    1. Re:The Real Question by clang_jangle · · Score: 4, Insightful

      Because that wouldn't have the same sensational ring to it?
      But honestly, I think The Real Questions are, "Why does Bennett Haselton have to blog every silly thought that pops into his brain, and why does slashdot have to put them all on the front page?"

      --
      Caveat Utilitor
    2. Re:The Real Question by TheRealMindChild · · Score: 3, Insightful

      Why does Bennett Haselton have to blog every silly thought that pops into his brain...

      Isn't that, by the very definition, what a blog is?

      --

      "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
    3. Re:The Real Question by thePowerOfGrayskull · · Score: 2, Insightful

      Because perhaps this is in fact news for nerds. Why a web browser, which is where most of us spend most our time, would not want to implement safety features is a great topic which merits debate. Seems like a cut and dried example of front page material to me.

      If it were a concise, well-written article, then I would agree. As it stands, it's rambling, repetitive, and just a bit painful to read.

  2. broswing by snarfies · · Score: 3, Insightful

    "Frequent contributor Bennett Haselton writes with an idea that he thinks could help keep broswing on Microsoft's browser more secure for users -- and benefit Microsoft as a result."

    I have an idea that I think could help keep Slashdot from embarrassing itself even more than failing to ask Blizzard about bnetd - use a spellchecker.

  3. Google's worst feature... by Anonymous Coward · · Score: 2, Insightful

    You mean they should use that obnoxious Google feature that tries to stop one visiting crack sites? At least they could provide a link to continue, after the user is informed of the risks - to not include one is simply irritating.

  4. SPF by dword · · Score: 2, Insightful

    I've recently heard about a concept called single point of failure, maybe you should look into it. If anything goes wrong and Google goes down with its malware list or they simply choose to block IE, we'll be completely defenseless.

  5. This Google-worship really has to end by Anonymous Coward · · Score: 1, Insightful

    To paraphrase: "blah blah blah bllah bllah blah everyone should use Google blah blah blah."

    Look, monoculture wasn't a good thing the last time and it isn't a good thing this time either. Multiple, competing sources of data please. I don't want a mistake in Google's data to mean it will automatically get propagated to MS' products, nor do I want a mistake in MS's list to automatically propogate to Google.

    As for Microsoft having calculated this politically, I'll bet it never gave the matter a moment's thought. MS have to be answerable for their own product - sticking a reliance on a competitor and changeable competitor APIs in there just doesn't make any sense at all.

  6. How about other browsers use the MS list? by alanjstr · · Score: 2, Insightful

    This advocates MS also using the Google list. How about Firefox, etc, also access the Microsoft API?

  7. Someone is Assuming something. by Icegryphon · · Score: 2, Insightful

    I think you are assuming Microsoft cares about customer security.
    If that were really the case then this would have already been implemented or in the works to be.
    Better yet, why should Microsoft care?
    Most people don't fix computer and just go out and buy a new one ever few years
    Sounds like another Microsoft fee for a new computer to me.
    Maybe I am just to cynical?

  8. Re:results may be biased by eln · · Score: 1, Insightful

    Sure, the results could be biased. On the other hand, NSS is a supposedly independent lab with no apparent connection to MS other than that MS commissioned this particular study. Unless there's a pattern of pro-MS bias in NSS-run tests, it's probably likely that this test was as evenhanded as any such test can be.

    The fact that MS marketing is touting this result is not evidence of bias, it's just evidence that the test results favored MS. If the test were completed and showed Google's list performed better, MS would have simply not published the result at all and we never would have heard about it.

    Rather than crying about bias, perhaps the OSS community should be spending their time figuring out how to make their own lists better.

  9. Re:microsoft basher by eln · · Score: 2, Insightful

    The focus of the story is colored by the blogger's own bias. Rather than focusing on why MS isn't doing better than 81%, the focus should be on why Google's product performs so abysmally in comparison to Microsoft's. Sure, MS could in theory make marginal improvements, but Google is the one that really ought to be taken to task for their poor results.

    I know the conventional wisdom is MS == bad, and Google == good, but trying to find an MS-bashing angle to every bit of news is counterproductive and tiresome.

  10. Re:Recursive by westlake · · Score: 2, Insightful

    Shouldn't IE itself and microsoft.com be on any decent malware list?

    I read this as Troll.

    It contributes absolutely nothing useful to the discussion - but instead simply feeds on the modder's visceral hatred of everything Microsoft.

       

  11. Re:results may be biased by speedtux · · Score: 2, Insightful

    The fact that MS marketing is touting this result is not evidence of bias,

    We don't have to show bias, NSS Labs has to convincingly show absence of bias. Their experiments are not peer reviewed and they are not reproducible, which means that they aren't worth the paper they are written on.

    If the test were completed and showed Google's list performed better, MS would have simply not published the result at all and we never would have heard about it.

    That alone means there is bias: selection bias. They can simply commission enough studies under enough different conditions and then select the (possibly tiny) subset of studies that show what they want.