Slashdot Mirror


Schooling Microsoft On Random Browser Selection

Rob Weir got wind that a Slovakian tech site had been discussing the non-randomness of Microsoft's intended-to-be-random browser choice screen, which went into effect on European Windows 7 systems last week. He did some testing and found that indeed the order in which the five browser choices appear on the selection screen is far from random — though probably not intentionally slanted. He then proceeds to give Microsoft a lesson in random-shuffle algorithms. "This computational problem has been known since the earliest days of computing. There are 5 well-known approaches: 3 good solutions, 1 acceptable solution that is slower than necessary and 1 bad approach that doesn’t really work. Microsoft appears to have picked the bad approach. But I do not believe there is some nefarious intent to this bug. It is more in the nature of a 'naive algorithm,' like the bubble sort, that inexperienced programmers inevitably will fall upon when solving a given problem. I bet if we gave this same problem to 100 freshmen computer science majors, at least 1 of them would make the same mistake. But with education and experience, one learns about these things. And one of the things one learns early on is to reach for Knuth. ... The lesson here is that getting randomness on a computer cannot be left to chance. You cannot just throw Math.random() at a problem and stir the pot and expect good results."

5 of 436 comments (clear)

  1. Re:Good enough by EvanED · · Score: 4, Informative

    Given that each user is only going to see this screen once per computer, I'd say simply using the seconds of the current minute as a random seed should be OK

    This problem has nothing to do with how the PRNG is seeded.

    The word "seed" doesn't even appear in TFA at all.

  2. Re:What? Why not? by Tapewolf · · Score: 5, Informative
    I thought it was originally going to be that they forgot to seed the random number generator or something.

    What the problem actually seems to be is not that they're using random, but how they're using it.

    What they essentially did was use something like qsort() to sort the list, but in their comparator function, instead of returning which of the two strings comes first, they return some random crap instead. qsort() or whatever they used was designed to have results that are consistent, and with input like that it could potentially abort and leave the list entirely unsorted. Or if you're really unlucky it could sit there sorting them forever.

  3. Re:damned faintly praising? by sopssa · · Score: 4, Informative

    Is picking a worse random number generation function (the default one in C and JS) really fucking up?

    And btw, it looks like their choice promotes all other browsers than IE almost 2x more!

    Position I.E. Firefox Opera Chrome Safari
    1 1304 2099 2132 2595 1870
    2 1325 2161 2036 2565 1913
    3 1105 2244 1374 3679 1598
    4 1232 2248 1916 590 4014
    5 5034 1248 2542 571 605

    I can already see all the comments how MS would be favoring IE with this (summary conveniently left that one out), but as it is they're promoting the other browsers almost double more.

  4. Re:damned faintly praising? by cgenman · · Score: 5, Informative

    The relevant code is in their Javascript:

    aBrowserOrderTop5.sort(RandomSort); (they repeat this twice for some reason) ...
    function RandomSort (a,b)
    {
            return (0.5 - Math.random());
    }

    This takes the browser's built-in sorting function, tells it to sort by an essentially random criteria, and hopes that it all works out. Unfortunately, this is highly dependent on the implementation of the built-in sort function, and that's up to the browser designer to create. The only constraint on the structure of sort is that it must successfully order comprehensible data, which does not mean that it will properly randomize data when provided. Essentially, they overloaded a black-box function that wasn't designed for randomization in the hopes that it would work.

    For an instance of why this wouldn't work, consider the case of the last item. Say that you're sorting a list of 5 letters. Now say that you're most of the way through the list, having properly sorted the first 4 letters into "A, B, D, E", with just the 5th letter C left. So you step through.
    Does C come before E? Yes.
    Does C come before D? Yes.
    Does C come before B? No.
    C must go between B and D, and the list will look like "A , B , C , D , E." It will be sorted correctly every time.

    Now let's throw that randomization into the middle there. Let's start again with the list, though since we're randomizing let's call them item 1, 2, 3, and 4. If we're properly randomly sorting the last item 5 into the list, it should have equal chances of showing up everywhere. But remember, we're still using the sorting algorithm from above, we're just flipping a coin at each question instead of actually comparing. So what we get is:

    Does 5 come before 4? 50% yes, or stop
    Does 5 come before 3? 50% yes, or stop
    Does 5 come before 2? 50% yes, or stop

    etc. But because it's iterative, those 50% chances stack. You only get to the second question half of the time, so you only get to the third question half of that half of the time. And essentially what you wind up with is a % chance of the last number being sorted into each of the slots as: 3%, 6%, 12%, 25%, 50%. This is obviously not a random distribution curve.

    This is not necessarily the sort algorithm that Microsoft uses in I.E. (The 50% chance of staying as the last element is a bit suspicious, though, as is running their code twice). But it does point out unequivocably that you can't overload an off-the-shelf sort algorithm with a randomizing comparator and expect the outcome to follow a genuinely random distribution curve. They really ought to have an in-house random sort algorithm that their developers can pull from.

    (Thanks to another poster for finding the first google hit that describes this method.).

  5. Re:damned faintly praising? by asaz989 · · Score: 4, Informative

    Are you running Firefox? One of the things that the article points out is that the specific type of non-randomness that sort gives in this case is implementation-dependent (meaning browser-dependent). IE being pushed to the end is what happens in the Internet Explorer implementation of Javascript; the version of Firefox that he tested disproportionately pushes IE to the front, and presumably other browsers would give a different distribution.