Schooling Microsoft On Random Browser Selection

← Back to Stories (view on slashdot.org)

Schooling Microsoft On Random Browser Selection

Posted by kdawson on Sunday February 28, 2010 @07:33AM from the let-me-show-it-you dept.

Rob Weir got wind that a Slovakian tech site had been discussing the non-randomness of Microsoft's intended-to-be-random browser choice screen, which went into effect on European Windows 7 systems last week. He did some testing and found that indeed the order in which the five browser choices appear on the selection screen is far from random — though probably not intentionally slanted. He then proceeds to give Microsoft a lesson in random-shuffle algorithms. "This computational problem has been known since the earliest days of computing. There are 5 well-known approaches: 3 good solutions, 1 acceptable solution that is slower than necessary and 1 bad approach that doesn’t really work. Microsoft appears to have picked the bad approach. But I do not believe there is some nefarious intent to this bug. It is more in the nature of a 'naive algorithm,' like the bubble sort, that inexperienced programmers inevitably will fall upon when solving a given problem. I bet if we gave this same problem to 100 freshmen computer science majors, at least 1 of them would make the same mistake. But with education and experience, one learns about these things. And one of the things one learns early on is to reach for Knuth. ... The lesson here is that getting randomness on a computer cannot be left to chance. You cannot just throw Math.random() at a problem and stir the pot and expect good results."

26 of 436 comments (clear)

Min score:

Reason:

Sort:

LAST by DamonHD · 2010-02-28 07:36 · Score: 5, Funny

Hmm, there's a nice shuffle implementation in Java that Microsoft could use... Oh, wait...
Rgds
Damon

--
http://m.earth.org.uk/
Re:What? Why not? by EvanED · 2010-02-28 07:50 · Score: 5, Insightful

Why not? Is the author suggesting that random functions in use today are somewhat deficient? What is his solution?
You know, it's really too bad that the author of the article the summary linked to didn't write up an article answering exactly that. Then maybe Slashdot could have linked to it.
(In a nutshell, the answers are, respectively: "because plopping a 'rand()' into your code doesn't mean that what you'll get out is uniform", "no", and "use a shuffling algorithm that works.")
Re:Adding randomness... by K.+S.+Kyosuke · 2010-02-28 07:52 · Score: 4, Funny

Exactly, as there is always a chance for you to get a better browser due to a stray alpha particle.

--
Ezekiel 23:20
Re:Good enough by EvanED · 2010-02-28 07:53 · Score: 4, Informative

Given that each user is only going to see this screen once per computer, I'd say simply using the seconds of the current minute as a random seed should be OK
This problem has nothing to do with how the PRNG is seeded.
The word "seed" doesn't even appear in TFA at all.
The top hit on Google... by jmtpi · 2010-02-28 07:54 · Score: 5, Interesting

A Google search on:

javascript array sort
gives exactly the bogus answer that Microsoft used in the top hit.
Unfortunately for Microsoft, a bing search gives the same top hit.
Re:What? Why not? by Tapewolf · 2010-02-28 07:55 · Score: 5, Informative

I thought it was originally going to be that they forgot to seed the random number generator or something.
What the problem actually seems to be is not that they're using random, but how they're using it.
What they essentially did was use something like qsort() to sort the list, but in their comparator function, instead of returning which of the two strings comes first, they return some random crap instead. qsort() or whatever they used was designed to have results that are consistent, and with input like that it could potentially abort and leave the list entirely unsorted. Or if you're really unlucky it could sit there sorting them forever.
He's just bitching by Sycraft-fu · 2010-02-28 08:02 · Score: 4, Insightful

It is probably a combination of two things:
1) Hate for MS. MS is doing what some have said they've needed to do in giving users browser choice, and they've done so as to try not to promote any given one. While that makes proponents of choice happy, it makes MS haters mad. The more MS does to try and accommodate users and play fair, the less there is to hate on them for legitimately. As such haters are going to try and find nit picks to bitch about.
2) General geek pedantry. Many geeks seem to love to be exceedingly pedantic about every little thing. If a definition isn't 100% perfect, at least in their mind, they jump all over it. I think it is a "Look at how smart I am!" kind of move. They want to show that they noticed that it wasn't 100% perfect and thus show how clever they are.
Doesn't matter, it is what it is and as you said, random enough. This guy can whine all he likes.
1. Re:He's just bitching by magsol · 2010-02-28 08:06 · Score: 4, Insightful
  
  On the other hand, the devil is in the details, and one would think that a company such as Microsoft that has been owning the software market for decades now would know how to implement a randomizing algorithm correctly.
  
  --
  "I'd just like to emphasise that taking a million years isn't a metaphor here..." -Rich Bradshaw
2. Re:He's just bitching by TerranFury · 2010-02-28 08:22 · Score: 4, Insightful
  
  Whatever. They offloaded what looked like a menial task to some low-level programmer, who ran it a few times, saw it was "random" (without doing any statistical tests), and went home happy. He probably should have known the Knuth shuffle algorithm -- I remember studying it in high school CS, even -- but honestly it's not that huge a deal.
3. Re:He's just bitching by maxwell+demon · 2010-02-28 08:37 · Score: 4, Interesting
  
  Spending the extra programmer time and effort to turn a "99.99% random" process into a "100% random"
  I don't know what you consider "99.99% random", but the difference between 20% (probability of IE turning up last in a real random shuffle) and ca. 50% (probability of IE showing up last in the implemented "random shuffle") is certainly significant enough that you can't call it 99.99% random." You might argue that it is "random enough for this," but that's of course a matter of opinion, and therefore debatable (there's no objective definition of "random enough").
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
4. Re:He's just bitching by cgenman · 2010-02-28 09:11 · Score: 5, Insightful
  
  While in Microsoft's native browser (which would happen the first time), Internet Explorer is given a full %64 chance of receiving one of the coveted 2 edge positions. Considering that antitrust courts were involved in the creation of this screen, you'd think that getting "random" right would be a development priority, especially considering it should have taken a competent programmer exactly the same amount of time to do it right as to do it wrong. If this takes even one hour of lawyer time to ponder, it would have been much cheaper to send the programmer back to fix it.
  A 50% chance of getting a particular slot that should be %20 is not "99.99% random." It's just wrong. And when you're talking about the cost of antitrust regulation, it's really, really wrong.
  I'm glad this is being brought up on Slashdot. There is a lot of misunderstanding about how to create randomness in systems. Even on a basic level, people frequently ask for "random" when they actually want jukebox random. In this case, though, it just seems like a basic misunderstanding of statistics, which is not surprising given the moderate code complexity and likelihood this screen was given to an intern or jr programmer.
  
  --
  The ______ Agenda
5. Re:He's just bitching by John+Hasler · 2010-02-28 09:21 · Score: 4, Funny
  
  > ...the manager ignored it...
  Or he decided that it was so important that he had to do it himself.
  
  --
  Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
6. Re:He's just bitching by noidentity · 2010-02-28 12:17 · Score: 5, Interesting
  
  You know, your post just made me realize that Microsoft has made a good entry for perhaps next year's Underhanded C Contest: write some innocent-looking code that is supposed to randomize a selection, but fails to do so fairly and favors certain selections over others.
Re:damned faintly praising? by sopssa · 2010-02-28 08:02 · Score: 4, Informative

Is picking a worse random number generation function (the default one in C and JS) really fucking up?
And btw, it looks like their choice promotes all other browsers than IE almost 2x more!
Position I.E. Firefox Opera Chrome Safari 1 1304 2099 2132 2595 1870 2 1325 2161 2036 2565 1913 3 1105 2244 1374 3679 1598 4 1232 2248 1916 590 4014 5 5034 1248 2542 571 605
I can already see all the comments how MS would be favoring IE with this (summary conveniently left that one out), but as it is they're promoting the other browsers almost double more.
Re:What? Why not? by Anonymous Coward · 2010-02-28 08:07 · Score: 5, Insightful

No, Math.random is not the problem, the problem is how it is used. They used it as random input to a sorting algorithm without considering how the sorting algorithm works. The assumption that any sorting algorithm with inconsistently random input = random order is wrong. If they had assigned a random value to each element and sorted by that value the result would have been truly random as the value associated with each element would have been consistent.
Re:Good enough by mangu · 2010-02-28 08:08 · Score: 4, Insightful

Given that each user is only going to see this screen once per computer
Given that each person will only lose one cent per lifetime, I propose to move $0.01 from each bank account in the world to my own account.
Re:damned faintly praising? by EvanED · 2010-02-28 08:08 · Score: 4, Insightful

Is picking a worse random number generation function (the default one in C and JS) really fucking up?
There's no problem with the function they're using; the problem is how they're using it. If 'rand()' were perfect, their technique would still suck.
I can already see all the comments how MS would be favoring IE with this (summary conveniently left that one out), but as it is they're promoting the other browsers almost double more.
I do think the summary should have mentioned that bias, but I don't think it's quite as good a position as you convey. I bet the far right position is better than #3 and #4 at least.
(If I wanted to put on my conspiracy hat -- which I don't, I don't really believe this -- I'd say that MS wanted to bias it towards them and decided that biasing it toward #1 would be too blatant, but that #5 was "good enough".)
You can't artificially put down competition by SuperKendall · 2010-02-28 08:16 · Score: 5, Insightful

Here's the problem - consider the results again. Safari will almost always (almost 50% of the time) be put in the bottom two elements. In fact depending on the algorithm used it's 40-50% chance of being put in one exact slot (either choice four or five).
When the whole point of the list is promote browser competition, it makes no sense to accept a list which is that skewed for ANY browser result from the list. You need to have it properly shuffled so that no one browser has a statistical advantage or disadvantage - if you are going to claim it doesn't matter then why not let Microsoft set an arbitrary fixed order for the list?
That is not what the legal injunction against them says they can do, therefore the randomness of the results DO matter. Just as in most things in life, correctness of results is actually important.

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley
1. Re:You can't artificially put down competition by pushing-robot · 2010-02-28 08:34 · Score: 4, Funny
  
  Safari will almost always (almost 50% of the time) be put in the bottom two elements [out of five].
  And how well did you do in statistics class?
  
  --
  How can I believe you when you tell me what I don't want to hear?
Re:What's the problem? by Anonymous Coward · 2010-02-28 08:34 · Score: 5, Funny

You are missing the point. As the author of the article pointed out, this technique can cause an infinite loop.
Re:damned faintly praising? by beelsebob · 2010-02-28 08:37 · Score: 5, Insightful

No, the point was that no one browser got unfairly pushed to the top all the time. This algorithm does push a certain browser higher more often than not, and hence is not fit for it's job.
Re:damned faintly praising? by 644bd346996 · 2010-02-28 08:37 · Score: 5, Insightful

Even with a very high quality entropy source, the algorithm Microsoft used will result in a very non-uniform distribution.
Clearly, Microsoft didn't care about this enough to assign one of their experienced coders to it, which is odd given the legal involvement. Either the technical side of MS ignored the legal department's explanation of the importance of the browser ballot to MS's ability to do business on a particularly profitable continent, or someone powerful in MS decided to spite the EU by assigning low quality programmers to the project.
Bad Article, Bad Summary by alphabetsoup · 2010-02-28 09:20 · Score: 4, Interesting

Both the article and the summary mixes up the concepts. Randomness and bias are related but different things. Think of a biased coin loaded in favor of heads - the heads may appear twice as often as the tails, but the distribution is still random. Here too, contrary to the summary's claim of "far from random", the results are random, just biased, and biased against IE, if I may add, which is an important fact the summary omitted.
Re:damned faintly praising? by cgenman · 2010-02-28 10:12 · Score: 5, Informative

The relevant code is in their Javascript:
aBrowserOrderTop5.sort(RandomSort); (they repeat this twice for some reason) ...
function RandomSort (a,b)
{
return (0.5 - Math.random());
}
This takes the browser's built-in sorting function, tells it to sort by an essentially random criteria, and hopes that it all works out. Unfortunately, this is highly dependent on the implementation of the built-in sort function, and that's up to the browser designer to create. The only constraint on the structure of sort is that it must successfully order comprehensible data, which does not mean that it will properly randomize data when provided. Essentially, they overloaded a black-box function that wasn't designed for randomization in the hopes that it would work.
For an instance of why this wouldn't work, consider the case of the last item. Say that you're sorting a list of 5 letters. Now say that you're most of the way through the list, having properly sorted the first 4 letters into "A, B, D, E", with just the 5th letter C left. So you step through.
Does C come before E? Yes.
Does C come before D? Yes.
Does C come before B? No.
C must go between B and D, and the list will look like "A , B , C , D , E." It will be sorted correctly every time.
Now let's throw that randomization into the middle there. Let's start again with the list, though since we're randomizing let's call them item 1, 2, 3, and 4. If we're properly randomly sorting the last item 5 into the list, it should have equal chances of showing up everywhere. But remember, we're still using the sorting algorithm from above, we're just flipping a coin at each question instead of actually comparing. So what we get is:
Does 5 come before 4? 50% yes, or stop
Does 5 come before 3? 50% yes, or stop
Does 5 come before 2? 50% yes, or stop
etc. But because it's iterative, those 50% chances stack. You only get to the second question half of the time, so you only get to the third question half of that half of the time. And essentially what you wind up with is a % chance of the last number being sorted into each of the slots as: 3%, 6%, 12%, 25%, 50%. This is obviously not a random distribution curve.
This is not necessarily the sort algorithm that Microsoft uses in I.E. (The 50% chance of staying as the last element is a bit suspicious, though, as is running their code twice). But it does point out unequivocably that you can't overload an off-the-shelf sort algorithm with a randomizing comparator and expect the outcome to follow a genuinely random distribution curve. They really ought to have an in-house random sort algorithm that their developers can pull from.
(Thanks to another poster for finding the first google hit that describes this method.).

--
The ______ Agenda
Re:damned faintly praising? by asaz989 · 2010-02-28 13:40 · Score: 4, Informative

Are you running Firefox? One of the things that the article points out is that the specific type of non-randomness that sort gives in this case is implementation-dependent (meaning browser-dependent). IE being pushed to the end is what happens in the Internet Explorer implementation of Javascript; the version of Firefox that he tested disproportionately pushes IE to the front, and presumably other browsers would give a different distribution.
Re:damned faintly praising? by Phantasmagoria · 2010-02-28 14:20 · Score: 5, Insightful

If you had bothered to read the article, you'd see that the author has done JUST that. Not only did he prove (using proper statistical methods) that the results are significantly not random, he also dug up the exact javascript source code that does the shuffling and explained why it is faulty. RTFA!

--
Loban Amaan Rahman ==> Anagram of ==> Aha! An Abnormal Man!