Internet Explorer 9 Caught Cheating In SunSpider

← Back to Stories (view on slashdot.org)

Internet Explorer 9 Caught Cheating In SunSpider

Posted by CmdrTaco on Wednesday November 17, 2010 @01:43AM from the well-that's-not-nice dept.

dkd903 writes "A Mozilla engineer has uncovered something embarrassing for Microsoft – Internet Explorer is cheating in the SunSpider Benchmark. The SunSpider, although developed by Apple, has nowadays become a very popular choice of benchmark for the JavaScript engines of browsers."

9 of 360 comments (clear)

Min score:

Reason:

Sort:

Do not attribute to malice ... by Tar-Alcarin · 2010-11-17 01:52 · Score: 5, Insightful

what can be attributed to stupidity.
1) Microsoft cheated by optimizing Internet Explorer 9 solely to ace the SunSpider Bechmark. To me, this seems like the best explanation.
2)Microsoft engineers working on Internet Explorer 9 could have been using the SunSpider Benchmark and unintentionally over-optimized the JavaScript engine for the SunSpider Benchmark. This seems very unlikely to me.
I see no reason why explanation number one is more likely than explanation number two.
Really? You're Going with That? by eldavojohn · 2010-11-17 01:53 · Score: 5, Insightful

Next we're going to be shocked that 8th grade history students try to memorize the material they think will be on their test rather than seeking a deep and insightful mastery of the subject and its modern societal implications.
Some things to consider: 1) I'm not doing business with the 8th grader. Nor am I relying on his understanding and memorization of history to run Javascript that I write for clients. 2) You are giving Microsoft a pass by building an analogy between their javascript engine and an 8th grade history student.

Just something to consider when you say we shouldn't be shocked by this.

--
My work here is dung.
Three explanations FTFA by davev2.0 · 2010-11-17 01:57 · Score: 5, Insightful

There are three possible explanation for this weird result from Internet Explorer:

Microsoft cheated by optimizing Internet Explorer 9 solely to ace the SunSpider Bechmark. To me, this seems like the best explanation.
Microsoft engineers working on Internet Explorer 9 could have been using the SunSpider Benchmark and unintentionally over-optimized the JavaScript engine for the SunSpider Benchmark. This seems very unlikely to me.
A third option (suggested in Hacker News) might be that this is an actual bug and adding these trivial codes disaligns cache tables and such throwing off the performance entirely. If this is the reason, it raises a serious question about the robustness of the engine.
Everything in italics is unsupported opinion by the author, yet is treated as fact in the summary and title by CmdrTaco and Slashdot. Perhaps if Slashdot would stick to actual news sites (you know NEWS for nerds and all that), this would be a balanced report with a good amount of information. Instead, it is just another Slashdot supported hit piece against MicroSoft.
Re:Benchmarks by TheRaven64 · 2010-11-17 02:24 · Score: 5, Informative

There is a difference between optimising for a benchmark and cheating at a benchmark. Optimising for a benchmark means looking at the patterns that are in a benchmark and ensuring that these generate good code. This is generally beneficial, because a well-constructed benchmark is representative of the kind of code that people will run, so optimising for the benchmark means that common cases in real code will be optimised too. I do this, and I assume that most other compiler writers do the same. Cheating at a benchmark means spotting code in a benchmark and returning a special case.
For example, if someone is running a recursive Fibonacci implementation as a benchmark, a valid optimisation would be noting that the function has no side effects and automatically memoising it. This would turn it into a linear time, rather than polynomial time, function, at the cost of increased memory usage. A cheating optimisation would be to recognise that it's the Fibonacci sequence benchmark and replaces it with one that's precalculated the return values. The cheat would be a lot faster, but it would be a special case for that specific benchmark and would have no impact on any other code - it's cheating because you're not really using the compiler at all, you're hand-cmpiling that specific case, which is an approach that doesn't scale.
The Mozilla engineer is claiming that this is an example of cheating because trivial changes to the code (adding an explicit return; at the end, and adding a line saying true;) both make the benchmark much slower. I'm inclined to agree. The true; line is a bit difficult - an optimiser should be stripping that out, but it's possible that it's generating an on-stack reference to the true singleton, which might mess up some data alignment. The explicit return is more obvious - that ought to be generating exactly the same AST as the version with an implicit return.
That said, fair benchmarks are incredibly hard to write for modern computers. I've got a couple of benchmarks that show my Smalltalk compiler is significantly faster than GCC-compiled C. If you look at the instruction streams generated by the two, this shouldn't be the case, but due to some interaction with the cache the more complex code runs faster than the simpler code. Modify either the Smalltalk or C versions very slightly and this advantage vanishes and the results return to something more plausible. There are lots of optimisations that you can do with JavaScript that have a massive performance impact, but need some quite complex heuristics to decide where to apply them. A fairly simple change to a program can quite easily make it fall through the optimiser's pattern matching engine and run in the slow path.

--
I am TheRaven on Soylent News
Re:Benchmarks by Nutria · 2010-11-17 02:41 · Score: 5, Informative

Fear not, for I have RTFA and the original article that the digitizor article is based on.
Fortunately for the ethics of Mozilla, the named Mozilla engineer (Rob Sayre) never claimed that IE9 cheated. Instead, he diplomatically refers to it as a "oddity" and "fragile analysis" and filed a bug w/ MSFT.
http://blog.mozilla.com/rob-sayre/2010/09/09/js-benchmarks-closing-in/M
http://blog.mozilla.com/rob-sayre/2010/11/16/reporting-a-bug-on-a-fragile-analysis/
So, blame Digitizor and ycombinator for putting words in Rob Sayre's mouth.

--
"I don't know, therefore Aliens" Wafflebox1
Re:Embarassing? by Anonymous Coward · 2010-11-17 02:47 · Score: 5, Insightful

Optimisations done purely for use only on a benchmark to achieve far better results than normal is the exact definition of cheating. Benchmarks are meant to test the browser with some form of real performance measure and not how good the programmers are at making the browser pass that one test. If the thing is getting thrown off by some very simple instructions to the tune of 20 times longer then it is seriously broken. Optimization or not.

It is like when ATI/Nvidia made their drivers do some funky shit on the benchmarks to make their products seem way better; This was also called cheating at the time.
Re:Benchmarks by BZ · 2010-11-17 02:53 · Score: 5, Informative

1) If you actually read the article, you may have noticed that the engineer is named. It's
right there there at the beginning of paragraph 2: "While Mozilla engineer Rob Sayre"
2) The "cheating" stuff is all from the Hacker News thread and the fucking articl. I
suggest you further read item 1 under "Further Readings" on the fucking article, which
is what Rob actually wrote. The link is: http://blog.mozilla.com/rob-sayre/2010/11/16/reporting-a-bug-on-a-fragile-analysis/
Just to save you the trouble of reading it, if don't want to, it's pretty clear that IE9 is eliminating the heart of the math-cordic loop as dead code. It _is_ dead code, so the optimization is correct. What's weird is that very similar code (in fact, code that compiles to identical bytecode in some other JS engines) that's just as dead is not dead-code eliminated. This suggests that the dead-code-elimination algorithm is somewhat fragile. In particular, testing has yet to turn up a single other piece of dead code it eliminates other than this one function in Sunspider. So Rob filed a bug about this apparent fragility with Microsoft and blogged about it. The rest is all speculation by third parties.
Re:Embarassing? by chrb · 2010-11-17 03:50 · Score: 5, Informative

Did you look at the diffs? The addition of the "true;" operation should make absolutely no difference to the output code. It's a NOP. The fact that it makes a difference indicates that either something fishy is going on, or there is a bug in the compiler that fails to recognise "true;" or "return (at end of function)" as being deadcode to optimise away, and yet the compiler can apparently otherwise recognise the entire function as deadcode. Just to be clear, we are talking about a compiler that can apparently completely optimise away this whole function:

function cordicsincos() {
var X;
var Y;
var TargetAngle;
var CurrAngle;
var Step;
X = FIXED(AG_CONST); /* AG_CONST * cos(0) */
Y = 0; /* AG_CONST * sin(0) */
TargetAngle = FIXED(28.027);
CurrAngle = 0;
for (Step = 0; Step CurrAngle) {
NewX = X - (Y >> Step);
Y = (X >> Step) + Y;
X = NewX;
CurrAngle += Angles[Step];
} else {
NewX = X + (Y >> Step);
Y = -(X >> Step) + Y;
X = NewX;
CurrAngle -= Angles[Step];
}
}
}
but fails to optimise away the code when a single "true;" instruction is added, or when "return" is added to the end of the function. Maybe it is just a bug, but it certainly is an odd one.
This shows the dangers of synthetic non-realistic benchmarks. I was amused to read Microsoft's comments on SunSpider: "The WebKit SunSpider tests exercise less than 10% of the API’s available from JavaScript and many of the tests loop through the same code thousands of times. This approach is not representative of real world scenarios and favors some JavaScript engine architectures over others." Indeed.
btw the Hacker News discussion is more informative.
Re:I'm sure there's no hyperbole in this article by pla · 2010-11-17 06:09 · Score: 5, Insightful

Why don't you try reading it before you make that claim? The article is a few simple benchmark results and mild speculation as to what caused them. The summary may be inflammatory, the article goes out of its way not to be.

1) Microsoft beats everyone else by a factor of 10.
2) Making any of a number of effectively cosmetic changes to the function results in Microsoft taking twice as long as everyone else.
3) Making the inner loop 10x longer makes everyone else take 10x longer, except MS, who takes 180x longer.

Sorry, but if that counts as an optimization "bug", I have a bridge to sell you.