Mindcraft Fun Continues
LinuxOnEveryDesktop sent us Mindcraft's comments on a
third benchmark that
will be open to a wider array of Linux Experts (the second
benchmark took tips from Linus which raised a lot of eyebrows: my
favorite being 'did they require Bill to be involved too?')
There are quite a few restrictions, but overall it seems
like a solid chance to show what Linux can really do. Check
it out.
And unlike in the previous hatchet job, we are going to be right there watching as they tromple over our OS and publicly defame it.
We are going to be standing there with our mouths open saying, "What happened?"
Folks, this is a battle we can only lose. Bruce has his LIVELIHOOD at stake here. NT is not going to lose no matter how open it may seem. Whatever it was that happened in that second test Mindcraft did gave them the confidence to do this test so they could appear to the public as "fairminded" and "open".
It ain't so. What can we do to prevent this? I hope noone important falls into this trap. It could be a black day for us all as the press trumpets "NT prevails in open tests where linux gurus try their best!"
I'd like to point out a few issues with this: 1) Motivation; in point #1, Mindcraft says the purpose of the test is to verify the results of the second tests. No Linux experts ever even had the opportunity to view the second test's results. Why are we rerunning test #2 which no one has had the opportunity to see or critique? Why not retest #1, the one that everyone says is false and inaccurate? 2) Machine used; this machine will not be the Dell PowerEdge that was used for the first test. It may be a machine that was specifically chosen for it's weak Linux drivers or other reasons. 3) Mindcraft configuration; why must the Linux experts use the configuration that Mindcraft used? 4) Microsoft's ability to use the results in press releases; I admit, this part is the reason why the tests are run in the first place; but why is this starting to look like a press campaign? Microsoft has incredibly tricky attorneys; why should we play their game? 5) Will the Linux experts have the opportunity to edit the 'joint' press release that includes quotations, possibly out of context, from the Linux experts? --Curious in Atlanta
It's a little obvious when you think about it. Mindcraft says that the hardware configuration is "set in stone". And, in fact, when they posted messages on the newsgroup, people told them that they needed to change their hardware configuration... but they won't. The key to this entire test IS the hardware.
Where did this specific hardware configuration come from? Microsoft. And Microsoft didn't just pull this configuration out of thin air. They've been doing all sorts of internal benchmarks with Linux systems to see what kind of numbers they could get. In their tests, I'm sure they've come across many favorable Linux comparisons.
But, as expected, they've found a few sub-optimal configurations for Linux which NT does well with. Microsoft has run many tests, and found the hardware that works the worst with Linux but good with NT. This is what explains the email from a Microsoft email address regarding the Xeon configuration... the email that Mindcraft said they did not send. It was done by Microsoft's internal testing.
Once they have the bad configuration, they need to send it off to another third party for "independent verification". Which, I believe, it isn't Mindcraft that was responsible for the low numbers... they were handed a benchmark that Microsoft ran ahead of time and already knew the results for. They just needed an outside party to go discover it for themselves.
This creates a simple smoke-and-mirror effect. Microsoft isn't blamed for this... Mindcraft is. And it disguises the real issue... it isn't software/OS tuning. It is hardware de-tuning done in advance by microsoft.
Consider that each piece of the system is pretty much a non-optimal configuration for linux. CPU, RAID disk, and probably the network. Probably one of these pieces (say, the 4x100base-t instead of a single gigabit ether) is really sub-optimal and replacing it would probably yield incredible results, but you've still got the additional handicaps.
But the Linux community has walked right into this one, thinking they can tune it out. Probably short of re-writing some kernel and driver codes, this piece of hardware isn't going to fly on anything *but* NT. (*BSD will also compare unfavorably.)
It probably is too late now to point this out -- they'll have the claim of "sour grapes" to use.
Looks like Linux did not come out on top in their second test, and Mindcraft is willing to bring in Linux experts to verify the results.
My biggest fear with this is not that Mindcraft will try some underhanded trickery, it's that the Linux community will not accept the results unless linux comes out on top. That is just as bad as if Mindcraft tailored its benchmarks to favor NT.
This open benchmark is a wonderful opportunity for the Linux community to benchmark itself. AFAIK, no controlled experiments on high-end servers have been done to see how Linux stacks up against the heavyweights. If Linux comes out on top, great. But if not, we as a community have to accept that and learn from the process.
This is best chance yet to discover the bottlenecks in the kernel and several critical pieces of software. IMHO, the Linux experts should not go in with the goal of beating NT. They should go in with the goal of squeezing every last bit of performance out of the machine and using the resulting data to fix the problems.
Mindcraft and Microsoft (!) are donating resources to the Linux community in an effort to help us improve the OS. Let's grab the opportunity!
...that "benchmarking" product A and product B, while getting paid from product A's manufacturer, using product A's consultants, and using product A's own testing fascilities...IS INHERENTLY BIASED!!! Regardless if product A is actually better than product B or not. My God, its so simple yet they can't figure it out!
Mindcraft has no credibility left. Worse, now they have an axe to grind because of how badly they were beat up in the mainstream press (ABCnews.com, Salon, Slashdot, etc). Any "benchmarking" they do from here on in should be ignored completely. If we can learn something from this, such as the need for better documentation, then at least thats a positive for us.
Sig (appended to the end of comments you post, 120 chars)
Alan Cox brought up a few points about the tests here:
http://linuxtoday.com/stories/5631.html
He mentions that we really should be benchmarking Zeus or a faster web server under Linux versus IIS if we want to find out how fast the OS can serve static web pages. IIS is the fastest web server for NT, so we should be able to use the fastest web server for Linux for the tests. If you want to compare NT versus Linux, then get the fastest web server for both.
If you want to compare capabilities, then use Apache, etc. Use the best product for the job.
Additionally, he does mention that we should use NT clients or at least a mixture of both, for the tests. At the company I did some consulting for, they are standardizing on NT. Microsoft's roadmap is all NT for the future. Why would you want to benchmark against a dead-end technology like Win9X?
Somehow I think these tests are rigged. Notice, Mindcraft only offered to rerun the tests after they ran a second one, which they didn't release the results of.
I think we need another party, that's neutral, to do some benchmarks, and take Mindcraft out of the picture altogether. They already admitted they fouled up, and they shouldn't be trusted to do the benchmarks again, because we will be giving them credibility they don't deserve.
Just my $.02
Ben
Mindcraft is participating in this third round of testing at their own expense, as they point out in bold text on the invitation page. They are doing this, I believe, to recover some of the credibility they lost by conducting their first benchmark in an extremely sloppy and biased manner.
And why are they doing this? Because for a company like Mindcraft their credibility is their cash cow -- if their test results can't be believed, no one's going to pay for them. So before we linux advocates get ourselves all worked up over the opportunity to prove what linux can do, we must ask ourselves: "what's the ultimate goal of this test?". Or perhaps that should be phrased "who is the ultimate audience of this test?".
The answer, I believe, is that the ultimate audience, the target, of this 3rd benchmark is Mindcraft's collective future customers, including, me must presume, those customers from which Mindcraft might expect repeat business. In a word, Microsoft.
So, while I'm encouraged by the news that linux will get another run at the benchmark, I'm not entirely satisfied that this will be a completely unbiased test. Although it's encouraging that Mindcraft has opened the test to tuning by linux experts, they still have dictated the structure of the test, and it seems to me that there's room there for bias.
BTW, I visited the Mindcraft web site shortly after the publication of the initial test results. Their home page included some text that read something like (paraphrasing here) we work with the customer to identify their test goals, then design a test to produce the desired results. In other words, Microsoft got what they paid for. It seems interesting now, in the aftermath of the Mindcrap Affair, that those rather damning words seem to have disappeared from their site.
--JT
There is a lot of talk about the Hardware RAID controller, and the drivers for it.
What about setting a $$ limit on the HW and let both "sides" choose the best HW for it's platform?
This is nothing more than Mindcraft going into butt-covering mode...
Mindcraft isn't interested in an honest test, they just want to show
(to the media) that they know what they're doing.. it's a PR game
plain and simple.. they want Linux people to put their stamp of
approval on something that they have no real control over...
Why the time restrictions? the Linux experts aren't allowed to use
any patch that came out after April 20th... One of the main points
about the original test was the unsupported RAID card used... so if
someone were to magically release a patch tomorrow that made that card
run 3x as fast, they wouldn't be able to use it.
By the terms in the paper, as soon as someone from the Linux camp
joins, they're bound to put their name on the PR sheet. (which is
essentially just a confirmation of their second test, which they have
already run and won't show to anybody.) Since Mindcraft has STILL not
levelled the playing field, I strongly urge a boycott of this 'test'.
If Mindcraft REALLY wanted to have an unbiased test, they would invite
Redhat and Miscrosoft to sit down and draw up a mutually agreed-upon
hardware list; that way no side is at a disadvantage.
When you're at war, you don't allow your opponent to choose the
battlefield unless you have no other option. We have another option,
which is not to fight. By allowing someone else to choose the
hardware, the Linux side is at a disadvantage. Don't give them more
ammunition against us.