Casting a Jaundiced Eye On AnTuTu Benchmark Claims Favoring Intel

← Back to Stories (view on slashdot.org)

Casting a Jaundiced Eye On AnTuTu Benchmark Claims Favoring Intel

Posted by timothy on Saturday July 13, 2013 @08:14AM from the surely-there's-a-perfectly-innocent-explanation dept.

MojoKid writes "Recently, industry analysts came forward with the dubious claim that Intel's Clover Trail+ low power processor for mobile devices had somehow seized a massive lead over ARM's products, though there were suspicious discrepancies in the popular AnTuTu benchmark that was utilized to showcase performance. It turns out that the situation is far shadier than initially thought. The version used in testing with the benchmark isn't just tilted to favor Intel — it seems to flat-out cheat to accomplish it. The new 3.3 version of AnTuTu was compiled using Intel's C++ Compiler, while GCC was used for the ARM variants. The Intel code was auto-vectorized, the ARM code wasn't — there are no NEON instructions in the ARM version of the application. Granted, GCC isn't currently very good at auto-vectorization, but NEON is now standard on every Cortex-A9 and Cortex-A15 SoC — and these are the parts people will be benchmarking. But compiler optimizations are just the beginning. Apparently the Intel code deliberately breaks the benchmark's function. At a certain point, it runs a loop that's meant to be performed 32x just once, then reports to the benchmark that the task completed successfully. Now, the optimization in question is part of ICC (the Intel C++ compiler), but was only added recently. It's not the kind of procedure you'd call by accident. AnTuTu has released an updated "new" version of the benchmark in which Intel performance drops back down 20-50%. Systems based on high-end ARM devices again win the benchmark overall, as they did previously."

16 of 82 comments (clear)

Min score:

Reason:

Sort:

But still... by sunking2 · 2013-07-13 08:26 · Score: 2, Insightful

It is the suite of tools, not just the processor. If intel offers a better processor/compiler package than is available for arm why shouldn't they tout it? I'm not saying they are presenting it in the correct way, but I do think they have a valid point they want to make. That with Intel you get more than a CPU, you get a heck of a lot of tool expertise. And for some people that is worth something.
1. Re:But still... by Anonymous Coward · 2013-07-13 08:36 · Score: 3, Insightful
  
  if you use icc instead of gcc for x86 then you should use the ARMCC compiler or Keil or one of the others for arm.
2. Re:But still... by gnasher719 · 2013-07-13 10:07 · Score: 2
  
  It is the suite of tools, not just the processor. If intel offers a better processor/compiler package than is available for arm why shouldn't they tout it? I'm not saying they are presenting it in the correct way, but I do think they have a valid point they want to make. That with Intel you get more than a CPU, you get a heck of a lot of tool expertise. And for some people that is worth something.
  Absolute correct, you should judge the combination of processor + commonly used compiler. For example, if Apple built an iPad with an Intel processor, then any iPad app would be built with Clang for ARMv7, Clang for ARMv7s, and Clang for x86_64, and you could directly compare all three versions.
  
  However, you must be careful. You need to check real-life code. If you run identical code 32 times and an opimising compiler figures out you need to do it only once, that's not real-life. If this is what your benchmark does, then your benchmark runs 32 times faster, but nobody cares how fast benchmarks run. People care about real applications, and the benchmark's now fails at its purpose which is given an indication how real applications will behave.
Yet it does not make this "benchmark" honest. by boorack · 2013-07-13 08:36 · Score: 5, Insightful

Compiler was one of many skews in this "honest" benchmark. Aside of deliberately "fixing" benchmark code for intel and deliberately breaking ARM benchmark by disabling NEON. In my opinion they should run identical code, trying to maximize its performance on both platforme and in case of Intel use both compilers and post both results. This would lead potential customer to correct conclusions - as opposed to a bunch of lies and misinterpretations AnTuTu actually posted.
1. Re:Yet it does not make this "benchmark" honest. by dryeo · 2013-07-13 17:11 · Score: 2
  
  You run configure (various options such as --enable-gpl for FFmpeg) && make for each platform. For benchmarking I guess you could do make check for Cairo but that is not a very good test as make check needs exactly the right versions of ghostscrpt, various fonts and I don't know what else. For FFmpeg you could run make fate after downloading the samples and time it. This would be a fairly good C benchmark for various CPU's because as you stated there are code paths for a hell of a lot of CPU's. The OS and libc are still going to affect the results. Examples of results of this, without timings, are at fate.ffmpeg.org (also similar at fate.libav.org)
  
  --
  https://en.wikipedia.org/wiki/Inverted_totalitarianism
Re:Duplicate? by Molochi · 2013-07-13 08:40 · Score: 2

Just the controversy. The news, buried at the bottom of the article, is that AnTuTu has a newer version that drops Intel performance back to where it was before.

--
"The Adobe Updater must update itself before it can check for updates. Would you like to update the Adobe Updater now?"
Fixed, apparently by edxwelch · 2013-07-13 08:44 · Score: 3, Informative

In fairness to AnTuTu they released a new version which tries to rectify the problem:
http://www.eetimes.com/author.asp?section_id=36&doc_id=1318894&
Re:Linpack bullshitting by Trepidity · 2013-07-13 09:03 · Score: 2

At least Linpack performs actual linear algebra, so coding to that particular test will help some people with real workloads (i.e. scientific software that uses Linpack). It's definitely not representative of everyone's workload, though.

--
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Anyone remember the days... by gTsiros · 2013-07-13 09:12 · Score: 2

...where companies used to rig benchmarks?
Oh right, we're still not past them.
AND WE'LL NEVER BE!
Always use real world applications, in actual, real usage. Never benchmarks.

--
Looking for people to chat about multicopters, coding, music. skype: gtsiros
Re:Benchmarks, trustworthy? by hairyfeet · 2013-07-13 09:43 · Score: 5, Informative

Look up "Intel cripples compiler" and you'll see its MUCH worse than merely tilting the benchmarks in favor of ARM, this bullshit means that ANY chip that doesn't have a CPUID of "Genuine intel" gets royally fucked by ALL SOFTWARE that is compiled with the Intel compiler.
If you look up the above in google you'll find a researcher that has done studies and if that doesn't deserve antitrust i don't know what does, he started looking into it when he found that his code would run faster on an old P4 than on a new AMD and it is soooo nasty that if you take a Via chip, the only chip that lets you change the CPUID, and change it from "Centaur hauls" to genuine Intel it jumps nearly 30% in the benches!
So do NOT buy chips based on the benches, they are as rigged as the old "quack.exe" but this is a thousand times worse because ANY program that is compiled with this is crippled and WILL run slower on ANY non Intel chip. So please programmers, use GCC, use AMD's compiler (which is based on GCC and doesn't favor one chip over another) and for those looking for a system DO NOT buy Intel if you can help it, since you are supporting this kind of market rigging bullshit. after seeing the results and seeing just how badly Intel is rigging I went exclusively AMD in my shop and even in my family with NO regrets, at least this way i'm supporting a company that isn't bribing OEMs and rigging markets.
seriously guys don't take MY word for it, look it up. They have even rigged it in the past to push shittier chips over better ones, the guy doing the tests found that even though the early P4 was a slow as hell chip when you ran a program compiled with ICC on both the P3 and P4 surprise! P4 would win. same program compiled with GCC? P3 won by over 30%.

--
ACs don't waste your time replying, your posts are never seen by me.
Time for ARM to invest in GCC by citizenr · 2013-07-13 10:17 · Score: 2, Insightful

ARM looks like a sore loser here.
>GCC isn't currently very good at auto-vectorization, but NEON is now standard on every Cortex-A9 and Cortex-A15 SoC
So the conclusion is to remove intel optimizations instead of improving ARM ones?

--
Who logs in to gdm? Not I, said the duck.
1. Re:Time for ARM to invest in GCC by RemAngel · 2013-07-14 05:52 · Score: 2
  
  ARMs business is IP so they care about GPLv3 and the constraints it puts them under. Alternative open-source compilers such as LVVM, with less onerous licensing, are therefore more likely to be contributed to,
Re:Benchmarks, trustworthy? by Macman408 · 2013-07-13 12:20 · Score: 4, Insightful

To be fair, any use of a benchmark to judge which system to buy is pretty silly. The best benchmark you can make is something that is identical to your intended workload; eg play a game or use an application on several systems, and see which feels better to you.
Taking some code written in a high-level language and compiling it for a platform is a great benchmark - if that's what you're going to be doing with the system. But you'd better be using the compiler you'll be using on the system. If you need free, you should test GCC on both. If you are considering buying Intel's compiler (it's not free, is it?), then add it in as another test to see if it's worth the extra outlay of cash. Intel puts a lot of work into making compilers very good on its systems, so if you're going to use the Intel compilers for Intel systems, it's perfectly valid to compare against using GCC on an ARM platform, if that's what you'd be using on ARM.
But if most of what you're running will be compiled in GCC for either platform, yes, you should absolutely test GCC on both.
That said, much of what's noted isn't necessarily intentional wrongdoing. For the example of breaking functionality, it's quite possible that the compiler made a perfectly valid optimization to get rid of 31 of the 32 loop iterations. One of my professors once told a story about how he wrote a benchmark, and upon compiling it, found that he was getting some unbelievably fast results. As in literally unbelievable - upon investigation, he discovered that the main loop of the benchmark had been completely optimized away, because the loop was producing no externally visible results. (As an example, if the loop were to do "add r3 = r2, r1" 32 times, a good compiler could certainly optimize that down to a single iteration of the loop; as long as r2 and r1 are unchanging, then you only need to do it once. Similarly, even if r1 and r2 are changing on each iteration, you need to use the result in r3 from each iteration of the loop, otherwise you could optimize it to only perform the final iteration, and the compiler could pre-compute the values that would be in r2 and r1 for that final iteration.)
So perhaps it's a bad benchmark - but I wouldn't default to calling it malicious, just that the benchmark isn't measuring what you might want it to measure. And quite frankly, most users aren't going to be doing anything that even vaguely resembles a benchmark anyway, so they really have little justification to make a buying decision based on them.
Re:Benchmarks, trustworthy? by OneAhead · 2013-07-13 16:47 · Score: 3, Informative

If by fixed you mean "Intel put a disclaimer on its compiler saying [ICC] may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors", then yes, it is fixed. Otherwise, not so much. I happen to have tested ICC performance against other compilers not too long ago, and it refuses to genereate AVX instructions that are reachable when running on an AMD CPU. The -xO flag didn't help - all it did was turn off AVX altogether. Adding flags that prevent it from generating other execution paths than the AVX one didn't help either; when started, the binary would just generate a clean (but false) error message that the processor doesn't support its instructions, and exit immediately. From this, I concluded that after all these years, they still check for "GenuineIntel" instead of looking at the actual capability flags. In the end, we found absolutely no way to make ICC generate AVX instructions that would be executed on an AMD processor.
Re:Benchmarks, trustworthy? by hairyfeet · 2013-07-13 23:09 · Score: 2

I'm sorry dude but while you started out well, you quickly ran into bullshit. Look up what I said to look up, "Intel cripples compiler" and there you WILL see the smoking gun....The Pentium 3. If your arguments were valid, that its JUST Intel knows their own chips well? Then the P3 wouldn't get penalized by ICC...but it does. and again if you ONLY change the CPUID, which frankly ANY compiler that uses CPUID to judge what a chip can do instead of the flags? Bullshit. But you switch from Centaur hauls to genuine Intel and tada! Your chip will "magically" score 30% HIGHER than it did before the switch,the ONLY thing being changed is the CPUID.
The guy wrote did the tests tore down his code to see what it was doing and what intel has done with their cripple compiler is that on ANY chip that they don't want pushed, including their own P3, that doesn't give it a P4 or better CPUID gets thrown into X87 mode. That's right NO SSE, even though BOTH Intel and AMD has had SSE for over a decade now and that ANY code could just check the CPU Flags and know this, but this isn't about using what the chip has, its about making sure Intel chips score higher no matter what.
So I'm sorry dude but if that isn't grounds for antitrust then nothing is. Intel ignores CPU flags on ALL chips but their own and instead uses CPUID to make sure that any non Intel chip gets thrown into slow mode so they can win, again its quack.exe all over again only worse because plenty of companies compile with ICC and they are helping Intel rig the market against competition.

--
ACs don't waste your time replying, your posts are never seen by me.
Re:Benchmarks, trustworthy? by Runaway1956 · 2013-07-14 00:29 · Score: 2

I've been all AMD almost forever, for this reason among others.
http://forums.pcper.com/showthread.php?470102-Intel-s-compiler-cripples-code-on-AMD-and-VIA-chips 2010
http://www.theregister.co.uk/2009/12/16/intel_ftc/ 2009
http://techreport.com/news/8547/does-intel-compiler-cripple-amd-performance 2005
I found those three on the first page of my search results, and quit looking. Different search terms and a more determined search will find hits as old as about 1999, maybe even older. Hard to remember, but I think I first became aware of compiler cheats by Intel around 2000 or 2001. Prior to that, I naively thought that a compiler was a compiler.

--
"Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br