Slashdot Mirror


Proposal For Open-Source Benchmarks

nd writes: "Van Smith from Tom's Hardware has written a proposal that calls for open source benchmarking. He talks about the need for increasing the objectivity of benchmarking. The proposal is basically to develop a suite of open-source benchmarking tools and new methodologies. It's a rather dramatic column, as he discusses Transmeta, bias towards Intel, among other things. " Well, once you get through the inital umpteen pages of preamble, the generically named A Modest Proposal is the actual point. Interesting idea - but I shall weep for the passing of bogo-MIPs as the definitive measure of system performance. *grin*

42 of 118 comments (clear)

  1. "Open source" ideology by Anonymous Coward · · Score: 3

    You know I'm getting somewhat sick of the whole open source thing. At first I thought it was a Good Thing, a way to allow people to collaborate on code and to keep it from being stolen. But gradually I am becoming more and more cynical about it - not so much the concept, but more the zealotry that surrounds it.

    Just look at the title of the article linked in this story - "A Call to Arms - A Proposal for Open-Source Benchmarks". WTF? Why is this a call to arms? Isn't this just a bit rabid for what is, after all, just an article about benchmarks. Benchmarks may be important, but they're not worth getting worked up over.

    And then the first page of the article is a rambling piece of tabloid "cyber"-journalism far worse than even Jon Katz has ever managed. Why is this diatribe necessary? Surely we all know what open-source is, and we all realise that the net has changed a lot of things. No, it's the same thing I see again and again - the zealotry of the open source proponent who feels the need for grand rhetoric and buzzword-filled arguments.

    There is an ideology behind open source, and a good one, but it has been taken too far. Richard Stallman is not the best person to represent such a diverse group of people - his radical politics and hatred of commercialism make him quick with the denounciation of anything he disagrees with, like the name Linux - after all, he'd rather it was "GNU/Linux" or even worse "Lignux". This kind of ideological zeal is certainly putting me off of the idea, and others I'm sure too, but there seems to be a never-ending parade of people willing to subscribe to his beliefs and zealotry.

    Anyway, what I'd like to see is a return to what open source is about - writing good, free code for the use of all. There's no need for flaming attacks on closed-source software or whatever - that shouldn't be the point of open source, and is just a waste of time better spent coding. Unfortunately /. seems to provoke this kind of hysteria, but even with this I'll still read it :)

    If you disagree, feel free to reply. Nicely :)

  2. Funny... by Anonymous Coward · · Score: 4

    Tom: "Open Source Babble Transmeta Crusoe Linux Ramble Internet Cyber-World Paradigm Revolution"
    Slashdot Multitudes: Yay! (clapclapclapclap)

    Jon Katz: "Open Source Babble Transmeta Crusoe Linux Ramble Internet Cyber-World Paradigm Revolution"
    Slashdot Multitudes: Windbag! Parasite! Media Whore!(boooo, hissssss)

    1. Re:Funny... by bgarcia · · Score: 4

      He has a point about... wait a sec... Jon, is that you?

      --
      I'm a leaf on the wind. Watch how I soar.
  3. Re:Uhhh....Yeah, but who will use it? by adamsc · · Score: 2
    they/'d be really embarassed to sell a coupla million worth of mainframe with a benchmark figure lower than that of an alpha costing a couple of grand.
    Even though the mainframe processors have been getting pretty fast, I think this clashes your earlier statement about mainframes being I/O beats (which is completely accurate). What embarrasment is there in the fact that a system doesn't do well in a scenario completely different from what it was designed for? That'd be like complaining that the diesel locomotive you just bought sucks on 0-60 performance.

    Besides, any IBM salesdroid worth his commission would mention the RS/6000 line.
    __

  4. Read the HOWTO by pb · · Score: 3

    Anyone remember the Benchmarking HOWTO?

    There are *lots* of open-source benchmarks, and of course we can make new and better ones, and get a test suite together.

    For starters, the LBT (Linux Benchmarking Toolkit):
    Run the BYTEmarks (and the old UNIX ones too, they're funny), Whetstone, XBench... oh, and compile a stock kernel (and don't fiddle with the options, 2.0.0 was recommended then.)

    Personally, I'd also suggest bonnie, it's a good benchmark for disk performance, but you'd have to have a range of options here. (testing disk performance and cache, so you'd really want a large number here too, just to be fair. 2*RAM?)

    Also, when RedHat boots up, it has those RAID checksumming tests, those are good. They test different implementations of the same algorithm, so they say a lot about the individual chip. (whether it likes MMX, works well with different optimizations, and whatnot)
    ---
    pb Reply or e-mail; don't vaguely moderate.

    --
    pb Reply or e-mail; don't vaguely moderate.
  5. BogoMIPS by Nate+Fox · · Score: 2
    dmesg | grep -i bogo
    Calibrating delay loop... 897.84 BogoMIPS

    ...comming from a K6-2 450. Now everyone has fast boxes these days...whats the LOWEST BogoMIPS you've ever seen, and what CPU was it runnin? :)

    -----
    If Bill Gates had a nickel for every time Windows crashed...

    1. Re:BogoMIPS by technos · · Score: 2

      The lowest correct value I have ever seen was 4.81 BogoMips on a 386SX-18. (AMD Elan, hardware controlled variable clock, 2-18 Mhz) I did however run into the odd failure on a buggy Sunnylab MediaGX board that caused the BogoMips to be reported as 0.01 on one pass (after a lengthy hang) and 172.xx on the next with 2.2.4.

      --
      .sig: Now legally binding!
  6. bogo-MIPS explained by Dougal · · Score: 2

    Until I decided to look it up ust there, I had no idea what bogo-MIPS was. Enlightenment can be found either here or here.

    -- Michael

  7. Re: Tom's Hardware by Matt+Lee · · Score: 2

    Umm, Tom's Hardware is not all that reputable, especially after that fiasco a while back about his involvement with NVIDIA. I think he's apologized, but the suspicion is still there. Frankly, I don't really trust ANY website with accurate benchmarks - I trust my own judgement after I read all of the benchmarks on all sites.

    Tom and his hardware aside, I think that open benchmarking tools are a good idea. However, we might see a different set of problems, in that if the hardware company knows exactly what code is going to be executed to benchmark their product, they can optimize/cheat for that code.

  8. Re:Uhhh....Yeah, but who will use it? by IntlHarvester · · Score: 3

    Back in the old days, Cadillac shipped cars with 472 and 500 cubic inch engines (about 8 liters in modern terms). These things put out nearly 400 HP and buttloads of torque. With the exception of some muscle cars and the Corvette, Cadillacs were the fastest cars GM built.

    But, nowhere in their advertising did they mention the size of the engine or the amount of power or anything about "performance". Back in those days everyone just knew Cadillacs had plenty of power. I suspect it's the same with IBM and their mainframes - just too much reputation to even advertise.
    --

    --
    Business. Numbers. Money. People. Computer World.
  9. Re: Tom's Hardware by BrianH · · Score: 3

    If a benchmark could be written that would accurately simulate real world applications, then I'd say let them optomize their hardware/drivers for it. If the benchmark is good enough, then any optomizations made for the benchmark should also cause a performance increase in your genuine applications. Of course, therein lies the trick. Can you make a benchmark that realistic?

    --

    There is nothing so pathetic as seeing a beautiful young theory roughed up by a tough gang of facts.
  10. Another idea - do benchmarks need to be portable? by Roy+Ward · · Score: 2

    I think that one possible use for open source benchmarks are benchmanrks which _can_ be tweeked for individual processors.

    To explain: define a set of tasks (this could include some of the same set of tasks as some of the current synthetic benchmarks), but define then in the algorithm that must be used rather than the implementation. Then write a C/whatever standard that implements that algorithm as well as possible to use as a base. _Then_ allow the proponents of particular platforms to modify a version of the code (possibly using #ifdefs or whatever to keep it in one code base) as long as they use the same algorithm.

    One possible test (I'm only using it as an example, not suggesting it) would be to calculate a certain portion of the Mandelbrot Set down to a depth of 10000 and put the results in an array of a certain structure, where it must be done using brute force with a presicion of a least 40 binary significant digits (i.e. 64-bit longs or doubles) ... edge following not allowed. Part of doing the whole benchmark is doing the test n times, where the position of the result array keeps moving. With that, we'd start with some base code that does the job fairly well, than people can add #ifdef PPC_G3, #ifdef AMD_K6_2 and write pieces of code (using assembler if they like) to speed it up for their favourite architecture. A little bit of competition could be fun :-).

    The current distributed.net RC5-64 could be considered an example of such a benchmark - using processor tweeks are good as long as you solve the problem.

    Open source can be used to prevent cheating, in that it can be seen that everyone is following the correct algorithm (or strict review by trusted organization as in the case of RC5-64). It also means that people can look over the tweeks for other platforms and see of any of them are applicable.

    The rationale for this approach:
    (1) change the rules so that what is currently 'cheating' becomes part of the process - it becomes very difficult to cheat.

    (2) A lot of 'real world' applications like Photoshop and Quake are presumably using these sorts of tweeks for their inner loops anyway, so this is mirrored by allowing the same tweaks in the tests.

    This idea has several downsides:
    (1) it can only provide synthetic benchmarks, and on fairly small examples (so optimizing if for particular archectures doesn't require huge resources)

    (2) it only tests the speed that can be got using assembler ... how good the compilers are doesn't really get factored in.

    (3) It requires each platform to have some advocates good enough and willing to put time into optimizing code so every platform gets a fair go.

    (4) because the tests are so small, it needs a moderately large number of individual benchmarks - for instance RC5-64 on its own is useless since it doesn't test memory speed, and PowerPC and x86 architectures have the huge advanatage of having rotate instructions.

    (5) rather than give a single number (which is what people tend to want), resulting benchmarks would give a set of results for various aspects of the chip - the would make the results of more interest to technically oriented people.

    I'd be willing to put a little work into PowerPC G3 and possibly G4(Altivec) optimization in such a project.

    A more extreme version of this idea is to allow algorithm optimization too ... like do the Mandelbrot example (allowing edge following etc.) as fast as you can as long as the precision of the results is up to standard. I think that this would require too much time on part of the optimization writers though.

  11. Prior "Modest Proposal" by Robotech_Master · · Score: 2
    I personally prefer the original A Modest Proposal .

    (...gee, where did all my Karma go?)

    --
    Editor Emeritus and Senior Writer, TeleRead.org
  12. I'd love to contribute codes by PeterM+from+Berkeley · · Score: 2


    Our plasma simulation group has several simulation codes which would be pretty good as part of an open-source floating-point benchmark suite--*provided* this benchmark suite is distributed under the GPL or Berkeley license.

    We considered giving our codes to SPEC, but SPEC wants to be able to *sell* their benchmark suite for $500 a copy. This caused us legal headaches so rather than deal we didn't try to participate in SPECfp2000.

    We can offer C and C++ codes which exercise the FPU and memory subsystem heavily: they tend to be cache friendly though.

    PeterM

  13. Uhhh....Yeah, but who will use it? by Bowie+J.+Poag · · Score: 5

    In an industry where hard disks capacities are still measured in 1,000,000 bytes per megabyte, and 19" monitors are still 17.9" viewable, what makes you think that any company would adopt a benchmarking standard that was actually impartial to their product? The whole point of benchmarking your own product is to give the marketing department something to crow about. So, logically, they gear their hardware (and choose their benchmarks) accordingly.

    Sure, its a great thing for the rest of us, because we dont have anything we're trying to sell. Just dont expect anyone on the outside to hop on the bandwagon.

    Yours In Science,

    Bowie J. Poag
    Project Founder, PROPAGANDA For Linux (http://metalab.unc.edu/propaganda)

    --
    Bowie J. Poag

    1. Re:Uhhh....Yeah, but who will use it? by technos · · Score: 2

      The ultimate example is IBM. Ask a IBM rep how fast the new mainframe model is. S/he'll try to buy you off with a relative performance index, or will tell you that it is X% faster than last years, or twice as fast as their model with 1/2 as many processors, or that it is directly comparable with Sun's model. No mention of actual performance, no 'We're running eight PowerPC processors at X Mhz, and each delivers a raw X FLOPS'. Sure, they won't stop you from publishing your own benchmarks, but they're not forthcoming either..

      --
      .sig: Now legally binding!
    2. Re:Uhhh....Yeah, but who will use it? by technos · · Score: 2

      Lincoln did much the same thing until the late sixties.. Cadillac started claiming specs after the song 'Little Nash Rambler' hit the AM airwaves in 1956 or so (Rambler pulls along side a proud Caddy owner topped out at 120 and asks how to get out of second gear)

      Unlike comparing the Ford 460 to the GM 427, now IBM uses commodity processors for most of its machines; You can directly compare, by virtue of the speed and number of processors,(minus the OS and microcode fudge factor) an IBM mini to a SMP PIII, or a Altivec-enabled Mac, or a Alpha. Big Blue's mainframes still use somewhat in-house powerplants, but knowing that the 2001 390 is 1.25 times faster than the last revision isn't going to help you make a purchasing decision between it and a small Alpha cluster..

      --
      .sig: Now legally binding!
  14. Re:Slowest BogoMIPS I've seen... by AJWM · · Score: 2

    The 486dx2-66 beside me here is 33.18 bogoMIPS. It's running my webserver and a few other network services. (Heck, even X Windows isn't too bad on it anymore since I upgraded from 16M to 24M, but that's not usually running.)

    --
    -- Alastair
  15. Re:Benchmarks should not be Open Source by nd · · Score: 2

    Nothing's stopping them from doing that now with the current benchmarking tools.

    Having the source code for it will only make this trick slightly easier (less reverse engineering needed). Besides, if information leaked out that actual HARDWARE cheated on benchmarks, they would be under a LOT of critisism and I suspect they'd be caught rather quickly.

  16. Re:Benchmarks by nature are subjective by angst_ridden_hipster · · Score: 2
    What defines a benchmark? Is it not a measurement of the performance of one aspect of a system?
    Benchmarks should be open sourced, the community that uses the system(s) at large should define what the tests
    (torturous as they should be) actually test. That will determine the difference between fluff and actual fact.

    Of course, it's also Standard Operating Procedure to optimize products to perform well on Benchmarks specifically (I hear stories about compilers that seek out "Whetstones" or "Dhrystones" and will substitute hand-optimized machine code for 'em rather than just compile the code).

    Bottom line, is you can't trust 3rd party benchmarks. You need to test a system for your specific application. This, though, is prohibitively expensive for most applications. So you gotta rely on benchmarks.

    Therefore, make your benchmark as close to real-world use as possible! Especially if you're open-sourcing it. Then, optimizing for the benchmark is actually optimizing for real-world use.

    (The problem with this, of course, is that your real-world use may be dramatically different than mine. If I'm rendering 3D graphics, I have different needs than someone running, say, a web server. So this then requires a family of benchmarks, reflecting real-world usage in different domains of endeavor.)

    --
    Eloi, Eloi, lema sabachtani?
    www.fogbound.net
  17. The Good, The Bad, and the Ugly... by Silverpike · · Score: 4
    Ol' Tom has a good point. Sysmark really isn't the right solution for comparing processors. What he proposes is a realistic, achievable goal, but you have to define the playing field first.

    The Good:

    There already is a great benchmark for processors, and it's called SPEC. Yes, it's not open source, but it's really quite reliable for comparing CPUs of any architecture. As slashdot user "cweber" pointed out in his post, they have been doing this for 11 years, and they periodically revise their benchmark suite to stress CPUs more uniformly.

    The open-source method. This is really good to ensure that there are no cheaters at the benchmark level.

    Tom's interesting ideas on Crusoe. This stems from the fact that SPECmarks don't quite approximate real usage that Crusoe depends on to use it's hotspot optimizations. However, we are interested in the raw sustained speed of the processor (in this case), not the speed of the OS or it's task swap latency. Tough problems to solve.

    Open-source means that the benchmark code will be able to take advantage of the best compiler available for the target CPU (see comment at end).

    The Bad:

    Anyone who has done benchmarks knows that even small variations in system config can have strage or harmful effects on the benchmark results. This open-source effort is going to have to have a database of hardware configs in order for this to be useful.

    The Ugly:

    Vendors are going to oppose this (at least not support it). Why? Because plain and simple they have an interest in promoting the most favorable statistics possible about their products. They want to keep feeding you "polygon fill rates" and "texels per second" because their card may not stand up in a direct test program comparison. Plus, they are just dying to convince you that they have new BogusMarketingAcronym (tm) technology and their competitor does not. Nevermind that SSE and 3Dnow do pretty much the same thing -- companies have an interest in differentiating themselves as much as possible.

    If this benchmark actually takes off (and gets widely accepted), we might get cheaters at the firmware or hardware level. This has happened before -- although which company it was and which benchmark they cheated I can't remember. I can't find it on the net or remember to save my life (sigh)...

    I also need to say something to the people who think a processor should be judged independently of a compiler. This is just plain dumb. Why? Because a processor and it's compiler are a team. You can't use one without the other. When a chip is designed, there is a direct information dependence between the chip architects and the compiler writers. They are designed as a pair (ideally), and they should be tested as such. If a given compiler has great optimizations, then great! That means the compiler understands its target real well. It is a win for both the CPU and the compiler for pulling it off. This compiler is going to do the same kinds of optimizations when vendors use it to write programs, so that helps the comparison between benchmark code and apps.

    However, I can see the need to compare not only the best compiler, but GCC as well, because of its broad acceptance. But if you are serious about performance, and want to get every once of juice out of your chip, you use the vendor provided compilers, not GCC. Don't get me wrong, GCC is great for compliance and portability, but it usually doesn't compare well with vendor compilers for generated code speed (with the possible exception of IA-32).

    Ars Technica also published, a while back, some good information regarding CPU benchmarks. Check it out if you are interested in SPEC or CPU benchmarks in general.

    --
    The opinions I post here have nothing to do with my employer.
  18. what about SPEC? by cweber · · Score: 2

    I know, SPEC isn't open source in the strict sense, but it IS a broadly accepted benchmark suite of which source is available, and it has served us well for the past 11 years.

    As well as generic benchmark can serve anyway. There is of course no substitute to check out a box with your own apps and workload.

  19. Re:This is DEFINATELY a good idea.. and here is wh by cweber · · Score: 2

    That is why open source benchmarks are a good idea -- not only does it allow people to improve on the code directly, but it lets people see exactly what is going on behind the scenes.

    That is a very bad idea, indeed. If the code base of the benchmark changes at all, none of the numbers are comparable between releases. This is exactly why tightly controlled benchmarks like SPEC have been successful. SPEC only changes every few years, there are clear rules of what you can do and what not while compiling and running the benchmarks and there are rules about how to report the resulting numbers.

    Inasfar as one can trust generic benchmarks, SPEC has held up nicely and allowed us to superficially compare systems from Unix vendors with different CPUs, different architecture and different OS. Even with infrequent updates, the transition from one version of SPEC to the next gets in the way sometimes. I can only imagine how bad a true open source solution without additional rules would be.

  20. Cheating by Foogle · · Score: 2
    What happens when someone alters the source code? I mean, that's what open source is about. So somebody fsck's around the source code so that it works better with their product that anyone elses. It was still an "Open Benchmark", right? I think this would be even worse than the current situation.

    -----------

    "You can't shake the Devil's hand and say you're only kidding."

  21. Re:Benchmarks should not be Open Source by leiz · · Score: 2

    ATI has done it before, they too the ATI Rage Pro and optimized the drivers on it... creating the ATI Rage Pro Turbo video card which is the exact same card as the Rage Pro, but with a different driver which is optimized for Winbench 3D. It got much higher scores in winbench 3d but no performance increase in real world apps like quake2. see this link on tom's website for more info


    _______________________________________________
    There is no statute of limitation on stupidity.

  22. Re:Slowest BogoMIPS I've seen... by Mr.+Slippery · · Score: 2
    49.87 BogoMIPS i486dx2-100 btw
    My Pentium-90 says 36.04 BogoMIPS; my dual P75 says 29.90. Maybe I should upgrade to a 486. B-)
    --
    Tom Swiss | the infamous tms | my blog
    You cannot wash away blood with blood
  23. Re:Give Transmeta a little more wiggle room?? by Mr.+Slippery · · Score: 2
    (i.e. group and user attributes in the filesystem are a real waste on a handheld device)?
    If I borrow your handheld, do you want me to be able to do the same stuff you can, or to be restricted to a guest account?

    What if an organization maintains a pool of handhelds, and you grab a different one every day? Or for each task?

    Even if you never loan it out and it's yours forever, having different users for administrative tasks is a Good Thing; you don't want to be root all the time.

    --
    Tom Swiss | the infamous tms | my blog
    You cannot wash away blood with blood
  24. This is DEFINATELY a good idea.. and here is why. by citizenc · · Score: 2

    I've been reading computer magazines for years, and I am always amazed at the way that hardware companies will report bench mark results. Instead of explaining it in a simple, easy-to-understand manner, they will play on the ignorance of the average consumer, and throw terms like "triangles per second", "refresh rate", "maximum colours", and "frames per second" around, hoping to sound good!

    Several times, I have read ads for hardware, that proclaim that they are faster then the competitor. They even have pretty bar graphs! And of course, their bar is much longer. However, here is a great place to repeat the long-unblieved-until-now phrase "size doesn't matter."

    What is stupifyingly bizzare is that I can turn the page, find the competitors ad, and they proclaim exactly the same thing! It's mind boggling, too say the least.

    That is why open source benchmarks are a good idea -- not only does it allow people to improve on the code directly, but it lets people see exactly what is going on behind the scenes.

    Or maybe, we shouldn't let companies do benchmarks for their own products. After all, you can make stats say pretty much whatever you want them too. Or maybe we should just ignore ads completely. (I know that I don't trust magazine ads for sure. Rather, I go and find MULTIPLE reviews of hardware, both online, and word-of-mouth, to get an accurate picture.)


    ,-----.----...---..--..-....-
    ' CitizenC
    ' WebMaster, PlanetQ3F
    `-----.----...---..--..-....-

  25. Slowest BogoMIPS I've seen... by zorgon · · Score: 2

    % dmesg | grep Bogo Calibrating delay loop.. ok - 49.87 BogoMIPS i486dx2-100 btw Got a friend who claims to have a linux caching secondary dns server on a 386 -- see if I can get him to give up the the bogomips figure...

    --

    I am quite civilized, and I should be brought a beer immediately. -- Bruce Sterling

  26. Benchmarks by nature are subjective by zerodvyd · · Score: 3

    to be truly objective, the actual benchmark code should be written in a cross platform capacity. I question the reliability of benchmarking software in general, go ahead and call me a skeptic or whatnot...but I stand by that claim. What defines a benchmark? Is it not a measurement of the performance of one aspect of a system? Benchmarks should be open sourced, the community that uses the system(s) at large should define what the tests (torturous as they should be) actually test. That will determine the difference between fluff and actual fact.

    ...just as long as they keep the BogoMIPS around I'm okay with it :) lol

    zerodvyd

  27. The REAL problem with Benchmarks by _Mustang · · Score: 2

    is translating theory into the practical. With the exception of some *very* specific-to-use benchmarks that I've seen, everything else has always been a very poor approximation of what someone "Thinks" actually is a practical sample of what DOES happen when a computer is used.
    As the old saying goes, "There are lies, damn lies, and statistics", and benchmarks are the most advanced form of statistics. Draw your own conclusions..

  28. I'll tell ya who. . . by xant · · Score: 3
    Well . . . kind of the POINT of this whole exercise is to take the ability to perform referenceable benchmarks out of the hands of the interested parties (those who make money from them). Closed-source, commercial benchmarks are inherently flawed for some of the same reasons closed-source, commercial security is flawed. The difference is that those interested in finding and exploiting these flaws aren't crackers, but hardware companies.

    So to answer your question: Tom's Hardware, and other reputable benchmarking authorities, would use it. TH has rapidly become one of the highest-integrity, best-respected hardware/computing sites around, even (indeed especially) for the Windows crowd. (After, Win32 is still the dominant gaming platform.) If such a thing as open benching became popular, then commercial entities would be FORCED to use the open benchmarks or be accused of marketing skewed numbers, whether those accusations had merit or not.

    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
  29. We do need it... I suppose.. by affegott · · Score: 2

    I don't know why so many people beleive so strongly in processor benchmarks. Know how many integrer operations a sec that a processor can do is nice, but it doesn't give you an overall picture. I have seen VERY few _good_ overal system benchmarking tools... maybe this will lead to more.
    YOu would think getting a non biased test would be fairly easy... or perhaps it should be up to the manufactors to make the benchmarks... that why they could bend the thruth as much as possible... but if everyone is bending the truth, then it will all even out. :-)

    Yup.

  30. Re:Generic benchmarks are useless by Signail11 · · Score: 2

    Where do you get this from? For web server applications, I would want insane memory bandwidth, good SMP capabilities, and efficient cache handling/IO subsystems. For scientific applications, I wouldn't usually consider a x86 processor except under very specific cases, SSE or no SSE. The FP register stack just about kills parallelism and SSE only offers 32-bits of precision. I really don't want to save 3 days on computation time to spend 3 weeks using numerical analysis to hunt down pesky instabilities. To decode PPV signals, standard integer math is more than adaquate (I believe that it uses variable line-based rotations and offset phase shifting). Spec, TPC, and other benchmarks are *very* useful when buying high-end computer systems. Of course, in the end, what counts is performance on *your* application; that's why aggregate system benchmarks (WinBench 2000, etc.) are sometimes useful for home/office users who want to get a gauge for performance on typical tasks. Then again, if a user is going to be sending email or writing 2-page memos, the computer he/she is using won't matter much.

  31. Re:Benchmarks should not be Open Source by Signail11 · · Score: 2

    This comment deals with scientific programming; YMMV wrt game or graphics benchmarks.
    I *want* compiler writers and microarchitecture designers to optimize for reasonably well-designed benchmarks, such as Spec. I *want* compilers to recognize critical code fragments, idioms, and kernels in the Spec benchmarks and emit perfectly scheduled code. I don't care if the compiler can't make the optimization in the general case; when I write scientific code, I take care to use the standard style of writing certain common transformations, such as dot products so that compilers (Compaq's ccc and SGI's compilers are excellent in this regard) that target SpecFP pattern match the code and produce good code. I want microarchitecture designers to include elements that make their chips run Spec fast, since if a benchmark in Spec runs quickly and my computational task is similar, it will most likely benefit from any architecture changes as well. Thus, selecting good benchmarks in a suite is utterly critical if the benchmark number will have any value at all; moreover, there are many incidental benefits to selecting benchmarks that represent commonly used tasks or programs.

  32. Re:Here are some suggestions... by Signail11 · · Score: 2

    That wasn't quite my point. Open source programs constitute a minority of the SpecINT and SpecFP suites; each of the individual benchmarks is designed to be representative of the workload of a typical *scientific workstation*. Spec cares not about your 3D video card, your CD-ROM transfer rates, or your 3D sound card. There are no empty loops in the Spec benchmarks, and intruction dispatch speed is not tested as a discrete benchmark; it will factor into the overall score. As for compilers optimizating specially (via pattern matching) for Spec idioms or code fragments, all the better luck for them! If the compiler can, say, sense the standard form for DAXPY or another common Spec kernel and emit inline hand-scheduled code for the fragment, I will be all the happier, since I can use the same fragment in my code and tempt the compiler into emitting specially optimized asm (this works especially well on the good compilers: Intel's reference compiler, compaq's ccc, SGI's sgi-perflib/compiler suite).
    br. With regards to your comment about realistic computer usage: those tests that you suggest can be done in such a minute duration of time on any modern computer that they are _not worth testing_! It's essentially "fast enough" for any possible user; let's face it, the CEO's secretary does not need a P-III 800 or an Athlon, to say nothing of an Origin 2000, Starfire, RS/6000, or AlphaCluster.

  33. Here are some suggestions... by Signail11 · · Score: 4

    I suggest basing an open-source benchmark suite on the existing Spec benchmarks, as most of the code (or functionally equivalent code) is relatively freely available. Of the 12 SpecINT 2000 benchmarks, 5 (gzip, gcc, crafty, perlbmk, and bzip) already exist as open-source programs. The combinatorial optimization (181.mcf) benchmark's code is also on the Internet at www.zib.de, free for academic use. I'm sure someone could make a cleanroom interpretation of something similar. 175.vpr (a place and root program) can be found at http://www.eecg.toronto.edu/~vaughn/vpr/vpr.html. 197.parser is essentially a CS student's problem about parsing and extracting strings. 252.eon is a raytracer (we can use POVRay instead). 254.gap is a general purpose math library (Victor Shoup's NTL library exercises most of the same functions). 255.vortex is a standard RDBMS; MySQL or an equivalent could be used here. 300.twolf seems rather similar to 175.vpr; as circuit designing is really far removed from my field, I'll leave this to someone else.

  34. QuakeIII? by jallen02 · · Score: 2

    Don't correct me if im wrong!

    QuakeIII is the most 31337 Benchmark in existence. Dont you guys realize what carmack really was doing when he did a Linux port of QIII?

    You need nothing other than QIII for reliable benchmarking of a system.. and any other bench-
    marks just dont matter! Nuff said..

    :)

  35. Bogomips do rock, but... by Ron+Harwood · · Score: 2

    ...I do want a way to compare different processors/operating systems/video cards/etc. objectively without having to obtain a 3rd party's tools and pay for them... (Which I think you have to do with SPECint, right?)

    It would be great to have tools like that, and create a repository of the results.

  36. Benchmarks by Alexius · · Score: 2

    Wouldn't It Be A Bit More Helpful To Have Some Benchmarks That No One Knew What Instructions Were Used To Make Them? This Way, You Could Optimize Hardware For The Benchmarks At The Expense Of The Rest Of The Workload. And an Open Source Benchmark Would Be Constantly Improving, Thus A REading From One Year Could Be Compeletely Different Than One From The Next Year, Making It Tough To Compare New Technology To What You Would Already Have.

    --
    `Lex - Find Me Here: Text Appeal
  37. Benchmarks should not be Open Source by bonzoesc · · Score: 2
    As soon as hardware vendors learn they can make their hardware helluva fast by looking for the sequence of instructions present in the open source benchmark program, they will just make their crappy old ATI Rage 2MB video cards that are really expensive and wait for the '3Dbenchmark.start();' command, at which point they will say '3Dbenchmark.finished("only .000000001 second!");' This is not good.

    Unrelated note: A Modest Proposal was an essay written by Jonathan Swift that proposed that poor people should sell their babies for food. It was satirical and shocking, but most of all, very entertaining.

    "Assume the worst about people, and you'll generally be correct"

  38. Useful benchmarks by grue23 · · Score: 2
    The only benchmarks that I've ever found useful are the ones on that measure the performance of commonly used applications (for example all the great stuff on video cards at Tom's Hardware Page). That's all I really care about. Benchmarks of arbitrary performance are only useful to most home users for whacking off to.

    Most companies that develop systems that might require some form of benchmarks are likely going to have to develop their own with prototypes of their application, I can't see anything arbitrary being very helpful in predicting how a particular system will perform in comparison with any other system.