A Rubric for IT Analysis

← Back to Stories (view on slashdot.org)

Posted by timothy on Sunday June 12, 2005 @09:07AM from the useful-heuristics dept.

Aredridel writes "Zed A. Shaw has an insightful article on how analyses of software systems should be performed, and how they're often done wrong. It should be required reading for all IT journalists, and all readers of IT journals."

5 of 86 comments (clear)

Min score:

Reason:

Sort:

How to Lie with statistics by alanw · 2005-06-12 09:15 · Score: 4, Insightful

When you have read that article, go and buy a copy of the 1954 classic How to Lie with Statistics by Darrell Huff, ISBN 0393310728.
Red / Green (Bad graph examples) by mister_llah · 2005-06-12 09:17 · Score: 4, Insightful

The usage of red and green determines the meaning, if the higher statistic was red, it wouldn't be the "bad" effect he is stating.

The statement that green is good, red is bad, is not really true. Red is an attention getter, Green is an easy, inobtrusive color (relaxing, generally).

While it is easy enough to make the leap that 'red' is bad because red is often an 'alert' color, the reason red is an alert color is because it is an attention getter, not because it means bad.

Why else do you think so many people drive red sports cars? If red was bad, why wouldn't they drive green ones? ... and the graphs aren't necessarily misleading in the aspect of spacing, the graph seems to be trying to show the ratio of difference, not the difference amount. ... aside from what looks like a bad example of bad examples... there are some good points in the article...

--
MoM++ - A Classic Expanded - [Master of Magic 1.5]
http://mompp.sourceforge.net/
INSIGHTFUL?!?! by imsabbel · 2005-06-12 09:36 · Score: 2, Insightful

"Also look at the axes and their layout. The first graph has the y-axis (left side) going in 50 increments, and the second graph has the y-axis going in 100 increments. This distorts the graphs to make it look like they are the same results, but actually they look very different when graphed properly. What's worse is that the x-axis for both graphs is the same which means they are changing one scale (y-axis) without adjusting the other scale (x-axis). This creates a distorted graph."

Well, no idiot. When graphed properly, they look the same. Both tests show an absolutely compareable performance ratio. What does it matter that the faster machine runs both OSses faster? How does this skew anything? Is the concept of relative speed increases a new concept for the creator of the article?

A REAL loaded graph would surpress the y-axis or something to push the lower graph further down, or to skew the proportions.

Man, is today really shit article day on slashdot?

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
Re:Lying with statistics by Otter · 2005-06-12 10:22 · Score: 2, Insightful

1) Yes, the grandparent got the x- and y-axis confused in his post.
2) The point of the graphs is that the Windows server has a roughly 75% performance advantage over Samba on both systems. The different y-axes are used because one system is twice as fast as the other and using the same scale on both graphs would leave half of one graph empty. I would say the choice of scales is entirely correct.

3) The x-axis is labelled in numbers, not intervals. Excel graphs place tickmarks between the labels. You can complain about them, but the author didn't place them there. In any case, Samba doesn't look any better no matter where you put them.

4) Sorry, the whining about the red is just weak. In any case, this is another case of an Excel default being used, not a malevolent anti-Lunix conspiracy...

--
What I'm listening to now on Pandora...
This is a useful enumeration by eclewis · 2005-06-12 11:57 · Score: 2, Insightful

Naturally, despite Zed's assertion, not all graphs need to avoid his enumeration of pitfalls. For example, if the target audience is mathematically unsophisticated, most statistics (save perhaps mean) are inappropriate. Or perhaps red is meaningful in some contexts. Or maybe the presenter wants to show a very small delta, so an axis range is chosen to illustrate this.
Nevertheless, Zed's enumeration can be extremely valuable in helping a discerning reader (who doesn't already know it all!) to critically interpret graphs in order to decide what s/he may conclude. For example, if system X appears to outperforms system Y, but the difference may be within the (unpresented) deviation, one should not accept the assertion that X is superior. Instead, one may conclude that X may be better than or comparable to Y.

Zed's article can help some of us tell the difference between lies and truth. That's a good thing. The unfortunate weakness of the article is that the example is not particularly compelling. It simply doesn't illustrate the most important pitfalls.