Why Programmers Need To Learn Statistics

93% of Programmers Think You're Wrong by Greyfox · 2010-01-09 11:37 · Score: 3, Interesting

Everything I needed to know about statistics I learned playing poker.

--

I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

Re:93% of Programmers Think You're Wrong by Anonymous Coward · 2010-01-09 11:49 · Score: 5, Interesting

The only statistics book you'll even need
Re:93% of Programmers Think You're Wrong by ShakaUVM · 2010-01-09 12:01 · Score: 5, Insightful

A manga statistics book, eh?
I just realized I was a nerd. I looked at the table of contents and closed it down, then realized I hadn't even looked at the short skirt-wearing protagonist.
Sigh...
But to answer the article's point, elementary statistics are very easy. Advanced statistics are very hard. It's kind of like how people think "knowing the difference between circles and squares" is geometry and so analytical geometry must be just more of the same, right? It's quite possible the programmers think they know statistics because they know they're vaguely supposed to do a run multiple times, and maybe average the results or something.
It's also possible the author of the article is a know-it-all douchebag who tries to solve problems with overwrought solutions.
From TFA: "Zed: Fuck! Fuck! I have eyes! You do not! See!? No?! Exactly! Because you can't fucking see because you have no fucking eyes! Arrggh!"
Just throwing that theory out there.
Re:93% of Programmers Think You're Wrong by jo42 · 2010-01-09 12:38 · Score: 1, Funny

"Lies, damn lies and statistics" is all you need to know about statistics.
Re:93% of Programmers Think You're Wrong by Daniel+Dvorkin · 2010-01-09 13:20 · Score: 5, Insightful

"Lies, damn lies and statistics" is all you need to know about statistics.
This is right up there with "'click on the big blue e' is all you need to know about the internet."
Speaking as both a statistician and a computer scientist, I've seen the statistics-vs.-CS argument play out many times before, and the lack of knowledge on both sides is really striking, but not all that surprising -- both are hard subjects which take a lot of work to master. The lack of mutual respect is both infuriating and pathetic, and there's no excuse for it.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Re:93% of Programmers Think You're Wrong by Anonymous Coward · 2010-01-09 14:20 · Score: 1, Interesting

If you think that then answer the following problem:
If I flip two coins and one of them is heads, what are the odds the other one is also heads?
Re:93% of Programmers Think You're Wrong by srwood · 2010-01-09 14:27 · Score: 1

That's probability not statistics.
Re:93% of Programmers Think You're Wrong by Devout_IPUite · 2010-01-09 14:39 · Score: 4, Insightful

"It's also possible the author of the article is a know-it-all douchebag who tries to solve problems with overwrought solutions."

That was kinda what I got from this. Sure, my powers of ten runs to determine performance isn't statistically sound. Did I say it was? No. Why don't I care? Because my samples are cheap. Spiking vs non-spiking is something pretty easy to see when you glance at the data.

I mean, he said we're going to die if we don't learn statistics, but he never gave a compelling argument for it.

The best example was users, but even that was lacking. If you design a script that's as aggressive on a system as a high use user and your system supports as many 'users' as students, you're safe, if it supports less you work on qualifying the problem better then.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 14:57 · Score: 2, Informative

I'm sure it's not 50%, and not 25%
heads=1, tails = 0
0-0 0-1 1-0 1-1
so if one of them is 1, there's a 33.33% chance the other is 1 too.
i can work it out that way for 2 binary possiblities. couldn't generalize it x coins possiblities with y sides :-/

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 14:58 · Score: 1

true too, trickster !

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by markov_chain · 2010-01-09 15:04 · Score: 1

*facepalm*

--
Tsunami -- You can't bring a good wave down!
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 15:09 · Score: 1

he said "one of them", not "the first one" nor "the second one", so, really, "any one of them"... what do YOU come up with ?

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by Hurricane78 · 2010-01-09 15:26 · Score: 1

Thing is: You can only be expert in ONE of them. Period.
I for one, choose CS. Waaayy more interesting, and compared to the nerdiness level of statistics, we look like Joe Sixpack coming to the club in his sports car, with two girls in the back. ;)
If I want to do statistics, I can always hire someone.

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Re:93% of Programmers Think You're Wrong by Seedy2 · 2010-01-09 15:39 · Score: 1

"Lies, damn lies and statistics" is all you need to know about statistics.
This is right up there with "'click on the big blue e' is all you need to know about the internet."
Speaking as both a statistician and a computer scientist, I've seen the statistics-vs.-CS argument play out many times before, and the lack of knowledge on both sides is really striking, but not all that surprising -- both are hard subjects which take a lot of work to master. The lack of mutual respect is both infuriating and pathetic, and there's no excuse for it.
Statistics are important; it is highly unlikely that anyone with an MBA will know how or why, but they want them. At least not as they relate to CS.
So in most cases the person you replied to is essentially correct. Lies, damn lies and statistics.
But I know a lot of research depends on people being able to accurately utilize statistical methods.
I also understand that people who are good at it are rare.

--
Nothing to say here... move along
Re:93% of Programmers Think You're Wrong by ShakaUVM · 2010-01-09 16:03 · Score: 2, Interesting

>>Spiking vs non-spiking is something pretty easy to see when you glance at the data.
Yeah, in fact, the way that he presents it is bad statistics. =)
If the problem is that one out of 1000 queries is taking a minute to return instead of 0.1 seconds, then using the std deviation to describe the problem is nonsense. It is not a Gaussian distribution!
But of course someone who "has spent his life studying statistics and even R language" would know that, right? :p
Instead, as you point out, any programmer who did the same testing would see that one out of a thousand queries were taking far too long, and come to the same conclusion as him, without making the ghost of Gauss cry.
Re:93% of Programmers Think You're Wrong by somebody1 · 2010-01-09 16:09 · Score: 2, Insightful

Flipping a fair coin is always independent (50%) regardless of whether you flip one or a million of them. Same reason, why martingale in roulette doesn't work.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-09 16:15 · Score: 1

Your logic is wrong.
You assume that flip A and flip B are related, which they arent. Thanks to statistics, we can prove that the outcome of the first flip does not effect the outcome of the second. If one of them is 1, there is still a 50% chance of the other being 1. Same chance of the other being 1 as when the other is 0.
Your logic is based on probabilities after the coins are already flipped. Of course, after the coins are already flipped, the probability of them being other than what they already landed is 0.
Re:93% of Programmers Think You're Wrong by jellyfrog · 2010-01-09 16:30 · Score: 1

0.
If only one of them is heads, the one that isn't heads is... not heads.
Unless of course you meant "the first one is heads" in which case the second one has a 50% chance of being heads.
Or if you meant "At least one is heads" the answer is 1/3
Re:93% of Programmers Think You're Wrong by Hal_Porter · 2010-01-09 16:58 · Score: 2, Insightful

He has got a point that Computer Science graduates do value logic and reason (or less charitably bullshit) over evidence and observation.
In fact one of the best CS books I've ever read was "Computer Architecture. A Quantitative Approach" by Hennessy and Patterson precisely because all the rules of thumb in it were backed up by measurements.
Then again most CS types have realised that with a bit of Google assisted cherry picking of the statistics they can pretty much prove any of their preconceptions to be true, i.e. their favourite ultra high level language just happens to be "potentially just as fast or faster than C++, the problem is that most people don't have the skills to do it". Sigh. It's one thing to say you like a language subjectively and are more productive in it, quite another to claim it is fast when most measurements say it is just isn't.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:93% of Programmers Think You're Wrong by tconnors · 2010-01-09 17:16 · Score: 1

"Lies, damn lies and statistics" is all you need to know about statistics.
If you get fooled by politicians who lie by abusing statistics, then that's a pretty good sign you don't undestand statistics and need to learn more about it.
Re:93% of Programmers Think You're Wrong by Dwonis · 2010-01-09 17:21 · Score: 4, Insightful

Thing is: You can only be expert in ONE of them. Period.
Hundreds of cryptologists prove you wrong.
Re:93% of Programmers Think You're Wrong by LBt1st · 2010-01-09 17:29 · Score: 1

//If I flip two coins and one of them is heads, what are the odds the other one is also heads? if (coin_1 == flipped && coin_2 == flipped) { if (coin_1 == heads || coin_2 == heads) { //Odds of other coin being heads is 50%. } else { //Neither coin was heads. } }
Re:93% of Programmers Think You're Wrong by fj3k · 2010-01-09 17:33 · Score: 1

Using your example you should have interpreted the #-# as the first number is the one you know and the second number is the one you don't know. Thus you cross out the 0-0 and 0-1 because the known number is 1 and those have the known number as 0. So then you're left with the 1-0 and 1-1 as the possible ones so the probability is 50%

--
Two men claimed to have walked into a bar. Only one had the bruises to prove it.
Re:93% of Programmers Think You're Wrong by genner · 2010-01-09 17:37 · Score: 2, Funny

then realized I hadn't even looked at the short skirt-wearing protagonist.
That sound you just heard was a million slashdotters clicking on that link at the same time.....
except me since I'm familiar with the book in question and realized long ago that she has sharp knees,
Re:93% of Programmers Think You're Wrong by genner · 2010-01-09 17:39 · Score: 1

Everyone knows that 98.2% of all statistics are made up on the spot.
Duh, 74.2% of all people already know that.
Re:93% of Programmers Think You're Wrong by Nazlfrag · 2010-01-09 17:50 · Score: 2, Funny

Oh and by the way he's a hit with the ladies! He never has problems with them (well he is a dashing 6'2" *swoon*) and he's just such a nice guy too.
Re:93% of Programmers Think You're Wrong by donaldm · 2010-01-09 17:50 · Score: 1

Everything I needed to know about statistics I learned playing poker.
Err no! what you are talking about there is "Probability" not "Statistics". There is a difference :)

--
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
Re:93% of Programmers Think You're Wrong by donaldm · 2010-01-09 18:19 · Score: 2, Insightful

Thing is: You can only be expert in ONE of them. Period.
You can easily be expert or well informed in more than one field.

I for one, choose CS. Waaayy more interesting, and compared to the nerdiness level of statistics, we look like Joe Sixpack coming to the club in his sports car, with two girls in the back. ;) If I want to do statistics, I can always hire someone.
I suppose if I really want programming done I can hire someone. There problem you have here is trusting the person you hired to have done their job properly so you want to have some understanding of what is actually required.:)

If you are a consultant you have to have an understanding of all the fundamentals that are required to get the job done. You don't have to be an expert in all fields but you have to be able to communicate with the people that are giving input and if that requires learning what can sometimes be a difficult field then so be it.

Any type of computing requires knowledge of "Numerical Analysis", "Statistics and Probability", "Logical thought" and surprisingly "Art". You also should be open to input from a wide variety of sometimes conflicting ideas and have to the ability to determine what is the correct solution rather than just a solution as well as having the ability to reason and sometimes compromise with all parties. This is actually called human communication (sometimes diplomacy) and no one would say this is an easy thing to do.

--
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
Re:93% of Programmers Think You're Wrong by Daniel+Dvorkin · 2010-01-09 18:23 · Score: 3, Insightful

You can only be expert in ONE of them. Period.
[shrug] Depends on how you define "expert," I suppose. I have one MS in CS and another in biostatistics, and am currently working on a PhD in bioinformatics, where I use the knowledge I've gained in both fields pretty much every day. If you think CS is "waaayy more interesting," that's fine for you; personally I find them equally interesting and valuable.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 18:33 · Score: 1

the problem would then read "if I flip 2 coins and the first one is heads, what is the probability the second one will be heads too ?", not "If I flip two coins and one of them is heads, what are the odds the other one is also heads ?".

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 18:41 · Score: 2, Informative

I'm not assuming anything, just reading the question correctly: The question is NOT "if I flip two coins and THE FIRST ONE is heads..." (answer would then indeed be 50%), but "If I flip two coins and ONE OF THEM is heads..."
I'm listing all 4 combinations for 2 flips, and out of the 3 that satisfy the prerequisite ("one of them is heads") counting how many combinations turn up with the other one also being heads. There's one out of 3 possibilities, so that's 33%.

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 18:55 · Score: 1

2 coins, 2 states for each coin... it's very easy to build the combinatory tree I built.
I think you guys are misreading the question.

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 19:13 · Score: 1

You naysayers made me check my answer on the interweb. I'm right. Which is kinda funny: your horrified reactions... apply to yourselves :-)

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by cskrat · 2010-01-09 19:33 · Score: 1

The problem with developers and statistics isn't that we don't appreciate how complex statistics can be, it's that we don't know what to do with the pretty charts and graphs statisticians produce for us. Our primary concerns usually have nothing to do with the standard deviation of transaction times. We're concerned, first, with the correctness of the response and, second, whether or not it fails gracefully. There are, of course, exceptions in real time applications.
Most performance graphs are not going to have a nice bell curve to them anyway. They're going to bias heavily at the minimum time to follow the shortest logic path and work their way up. There will be spikes and plateaus where different paths are taken through the code logic. There will be outliers where something failed in the code. There will be outliers where something random happened between point A and point B ( a packet collision on the way to the database, a bad cache miss that will correct itself, garbage collection triggering, etc.). And there will be situations like what TFA's author is actually looking for, load based failures. Standard deviation is useless if you're not working with samples that have a normal distribution.

If the problem is that one out of 1000 queries is taking a minute to return instead of 0.1 seconds, then using the std deviation to describe the problem is nonsense. It is not a Gaussian distribution!
Outliers like that indicate to me that something broke for that sample. If possible, I'll investigate whatever request caused that outlier so that I know whether I'm facing a code/logic bug, a known low performance request, a load based failure or just some random influence outside of my control. Just to clarify, a known low performance request would be something like a database query calling a stored procedure to generate a large report.

--
My God! It's full of eval()'s.
Re:93% of Programmers Think You're Wrong by Casual+Maritime · 2010-01-09 19:41 · Score: 1

Congratulations on being an idiot.
Re:93% of Programmers Think You're Wrong by mr_walrus · 2010-01-09 19:55 · Score: 1

hundres out of a population of how many?
thats, um, statistically insignificant! :) :)
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-09 20:06 · Score: 1

that one, too.

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by tsalmark · 2010-01-09 21:17 · Score: 2, Insightful

I think the conversation has devolved into a language issue: a. what is the chance of two events happening. b. after some trigger, what is the chance of one event happening. 33.3%, 50%. can I go to bed now?
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-09 21:58 · Score: 1

No, I am correct.
If you have a prerequisite that one of them is heads, then you can't include in the population options that include no heads. In this case, you have either heads-tails or heads-heads. As I pointed out, its 50%, not 33%.
Note that order is irrelevant, but if you include order, its heads-heads, heads-heads, tails-heads, or heads-tails, which is also 50%. There are 2 sets of heads-heads because the prerequisite is that one of them is heads, and you have to include for order; ie if the first one is heads, the second is either heads or tails, and if instead the second one is heads, then the first flip is either heads or tails. Anyone who modded you up is an idiot.
Re:93% of Programmers Think You're Wrong by something_wicked_thi · 2010-01-09 22:19 · Score: 1

Idiot he may be, but only because he's so busy trying to explain the right answer to people like you. I initially misread the question as specifying the first coin was heads. If you read the question closely, his answer is obviously and intuitively correct. If you can't see it, try it yourself. It's easy enough to do so and then come back and apologize when you've found him correct.
Re:93% of Programmers Think You're Wrong by Filip22012005 · 2010-01-09 22:20 · Score: 1

No, but it can be irrelevant.

--
When the policeman of the tie, rule you violate, hello punishment of the kitty?
Re:93% of Programmers Think You're Wrong by tenco · 2010-01-09 23:26 · Score: 1

Thing is: You can only be expert in ONE of them. Period.
You can easily be expert or well informed in more than one field.
You think it's possible to be an expert in physics and a second field? Most physicists i know are an expert in a sub-field of physics and only well informed about other sub-fields. I guess this is also true about CS, where experts in applied CS may be only well informed about theoretical CS. There are sure some of them experts in all physics or CS, but i don't think this comes easily.
Re:93% of Programmers Think You're Wrong by bruce_the_loon · 2010-01-09 23:36 · Score: 1

Spiking vs non-spiking is something pretty easy to see when you glance at the data.

And you are one of the programmers he would be happy to work with. You understand looking at spiking in the data. He's moaning about those idiots who take =avg(A1:A1000) and say it is fine based on that and that alone.

--
Trying to become famous by taking photos. Visit my homepage please.
Re:93% of Programmers Think You're Wrong by ShakaUVM · 2010-01-10 00:07 · Score: 1

>>He has got a point that Computer Science graduates do value logic and reason (or less charitably bullshit) over evidence and observation.
Do they? In my experience, you tend to both have a logic model for how your performance should behave, and you test extensively to see how it works out in practice. If you just have observation and no theory, you'll never be able to understand how sometimes when you add more CPUs to a system, you get better than linear (superlinear) speedup. It's supposed to be impossible!
And likewise, if all you have is theory, then all sorts of things will come back to bite you in the ass later, as ignoring constant time factors is fine for theory, but can be really really bad in practice.
Every time I'd finish a program of bloc of code, I'd run it hundreds or thousands of times with different parameters and configurations. Did I pull the number of runs out of my ass (as the article complains about)? Yes. Would the author be able to do better? No. He's assuming he knows a priori what all the timing measurements are, which is exactly what we're trying to figure out!
You can't use the "stddev" (which is bad stats anyway for non gaussian curves) to estimate the number of samples you need when you don't know this number yet.
Re:93% of Programmers Think You're Wrong by Cassius+Corodes · 2010-01-10 00:17 · Score: 1

I think you need to think about that more carefully - try it in real life and see if the statistics match up with observation.

--
Control is an illusion, order our comforting lie. From chaos, through chaos, into chaos we fly
Re:93% of Programmers Think You're Wrong by Cassius+Corodes · 2010-01-10 00:29 · Score: 1

Took my own advice - turns out you are right - kudos.

--
Control is an illusion, order our comforting lie. From chaos, through chaos, into chaos we fly
Re:93% of Programmers Think You're Wrong by jakuaii · 2010-01-10 00:56 · Score: 1

You, Sir, are a troll. Here, have some food.
Assumed you have fair coins, then the chance that a toss of a coin is head or tail is exactly 50 % or 1/2, no?
So, if I toss 2 coins, then I have the following probabilities:
p_HH = 1/2 * 1/2 = 1/4
p_HT = p_TH = 1/2 * 1/2 = 1/4
p_TT = 1/2 * 1/2 = 1/4
So, basically, each of the four outcomes of the two-coin experiment has an equal chance of occurring, at 25%.
My assumption is that you don't have fair coins, if you actually flipped coins in reality. Most aren't, I seem to have heard.
Re:93% of Programmers Think You're Wrong by jakuaii · 2010-01-10 01:31 · Score: 1

After some meditation, I'd like to apologize. You are not a troll, I think you are just expressing yourself imprecisely.
From the probabilities given above, it's obvious that the probability in total that for two tosses, at least one of the two coins is a head is 75%.
So, if we only look at the cases where at least one coin is a head, what is the probability that both coins are head? It's of course 33% ! (of the 75% of the total, which is 25% of the total). I guess this was what you were aiming at.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-10 01:32 · Score: 1

you're saying that in the "tails-heads" situation, you don't satisfy the "one of them is heads" prerequisite ?"
and that in the "tails-tails" situation, you do ?

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by jakuaii · 2010-01-10 01:39 · Score: 1

And another apology to the apology. I re-read your post I commented at and saw that you were expressing yourself quite clearly, I was just confused from the previous posts.
However, I hope that the posts are useful anyway, as I tried to state my process of thought.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-10 01:46 · Score: 1

copied from the web:
The correct expression is:
P(H¦H)=P(HH)/(P(HH)+P(HT)) = (1/4)/((1/4)+(1/2)) = 1/3.
also:
There are four equally likely outcomes: HH, HT, TH, TT. The last must be excluded as we know that at least one is a head. That leaves HH, HT, TH. So there is a 1/3 chance of HH.
or
The real question, which is disguised within the scenario, is "What is the probability getting two heads?" The statement, "one of the coins came up heads," means ONLY that TT did not happen.
This leaves three possibilities: HT, TH, HH.
HH has a probability of .3333"

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by jvin248 · 2010-01-10 02:56 · Score: 1

Statistics is a difficult field, 'to get right', and it does take years to master...

Since people do spend years at it, and have varying levels of understanding all along that path, there is a lot of room to argue minutia. Now mix that with years in a Manufacturing Quality function and you get into all kinds of fun stuff with all levels of people and all kinds of opinion.

In the end you have to have Functional Statistics to convince the regular herd of non-statisticians. That means simple comparisons .. "this feature will only kill 3% of the users". People understand that.

Confidence intervals, X-bar charts, sufficient sample sizes, and so on get non-specialists confused - if it's killing 2.5% or 3.5% of the users you've still got a basic problem with the features and someone better get it fixed fast.

So go with Functional Statistics and find a copy of the book 'how to lie with statistics' so you can understand who's slanting a news article one way or the other ("look, 97% of our users of this feature survive!").
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 04:01 · Score: 2, Funny
Everyone knows that 98.2% of all statistics are made up on the spot.
From this we can see that 98.2% of that statistic was made up on the spot, meaning only 1.8% of all statistics are really made up on the spot. By repeated application of this we can conclude that either:
- A: statistics made up on the spot asymptotically reaches zero
- B: my skills in statistics are woefully inadequate.
My god, TFA is right!
--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by redalien · 2010-01-10 04:05 · Score: 1

He's right, it's a classic problem usually phrased as the gender of two children.
The question is NOT flip two coins, the first is heads, what's the second. You're effectively using the two coins to generate a random number between one and four, then your information about at least one being heads narrows it down to exclude the case of tails on both.
Given tails on both is excluded, there are 3 possibilities left, two with a tails one with a heads.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 04:15 · Score: 1

There are four possible combinations of heads and tails. If one coin is specified, only one combination is ruled out. The remaining combinations are: first coins is heads, second coin is heads, or both are heads. And that is the answer to the question that was asked, rephrased here as: out of the combinations of two tossed coins, how many include at least one heads.
Note that although they might be tossed at the same time rather than in sequence, it doesn't say the coins are interchangeable. Always be careful with probability quizzes. The difficulty is often in understanding the question itself. The precise wording is important!

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 04:21 · Score: 1

Correct, but not relevant. The question AC asked is a trickster question, meant to trip people up.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by BrokenHalo · 2010-01-10 04:37 · Score: 1

Trouble is, you can say anything you want with the simplest of statistics. Like this:
20% of all traffic accidents are caused by drunk drivers. Therefore 80% must be caused by sober drivers. Therefore, you're safer if you drive drunk than sober. ;-)
Re:93% of Programmers Think You're Wrong by pyronordicman · 2010-01-10 04:39 · Score: 1

In all seriousness, I took a great statistics course in grad school from the author of this book. The book covers much about statistics that is relevant to computer scientists & engineers, namely statistically valid benchmarking and performance measurement.
Re:93% of Programmers Think You're Wrong by jbengt · 2010-01-10 04:54 · Score: 1

By another possible reading, if "one of them" is heads, then not "two of them" are heads, so there is 0.00% probability that the "other" one is heads.
Really, the problem in comprehension here is in the potentially ambiguous or unclear expression and interpretation of the natural language explanation of the problem, not in the understanding of the statistics.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 04:57 · Score: 1

Of course not. Are you illiterate?
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 04:59 · Score: 1

Thats not the question.
This was the question:
If I flip two coins and one of them is heads, what are the odds the other one is also heads?
Try reading back through the posts next time. The wording is important, and you got it wrong.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-10 05:21 · Score: 1

"one of them" != "one and only one of them", plus that's the only way the question is of any interest, otherwise the answer would be "well, zero, duh".

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-10 05:29 · Score: 1

I'm trying to figure out your reasoning, even wrong trains of thought are interesting sometimes. Even your combinations tree does NOT make sense to me, 2 x "HH" but 0 x "TT", and one each "TH" and "HT"... I'm besumed by how one can reach that conclusion.
The for possibilites are, again
Tails-Tails
Tails-Heads
Heads-Tails
Tails-Tails
We discard the first one, because it doesn't satisfy the "one of them is Heads" criteria. We're left with 3 combinations in which "one of them is Heads". Out of these 3 combinations, only in one is the other one also Heads. One out of three is 33%, QED.
Are you Immathemacitulate ?

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-10 05:30 · Score: 1

Oops, the last one should read "heads-heads", not Tails-Tails". Sorry.

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 05:40 · Score: 1

There are two possibilities here
1 - You are a troll, in which case, no one loves you, not even your mother
2 - Your are the dumbest person to ever post on /., and you need a 3rd grader to explain to you how to read.
If 2 is the case, please notice that the prereq is that one of the flips must be heads.Therefore, tails-tails is not a viable choice, because neither is heads, and you know that one of them must be heads. I have already explained this on 3 occassions.
If you respond again, it better be an apology for wasting my time and recognition of how daft you are, along with proof that you have signed up for a statistics course and a remedial logic course at your local community college.
Re:93% of Programmers Think You're Wrong by Arthur+Grumbine · 2010-01-10 05:43 · Score: 1

I would sooooo pay money to see the look on your face when you finally realize the error of your ways (especially with how vehement you have become). Here it is, as clear as I can make it for you:
All possible scenarios when flipping two coins:
Case #1
Coin #1: Heads
Coin #2: Heads
or
Case #2
Coin #1: Heads
Coin #2: Tails
or
Case #3
Coin #1: Tails
Coin #2: Heads
or
Case #4
Coin #1: Tails
Coin #2: Tails

Cases #1, #2, and #3 all satisfy the condition "one of them is heads". Of these cases, which are all equally likely, only #1 satisfies the second condition. One out of three equally probable cases which together account for all possible cases. That's 33%. I do not know how to make this any more clear.

--
Now that I think about it, I'm pretty sure everything I just said is completely wrong.
Re:93% of Programmers Think You're Wrong by Arthur+Grumbine · 2010-01-10 06:06 · Score: 1

For every time that one of the flips of the two is heads, and the other is tails, I will pay you $1. For every time that one of the flips is heads, and the other is tails, you will pay me $1.25.
Cognitive dissonance much? Or do you actually expect me to pay you $1.25 at the same time you pay me a $1.

--
Now that I think about it, I'm pretty sure everything I just said is completely wrong.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 06:12 · Score: 1

I was deliberately paraphrasing, which I indicated. But just in case the sibling comment is not enough, I'll add this final explanation.
Two coins are flipped. In essence, this is the same as two separate coins placed in a random orientation on a table and then covered. What you're probably thinking of is that one of these is uncovered, revealing a heads. In that case, the probability of the other one being heads is 50%.
But that was not the question at all. Instead, both coins are left covered and the questioner merely states that one of them is heads, which is not the same situation at all, because you are still in the initial position of not knowing the particulars of any one specific coin.
And if you really want to get to the bottom of this, try programming the problem and run it a few times. Consider the central loop body:

if (coin_X == tails && coin_Y == tails) continue; flips++; if (coin_X == heads && coin_Y == heads) success++;

If the coins are randomly flipped, and remembering to completely discard any case where neither coin is heads (as specified), it's easy to see that the if-statement will be true in exactly 1/3 of cases.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by InterStellaArtois · 2010-01-10 06:12 · Score: 1

That's how it seems to me. It's the difference between asking "what is the probability of flipping 2 heads in a row", and "what is the probability of flipping heads, given that I just flipped heads on a previous trial".

The first case involves the combination of 2 coins (4 possibilities), the second involves the state of only one (who cares what the previous one was?)
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 06:19 · Score: 1

That is not true.
Run your program, and you will see that it is 50%. For the same reasons that I have pointed out repeatedly. You are also wrong. This was not the original question. However, if you know that one of the coins hits heads by default, you dont know which one, and order in this case would matter.
Therefore, reread any of my posts, run your program, and apologize.
Re:93% of Programmers Think You're Wrong by obarthelemy · 2010-01-10 06:26 · Score: 1

I give up... methinks you'll never get it

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 06:41 · Score: 1

When calculating probabilities, language and specification is always an issue.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 08:13 · Score: 1

Consider what probability is asked for. The probability that is asked for is for the situation where two coins are flipped and one is heads. Original wording is thus: "If I flip two coins and one coin is heads...". One coin being heads is part of the assumption, a "default" as you put it, and we indeed don't know which one.
Furthermore this bit of Java, containing the copypaste from before, spits out roughly 0.33 :

for (int i = 0; i < 10000; i++) { coin_X = random.nextInt(2); coin_Y = random.nextInt(2); if (coin_X == 0 && coin_Y == 0) continue; flips++; if (coin_X == 1 && coin_Y == 1) success++; } System.out.println("success ratio: " + ((double) success) / (double) flips);

Feel free to run it.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 08:32 · Score: 1

Swing and a miss.
This is the original wording: "If I flip two coins and one of them is heads, what are the odds the other one is also heads?" It does help to understand the question before you try to solve the answer.
If you find that coin X is tails, you dont know that one of the flips is heads. You can't use the data unless you already also know that coin Y is heads. You have to start with either coin X or coin Y is known to be heads. Then you flip the other and see if its heads or tails.
Sorry, but you are wrong.
Fix your code, you will see that it is 50% chance of two heads and 50% chance of 1 heads, 1 tails.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 08:51 · Score: 1

If you find that coin X is tails
But we know one of the coins is heads, it says so in the question. The master of the game says that one coin is heads, and then that's the way it is. Otherwise it wouldn't be relevant to the question.
Also, you don't have to start with any coin at all. You simply ask yourself while looking at two covered coins: what is the probability that, if one is heads, the other is heads too. The question does not mention looking at any one coin first, and then deciding.
Regardless, how would you fix the code to match the question?

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by adamdoyle · 2010-01-10 09:07 · Score: 1

If you think that then answer the following problem: If I flip two coins and one of them is heads, what are the odds the other one is also heads?
I'm pretty sure you don't have to flip a coin in poker...
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 09:15 · Score: 1

Realize that flipping 2 coins is the same thing as flipping one coin twice. The problem with your logic is that one of the coins has to be heads. If you flip coin X and find that it is tails, you can't claim that coin Y must be heads unless you flip it. This would mean that the results of one flip affect another, which is not true.
Run coins X and Y seperately, keep the values of each flip as a trial. This way, the outscome of coin X doesn't affect the outcome of coin Y, even though we assume that at least one of them will land on heads (it just can't occur because the other does or does not). Search each trial as an if statement for a running total with the following score modifiers;
Start each trial by searching by X:
if X = 1, Y = 0 == fail (-1)
if X = 1, Y = 1 == success (+1)
repeat for Y:
if Y = 1, X = 0 == fail (-1)
if Y = 1, X = 1 == success (+1)
Thus, your results will show
X = 1, Y = 0 will equal (-1)
X = 1, Y = 1 will equal (+2)
If X = 0, Y = 0 will equal 0 *no heads means no trial*
if X = 0, Y = 1 will equal (-1)
Your running score should not be statistically significantly different from 0 (depending on your confidence level) to show that in a continous trial, the odds of both coins landing on heads provided that at least one of them lands on heads is 50%.
To simplify this for you, one of the coins has to be heads. Therefore, you start with one coin heads and dont worry about its outcome (because it has to be heads). It cant change. It doesn't matter if you start with the first coin being heads or the second coin being heads, as long as one of them is predetermined to be heads, and you flip the other. The simplified problem is simply what are the probabilities associated with the flipped coin (that doesnt necessarily have to be heads). Its the same question as - If the next car that comes down the street is primarily red, what is the probability that the next coin flip is heads? You are trying to claim that flipping the coin and getting heads or tails will change the color of the car coming down the street, when we already prefaced that we know what color it is.
If you don't get this, I can't help you.
Re:93% of Programmers Think You're Wrong by ahabswhale · 2010-01-10 09:18 · Score: 1

You can't be an expert in CS and something else. For starters, there are so many topics in CS that merely asserting you're a CS expert is laughable. Second, just focusing on the one thing I do for a living as a programmer is more than a full time job. I keep up with the latest trends and technologies and this is very time consuming. I can't even do all the research (and I use the term loosely) I would like in the limited subjects I'm interested in and I don't even have a wife or kids to deal with. So, to claim multiple subjects of expertise is impossible unless your definition of "expert" is very weak and therefore has no meaning.

--
Are agnostics skeptical of unicorns too?
Re:93% of Programmers Think You're Wrong by redalien · 2010-01-10 09:39 · Score: 1

Do you know which is which?
Re:93% of Programmers Think You're Wrong by Arthur+Grumbine · 2010-01-10 09:48 · Score: 1

What I was saying, and I suggest you look at the bolded words from your, is that you said the exact same thing. You said, if [x] happens you will pay me $1. Then in the very next sentence you say if [x] happens I will pay you $1.25. There was no statement about if "one flip is heads and the other is heads" which you claim happens as often as "one flip is heads and the other is tails". So let me propose a setup that makes logical sense(and makes it very worth your time if you're right):

If you simultaneously flip 2 coins 500 times in succession in a continuous unedited video, I hereby agree to pay you $20 for every flip where both coins are heads, if you agree to pay me $16 for every flip in which one coin is heads and the other is tails. No one pays either person when both are tails. Since you believe that these likelihoods (heads-heads vs heads-tails/tails-heads) are equal, this should net you (in the long run) $4 for every two flips (disregarding the flips where both are tails). This gives you the opportunity to make $1000 by doing this experiment (which shouldn't take more than a 1/2 hour to setup and 1 1/2 hrs to film). The video must be posted to youtube to allow the /. community to witness the results and ensure the integrity of the video.

--
Now that I think about it, I'm pretty sure everything I just said is completely wrong.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 10:02 · Score: 1

That would be incorrect for multiple reasons, nevermind that you cant have a continuous youtube video be an hour and a half in length.
We have to assume that one of the coins is heads, as is prefaced in the original question, and then flip the other. Under that condition, just send me the $1000.
Just acknowledge that you are wrong, and move on with your life.
Re:93% of Programmers Think You're Wrong by Devout_IPUite · 2010-01-10 10:16 · Score: 1

I'm quite sure we would hate each other actually. He sounded pretentious without proving to me that he was right. Then I would brush him off and say "I don't fucking care, it's a user, make it click some shit and we'll see how it does with a thousand of these at once".
Re:93% of Programmers Think You're Wrong by Arthur+Grumbine · 2010-01-10 10:37 · Score: 1

The original question:

If I flip two coins and one of them is heads, what are the odds the other one is also heads?
There is no "and then flip the other" in the original question. English may not be your first language so I'll break the original question down for you:
The first clause "If I flip two coins" is proposing the completion of an action. Note that the proposed action is not "I flip a coin twice" (although this would still result in the same 33% likelihood) nor is it "I flip a coin after a previous coin flip that resulted in heads". The action is the flipping of "two coins".
The second clause provides a condition, "one of them is heads". This means that only those flips (of two coins, remember) will be considered in which at least one of the coins is heads.
So you see, my proposal/experiment reflected exactly the original question. Your "I'm gonna flip a single coin twice, but since one of the results has to be heads I'm only gonna start count second flips after a first flip of heads." is actually completely different from the original question.

--
Now that I think about it, I'm pretty sure everything I just said is completely wrong.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 10:43 · Score: 1

No.
You are calculating the odds of the other one being also heads, predicated on the first being heads. The order doesn't matter.
You flip 2 coins. One of them is heads. There is no probability of it not being heads. Therefore, the odds of the other one being heads are not affected.
You should try learning english, its a great language once you get the hand of it.
Re:93% of Programmers Think You're Wrong by Arthur+Grumbine · 2010-01-10 11:31 · Score: 1

You flip 2 coins.
Good so far. Let's call them A and B.

One of them is heads.
Still going good. Now this could be A or B and we don't know which it is. If both A and B are heads then this "one of them" could be either of them. However if A or B is tails, then this "one of them" will be "the one that is not tails". And this will happen 2/4 of the overall flips, and 2/3 of the flips which result in (at least) one of the coins being heads.

There is no probability of it not being heads.
Right.

Therefore, the odds of the other one being heads are not affected.
Not quite, but thanks for trying the game of "logic". Seriously, though, just get two coins and flip them 30 times. If neither are heads then don't count it. However, if they are not both tails (i.e "one of them is heads") pick one that is heads and mark down what the other one is. Seriously, dude. Just do it. 30 times. It'll take less time than your next post.

--
Now that I think about it, I'm pretty sure everything I just said is completely wrong.
Re:93% of Programmers Think You're Wrong by fbjon · 2010-01-10 11:47 · Score: 1
Realize that flipping 2 coins is the same thing as flipping one coin twice. The problem with your logic is that one of the coins has to be heads. If you flip coin X and find that it is tails, you can't claim that coin Y must be heads unless you flip it. This would mean that the results of one flip affect another, which is not true.
But there's the rub. I don't claim that coin flips would ever affect each other directly, but the phrasing of the question causes them to indirectly affect each other if they're both tails. Let's flip one coin and see what happens:
- Coin 1: tails - no heads yet
- Coin 2: heads - one heads, but the first one was tails
- Coin 1: heads - Bingo!
- Coin 2: tails - damn
- Coin 1: heads
- Coin 2: heads - both heads, success
- Coin 1: tails - no heads yet
- Coin 2: tails - no heads at all, so no trial
Therefore, you start with one coin heads and dont worry about its outcome (because it has to be heads). It cant change. It doesn't matter if you start with the first coin being heads or the second coin being heads, as long as one of them is predetermined to be heads, and you flip the other.
I agree.

The simplified problem is simply what are the probabilities associated with the flipped coin (that doesnt necessarily have to be heads)
Right. So there are two possibilities, either one is heads or the other. As a consequence, there are two possible ways that one coin is heads and the other is tails (as you've said). But there is only one possibility for both coins to be heads.
Now this all obviously assumes the coins to be distinct. If they're completely interchangeable, then it wouldn't make sense to talk about one or the other, and the probability collapses to 50%.

You are trying to claim that flipping the coin and getting heads or tails will change the color of the car coming down the street, when we already prefaced that we know what color it is.
It wouldn't change the car, but it would change the trial. Besides, that simplification's not quite the same. You'd have to say "if the car is red or the coin is heads", for it to be the same.
--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 12:12 · Score: 1

You are still wrong. It doesnt matter if it is coin A or coin B, one of them is already heads. We are asking about the probability of the other coin. Stop trying to answer a different question, because that is not the issue. You are wrong, I am done with you.
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 12:20 · Score: 1

No. Just no. You are wrong.
How do you not get this and still have figured out how to use a computer?
Dear G-d, its a simple question. You flip a coin, its either heads or tails with equal probability. The other coin is already heads. Move on.
If you insist on screwing it up and claiming that the second coin was the one that had to be heads, than the first coin still had a 50% chance being heads or tails while the second coin had the 100% chance of being heads. If you don't get it now, don't post again.
Re:93% of Programmers Think You're Wrong by Gendou · 2010-01-10 12:44 · Score: 1

I think I can help clarify this for you since you seem to be the only one to still be having trouble understanding this.
Two coins are flipped. In the absence of any other information, there are four possibilities:
Heads, Heads: 25%
Heads, Tails: 25%
Tails, Heads: 25%
Tails, Tails: 25%
Then we receive some new information: at least one of the coins is Heads. That rules out the last option. Let's recalculate the odds based on the new information:
Heads, Heads: 33.3%
Heads, Tails: 33.3%
Tails, Heads: 33.3%
Now, let's look at the question (reworded slightly to hopefully make it less confusing for you): "Two coins are flipped. At least one of the two coins lands Heads. What are the odds that both coins landed Heads?"
In the first instance (33.3%), both coins landed heads. In the second and third instances (combined 66.7%), both coins did not land heads.
So the answer is 1/3 (33.333...%)
You can verify this with some actual coins. Flip two coins, then if either coin is heads, check to see if the other coin is heads. Keep a tally of how often the other coin is or isn't heads. If you haven't actually flipped coins, you're just talking out your buttocks.
I don't know how else to help you if you're still struggling.
Re:93% of Programmers Think You're Wrong by Gendou · 2010-01-10 13:00 · Score: 1

Unfortunately, I have some bad news for you -- this is actually a well-documented mathematical puzzle, and there's even a Wikipedia article on it.
Similar to the Monty Hall Problem, almost everybody assumes 50% at first, since it seems natural and intuitive. When the question is stated unambiguously (the version at the top of this thread was admittedly not very clear), the answer really is 33%, provable both by basic math and by actual testing. The purpose of the problem is to see if someone can admit that he's wrong when he's confronted with logical and empirical evidence. This is often used during job interviews. Needless to say, you wouldn't be getting the job.
See also Bertrand's Box Paradox or the Three Prisoners Problem for similar puzzles.
Re:93% of Programmers Think You're Wrong by Idiomatick · 2010-01-10 13:05 · Score: 1

"I mean, he said we're going to die if we don't learn statistics, but he never gave a compelling argument for it."

Sure he did, the article was entitled: 'Programmers Need To Learn Statistics Or I Will Kill Them All' ... I felt compelled that he was a loon.
Re:93% of Programmers Think You're Wrong by Idiomatick · 2010-01-10 13:13 · Score: 1

In some schools statistics are a commonly taken math credit to go with a CS degree. When the CS people do them they are hilariously easy, most CS majors ace them. I think that leads programmers to think that stats are easy. May or may not be true, maybe it picks up in 3rd year, maybe my school is an aberration. Maybe stats are just easier than CS in the general case.

Likely this guy is just a douche though.
Re:93% of Programmers Think You're Wrong by Hal_Porter · 2010-01-10 17:47 · Score: 1

Well maybe I've just worked with some awful programmers. Don't get me wrong - a minority of people do do things properly. Another minority will stubbornly stick with an awful solution. Most lie somewhere in between, but they are too loaded with work to do much in the way of analysis.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:93% of Programmers Think You're Wrong by kramerd · 2010-01-10 17:56 · Score: 1

Nope, you are also wrong, because they situation you describe is not the original question. Read it again, I'm getting depressed at how many people on slashdot don't get simple things.
The question is 1 coin is heads, what is the probability that the other coin is heads. In other words, your girlfriend is pregnant. What are the odds that my girlfriend is also pregnant? Since we have never met each other, it doesn't matter if you never use birth control, or if you and your girlfriend enter into as many orgies as you can find, and you sleep with homeless people in the hopes of getting knocked up. It doesn't affect my girlfriend if you never sleep with her. Since one coin is heads, it doesn't matter that tails-tails exists. Tails-tails doesn't exist in the question, because one coin is heads (just like one of our girlfriends is pregnant, and by definition of different coins, we are dating different people). After one coin is heads, the other coin is random, just like flipping any coin (dont be pedantic, you arent flipping a double sided coin, except for the first coin, which could be coin #2, because one coin is guaranteed to be heads, it's not random, whereas the other is random). If you think that a coin that is required to be heads could be tails, then while every life is precious, your death would not be a loss to humanity.
While I have made it as stupidly simple as it should get, lets look another way.
You have 2 coins. One of them is heads. Lets call the coins coin 1 and coin 2. If coin 1 is heads, then coin 2 is not affected by coin 1, and has a 50% chance of being heads or tails. This makes the overall outcome either heads-tails or heads-heads. If coin 2 is guaranteed to be heads, than the outcomes are either tails-heads or heads-heads. Oh look the overall odds of heads-heads are 50%, like I have explained on 6 occasions in the past 24 hours, you (most insulting thing you can think of here).
How can all of you be this stupid. Read, think, and then don't post, because dammit, you are wrong.
Answer this: If you flip one coin, what are the odds of heads? If you say anything other than 50%, lets meet in person so I can money off you.
I'm done with this, I'm not responding to anyone else, and for fucks sake, all of you better hire an accountant, because if you cant figure this shit out, you sure as hell can't figure out your taxes.
Re:93% of Programmers Think You're Wrong by MadUndergrad · 2010-01-10 18:54 · Score: 1

http://xkcd.com/169/ is rather relevant here
Re:93% of Programmers Think You're Wrong by LBt1st · 2010-01-10 20:58 · Score: 1

Please see the replies to your own link.
Re:93% of Programmers Think You're Wrong by Gendou · 2010-01-10 23:30 · Score: 2, Informative

I'm not sure why I'm wasting time responding to a troll but whatever.
> The question is 1 coin is heads, what is the probability that the other coin is heads. In other words, your girlfriend is pregnant. What are the odds that my girlfriend is also pregnant?
No, you read it wrong. What it's actually asking is (if we pretend all girlfriends have exactly a 50% chance of being pregnant): "two girlfriends exist. At least one of the two is pregnant. What are the odds that both girlfriends are pregnant?"
You just read it wrong and you're too stubborn too admit that you could ever be wrong, even though this puzzle is FIFTY YEARS OLD and is well documented all over the internet. Just see the Wikipedia article on it.
Re:93% of Programmers Think You're Wrong by Gendou · 2010-01-10 23:32 · Score: 2, Informative

Please see this -- this is a well-known puzzle over 50 years old, and I'm surprised that there are people on Slashdot who weren't familiar with it already.
Re:93% of Programmers Think You're Wrong by ericlondaits · 2010-01-11 01:47 · Score: 2, Insightful

Standard deviation is useless if you're not working with samples that have a normal distribution.
Why? No. You can measure the standard deviation of any distribution, normal or not. And it is what it is, independent of distribution, it tells you how much you should expect samples to deviate from the average.

--
As a Slashdot discussion grows longer, the probability of an analogy involving cars approaches one.
Re:93% of Programmers Think You're Wrong by LBt1st · 2010-01-11 08:05 · Score: 1

That is insightful, I still agree with myself and the above poster that one of the two coins/children are constant before the question is even posed. Thus the other has a 50% probability.
"If I flip two coins and one of them is heads, what are the odds the other one is also heads?"
I just don't see any other way to interpret that question.
Re:93% of Programmers Think You're Wrong by ksp · 2010-01-11 09:59 · Score: 1

I've read it, and it's OK if you have statistics-phobia. There are lots of great books that cover software performance, how to measure it and describe the results. For any relational database, I recommend "Forecasting Oracle Performance". Covers Oracle, but you could use the theory on other brands as well. For statistics in general (and with less cartoons), "Statistics Without Tears" is a classic - a gentle introduction to the subject.
When you deal with intense multi-user such as web applications and relational databases, I feel just like Zed Shaw sometimes... People seriously need to grasp basic statistics.

--
What is the sound of one hand clapping?
cat /dev/null > /dev/audio
Re:93% of Programmers Think You're Wrong by Matumio · 2010-01-14 22:44 · Score: 1

because they know they're vaguely supposed to do a run multiple times, and maybe average the results or something.
Taking the minimum of multiple runs is often better than the average.
That's what programmers really should know about statistics. And if you don't think this applies to you, then take both average and minimum for your usual number of runs. Repeat 10 times and check which number is more stable (that is, has the smaller variance).
Re:93% of Programmers Think You're Wrong by ShakaUVM · 2010-01-15 04:40 · Score: 1

>>Repeat 10 times and check which number is more stable (that is, has the smaller variance).
Yeah, it'd probably be less susceptible to interference from load or other confounding factors. But sometimes we do need to see those effects in the timings... in a large parallel system, for example, if one computer occasionally bugs out and runs 10x as slow, and you have a barrier in your code, meaning everyone has to wait on the slowpoke, we want to capture that, so we can deal with the problem, whereas a minimum timing would ignore it.

Percent probability that Zed Shaw is a jerk by Anonymous Coward · 2010-01-09 11:38 · Score: 5, Funny

110%.

Re:Percent probability that Zed Shaw is a jerk by kandela · 2010-01-09 13:48 · Score: 4, Funny

And by that you mean 110% +/- 10% (95% confidence interval) right?

--
Conservation of angular momentum makes the world go round.
Re:Percent probability that Zed Shaw is a jerk by Bourdain · 2010-01-09 14:01 · Score: 1

with a P 0.05 too I hope?
Re:Percent probability that Zed Shaw is a jerk by tsalmark · 2010-01-09 21:23 · Score: 1

is not "P = 0.05" equivalent to a "95% confidence interval"?
Re:Percent probability that Zed Shaw is a jerk by Anonymous Coward · 2010-01-10 00:45 · Score: 1, Informative

Imaginary conversation with statistician:
- So, you say that my measurements are phony because I do not have confidence intervals plotted?
- Yes, it is very unscientific!
- And what does confidence interval of 95% means? It means that there is 95% probability that the value is in that interval?
- Well, no, there is only 95% likelihood.
- And what does likelihood mean?
- Er... It is... It is a function you know, f(y) = P(Y=y|X=x) for random variables X and Y
- So it is not a probability?
- No it is not.
- So it does not guarantee anything, as it is pretty meaningless from a practical viewpoint. I would need prior probabilities to be able to this number it as probability.
- Well...
- Does the confidence interval assume that my error distribution is gaussian?
- Yes, of course, that is pretty standard.
- And what if it is not?
- It is unlikely.
- Why? What guarantees existing moments and nicely behaving distributions in real world? I do not see the axioms of probability prohibiting this.
- No, but these are degenerate cases.
- Are there methods to make sure that my distribution is not some heavy tail one, but a light tail one?
- No.
- Is it true that gaussians are frequent?
- Yes because of central limit theorem.
- Then Cauchy distribution must be frequent also, as the ratio of two gaussians is Cauchy, isn't it?
- Yes.
- So can you still assert that gaussian is that likely as a generating pdf?
- ...
- So, we have a meaningless number, called likelihood, we have a meaningless section, called confidence, and a phony assumption of gaussian error.
- ...
- And you call my measurements unscientific.

correlation != causation by Hognoxious · 2010-01-09 11:40 · Score: 5, Funny

Correlation != causation. Just repeat that and you don't need to know statistics.

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."

Re:correlation != causation by Anonymous Coward · 2010-01-09 12:06 · Score: 1, Insightful

I think many programmers/managers would be better off with less statistics. I can not tell you the number of times I have seen a major (ie crash the damn app, corrupt data, etc) bugs go into an application because 'statistically no one will ever do that'. It is 100% predictable that someone will do it oh and then you are allowed to fix it.
Misapplied statistics are worse. Like my previous example sure out of say 10 million transactions 1 failed. But guess what? That 1 is just as important as ALL the other ones. So what if it is 1 out of 10 million that it will happen. Thats just math masturbation.
Re:correlation != causation by jc42 · 2010-01-09 13:15 · Score: 2, Interesting

So what if it is 1 out of 10 million that it will happen.
When I hear this sort of reasoning, I like to point out that with modern computers, something that happens only 1 time out of a million can very easily mean thousands of occurrences per day, each of which will get us a support call. This usually ends the discussion really fast, and they agree to properly implementing the "unlikely" edge cases.
I've also heard to observation that in computing, statistical behavior is generally referred to as "bugs".

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Re:correlation != causation by magsol · 2010-01-09 13:50 · Score: 1

Arguably, the anecdotes you've mentioned are reasons why programmers and managers would be much better off with more statistics. You're absolutely right when you say that the 1 failure in 10 million transactions is what matters; for programmers and managers to overlook that as "statistically insignificant" have a very warped and naive grasp of statistics.

--
"I'd just like to emphasise that taking a million years isn't a metaphor here..." -Rich Bradshaw
Re:correlation != causation by gomiam · 2010-01-09 13:51 · Score: 1

I have had to deal with this too. While designing some web applications at the Faculty I used to work at, I had to explain _yet once again_ that, when programming, no unexpected behaviour should really be unexpected. I think that is called defensive programming.
Re:correlation != causation by JWSmythe · 2010-01-09 14:28 · Score: 2, Insightful

You forgot to mention that the 9,999,999 transactions are normal billing transactions, and the one that fails is the batch that actually charges their credit cards. :)

--
Serious? Seriousness is well above my pay grade.
Re:correlation != causation by tsalmark · 2010-01-09 21:33 · Score: 1

I once started work at a company that generated login credentials completely randomly for 100's of thousands of users. I lost every verbal argument with management about the need for a rewrite until I ran a simple SQL query and found three pairs of users sharing the same identity. (I just rewrote the rand routine to loop until unique and gave them the projected failure date, sometime well in the future)
Re:correlation != causation by Anonymous Coward · 2010-01-10 00:06 · Score: 1, Informative

Statistics is like Photography. A subject hard to master, yet taking for instance a photograph of a poor man will not solve poverty
Re:correlation != causation by stygianguest · 2010-01-10 01:47 · Score: 1

I think many programmers/managers would be better off with less statistics.
That's a good point. Perhaps the half baked attempts at universities and colleges to teach CS students statistics, only gives the poor graduates the misguided idea that they know statistics.
I for one had a single statistics course, which was barely enough to get acquainted to the concept of chance. Yet I still fail to grok what a 25% chance of rain tomorrow is really supposed to mean. Yes, a chance of one fourth of rain, or in one out of four cases it will rain. That's very nice, but should I take my umbrella?
Re:correlation != causation by Rallion · 2010-01-10 04:20 · Score: 1

Actually, missing something that simple would indicate that they have a very warped and naive grasp of multiplication.
Re:correlation != causation by siride · 2010-01-10 04:39 · Score: 1

The percent chance of rain problem isn't a problem on your end. The weather folks never define it clearly for the public, so it is really just meaningless. So far as I know, at the NWS, the percent chance is a combination of the amount of forecast area (land) that will expect to see the given precipitation and the forecaster confidence. There are some other guidelines that go into picking any of the 10 categories (0%, 10%, 20%, etc.). In the end, I find it to be meaningless, and instead rely on forecast discussions and model output to determine the nature of the storm and from there extrapolate what "100% chance of rain this afternoon" really means (it might mean 20 minutes of rain for everybody as a thin band of thunderstorm passes through, or it might mean lighter rain all afternoon, or it might mean a lot of showers, but actually no guarantees that any particular location will get rain).
Re:correlation != causation by jc42 · 2010-01-10 05:23 · Score: 1

... I had to explain _yet once again_ that, when programming, no unexpected behaviour should really be unexpected.
This is something that programmers generally understand. But this wording is far too abstract for most managers. It sounds like a vague, feel-good cliche from a self-help book. You need to find words that get across how it effects them.
To a human, "one in a million" sounds very unlikely. But one of the major reasons we use computers is that they are capable of doing millions of operations per second without getting bored and losing their concentration. If you can program a routine operation, a computer can do it exactly the same way endlessly without mistakes. But computers aren't intelligent (despite the efforts of the media and movie industries to convince us they are). You have to program every tiny detail, or they'll get it wrong in the unusual cases. Managers and other non-programmers usually don't understand this.
Even a fairly complex operation in a computer can easily happen thousands or millions of times per day, so "once every million times" could mean many times per day. If you can get across the concept that failures inside a computer at a mere "one time per million" rate could flood them with hundreds or thousands of failures per day, you might convince people that it's worthwhile to pay you to make the "unlikely" cases work right.

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.

Your argument is dead, Zed by BadAnalogyGuy · 2010-01-09 11:42 · Score: 5, Insightful

Maybe the problem is in your presentation. Even here, you tell programmers that you want to kill them for not understanding a topic that even you are unwilling to acknowledge mastery of. Then you tell us how hard the topic is to understand, even though you've spent so much time trying to learn it.

Is it any wonder that no one takes your suggestions seriously? You are practically sabotaging yourself with self-effacement.

These aren't homework problems you're tackling here. They are business problems and you need to sell yourself and your ideas if you want to get any traction. Do you have any evidence that your methods are better than the SOP thus far? Do you have any case studies that show how effective statistic analysis is in *any* of your projects?

Or are you simply taking something that seems like a data point and extrapolating it to cover a vast swath of applications?

Re:Your argument is dead, Zed by Krishnoid · 2010-01-09 11:57 · Score: 4, Funny

Or are you simply taking something that seems like a data point and extrapolating it to cover a vast swath of applications?
Well yeah, that's what he was saying -- statistics!
Re:Your argument is dead, Zed by ihavnoid · 2010-01-09 12:13 · Score: 1, Offtopic

Well, I think this would be the article Zed needs to read:
http://www.joelonsoftware.com/articles/fog0000000332.html
Basically, many programmers feel that everybody else around him(or her) is a stupid asshole. However, if you want succeed, (e.g. have everybody around you learn statistics) you should never, ever, ever make enemies.
Be productive, work hard, listen to others, and try to do the work in the *right way*. Gain respect from yor collegues, and then they will get interested.
Re:Your argument is dead, Zed by superdana · 2010-01-09 12:14 · Score: 4, Insightful

Maybe the problem is in your presentation.

Meet Zed Shaw.
Re:Your argument is dead, Zed by dbIII · 2010-01-09 12:58 · Score: 2, Insightful

It's just the "beige box is the hard drive and the screen is the computer" problem over again. People pretend they know what they are doing and make stuff up and pretend that they are confident that it is real. This really annoys those that do know what they are doing but don't want to appear to be overconfident because they haven't written the textbooks themselves.
Re:Your argument is dead, Zed by arendjr · 2010-01-09 13:24 · Score: 4, Insightful

I don't know Zed Shaw yet, but I think you're right.
The whole problem he is describing sounds like a big ego problem. He himself has a huge ego, and has problems when he runs across the programmers, who often have huge egos as well.
Now, I think he does make a point though. The programmers he is ranting about indeed do sound like assholes, just like he himself is. In order to be a really good programmer (or a good statistics expert) you should also know when to put aside your ego.
Re:Your argument is dead, Zed by Surt · 2010-01-09 13:49 · Score: 1

I bet Joel is around a lot of people.
But my colleagues all seem pretty smart and talented.

--
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Re:Your argument is dead, Zed by Hurricane78 · 2010-01-09 15:35 · Score: 5, Funny

I just found a very old hard disk. Double height. MFM/RLL. And after a “strings -n 32 /dev/hdd”, I got the following old saying, carved in the bytes of the disk:

Computer science
Statistics
Social skills
Choose one.
;)

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Re:Your argument is dead, Zed by pwolk · 2010-01-09 21:34 · Score: 1

I find the essence in Zed's: "I try to show them how ...".
The problem is getting the message across, not the statistics.
Zed, try harder, and most of all, try other approaches. People are funny creatures.
Re:Your argument is dead, Zed by lena_10326 · 2010-01-09 21:44 · Score: 3, Insightful

Basically, many programmers feel that everybody else around him(or her) is a stupid asshole
That's one of the reasons working in IT is not all that satisfying. Many problems have multiple solutions which for the most part are equivalent in function but vary on what they're attempting to optimize for (* see below) yet developers seem to latch onto the solution they thought of and become down right rude and nasty when evaluating a teammate's solution. When every developer assumes he is the smartest of the bunch and all others are morons it fosters an environment where everyone is unwilling to compromise and a 3rd person usually has to step in to break the tie. That leads to a hostile work place where thought battles frequently occur. Losing a battle causes a teammate to become afraid of undue criticism in the future, so the next time around they over engineer the code trying to cover all bases. This leads to large systems that solve fairly simple problems with overly complex implementations. After a few cycles of this, the software is unmanageable, which becomes evidence proving to the developer that his teammates and ones who came before are idiots with no clue, and now it is up to that lone hot shot to bitch about fixing the mess, which of course is accompanied with many nasty critiques and insinuations.
I am a developer with a fairly open mind and I strive to eliminate ego from the workplace by staying on the positive, helpful side, but honestly I'm getting sick of working with people who don't try to do the same.
* Example, solutions can be optimized to target maintainability, readability, CPU/IO performance, availability, reliability, correctness/precision, recovery, automation, reduction of complexity, extensibility, cross platform, resilience to change, parallelism, security, partitioning, modularization, popular design idioms. The list is nearly endless.

--
Camping on quad since 1996.
Re:Your argument is dead, Zed by jbatista · 2010-01-09 22:32 · Score: 1

You're bound to make enemies at some point, simply because not everyone has that high-standard work ethics. All it takes is one day when someone is feeling like s*** (because everyone else has a girlfriend, because he didn't get the raise, whatver) and you come up to him and point out how he should do his work better, and you've created a grudging enemy. The "be productive, work hard, listen to others" advice is good because it tends to create allies, not because it tends to avoid creating enemies.

--
My sig is better than your sig.
Re:Your argument is dead, Zed by tool462 · 2010-01-10 08:25 · Score: 1

Hey, stop trying to pigeonhole him as some kind of asshole. He's just looking at the statistics. I mean, based on his own observation, his solutions are correct 100% of the time.
Now let's see you try to argue your way out of this one using "logic" and "reason".

Or, how about... by halivar · 2010-01-09 11:43 · Score: 5, Insightful

Statisticians need to learn programming or I will kill them all.

Re:Or, how about... by Max(10) · 2010-01-09 12:23 · Score: 1

"Statisticians need to learn programming or I will kill them all."
No, please don't, leave at least half a dozen so they can do the statistics on your killing the others and then we'll use the Pearson correlation coefficient on their results to find the most incompetent statistician of the bunch whose future work we'll then use to seed our PRNGs.
Re:Or, how about... by SiggyTheViking · 2010-01-09 12:32 · Score: 1

How about you just kill them all?
Right after all the lawyers.
Re:Or, how about... by ruyon · 2010-01-09 12:35 · Score: 1

Better yet, how about "Zed needs to learn manners or I will rip his mouth apart."
Please execuse my English.
Re:Or, how about... by JWSmythe · 2010-01-09 14:47 · Score: 1

The problem with thinning the herd is, despite that it was your idea, and it seemed like a good one, it can likely be expanded to include yourself.
Kind of like the argument of "kill all stupid people". Ok evaluate it based on IQ, and assume that it is your decision that it happens. Take the bottom 10%, and you're clearing out the "unwanted" "stupid" people. That may eliminate everyone with an IQ under 90. The "smart" people may see that there are still "dumb" people, and again want to eliminate the bottom 10% of the population, which may raise the minimum IQ to 100. As subsequent rounds happen, where those who believe they are superior decide that the lessers should die, you will find that there is a subset of the original group who is smarter than you, and you'll find that your head is on the chopping block.
But hey, others have considered ethnic cleansing of various sorts. Those have generally been frowned upon.
For some reason, I can't argue against the lawyers choice though. There may be a few to salvage, but they will be statistically irrelevant.

--
Serious? Seriousness is well above my pay grade.
Re:Or, how about... by vadim_t · 2010-01-10 04:42 · Score: 1

Pedantic: IQ 100 is supposed to be the population's average.
If you kill everybody below a set IQ, the scores will have to be adjusted. The average person will now be smarter, so somebody with formerly an IQ of 100 may now have one of 90.
Re:Or, how about... by JWSmythe · 2010-01-10 06:33 · Score: 1

Well....
The score of 100 was to be the statistical median. The score value remains until the test is renormalized to resume "100" as the median score.
If everyone with a score below 100 were elimianted, the person with a score of 101 would continue to have a score of 101, until such point when the test were renormalized, and only then would they realize they have a score of say 80.
If there was such a cleansing, it would seem to be advantageous to not renormalize the scores, otherwise you would end up in an endless loop until your sample set were reduced to 1. It's very lonely being at the top, but I guess someone could be a bit egomanical at that point. Just because you're smart doesn't mean that you aren't crazy. :)

--
Serious? Seriousness is well above my pay grade.

Mathematicians just need to shutup. by HornWumpus · 2010-01-09 11:44 · Score: 4, Insightful

We know as much statistics as we need to know.

Some know more, some less. Each has traded off hours vs. knowledge in many fields.

For example: Why would a programmer who's job is to automate bean counting need to know more then basic statistics? (s)he rightfully focuses his efforts on accounting.

One post calculus statistics course gives me enough grounding to know what I don't know and punt to experts when I need to.

Fucking specialists forget all the things they don't know and only look at the world through one lens.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'

Re:Mathematicians just need to shutup. by gardyloo · 2010-01-09 12:02 · Score: 2, Interesting

We know as much statistics as we need to know.
Some know more, some less.
That's either the most honest, insightful comment I've ever seen, or the most useless. I'm 92% sure, with an uncertainty of about +/-5%, that it's the latter.
Re:Mathematicians just need to shutup. by __aasqbs9791 · 2010-01-09 12:14 · Score: 5, Insightful

One post calculus statistics course gives me enough grounding to know what I don't know and punt to experts when I need to.
That's actually his argument (though I'm pretty sure he doesn't realize it, having met him a few years ago at a conference). People need to know their limits, and the strengths (and weaknesses) of others, and defer to them when they know what they're talking about, rather than talking out of their asses. As you point out, you can't know everything, but you'll defer to others who know more when you need to. I'm pretty sure Zed would like working with you based upon that fact alone (I know I value that trait and try to express it myself). Far too many people think they aren't allowed to have any weaknesses (and we all do in some area or another) so they talk a big game, and when push comes to shove, they will actively block people who actually know more than they do about the subject at hand. Working with too many people like that has driven Zed insane (IMHO) and I know I've been close to it at a couple of work places before (and really loved the one that wasn't like that hardly at all).
Re:Mathematicians just need to shutup. by Toonol · 2010-01-09 12:17 · Score: 5, Insightful

But statistics is one of those fields that benefits everybody; it's a bit like probability, logic, or (further afield) history. Lack of a fundamental understanding of statistic can lead you astray in a near-infinite number of ways.

I have sat in business meetings hundreds of times where I've seen decisions made on completely meaningless and irrelevant data, because the people involved don't understand statistics. The same holds true in your personal life; decisions with purchasing products, investing money...

Now, I'll bet that most slashdot readers have the minimum amount of knowledge of statistic to avoid the most egregious errors; but more knowledge is certainly helpful. It will help you in a myriad of ways.
Re:Mathematicians just need to shutup. by Anonymous Coward · 2010-01-09 12:32 · Score: 2, Insightful

Being socially adept is also a skill that benefits everybody but many programmers just arent. I hardly know anything about statistics, but Im not afraid to ask questions. Im sure there's stuff that other programmers know and think equally fundamental to success that Zed doesnt. It's fantastic that he's passionate about statistics. That skill certain comes in handy, but how much more important is it than helping everyone on the team get their job done, for example?
Re:Mathematicians just need to shutup. by HornWumpus · 2010-01-09 12:38 · Score: 1, Interesting

Statistics does not benefit everybody equally.
I'd say that if someone has not completed calculus then any statistics in their reach is simply memorize and regurgitate.
Put things in the correct order. Finish calculus then study stats.
The business majors understanding of statistics is the most dangerous.
They don't even know what they don't know.
They can regurgitate the definition of standard deviation but don't remember what normal distribution means.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Re:Mathematicians just need to shutup. by HornWumpus · 2010-01-09 12:41 · Score: 1, Insightful

And your 'one lens' is clueless grammarian.
Condolences to your mother for having spawned such a moron.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Re:Mathematicians just need to shutup. by LostCluster · 2010-01-09 14:44 · Score: 2, Insightful

The stats book I used in college had a table where they computed out the normal distribution equation to a table that the non calc-knowing could look up. Of course, than means that table had to be distributed on finals day.
Now, there's a funny think when you write out a table of values. You have to make an intentional mistake, or you're not able to have an effective copyright because the infringer could claim they did the work themselves.
I wrote a computer program to check the values to four digits (because that was the precision of the table) and found the one mistake. Funny thing, there were people who believed everything in the book had to be perfect... they also seemed to each have a favorite religion book, but the people didn't agree on the same one. The professors were alarmed... they had a problem about to use that value planned for the final...
Re:Mathematicians just need to shutup. by fermion · 2010-01-09 16:40 · Score: 1

There is no doubt that we each have our area of expertise. That is not the, IMHO, the question. The question is what does a programmer, working in a contemporary setting, need to know. There are many things most of us do not need to know. We do not need to know how to write an efficient search or sort routine. We do not need to know how to manage memory. We don't even need to know how to manually debug a program.
Since so much is done for us by the languages and IDE we use, I think it is reasonable to ask us to know something about the process we program. Programming is deterministic, and this is why many of us do know much about statistics. OTOH, much of what we are asked to program has a statistical nature. Searches do not always call for exact matches. In word processing a texting predictive typing does not return exact results. In finance, we want stochastic predictors concerning where the market probably will be tomorrow. Exactness is so 2000's.
And then there is the issue that software developers should be able to, on some level, research, understand, analyze, and create a policy based solution to a problem. Ignore the fact, as stated in the previous paragraph, that not all these problems are going to have exact, or trivially reproducible solution, and we are still left with understanding the problem. The involves some knowledge of statistics and it's vagaries. Lack of knowledge can lead to massively incorrect understanding. For instance, late last year a paper was published comparing subjective and objective measures of happiness. in this paper is was shown that if, on average, a state in the US express subjective happiness, there was a good chance that state would be happy using objective data. Even my understanding of this is not great, and the explanation is oversimplified, but the basic idea is there. In fact, I look at the data and say that the correlation is not all that great, but I will admit the variables do show at least some limited correlation. The problem is that the popular media takes this graph, which is comparing two technique of measuring a variable, and does not order that variable or imply the variable has any inherent meaning, and uses the data to say that some states are "happy" and some states are "not happy". Clearly we don't expect journalist to have a sufficient graph of math or science to understand why they did was unethical, but we should have expectation that anyone above the level of code monkey would have such an understanding. Otherwise we are going to have programs that will claim to give us valid or otherwise reliable results, when in fact what we have is simply someone's faith that it is a good result, without any well know and well regarded method to back it up.

--
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Re:Mathematicians just need to shutup. by Anonymous Coward · 2010-01-09 17:05 · Score: 1, Informative

I'd say that if someone has not completed calculus then any statistics in their reach is simply memorize and regurgitate.
Put things in the correct order. Finish calculus then study stats.
Horseehit. You can use a distribution without having to integrate it in a great many scenarios that would benefit many people. Also, for discrete statistics - which is probably of more immediate use to most people - you can replace that nasty integral with addition.
You don't have to know everything about a field to use parts of it. As parent said, I think many, many people would benefit from common-sense concepts combining statistics and logic, just so you can make good decisions about purchases and such. Read a book called "Innumeracy" to see the level of stupid I'm talking about - calculus is so far beyond that kind of dumb it's sad. I would settle for people being able to intuitively understand the implications of Bayes' rule. Understanding why prior probabilities are important would be a big start, and there's no calculus in that.
For reference, I have had calc and stats as part of my math minor. In my job, I use statistics daily. I use things I learned in calculus alone a lot more seldom.
The business majors understanding of statistics is the most dangerous.
Oh yeah. Are you a six sigma black belt? ;)
Re:Mathematicians just need to shutup. by snowgirl · 2010-01-09 19:44 · Score: 1

I have sat in business meetings hundreds of times where I've seen decisions made on completely meaningless and irrelevant data, because the people involved don't understand statistics. The same holds true in your personal life; decisions with purchasing products, investing money...
Investing money... isn't that why I'm paying $400 a month into the lottery? I mean, I'm going to win it eventually, right?

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
Re:Mathematicians just need to shutup. by St.Creed · 2010-01-10 00:27 · Score: 1

Zed, is that you?

--
Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
Re:Mathematicians just need to shutup. by wtfudgecakes · 2010-01-10 03:51 · Score: 1

And you need to learn to spell the word than.
Re:Mathematicians just need to shutup. by AthleteMusicianNerd · 2010-01-13 05:02 · Score: 1

Far too many people think they aren't allowed to have any weaknesses (and we all do in some area or another) so they talk a big game, and when push comes to shove, they will actively block people who actually know more than they do about the subject at hand.
In my experience, the size of the game they talk is inversely proportional to the amount of real knowledge they have. In response to the title of the thread, George Boole - Mathematician.

Title fail. by girlintraining · 2010-01-09 11:44 · Score: 5, Funny

Programmers Need To Learn Statistics Or I Will Kill Them All

Okay, two things: First, threatening programmers never work. Management's been trying that for years. Second -- don't you mean 'kill -9' them all, or maybe demalloc(), or cast them to void*, or one of a dozen other witty things you could do besides the mundane answer of threatening stabby bits on them because you have a case of intellectual snobbery?

--
#fuckbeta #iamslashdot #dicemustdie

Re:Title fail. by rolando2424 · 2010-01-09 12:36 · Score: 1

demalloc()
Don't you mean free()?

--
Okay seriously I've just run out of pointless things to say.
Re:Title fail. by bipbop · 2010-01-09 12:39 · Score: 1

There's a surprising number of google hits for demalloc! (Typing random crap at the end of my entry to pass the time until Slashdot's 18-second-or-whatever lower bound is exceeded. Meh.)
Re:Title fail. by girlintraining · 2010-01-09 12:48 · Score: 3, Informative

Don't you mean free()?

#include <stdhumor.h> void demalloc (void *ptr); void demalloc(*ptr) { /* I meant to say */ free(ptr); }

--
#fuckbeta #iamslashdot #dicemustdie
Re:Title fail. by Anonymous Coward · 2010-01-09 13:44 · Score: 5, Funny

or firefox's implementation:

void demalloc(*ptr)
{
/* noop */
return;
}
Re:Title fail. by croux · 2010-01-10 01:51 · Score: 1

or Windows implementation :

void demalloc(void *ptr) { ptr = NULL; }

and IE implementation :

void demalloc(void *ptr) { free(ptr + 4); }

Really? by Anonymous Coward · 2010-01-09 11:45 · Score: 1, Funny

Zed Shaw says: "I've been studying it for years and years and still don't think I know anything"

Don't you think this might be telling you something, like... perhaps statistics are too hard for you? Leave the real work to the people who do know what they are doing and do know something about the field: programmers.

Re:Really? by Anonymous Coward · 2010-01-09 11:48 · Score: 1, Insightful

Statistics is "just" applied measure theory. Which means, among other things, that its language is Turing complete. There is infinitely much to know.
Re:Really? by Sulphur · 2010-01-09 12:24 · Score: 1

Statistics is "just" applied measure theory. Which means, among other things, that its language is Turing complete.
Can you give a traveling salesman analogy for that?

My advice: take a statistics class as an undergrad by j1m+5n0w · 2010-01-09 11:49 · Score: 1

I never took a statistics class as an undergrad. In retrospect, I think it would have been very useful, probably more so than the calculus I took (which I think is also a very good thing to know, but stats tend to be used more often).

The funny thing is he's doing exactly the same by Rix · 2010-01-09 11:52 · Score: 4, Insightful

He's just as arrogantly claiming that he's right and they're wrong. Now, he may very well in fact be right, but he's taking the same obstinate position the people he criticizes do.

It's important to know when your input is not desired. Even if you think it should be.

Re:The funny thing is he's doing exactly the same by DMiax · 2010-01-09 14:32 · Score: 1

Maybe, but even then he is taking an arrogant stance in his field of expertise, while they are doing the same outside theirs. I assume he would not do the same when talking about programming, since that would obviously be hypocritical.
Re:The funny thing is he's doing exactly the same by Mr0bvious · 2010-01-09 16:15 · Score: 1

Oh look, we'll have to agree to disagree, regardless of how wrong you are.

--
Never happened. True story.

The reason people ignore you Zed.. by Anonymous Coward · 2010-01-09 11:54 · Score: 5, Insightful

is not because they don't understand statistics. It is because you are a dick.

Re:The reason people ignore you Zed.. by dbarclay10 · 2010-01-09 14:05 · Score: 2, Interesting

Your comment ("the reason people ignore you is because you're a dick") is clearly a troll, but it was also moderated Insightful ... which might also be a troll :)

Nevertheless, assuming for a moment that you're being truthful in your expression, then I have this to say:

This is what is wrong with the world today. Billions upon millions of morons who don't know what they're doing, and people trying to show them how to (or, hell, what the fuck - people trying to beat them into) do(ing) it the right way.

You want these assholes who can't even figure out how to correctly measure something to build the bridge you drive over twice a day? How about the building you work in?

Or I dunno, maybe you'd prefer having _only_ people who will point out errors when they see them working on it? How about your doctor? You want your operating room filled with maybe one smart guy who recognizes an error and six people who don't know any better? And you're saying that, when the smart guy recognizes the error and tries to point it out (no matter HOW he does it, though I'm betting the original poster isn't that much of an asshat at work), he's being a dick?

Christ, what's wrong with you? Seriously?

--

Barclay family motto:
Aut agere aut mori.
(Either action or death.)
Re:The reason people ignore you Zed.. by Anonymous Coward · 2010-01-09 14:38 · Score: 4, Insightful

Claiming that the author is a dick is not mutually exclusive to him having a good point. The author is right in his claims that people who don't know what they're talking about often think they do and get pissy when someone claims otherwise. But the author presents this viewpoint in a really stupid manner. It is dickish to say, essentially, "Hey idiot, you're wrong", even if the person is wrong.
Note how your response is dickish, but probably right in claiming that the world is filled with arrogant/stubborn people.
Re:The reason people ignore you Zed.. by Jedi+Alec · 2010-01-09 17:18 · Score: 2, Insightful

Ah, the good old clash between the real world and the way you(we?) think it should be.
Pointing out that people are wrong is a sensitive process. If you do it the wrong way, you provoke an emotional response that stops the person you're trying to convince from absorbing what you're trying to tell them.
It doesn't matter if you're right or wrong, if you convey your information in a way that is perceived as "being a dick" it will never reach its destination. That sucks, but it's just the way most human beings work. And I very much doubt that this is a root cause of "what is wrong with the world today", unless people getting pissed off because some know-it-all jackass is telling them they're a moron is a recent development.

--

People replying to my sig annoy me. That's why I change it all the time.
Re:The reason people ignore you Zed.. by Anonymous Coward · 2010-01-09 18:41 · Score: 1, Interesting

No. You cut out a large portion of my comment.
I said people ignore Zed
a) because he's a dick.
b) not because they don't understand statistics.
So what I am saying is Zed is not the one single 'smart guy' surrounded by a whole lot of incompetent assholes, he is one of those guys who thinks he knows better than everyone else about everything.
He can't understand, and can't be made to understand why most of the crap he likes to bang on about is irrelevant to the problem at hand and so he is largely ignored. If he thinks women listen to him more, its simply because women are generally better at *seeming* to listen to dickheads in order to make them shut up.
One of my degrees in in experimental physics. I have a strong understanding of the application of statistics to measured data. I often take performance measurements without considering deviation, because deviation is not relevant to what I am doing at the time. Apparently this makes Zed want to kill me.
Re:The reason people ignore you Zed.. by freedomlinux · 2010-01-09 18:51 · Score: 2, Insightful

I think all statisticians should have to learn writing communications skills.
Zed sure embarrasses himself by writing such an atrocious piece of garbage.

Maybe people would listen to Zed if he didn't:
a.) Depend on vulgar language to emphasize an argument (and subsequently)
b.) Prove himself as a huge douchbag.
Re:The reason people ignore you Zed.. by adamchou · 2010-01-10 00:01 · Score: 1

Since you don't seem to mind how people convey a correct message, even if they are complete assholes, I hope the next time someone explains to you how to do something that you did wrong, they punch you in the face.
Re:The reason people ignore you Zed.. by khallow · 2010-01-10 04:05 · Score: 1

You want these assholes who can't even figure out how to correctly measure something to build the bridge you drive over twice a day? How about the building you work in?
How about people who can't communicate? What's the point in having someone who can do it right, but can't convince anyone else? A ineffectual Cassandra? "Being a dick" means poor communication and negotiation skills.
Re:The reason people ignore you Zed.. by mdwh2 · 2010-01-10 08:55 · Score: 1

Your comment ("the reason people ignore you is because you're a dick") is clearly a troll
TFA contains gems such as "they dont know shit", and "their confidence in their lacking knowledge is only surpassed by their lack of confidence in their personal appearance".
So if that's anyone to go by, trolling shouldn't mean you get modded down, it should mean you get a front page Slashdot article...
Or I dunno, maybe you'd prefer having _only_ people who will point out errors when they see them working on it? How about your doctor? You want your operating room filled with maybe one smart guy who recognizes an error and six people who don't know any better? And you're saying that, when the smart guy recognizes the error and tries to point it out (no matter HOW he does it, though I'm betting the original poster isn't that much of an asshat at work), he's being a dick?
I wouldn't want to rely on someone who makes errors such as generalisations based on anecdotes, nor am I going to ever be persuaded by an argument that relies on ad hominems rather than reason.

Statistics is HARD by omb · 2010-01-09 11:54 · Score: 4, Informative

Statistics is HARD, for two reasons:

(a) Probability theory, on which all practical Statistics is based it both (i) counter-intuitive and (ii) difficult

(b) The very Mathematics on which it is based is obscure

And, worst of all, it is uniformly badly taught, even in good universities, and the Statistics for XXX are uniformly awful, blind leading the blind.

Lastly it is very hard to get a staight answer from a mathematical Statistician.

Re:Statistics is HARD by codewarren · 2010-01-09 12:09 · Score: 3, Funny

Statistics for XXX are uniformly awful, blind leading the blind.
They have statistics for porn? (!!)
What could be wrong with that? And blind on blind action? Strange, but interesting.
Re:Statistics is HARD by digitalhermit · 2010-01-09 12:12 · Score: 1

Can't agree with that.
Basic statistics as taught in a beginning stats class is counter-intuitive because they don't teach the calculus behind it. But it's actually quite simple to use, however. The tough part is figuring out what statistic to apply to a given problem. It's not difficult. There's a reason that it satisfies the "basic math requirements" for a business major and physical therapy major.
The mathematics behind statistics is Calculus 2 which is hardly obscure. The Statistics with Calculus class in fact only requires a Calc 1 understanding; i.e., knowledge of limits, differentiation and integration. What the statistics course teaches is how to apply those tools and not the reasoning behind how they work.
And yes, statistics is often badly taught, but I can say that about almost every undergrad math course that I ever took.
Re:Statistics is HARD by LostCluster · 2010-01-09 12:23 · Score: 1

I didn't have much trouble with statistics in college after having studied physics the year before in high school, and firmly formulas are being taught because they've been proven true, so you just need to remember the steps to get something done, and the numbers were just filling in the variables. More numbers involved, but still there's formulas.
I had such an easy time with the course, and had trouble hiding that, that I would regularly be visited by students asking for help on Sunday on the homework that I had completed after class on Friday. Doing the homework within minutes of it being taught helped greatly. It led me to be totally free of work over the weekend while others put it off, and some waiting for me to return from my hometown.
Re:Statistics is HARD by radtea · 2010-01-09 12:28 · Score: 4, Insightful

Statistics is HARD, for two reasons:
I'd argue that probability theory isn't as hard as people make it seem, but statisticians are wankers. Most of what we think of statistics was developed by people who were intimately engaged with empirical research, but modern statisticians are mathematicians, many of whom have never actually performed an experiment. They think the statistics are real, whereas experimental scientists know the truth: God made the Probability Distribution Functions. All else is the work of man.
Furthermore, modern computing has made a lot of the conceptual apparatus of conventional statistics irrelevant, as it is designed to deal with the problem of reducing problems to something that can be computed by hand and finished off with a single table lookup. Today its a rare case that we can't get at the PDFs directly, bypassing much of conventional statistics. But due to how badly the stats are taught, and how poorly probability theory is understood, we are still living in a world where p-values are the exception, not the norm, and when they are quoted they are frequently unrealistic because they are based on statistical assumptions that are not warranted given the non-idealities of the data.
So I'd argue that statistics is basically a dead field populated by zombies who are dedicated to infecting as many students as possible. If we taught thermodynamics or mechanics with equally outmoded concepts they would be really hard too.

--
Blasphemy is a human right. Blasphemophobia kills.
Re:Statistics is HARD by wfolta · 2010-01-09 12:33 · Score: 1

You hit the nail on the head. Statstics is counter-intuitive and badly taught. But extremely important.
The worst grade I got in undergraduate studies was in Probability, and in graduate studies I've been exposed to statistics now for about the 4th time and it's finally sinking in... mostly... a lot.
That said, there is need for statistics in any programming endeavor where you are trying to come up with a new algorithm or trying to improve the performance of an existing one. I can think of the kind of pitiful "ran it several times and this one's faster" testing I would have done in the past, and all the logical hand-waving I would have done if questioned, "Can we be SURE it's faster?", and it's embarrassing. If you're just coding, perhaps no need, though a good feel for how real statistics and scientific experimentation is done is very helpful in programming.
Re:Statistics is HARD by thesandtiger · 2010-01-09 12:35 · Score: 5, Interesting

I don't think it's hard - I just think it requires a different way of thinking than most programmers usually take to maths.
As a programmer/developer who went into research (in social sciences, so it's really soft), I can say that in my experience stats is really closer to a programming language than it is to other maths. Here's why:
1) You have a LOT of tools to pick from. What kind of analysis do you want to do? What kind will give you the most useful result? What kind is your data amenable to?
2) You don't always have a clear choice as to which is the best for a given situation. Sometimes you need multiple different types of analysis to really get the full picture.
3) Just because it's math doesn't always mean it's right. There's some crazy ass black-box magic stats stuff we use for one project of ours that, in theory, will let us figure out the demographic composition of an unknown target population. Maybe. Sometimes. If the wind is right. Or not.
4) At the advanced levels, it's fucking insane. People who hack stuff like ultra optimized 3d engines with large quantities of assembler or whatever always wigged me out because my brain just doesn't work that way. With the really complex stats stuff it's the same way - I can plug and chug with the formulas, but I honestly have about as much comprehension of why some of the more advanced stuff works as my dog has of CPU design.
5) If you know the basics, you know just enough to be dangerous and really piss off people who know what they're doing. Being able to run an anova or determine correlation makes some people think they actually know what's going on because, hey, it's math. But a lot of people who just do the basic stuff think their results are more meaningful than they actually are - falling prey to the whole "it's statistically significant therefore it must be IMPORTANT" fallacy (when you can certainly have things that are "statistically significant" but actually have virtually no impact on the outcome.
6) Even when people know their shit, they disagree. A fine example of this would be the Space Shuttle failure rate - you had people saying that the shuttle would suffer a critical failure from everywhere between 1 in 5 and 1 in 50,000 launches. And depending on what tools they used to do their analysis, they were correct. Same as with programming languages - depending on the problem, equally skilled programmers might pick entirely different languages to use because they think one part or another is more critical.
Honestly, I really enjoy stats - if I had to do it all over again I would probably have spent a LOT more time working with stats than I did as a programmer in my younger years - but I won't pretend that it's totally clear what tools to use when. The author of TFA should do well to realize that even fellow statisticians would probably slap the shit out of him over some of his beliefs about how to properly go about utilizing stats toolsets.

--
Since I can't tell them apart, I treat all ACs as the same person.
Re:Statistics is HARD by Anonymous Coward · 2010-01-09 12:37 · Score: 2, Insightful

The mathematics behind statistics is _not_ "Calculus 2". It is measure theory and analysis.
Re:Statistics is HARD by omb · 2010-01-09 13:56 · Score: 2, Insightful

Sorry, the replies indicate just how correct what I wrote was:

1. It is not about formulas, or Calculus xxx, it is about really understanding what you are doing, and how all the formulas were derived, and some of that is really heavy Pure Mathematics in particular Algebra and Analysis, so that, if necessary, you can work out the probability theory in new situations.

2. In addition to the Math, there is Logic, Philosophy and Science in Experimental Design.

The big problem is that people who just know the formulae miss apply them to wrong experimental situations.

The most topical current example is the AGW controvesy where some Climatologists, HAD-CRU, eliminated (perceived outlier) data not realising that would mean that confidence estimators on their data were thereby faulty, so all that work must be re-done.
Re:Statistics is HARD by Sycraft-fu · 2010-01-09 15:21 · Score: 1

A good one for demonstrating that "statistically significant" doesn't mean "useful" is to run an analysis on the number of days since a project has started. You will, of course, find that the data is a perfect fit for a line and thus "significant." There is virtually no chance this was random. Of course that means nothing, all you've done is find a fancy way to say "Each day, the number of days since the project started increases by one," which is a useless statement.
It's a useful way of showing people that just because you have data that gives a "significant" result, doesn't meant that result is useful at all. Of course in real analysis it can be more subtle, but this shows it in an obvious way.
Re:Statistics is HARD by kramerd · 2010-01-09 16:33 · Score: 1

You have got it so wrong it hurts to read. The number of days since the project started on day 4 will always be 4. If it turns out that it is 5, then this finding is statistically significant. If it turns out to be 4, then it is not.
If something is statistically significant, it simply means that it was unlikely to have occurred by chance. In your example, there would be no deviation from expectations, as r^2 = 1, so there is no statistically significant deviation, nor would the findings be significant. It is almost certain that your example would have such an r^2 value because it occurred by chance, as there was no possibility of deviation. This does not mean that in a separate example (such as one where the population contains deviations from the mean), finding an r^2 value of 1 and no deviation would not be a significant finding; in fact, for most studies, finding no deviation from expectation would indicate that you did your study incorrectly.
Next time, try understanding the words you use before posting.
Re:Statistics is HARD by snowgirl · 2010-01-09 19:55 · Score: 1

Honestly, I really enjoy stats - if I had to do it all over again I would probably have spent a LOT more time working with stats than I did as a programmer in my younger years - but I won't pretend that it's totally clear what tools to use when. The author of TFA should do well to realize that even fellow statisticians would probably slap the shit out of him over some of his beliefs about how to properly go about utilizing stats toolsets
If I had it to do over again, I'd probably actually go to my stats class, rather than not even show up for half of the semester, then study for 24-hours before learning it all, and then collapsing asleep after the test.
Funny thing was, I ended up with a B in the class, because I aced the final. It's really the only test I ever studied for, and of course if you didn't know this, cramming for a test means that you won't remember any of it later... same with me, I'm at a complete lose for doing any of the stuff from my stats class.

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
Re:Statistics is HARD by snowgirl · 2010-01-09 20:19 · Score: 1

You have got it so wrong it hurts to read. The number of days since the project started on day 4 will always be 4. If it turns out that it is 5, then this finding is statistically significant. If it turns out to be 4, then it is not.
If something is statistically significant, it simply means that it was unlikely to have occurred by chance. In your example, there would be no deviation from expectations, as r^2 = 1, so there is no statistically significant deviation, nor would the findings be significant. It is almost certain that your example would have such an r^2 value because it occurred by chance, as there was no possibility of deviation. This does not mean that in a separate example (such as one where the population contains deviations from the mean), finding an r^2 value of 1 and no deviation would not be a significant finding; in fact, for most studies, finding no deviation from expectation would indicate that you did your study incorrectly.
Next time, try understanding the words you use before posting.
Metapwn coming. Let's conduct a thought experiment. Everyday, you ask a person for a number, and they give you a value, which you then record. You then evaluate those numbers, and find that each day that you ask the person for a number, they gave the number of days that had elapsed since you started asking them that number.
It is statistically significant to say that the number given by the person is equal to the number of days since the beginning of the questioning. It is, however, unsurprising in the case where you knew that the person was going to do so after the fact.
Statistical significance has nothing to do with expectations, it simply states that what you're showing is unlikely to be because of chance... NOT that results are interesting, practically significant, or unexpected.
Another thought experiment. We sample a population for Y chromosomes, and their sex. We find a statistically significant relationship between having a Y chromosome and the sex being male. Namely, it is not just chance that the sex of the individual is male, and they have a Y chromosome.

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
Re:Statistics is HARD by dookiesan · 2010-01-09 21:02 · Score: 1

Much research in statistics is focused on very applied problems in computational biology. You are right that statisticians do not perform the experiments; it's unrealistic to expect them to have the lab experience necessary. The mathematical statisticians are working on problems such as multiple testing (in many studies there are hundreds of thousands of hypotheses being tested) and inferences of very high dimensional data. These are relevant topics and the work is motivated by the recent shift in the types of data we collect in experiments.
Some interesting subsets of computer science (machine learning and AI) are now focused on statistical models. You have pointed out that much statistical work isn't developed by statisticians. That doesn't mean the field is dead -- it is thriving!
Some Bayesians who rely on MCMC and may not care as much, but generally it is still important to find models with closed form likelihoods and optimization updates. Applied statisticians work hard to keep things in closed form precisely because it matters in practice (but not in theory).
Re:Statistics is HARD by kramerd · 2010-01-09 22:01 · Score: 2, Informative

Thought experiment 1 - this would be a significant finding, provided that you did not ask how many days have passed as the number you are asked for; if you ask for a number and they respond in a pattern, that is a significant finding. It would be statistically significant, however, that patterns are used if you instruct the person to follow such a specific pattern, because you removed the opportunity for variablity by your instruction. You dont have a random sample if there is no variability in a population. I already covered this.
A sample, on the other hand, would not be one person responding with a number day after day after day; rather it would at least be hundreds (if not thousands) if you wish to extrapolate to the population (either people or numbers). You can't do it with 1 data point.
Your third paragraph is confusing, because it is a paraphrase of what I said, only it doesn't fit with everything else you say in your reply.
Second thought experiment - Sex isn't male (neither are y-chromosomes), you are thinking of gender. Regardless, you would find a statistically significant relationship between males having a y-chromosome, because its part of the definition of how you define gender. This would be like sampling a population of red cars to determine if they are red.
Try coming up with a population that has variability, so that taking a sample makes sense, and you will see that statistical significance matters.
Re:Statistics is HARD by janwedekind · 2010-01-10 00:06 · Score: 1

Do you want a straight answer?
Re:Statistics is HARD by shic · 2010-01-10 00:42 · Score: 1

5) If you know the basics, you know just enough to be dangerous and really piss off people who know what they're doing. Being able to run an anova or determine correlation makes some people think they actually know what's going on because, hey, it's math. But a lot of people who just do the basic stuff think their results are more meaningful than they actually are - falling prey to the whole "it's statistically significant therefore it must be IMPORTANT" fallacy (when you can certainly have things that are "statistically significant" but actually have virtually no impact on the outcome.
Here, I think, you've mentioned a critical issue with statistics. The field of statistics is treated as if it were an omniscient black box by the majority of the population, and the more complex the calculation and the less people understand about the process, the more weight the statistic tends to have on decision making. One idea that interests me is that this might be valid for some 'real life' situations... especially where the people are utterly bamboozled by the statistics but use them to guide their behaviours in the absence of credible alternatives... essentially turning arbitrary choices into self-fulfilling prophecies. I am intrigued as to whether or not such subtlety would allow one to devise statistics that have subversive systemic effects.
I would love to receive a recommendation for a solid statistics reference book. All the books I can find infuriate me - they talk down to the reader and appear to be written for someone who is innumerate... labouring the trivial and obvious - while skimming the non-obvious dismissively. Another thing that frustrates me about statistics texts is that they frequently focus on 'case' studies - encouraging the reader to assume that they can re-apply techniques when they encounter similar-looking situations in real life. This is utterly bonkers - since, without a full understanding of the mechanics of the calculations, it is impossible to determine if the similarity between two situations is an irrelevant detail or crucial to the technique. I'm not interested in 'learning by example' - so I can copy a bunch of people whose views I distrust... I'd like a reference that allows me to use a typical approach in my statistical reasoning rather than re-inventing the wheel myself. I'd like it to be comprehensive and compact - I don't want it to aim to indoctrinate me to think in a particular way in my analyses.
(Can anyone recommend such a book?)
Re:Statistics is HARD by jozmala · 2010-01-10 02:16 · Score: 1

statistics is EASY atleast in university.
Just look dozen of previous exams
spend one hour with someone who really knows their math to solve some of the problems.
Go to exam.
Repeat until passed.
Sometimes statistical analysis gives you enough exact questions that are coming in next exam.
If it doesn't then you can continue collecting more data, until you get it right ;)
And it doesn't even matter if its the anomalous data point the time when you pass.

--
©God :Copyright is exclusive right for creator to determine the use of his creation.
Re:Statistics is HARD by thesandtiger · 2010-01-10 03:20 · Score: 1

To be honest, you'd probably be best served with a text book - I found that most of the ones used at varying levels when I returned to school for my psych degree were of the sort that taught like this: "Here is a statistical concept, here are some strong sides to it, here are some weaknesses. Here are 3 or 4 situations and some data about them - use this new concept to analyze those situations" and then the classes would largely be about what the results "showed" and whether or not those results were even remotely solid and reasonable and if so, why, if not, why not.
Many non-text stats books are written with an agenda - the authors want to promote their tool as the best one for any job - so they take the approach you decry. Text books that cover a variety of methods tend to be a lot less preachy and encourage you to recognize the strengths and limits of many methods.
Right now in my lab, we have one guy who is on an HLM (hierarchical linear modeleling) kick, and he just will not shut the fuck up about how wonderful it is. Any time we have event based data (as in, something happened to an individual several different times and we have some info about each event) he just goes CRAZY with this stuff. And it is useful... For some of his stuff... but he's trying to used it in situations that don't apply because it's just neat.

--
Since I can't tell them apart, I treat all ACs as the same person.
Re:Statistics is HARD by philosiphus · 2010-01-10 04:14 · Score: 1

Zed does not distinguish between probability and statistical inference. He describes measurements that have been taken -- making inferences from data. He only mentions statistics in the context of inferring what happened rather than model the probability of an event happening in the future. That is only part of the problem in programming where, ideally, you should have 100% certainty, in the absence of exceptional situations (machine loses power, runs out of memory), that given the same inputs your program will produce the same outputs.
I agree with an implicit part of his rant: too many people (men and women) tend to rationalize intellectual laziness. Based on my own experience with people holding doctoral degrees in physics and mathematics, I have to say Zed sounds like he is in an "enterprise" situation where people can be lazy. Perhaps he would be happier in another workplace. Some of those mathematicians have been sloppy programmers and would even take offense if you (as a programmer) try to show them how to correct their program so it does not destroy your system. That does not mean all mathematicans are bad programmers and I have met many who wrote programs I would be proud of.
Which brings me to his comment on women. If some women in the population of people around Zed listen to him it does not mean all women will. In fact, the population of a single workplace (Zed's) does not provide enough data to support Zed's proposition that most women would listen to him. I have met (and enjoyed working with) many men who conscientiously try to do the best job they can and are more interested in mastering the intellectual problems at hand than proving that they are the best or most knowledgeable. So fuck you, Zed, you sexist bastard. Thanks for giving the feminists and leftists one more argument to rationalize their sexual discrimination and force men to be contractors, expendable and generally unemployed, while all the women get to keep the full-time employment.
Re:Statistics is HARD by toddestan · 2010-01-10 05:58 · Score: 1

They have statistics for porn? (!!)
Yes, and it's HARD.
Re:Statistics is HARD by shic · 2010-01-10 09:10 · Score: 1

I'm certain I'd be best served by "a text book" - the question, of course, is which one? Browsing at my local bookshops has turned up nothing worth buying - and I get no further with Amazon et al.
I'd really like a definitive reference book on statistical methods... one that assumes its reader is mathematically able, but only occasionally engaged in statistical work. :)
Re:Statistics is HARD by thesandtiger · 2010-01-10 10:34 · Score: 1

I can't say the one I'd recommend is definitive by any means, but it was useful in that it presented statistics in a way that made the concept of "these are tools that can be used in a variety of ways, but you'll have to learn over time what methods work for your various projects" very obvious - it's basic (undergrad stats for behavioral sciences level) but it can be useful. The book is:
ISBN-13: 9780471509820
Lawrence Grimm, Statistical Applications for the Behavioral Sciences

--
Since I can't tell them apart, I treat all ACs as the same person.
Re:Statistics is HARD by snowgirl · 2010-01-10 12:56 · Score: 1

Second thought experiment - Sex isn't male (neither are y-chromosomes), you are thinking of gender. Regardless, you would find a statistically significant relationship between males having a y-chromosome, because its part of the definition of how you define gender. This would be like sampling a population of red cars to determine if they are red.
Actually, a Y chromosome is NOT guaranteed to produce a male. For instance, if there is a mutation in the SRY gene that makes it inactive, then the person will develop entirely as if they had a nominal X chromosome (they will develop ovaries). Next, the SRY gene only guides the development of the gonads.
Once the gonads have been established, they secrete hormones which influence the further development of what we consider to be "primary sexual characteristics". If there is a mutation in the androgen receptor gene, or in the androgen gene itself, then the person will respond ineffectively to the hormones produced by the testicles, thus resulting in the person developing female primary sexual characteristics. But, even then, there are at least two different kinds of primary sexual characteristics.
Androgens affect the development of the external genitalia between being a penis and scrotum, to being a clitoris, and labia with a shallow vagina. As well, Anti-Müllerian Hormones (AMH) affect the development of the Müllerian ducts (the upper vagina, cervix, uterus, and fallopian tubes). And so a mutational error in the processing of AMH can result in a person with testicles, a penis and a uterus.
THIS is why I'm talking about "expectations have nothing to do with statistics"... we have stupid built-in assumptions that we take for granted, and we make assumptions about the odds and everything that exist.
While you like to think this second thought experiment contains no variation, it's more accurately like sampling a population of Model Ts for what color they are. The vast almost entire amount of them are black, but some are not.

Try coming up with a population that has variability, so that taking a sample makes sense, and you will see that statistical significance matters.
Funny, because I did, you just lacked the biological understanding about what ACTUALLY causes sex, because you were drawn into the assumption that f(x) = x with a p = 0.0 In actuality, the p is not 0.0, and never will be, because the system is a fuckload more complicated than you thought it was, which... brace yourself for this...
IS EXACTLY WHAT THE ARTICLE WAS TALKING ABOUT!!!

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
Re:Statistics is HARD by kramerd · 2010-01-10 17:35 · Score: 1

I didn't say a Y chromosome was guaranteed to produce a male (nevermind that I explicitly stated that Y chromosome does not = male), I said it was statistically significant; if you haven't been paying attention, it just means that its unlikely to have occurred by chance. Outliers exist, it doesn't make them your expectation. For example, the average history major at UNC for class of 1983 starts out at over 200k per year, but not if you take Michael Jordan out of the sample. On the other hand, while a red car could end up actually being a waffle house, we don't assume that the average objective viewer is high on crystal meth and shrooms.
Biologically, sex is caused by hormones, not f(x). Its also not caused by a fuckload of a complicated system. Also, no one reads the article because it's almost always wrong, which is why we a forum in the first place (so we can argue about why it's wrong).
By the way, most model Ts are Georgia Tech Old Gold, not black (if they aren't, I dont care, most that still drive are). Try again if you want, but please actually read my response beforehand.
Re:Statistics is HARD by snowgirl · 2010-01-11 14:41 · Score: 1

Ok, to quote you exactly:

Sex isn't male (neither are y-chromosomes), you are thinking of gender. Regardless, you would find a statistically significant relationship between males having a y-chromosome, because its part of the definition of how you define gender. This would be like sampling a population of red cars to determine if they are red.
You are taking the assumption that "male" means "gender", and that the y-chromosome is "part of the definition of how you define gender."
No, GENDER is a social construct that people build regardless of any actual physical properties of the individual. That men and women divide themselves into two separate genders and we thus define one as "masculine" and the other as "feminine" is pretty much just a social construct. We desire to conform with people, and we adapt to the gender roles presented to us by our peers.
SEX however is biologically defined. One's CHROMOSOMAL SEX will always be male (not necessarily masculine) if they have a Y chromosome. One's gonadal sex will always match the fact of if they have testes or ovaries. One's genital sex will always match the physical appearance of their genitalia. However, one's hormonal profile will not always match what would be expected as a result of looking at any of the possible definitions for their sex.
Before we got on this stupid tangent because you're wrong about sexual development of humans, my point was that with a p = 0.0, you can prove STATISTICAL SIGNIFICANCE, regardless of if you jury rigged it in or not. That's one of the CRITICISM of statistical significance, is that it doesn't really say jack shit to someone who doesn't understand statistics, yet SEEMS to says a lot to people who THINK they understand statistics.

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
Re:Statistics is HARD by kramerd · 2010-01-11 17:45 · Score: 1

No, we don't define gender as feminine vs masculine, despite what you think you learned in your intro to feminist philosophy class in community college. Gender is a defined class of replacement nouns, pronouns, etc; like to replace "the man" with "he." Gender is the construct difference between male and female. In other words, male and female are definitions of gender. So you seem to have it backwards, my assumption was that gender means male, not the other way around. I correctly use the assumption that the Y-chromosome is part of the definition of how you define gender, because it is true. Granted, it doesn't fit when I refer to "my ship" as "she", but I would refer to "the person with the Y chromosome" as "him," even in a statistical sample of Y-chromosomes for gender. Then again, in the quote you have ignored twice now even though you reproduce, I also explicitly state that y-chromosome does not necessarily = male [Sex isn't male (neither are y-chromosomes)].
You won't have a very good career in debating, because the first of debating is dont quote people and then come up with improper assumptions about what those quotes mean, because if the person you are debating isn't retarded, you will embarrass yourself.
At any rate, we got on this tangent because you brought it up.
Here is the breakdown:
- You brought up the example
- I explained why your example didn't work
- You went off on a tangent
- I briefly pointed why you were wrong
- You quoted me, reached an impossible conclusion regarding my assumptions, misinterperated the result of those false assumption regarding my assumptions, and then pointed out that you don't know what means, complaining that we somehow got off on a tangent
- I responded again, and now am waiting to see how you screw this one up
On a tangent here, if you randomly CAPITALIZE WORDS in the middle of a TYPED sentence, it doesn't strengthen or or emphasize points of your argument, it just makes you look immature and unsure of your points. Certainly, no one is going to take anything you say more seriously for it, especially because that behavior generally means that you are wrong. Stop doing that if you want to be taken seriously.
Re:Statistics is HARD by snowgirl · 2010-01-11 21:18 · Score: 1

For the record, I have no taken any "feminist philosophy" classes, and have studied linguistics extensively, and I understand why it's entirely normal and natural for the German language to insist that the word for "girl" is neuter, despite having a feminine gender.
As well the word "Computer" has the masculine gender in German. So, when talking about my computer, I could say, "I took him apart, and then put everything back together again, and he still worked."
If you want to mix linguistic definitions with biological definitions and equivocate everything, then I have no interest in communicating with you, because it's like trying to have a conversation with a kitchen table. Even though you both are masculine in the German language, I still don't think either of you have anything meaningful to say.
For the record, there exist a statistically significant number of people in the world, whom you would call "she" even though they have a Y-chromosome... and they would be this way from birth, and if you called them "he", everyone would look at you like you were crazy. The Y-chromosome plays no necessary (but is often significant) role in which pronoun you're going to use for any particular individual.
Quoting your debate opponent, and seeking to explain how you understand what they're claiming is part of the proper method of debate. It is absolutely vital in a debate to consider that your opponent is using words in a different way than yourself, and seek reconciliation of terms before proceeding to have any debate worth any meaning.
Your worldview seems to contain too many false premises, and you seem unwilling to consider that your premises are incorrect, your definitions are flawed, or that someone could misinterpret your words, because you've made them unclear. Therefore, there is no point in arguing with you any further.
I say, Good day, sir.

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS

bloggers need to learn to write or ... by Lazy+Jones · 2010-01-09 11:54 · Score: 1

... something inside me wants to flame him for being a rude twat who wasted 1 minute of my lifetime, even though he has some valid points. I'd be surprised if he didn't get some responses along the lines of "cry me a river" etc.

--
"I love my job, but I hate talking to people like you" (Freddie Mercury)

Go ahead and try it by thetoadwarrior · 2010-01-09 11:55 · Score: 3, Insightful

I know enough about statistics to know statistically I know I'm safe from his threats. I suspect if I were a bag of Cheetos the odds were be against me but that's not the case.

It's not just statistics by im_thatoneguy · 2010-01-09 11:56 · Score: 2, Insightful

I've found that more than just about any other degree Computer Science and to a less extent Medical Degrees imbue the recipient with an unnatural ego when it comes to subjects with which they are unfamiliar. I propose we remove the word Science from CS degrees and call it what it is "Computer Programming and Troubleshooting". There are far too many CS graduates who think they are actually scientists.

Re:It's not just statistics by radarsat1 · 2010-01-09 12:06 · Score: 4, Insightful

I disagree that CS is just "programming and troubleshooting", but I do agree that Computer Science is a complete misnomer. It's extremely misleading, and difficult to explain to people, "I'm a computer scientist, but no I'm not actually a scientist, instead I understand how to describe formal languages in terms of strict grammar rules and transform abstract syntax trees from one representation to another."
It shouldn't be called Computer Science, it should be called Computational Mathematics, because that's what it is.
(On the other hand, there is whole branch of CS that extends very deeply into statistics called Machine Learning, but at the core I'd say it is still more mathematics than science. There is also human-machine interaction which often goes under CS, but is actually more like psychology.. so it's not so cut and dry.)
Re:It's not just statistics by Dahamma · 2010-01-09 12:18 · Score: 2, Insightful

Maybe wherever you went to school they taught "computer troubleshooting" as a degree, but some of us actually got a solid foundation in the various theoretical and practical foundations of computer software engineering.
Though I do agree that "Computer Science" is a stupid name. They already have Mechanical Engineering, Chemical Engineering, Electrical Engineering, etc - why not just call it "Software Engineering"? [I'd say "Computer Engineering", but since that was my major and I also had to do transistor physics and VLSI design, it I guess does need to be separate...]
Re:It's not just statistics by Dahamma · 2010-01-09 13:30 · Score: 1

Again, I don't know where you went to school, but physics, calculus, and diff eq (not chemistry, but I took 2 years anyway because I was also a bio major) were all required for our Computer Science and Computer Engineering programs.
Re:It's not just statistics by Dahamma · 2010-01-09 13:36 · Score: 1

So what? Same with Electrical Engineering. It's a profession and a major, and there is no law that you have to complete an Electical Engineering major to get an EE job, as long as you know what you are doing.
For an EE major, you have to take physics, electronics, calculus, etc - and then take a bunch of classes on practical applications. When I went to school, you had to take similar prerequisites for a Computer Science or Computer Engineering degree. It's all naming and semantics at this point.
Re:It's not just statistics by RobinEggs · 2010-01-09 13:52 · Score: 1

I've found that more than just about any other degree Computer Science and to a less extent Medical Degrees imbue the recipient with an unnatural ego when it comes to subjects with which they are unfamiliar.
+5 insightful

In the last couple years I've come to my own theory that economists, physicists, and computer "scientists" all suffer some delusion that they can solve all problems in all other fields with slightly esoteric applications of the standard methods from their discipline. See Freakonomics, etc.
Re:It's not just statistics by Dragonslicer · 2010-01-09 14:12 · Score: 1

I'm not the Anonymous Coward, but I'll toss in my experience. I went to the University of Maine, which is an above average engineering school (it ain't MIT, but it's one of the better state universities in the northeast). For my Computer Science degree, we were required to take three semesters of calculus (I got AP credits for the first two semesters), one semester of linear algebra, one semester of probability/statistics, two semesters of introductory physics (one mechanics, one electricity/magnetism), and two more physical science classes. I took introductory modern physics (basic relativity and quantum mechanics) and nuclear physics for extra science classes, and I took differential equations as an extra class because I needed it for the nuclear physics class and other high-level physics classes (I had been considering picking up a physics minor). I don't think differential equations is required for most Computer Science majors because it isn't used nearly as frequently as calculus or even linear algebra or statistics.
Re:It's not just statistics by Dahamma · 2010-01-09 16:04 · Score: 1

Ironic that in responding to an article bemoaning a misuse of statistics you would infer so much from a sample consisting only of yourself, isn't it? :-)
Not sure if you were referring to me or you in that comment, but to quote the article...
"I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation.'"
Ie I'm not all that sure the article had anything to do with statistics in the first place...
Having attended a school from which a surprisingly large (statistically?) number of grad students dropped out to found the most successful companies in the Bay Area without finishing their degree, I'd have to say I am solidly in the camp that licensing, certification, degrees, etc mean exactly jack and shit compared to motivation and intelligence in the software industry.
Re:It's not just statistics by Rangataua · 2010-01-09 22:55 · Score: 1

I normally perform the term computer engineer or software engineer as I think it better acknowledges that most software is an engineering problem.
Re:It's not just statistics by xiong.chiamiov · 2010-01-09 23:32 · Score: 1

Though I do agree that "Computer Science" is a stupid name. They already have Mechanical Engineering, Chemical Engineering, Electrical Engineering, etc - why not just call it "Software Engineering"?
Some of us do.

--
have you read the Moderation Guidelines Addendum?
Re:It's not just statistics by adamchou · 2010-01-10 00:12 · Score: 1

If you've got issues with the semantics of a field of study, I say you pick on political science and social science first
Re:It's not just statistics by newcastlejon · 2010-01-10 02:02 · Score: 1

And economics. Economics is not a science, and never will be!

--
If God forks the Universe every time you roll a die, he'd better have a damned good memory.
Re:It's not just statistics by cetialphav · 2010-01-10 04:44 · Score: 1

And economics. Economics is not a science, and never will be!
Why not? People making economic decisions are (mostly) rational. They have reasons for spending money the way that they do. When you aggregate all of these decisions, you get an economy. Economics is just trying to understand how people make their decisions (microeconomics) and what the results of these decisions are at a larger scale (macroeconomics). This certainly is something that can be rigorously studied and analyzed. Whether you call it science or not depends on your precise definition of science, I guess.
The main problem with economics is in its misuse. People want economics to predict interest rates, revenue growth, etc, but that is impossible. The entire economic system is complex and self-modifying and the mathematical models are a not-very-good approximation of that.
Economics is terrible at predicting the future, but it is great at understanding the past. What other field can help us understand the economic collapse in the Great Depression? By studying these types of cycles, economists can identify some patterns that we can apply to policies to help reduce these kinds of cycles. And when those policies prove to be imperfect, economists can study why and offer further improvements, ad nauseam.
Re:It's not just statistics by smallfries · 2010-01-10 05:00 · Score: 1

If you pointed out that Theoretical Physics didn't involve much experimentation would you think that you had proven that Physics was not a science?
Why do you think that describing one part of a theoretical field within CS is sufficient to say that it is not a science? Strictly speaking what you have described is Program Transformation, and even within this area in CS there is lots of experimental work in compiler optimisation.

--
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
Re:It's not just statistics by ahabswhale · 2010-01-10 13:50 · Score: 1

"Economics is terrible at predicting the future, but it is great at understanding the past."

It's not even good at that. Different people have different opinions as why the depression was as bad as it was. But you can never get people to agree on history as a rule of thumb so no surprise.

--
Are agnostics skeptical of unicorns too?
Re:It's not just statistics by alexo · 2010-01-11 12:09 · Score: 1

It shouldn't be called Computer Science, it should be called Computational Mathematics, because that's what it is.
"Computer science is no more about computers than astronomy is about telescopes."
-- Edsger Dijkstra

Stats are only as good as the data by spiffmastercow · 2010-01-09 11:58 · Score: 1

I was tasked recently with developing stat reports that would be used to give the best workers the most important tasks. I used their desired metric, and modified the numbers to show on a 0-100 scale where 75 is average and each standard deviation is 10 points. The result? The sample sizes were too small, and some groups had widely varying scores when every group member's performance was nearly identical. Then again, maybe I'm doing something wrong.

Re:Stats are only as good as the data by Anonymous Coward · 2010-01-09 12:25 · Score: 1, Funny

Crash course in statistics... The result you got is not 'refined', you get the 'vital' variables like who has a mustache, ugly shirt... things that might be more likely to group people together. Ugly people hang out with ugly people and vice versa, you get the point.
Then you go around asking them to lend you some money, if they don't, their stats go down pretty quick, it's also a plus here if their memory is bad or maybe suffering from early onset alzheimers.
When that's well and done you use a differential algoritm with the other stats and this 'noise' you have gathered to get a nice graph.
And finally you put the values into excel, and make a nice pie chart which you copy paste into Power Point.
Present it to your superiors and tell them how much work you put into it. Also if there is a glitch in the presentation, like odd values or discrepancies, tell them the IT sector screwd up.
Re:Stats are only as good as the data by spiffmastercow · 2010-01-09 14:16 · Score: 1

The 75 average corresponds to the american school grade system, making the scoring accessible to users. The max score was 2.5 standard deviations higher, and the minimum grade was 7.5 std. devs lower. Also, it was a requurement that workers only be scored against workers in their area of expertise, because scores vary widely from one group to another. It worked great for the group of 150 workes, not so well for the group that only had 3 workers. None of those things should cause a problem, except the low sample size. i could explain further, and you could stick your foot in your mouth, but there's a high statistical probability that you're just an asshole who doesn't listen and simply makes assumptions.
Re:Stats are only as good as the data by rmm4pi8 · 2010-01-10 02:52 · Score: 1

Think of normal curves with standard deviations as something like the curve of adult human heights. So if you have some other kind of data which clusters very differently (say, incomes, where the high end of the curve goes on for a long time) or like your data, where you might have a team where everyone is very nearly at the median (what we call "narrow-tailed") and you just define the distribution as normal (which is what you're doing when you look at 'standard' deviations) then you're basically using statistics to remap small differences as if they're as large as the difference in human heights. And yes, for normal distributions you generally want sample sizes of 40 or so depending on the size of the effect (but in my experience that's reasonable for organizational study planning, where you obviously don't know the effect size beforehand). If what you're trying to do is just rank people in a statistically robust way, then you want something like Pearson's R, which only assumes a rank ordering, not a normal distribution, and is much more robust to small sample sizes. Of course, this will not result in grade-like scoring, but I'm not sure what can be done about that. Hope that helps a bit.

--
U.S. War Crimes blog. Email for free Mandriva support.
Re:Stats are only as good as the data by spiffmastercow · 2010-01-10 05:38 · Score: 1

The actual "final score" is recorded in the DB as a floating point value indicating the person's distance from the mean in standard deviations (i.e. A value of 1 is one std dev above the mean, -2.5 is tho and a half std devs below). I just display it as a grade so that our salesmen and managers can more easily comprehend it. I learned this little trick in class.. Where I went to school, the math and science professors all seemed to like grading on a bell curve, so that the average student was actually a C student.
Re:Stats are only as good as the data by spiffmastercow · 2010-01-10 05:42 · Score: 1

Also, i guess I should finish by explaining yhat we decided to only use this system with groups that had a big enough sample size, so it all worked out.
Re:Stats are only as good as the data by rmm4pi8 · 2010-01-10 09:14 · Score: 1

It's not just sample size. If the distribution isn't approximately normal, it's still not going to work. Your professors were basically deciding that the distribution of grades *ought* to be normal, whether the distribution of learning was or not. You may be able to do something like that in a work environment too, obviously, but it's important to recognize that it's a normative assumption not driven by the data.

--
U.S. War Crimes blog. Email for free Mandriva support.
Re:Stats are only as good as the data by rmm4pi8 · 2010-01-10 09:18 · Score: 1

Right, but that's really nothing to do with it. The point is that you're treating a bunch of data that's probably ordinal and non-normal as if it's ratio and normal (http://en.wikipedia.org/wiki/Level_of_measurement). With the distribution of your data, you can either go with a high-effort approach like estimating an appropriate parametric distribution or subsampling, or you can just use some stock non-parametric tests. It's not just the "grades" -- the whole concept of "standard deviation" is undefined with regard to distributions that are substantially different than Gaussian. Basically you've got a divide-by-zero error in your statistics.

--
U.S. War Crimes blog. Email for free Mandriva support.
Re:Stats are only as good as the data by spiffmastercow · 2010-01-11 08:54 · Score: 1

Very true, and I'm aware of this.. I just don't know of a better way to do it. I use this method when I have a large sample size and a distribution that's more or less consistent with a bell curve. There is actually one group that had a standard deviation so small that I had to throw the data out because it was useless. I'm not overly worried about it, since the metric to determine the initial score is BS to begin with (useful BS, but still BS), so it's no surprise that the end result is BS. Nobody gets killed or fired (directly) over these scores, it's more of a system to determine who to watch more closely. If you know of a better way to statistically determine a "score" without a large sample size or a normative distribution, please let me know what it is.

Is Zed insane? by greg_barton · 2010-01-09 11:58 · Score: 1

Seriously.

Re:Is Zed insane? by perry64 · 2010-01-09 12:09 · Score: 1

Zed's dead, baby. Zed's dead. We're riding his chopper.

sounds impossible to please? by v1 · 2010-01-09 12:01 · Score: 3, Insightful

I've been studying it for years and years and still don't think I know anything.

And yet you're expecting someone whose expertise is in a different field to know more about it than you?

We can't all be experts in everything. If you're the expert in the field of discussion, get used to educating your coworkers on the topic, or find another job where you're surrounded by people with the same education and expertise as you.

The average person is an expert in no more than two or three related areas. That's why people work in teams, to cover each other's blind spots.

--
I work for the Department of Redundancy Department.

Re:sounds impossible to please? by Anonymous Coward · 2010-01-09 14:03 · Score: 1, Interesting

The issue is that the OP has a more realistic evaluation of his skills in the field. Its fine for a programmer to say "I'm not too good with statistics, could some one give me some advise?" Its not fine to say "I'm good at stats, I don't need your help" IF you're wrong. Going further, its even worse if you overestimate your abilities and then ignore good advice, or offer bad advice as a result.
At a minimum a lesson is that its fine to ask for help. You impress people more with timely, working results than with a hacked together system because a bad understanding of the problem resulted in a lot of last minute changes when things didn't work.

Zed Shaw needs some serious meds by optikos · 2010-01-09 12:03 · Score: 1

He cannot even write a logical, rational thought supporting why programmers need to know more than a casual level of statistics. He just rants about blue sunsets and writes the f-word a lot.

Zed Shaw is a tosser. by toby · 2010-01-09 12:03 · Score: 2, Informative

Nothing new to see here.

--
you had me at #!

Re:Zed Shaw is a tosser. by mhelander · 2010-01-09 13:48 · Score: 2, Insightful

Plus that hallmark observation of wise, old men in any profession: Whenever you see a power of ten, chances are the number is completely made up.

Re:Bitch, while you were writing all that jive by Anonymous Coward · 2010-01-09 12:04 · Score: 1, Funny

That's ODBC, Junior. Details matter.

(And I'll bet you a thousand dollars that I earned more than you this month.)

Stats? Fuck that. by delysid-x · 2010-01-09 12:05 · Score: 2, Informative

Statstics is WAY beyond what a programmer cares about. Logic is all that matters. Statistics->logic is the problem of the software engineer, not the programmer.

Re:Stats? Fuck that. by Daniel+Dvorkin · 2010-01-09 13:07 · Score: 1

You just did an excellent job of proving TFA's point.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.

He makes some good points... by SanityInAnarchy · 2010-01-09 12:07 · Score: 5, Insightful

...unfortunately, they are mostly lost in the irony of statements like this:

I think women are better programmers because they have less ego and are typically more interested in the gear rather than the pissing contest.

I doubt I've seen anyone more thoroughly entrenched in a pissing contest than Zed Shaw, of the website formerly known as "Zed's So Fucking Awesome".

--
Don't thank God, thank a doctor!

Re:He makes some good points... by snowgirl · 2010-01-09 20:30 · Score: 1

...unfortunately, they are mostly lost in the irony of statements like this:

I think women are better programmers because they have less ego and are typically more interested in the gear rather than the pissing contest.
I doubt I've seen anyone more thoroughly entrenched in a pissing contest than Zed Shaw, of the website formerly known as "Zed's So Fucking Awesome".
Actually, the statement makes complete sense... Because he believes that women are less likely to engage in pissing contests, he finds that he wins all those contests, which placates his personality.
My older sister got mad at me because she couldn't shop for clothes the same way she did with her friends. I straight up blew off some of her ideas, because they weren't my style... but her friends just kind of fold, and listen to her. My sister has a better time shopping with her friends than with me... because I threaten her perceived superiority.
As a woman, I must say, there are some pissing contests that happen between women, but most of them take place entirely outside of the view of men (mostly because we're culturally expected to back down to men), and even then it's mostly two faced stuff. Sure there won't be a scene where we both whip it out and measure, but it's kind of likely that the woman who backs down first is plotting to be passively aggressive to the "winner".

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
Re:He makes some good points... by dreamchaser · 2010-01-09 21:03 · Score: 1

(mostly because we're culturally expected to back down to men)
I wish someone would tell my wife that!
Sorry. Couldn't resist. I really don't wish that. I prefer a partner to chattel.
Re:He makes some good points... by snowgirl · 2010-01-10 12:59 · Score: 1

(mostly because we're culturally expected to back down to men)
I wish someone would tell my wife that!
Sorry. Couldn't resist. I really don't wish that. I prefer a partner to chattel.
Well, we attract and marry the partner that best fits our expectations, so that you like a partner to chattel is a pretty good indicator of why she won't back down.
Also, I suppose it would be better to say, "predisposed to backing down." It's culturally ok, and has always been pretty ok for a wife to coonflict with her husband's opinions in private. Jewish law, for instance, provides a lot of interesting "progressive" rights to women, even while declaring them little more than property.

--
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS

Famous last comments by HTH+NE1 · 2010-01-09 12:11 · Score: 1

Zed Shaw writes an impassioned plea to programmers: Programmers Need To Learn Statistics Or I Will Kill Them All.

// This will never happen

--
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?

Re:Famous last comments by thegrassyknowl · 2010-01-09 12:22 · Score: 1

// This will never happen
if (Zed.killed(this) == true)
{
Universe.instance() / 0;
}

--
I drink to make other people interesting!

... or know when to defer to an expert by jamesh · 2010-01-09 12:11 · Score: 1

I certainly suffer from a feeling of being an expert in all fields. Deep down I guess I know I'm not, but I'd probably rather just muddle my way through it assuming I know everything there is to know. The trick is knowing when something is sufficiently out of your field that you need to defer to someone who is an expert in that field. Statistics is just one example. Certainly a little bit of knowledge in a lot of fields is a good thing, but when you have to choose between 4 years of study vs consulting someone who's already done 4 years of study, the choice should be obvious... (assuming you aren't going to spend the rest of your programming life doing heavily statistics related programming :)

For me the frustration is taking the word of an expert without understanding why and how they have arrived at that answer. I guess statistics is one field where the answer that 'feels right' is often not the answer that is right. The number of people who buy lottery tickets is a good example of that :)

stfu! by AlgorithMan · 2010-01-09 12:12 · Score: 1

I don't know how educated your colleagues are, but if they have studied computer science, then you should just shut your dumb mouth, because we learn how to analyze running times WITHOUT actually running it. Even without actually programming it, just by analyzing the problem itself. That is called "complexity theory" and (in that case) you are the one who doesn't have any clue about what you don't understand.

and go away with "tuning". You might improve running times a bit, but no little tuning hack can defeat the improvements you get by better algorithm design by an expert on algorithmics (I mean that e.g. some XOR AX AX might speed up your program by factor 2, but replacing simple backtracking with techniques to keep branching vectors small gets you exponential speed ups!)

--
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes

Re:stfu! by NoOneInParticular · 2010-01-10 19:37 · Score: 1

Yes, I'm pretty sure that when you have an application server talking to a database on a single key-lookup with index (log(N)), and you can only serve 100 concurrent users, that complexity theory will spit out the answer (you need to replace the lan+firewall between the DB and the server).
Re:stfu! by AlgorithMan · 2010-01-12 02:50 · Score: 1

sorry, I thought we were talking about programmers, not network admins...

--
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes

ah.... by KZigurs · 2010-01-09 12:13 · Score: 1

95% confidence in understanding statistics when applied to business setting is often just as good as 95% confidence in actual measurements. Yes, the last 5% are the trickiest bit, but be sure if there will be slightest indication that a proper application is required I won't be afraid to ask someone who knows more. It's just that it is quite rare.

In example: Performance testing systems. You care way more about the degradation mode than statistical model of sustainable load.

Logic and reason superior? by TranceThrust · 2010-01-09 12:13 · Score: 1

Those two things is what statistics is based in the first place as well. Evidence etcetera comes second. If you can't blow logical counterarguments away you're probably wrong and you're indeed lacking in understanding.

Statistical analysis of the summary by mmmmbeer · 2010-01-09 12:13 · Score: 2, Interesting

Let's see, we have one guy complaining about how none of his programmer coworkers understand statistics, and we have X coworkers who undoubtedly disagree with him. Since we do not know him or any of his colleagues to any meaningful degree, we have to assign equal weight to each of their opinions. Statistics then tells us there is a 1/(X+1) chance of his being right, and an X/(X+1) chance of their being right. We can assume that X >= 2 based on his ranting, therefore resulting in the odds favoring them by at least 2/3, and probably much more. Therefore it is only rational to assume they are correct.

Re:Statistical analysis of the summary by Ian_Mi · 2010-01-09 13:45 · Score: 2, Funny

I think your statics are flawed. To give equal weight to each person's opinion we should assume that each person has an independent probability, p, of being right. Then the probability of Zed being right and the others being wrong would be p (1-p)^10 while the probability of the others being right and Zack being would be p^10 (1-p). Since these events are disjoint the probability Zack being right given that one of these two events occured would p (1-p)^10 / (p (1-p)^10 + p^10 (1-p)) = (1-p)^9 / ((1-p)^9 + p^9) while the probability of the others being right would be p^9 / ((1-p)^9 + p^9). Thus if p is less than 1/2 then Zed is more likely to be correct.
Re:Statistical analysis of the summary by Ian_Mi · 2010-01-09 13:47 · Score: 1

sorry, Zed not Zack, should have read this more carefully...
Re:Statistical analysis of the summary by Ian_Mi · 2010-01-09 13:51 · Score: 1

To generalize this from 10 others to n others would give conditional probabilities of (1-p)^(n-1) /((1-p)^(n-1) + p^(n-1)) and p^(n-1) /((1-p)^(n-1) + p^(n-1)). So this result holds for any n greater than 1.
Re:Statistical analysis of the summary by brian_tanner · 2010-01-09 13:56 · Score: 5, Informative

Wow. What class did you take that says if you don't know something you should assume equal probability?

I don't know if there is an invisible elephant in my kitchen, so I guess I should assign equal probability to both outcomes. I also don't really know how Baccarat works, I guess my odds are 50/50.

Without knowing something about he or his coworkers, you by definition cannot make any statistical statements. To make any statements, you would first need to make some observations. This is how statistics is different from logic. Statistics is grounded in data.

I don't agree with Zed, but you may have just proved his point.
Re:Statistical analysis of the summary by aurelianito · 2010-01-09 14:09 · Score: 1

Please mod parent insightful.
Re:Statistical analysis of the summary by cjHopman · 2010-01-09 17:05 · Score: 1

You are basically saying that the two possibilities are that zed is right, or that all of the other 9 are right. But, we want the probability that zed is right vs the probability that at least one of the others are right, since they all disagree with him. On a side note, I find it amusing that Zed takes his experiences with a nonrepresentative sample of a group to determine that the group itself doesn't understand statistics. As he is a member of the group himself, do you think his rant's self-reference was intentional or is he as oblivious as he claims others are?
Re:Statistical analysis of the summary by Paradigma11 · 2010-01-09 23:13 · Score: 1

Maybe he took one in http://en.wikipedia.org/wiki/Bayesian_inference . I think the principle was introduced by Laplace.
Re:Statistical analysis of the summary by jstults · 2010-01-10 02:20 · Score: 1

To make any statements, you would first need to make some observations
Or you could be a Bayesian, make some assumptions, include a priori info in the analysis (which you should probably do anyway even if you have data); before you get up to check if there is an elephant in the kitchen assigning equal priors to the two hypothesis is a sound maximum entropy sort of method. You can then update your 50/50 state of knowledge after observing zero or many elephants in your kitchen.
Re:Statistical analysis of the summary by chunkyq · 2010-01-10 03:10 · Score: 1

Bravo, sir. If I had mod points, they would be used to mod you insightful.
Re:Statistical analysis of the summary by brian_tanner · 2010-01-10 03:20 · Score: 1

Bayesian statistics are great because they let you specify your prior information about the quantities of interest. They don't give you prior information for free when you have none :)
Re:Statistical analysis of the summary by WaZiX · 2010-01-10 04:56 · Score: 1

Let's see, we have one guy complaining about how none of his programmer coworkers understand statistics, and we have X coworkers who undoubtedly disagree with him. Since we do not know him or any of his colleagues to any meaningful degree, we have to assign equal weight to each of their opinions. Statistics then tells us there is a 1/(X+1) chance of his being right, and an X/(X+1) chance of their being right. We can assume that X >= 2 based on his ranting, therefore resulting in the odds favoring them by at least 2/3, and probably much more. Therefore it is only rational to assume they are correct.
Euh no... The Variable (Zed being right or not) is not stochastic. The "probability" of him being right is either 1 or 0.
Re:Statistical analysis of the summary by fishexe · 2010-01-10 18:15 · Score: 1

To give equal weight to each person's opinion we should assume that each person has an independent probability, p, of being right.
How can you treat them as independent when they all believe the same thing (namely, that Zed is full of shit)? By definition, if co-worker X were right then co-worker Y would also be right, and so forth, so there is only one event with two possible outcomes, Zed is right and Zed is wrong. The equation you listed is appropriate where any combination of Zed being right and some or all of his co-workers being right is possible, but that's not the case. You can't treat it as a binomial distribution unless the "events" are actually independent and they are obviously all dependent (or equivalently, all just one event).

btw I have to wonder if Zed named himself after the villain from Mighty Morphin Power Rangers...just a thought...

--
"I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
Re:Statistical analysis of the summary by Ian_Mi · 2010-01-11 16:07 · Score: 1

I'm saying that each person was free to choose one side or the other independently from every other person. Here I am calculating the probability that they came to either conclusion (either n - 1 right and 1 wrong or vice versa) by chance. This is without regard to the knowledge we have about them agreeing. Then I calculate the conditional probabilities based on the event that they agree (i.e. the event that either n - 1 are right and 1 is wrong or vice versa).
Re:Statistical analysis of the summary by Ian_Mi · 2010-01-11 16:24 · Score: 1

In my analysis I assumed that they were disagreeing over a proposition so that all those who disagreed with Zed were agreeing.

Who is Zed Shaw? by Coward+Anonymous · 2010-01-09 12:15 · Score: 1

What has Zed Shaw done for humanity?

Can't happen is always fixed twice by RobertLTux · 2010-01-09 12:15 · Score: 1

you fix it once to handle when some Anti-Mensa card carrying twit actually makes it happen
then you fix it a second time to prevent it from happening

every time you get data from a user/outside process you should be able to handle values that make you go Eh WOT?? and then chuck those values out (and emit the correct error code)

--
Any person using FTFY or editing my postings agrees to a US$50.00 charge

Re:Can't happen is always fixed twice by LostCluster · 2010-01-09 12:34 · Score: 1

every time you get data from a user/outside process you should be able to handle values that make you go Eh WOT?? and then chuck those values out (and emit the correct error code)"
Computers aren't very good at generating data, just analyzing it. You've got to get your data from somewhere.
So, when something unlikely comes up for a report... the question isn't just whether the number is accurate, but also why did it happen?
I was once working at a catalog outfit where there was a question as why some days there were massive return numbers, others where there was a zero, and usually it stayed within the acceptable range. I looked into it... there was one guy who specialized in returns. When he took a day off for any reason, nobody stepped up to take his place. So that's where the zero-return days came from. Following any time off, there was a backlog which he quickly processed, creating the big days. The stat was accurate... there were just some irregularities in the data.
Re:Can't happen is always fixed twice by RobertLTux · 2010-01-09 13:30 · Score: 1

yes but there should be a normal/average value and then there should be a reasonable maximum value
(in this case you should not be getting return numbers above say 200% of "normal" unless THAT GUY got run over by a bus (and then you have a reason for the numbers ))
if the normal input is say 30 values input in the 3000 range should be chucked (or if you somehow get "fred" input)
out of bounds inputs do not come "from mars" they have reasons (or are invalid)

--
Any person using FTFY or editing my postings agrees to a US$50.00 charge
Re:Can't happen is always fixed twice by Evil+Shabazz · 2010-01-09 13:41 · Score: 1

And statistics would have told you nothing about what was really important in your example. :)

--
Down with the career politician! SUPPORT TERM LIMITS
Re:Can't happen is always fixed twice by LostCluster · 2010-01-09 14:35 · Score: 1

When we found out there was only one return guy... he also informed us that he was about to take a one week vacation. So, we were about to get five days of zeros, followed by a double-week that wouldn't have been evenly distrusted, plus a customer service impact of people waiting an extra week to get their money back.
I had already become the guy he'd ask to when a tech product came back, so we found a team to try to do his job (not well) in a reasonable amount of time while doing our normal jobs too.
Re:Can't happen is always fixed twice by quanticle · 2010-01-09 17:00 · Score: 1

Ah, but what is the correct error handling strategy? Do you ignore the value and silently carry on? Do you return a "reasonable" value? Do you stop execution and wait for the user to re-enter data? Choose wrongly and you'll at best reduce the users' productivity, and at worst end up killing someone.
In fact, one of the reasons for the Three Mile Island disaster was a temperature sensor that would only report temperatures below the "rated maximum", since higher temperatures were "obviously erroneous". The operators then did not realize that the core temperature had risen beyond the rated maximum because the gauge was programmed to ignore readings that were higher than expected.

--
We all know what to do, but we don't know how to get re-elected once we have done it
Re:Can't happen is always fixed twice by tepples · 2010-01-10 01:03 · Score: 1

And statistics would have told you nothing about what was really important in your example.
I disagree. Correlate the dates of troughs against the dates of crests and see that each crest happens the day after a trough. Sure, correlation by itself does not imply causation, but it does raise a red flag that causations are worth investigating.
Re:Can't happen is always fixed twice by Evil+Shabazz · 2010-01-10 03:47 · Score: 1

I was speaking of the mathematical kind of statistics, not the kind of obvious logical connections someone without any statistics training at all could make.

--
Down with the career politician! SUPPORT TERM LIMITS

Re:There is a spell check in the comment box... by Hognoxious · 2010-01-09 12:18 · Score: 1

Meh, that's what compilers are for.

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."

Re:Show them you're the Boss by KZigurs · 2010-01-09 12:18 · Score: 1

three door problem? What about poker! ;)

Maybe you suck? by Gothmolly · 2010-01-09 12:18 · Score: 1

You know, studying stuff in college for years doesn't make you smart. Maybe these are clever, practical people, and you're just not a good communicator?

--
I want to delete my account but Slashdot doesn't allow it.

Not just programmers by famebait · 2010-01-09 12:19 · Score: 1

Everyone needs to learn statistics. All of us who understand one iota of it are in a constant state of depression over how everyone keeps on making the most banal mistakes. But just a general gripe is not very helpful. Getting everyone to take advanced degrees in statistics is simply not going to happen. Most engineering courses inclue some basics, but that only helps a bit. What is needed is to teach it (to the "masses", i.e. the ones who really ought to know better) in terms of the pitfalls first, and what to understrand the workarounds. Those who have no iterest in pursuing it further might still gain some insight about where to be careful, and those with potential might more easily see the point in investing in some real knowledge.

--
sudo ergo sum

Translation by Opportunist · 2010-01-09 12:19 · Score: 1

I studied it for years, so my e-peen is bigger. It worked in school, so it has to work in reality and thus they are wrong when they tell me it does not, despite them having experience with real applications while I have not.

Ok, snideness aside. Statistics is a wonderful tool (hey, my degree is in statistics actually), but I wouldn't want to impose my metrics on real applications without first looking whether they measure anything sensible. I turned for programming because, well, it's more suitable to me. But when I look at the metrics some of my superiors designed, cringing is all I can do.

Example: A metric that measures how much code you produce. Which is in theory nice. Who creates more code has done more work. Right? From a statistician's point of view, yes. But any programmer will tell you that it's trivial to write lots of lines or few, and they will do the same work. Most programming languages support that just fine. Does the statistician know? Probably not, unless he is a programmer too.

Example: A metric that measures the amount of code you alter. Which is in theory nice. You check out, change and check in code, and who checks out and checks in more (and does alteration in between) does more work than others. Right? No. For reference, see the Wikipedia game.

The reason why programmers scoff at metrics is that we've all seen our share of really, really crappy metrics that led to less instead of more productivity because everyone started gaming the system. Had to do that, because if you actually did sensible work, you fell behind in the metric against those that gamed (i.e. those that didn't produce in the first place).

--
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.

Re:Logic and Reason *ARE* superior to evidence and by AnotherUsername · 2010-01-09 12:21 · Score: 2, Insightful

I prefer logic and reason mixed with evidence and observation.

If you just have logic and reason, then you get religion. Logically, it worked out when it was created. There is no evidence to counter it, so it must be true. Religion was created with logical reasoning. Some may say it was incorrect reasoning, but it was reasoning nonetheless.

On the other hand, if you just have observable evidence, with no logical reasoning, you can have all the data in the world, but you will have nothing to use it with. True, you can see it, but you cannot understand why it is the way it is.

Having all of one or the other is useless.

--
I don't like Linux. This doesn't make me a troll.

Re:Reply from a programmer that knows no statistic by not-quite-rite · 2010-01-09 12:24 · Score: 1

Best. Troll. Ever.

You know nothing about statistics, yet want to tell us how it is a phony science?

You couldn't have taken a few minutes on wolfram, or even wikipedia to even TRY to know a little of what you are talking about?

Yes, I do think you are a lunatic.

In other news... by MoeDrippins · 2010-01-09 12:25 · Score: 1

....Zed wants everyone to be just like him.

--
Before you design for reuse, make sure to design it for use.

*Somebody* on the team might need to know stats by digitig · 2010-01-09 12:28 · Score: 1

Unless they're actually programming statistical applications, most programmers probably don't need to know statistics. As long as somebody on the testing team does, all the programmer needs to understand is that function X sometimes fails to meet its timing spec (perhaps "often fails..." or "occasionally fails..." might add some value) or whatever. Then they know they need to do some optimisation. There's a natural human tendency to think that everybody should be doing what we're doing. In reality, they don't have to, because we're doing that; they need to be doing something else.

--
Quidnam Latine loqui modo coepi?

Maybe as important as any math for programmers by mpsmps · 2010-01-09 12:28 · Score: 1

http://slashdot.org/comments.pl?sid=1499856&cid=30673056

Superior? by VinceVulpes · 2010-01-09 12:29 · Score: 1

"I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation." Both are superior to statistics.

Re:Superior? by Anonymous Coward · 2010-01-09 12:42 · Score: 1, Insightful

Both, put together, are statistics.

lies, damned lies... by yalap · 2010-01-09 12:31 · Score: 3, Funny

Lies, damned lies and statistics. Us programmers are too busy dealing with the first two to ever reach the third..

Re:Reply from a programmer that knows no statistic by digitig · 2010-01-09 12:32 · Score: 1

Bridges that fail, fail predictably. It is usually just a question of collecting some data.

Good luck demonstaring that an aircraft instrument landing system is fit for purpose, then. Semiconductors might fail predictably when they're being observed under an electron microscope, but it's a bit harder in a hut by the side of an airfield.

--
Quidnam Latine loqui modo coepi?

Burn in flames by Rivalz · 2010-01-09 12:33 · Score: 1

" I have taken a bunch of math classes, studied statistics in grad school, learned the R language, and read tons of books on the subject. Despite all of this I'm not at all confident in my understanding of such a vast topic." I'm presented with 1 of 2 scenarios. Either he is smart and I should not bother studying statistics because it is vast and complicated and should only do research on a as needed basis. Or He is stupid. And I should just ignore the guy completely.

Re:Reply from a programmer that knows no statistic by doublegauss · 2010-01-09 12:35 · Score: 2, Informative

You probably still think I am a lunatic, but hear me out.

You don't qualify as a lunatic; just as someone who has no idea of what he's talking about. Absolutely no idea. Your post, my friend, is so full of ideas you obviously misunderstood that I won't even attempt to make a list.

And yes, I do statistics for a living.

Getting panties in a knot over nothing by foolish_to_be_here · 2010-01-09 12:38 · Score: 1

ditto!

--
Please mod me 1 or troll. It's where the truth is these days, even on Slashdot. Beware the power of moderators everywh

Summarized for people who don't want to read Zed by SanityInAnarchy · 2010-01-09 12:41 · Score: 4, Insightful

So, since so many people don't seem to want to actually read Zed's stuff -- and I honestly don't blame you -- I'll try to summarize:

Eventually, every major science adopted an empiricist view of the world. Except Computer Science of course.

He tends to bitch a lot about computer scientists. I'm just starting a CS degree, and there is a Statistics class in the curriculum. Is he working with people with good degrees, people from a technical college with a "programming" degree, people from a diploma mill, or high school students with no degree at all?

Of course, he seems to be implying it's everyone, and doing so in a typically Zed-like way.

"All you need to do is run that test [insert power-of-ten] times and then do an average." Usually the power-of-ten is 1000...

I don't know that I've ever heard that particular statement. But it's a good point:

How do you know that 1000 is the correct number of iterations to improve the power of the experiment?

Generally because it was probably closer to a million, so I'm erring on the side of taking more, rather than fewer, measurements. But without careful consideration, I could be way off.

How are you performing the samplings?

I think this is vastly less important than how you are dealing with the data, but it is also a good point. For example, his complaint is that an average isn't enough; with detailed enough logging, he could easily go back into my data and figure out min, max, standard deviation, histograms...

How do you know that 1000 is enough to get the process into a steady state after the ramp-up period?

Not a huge deal -- the "steady state" will almost certainly be faster than the "ramp-up" period. Worst case, I'm over-optimizing.

What will you do if the 1000 tests takes 10 hours?

Either ctrl+c, or try it 10 times.

How does 1000 sequential requests help you determine the performance under load?

Very good point here. It's still a useful statistic, but you still need to measure things like 1000 simultaneous requests, not just 1000 all in sequence.

On the other hand, if your performance is acceptable with them all in sequence, you could just run it through something like Event Machine, so it's all sequential on production, too.

The most troubling problem with these single number “averages” is that there’s two common averages and that without some form of range or variance error they are useless. If you take a look at the previous graphs you can see visually why this is a problem. Two averages can be the same, but hide massive differences in behavior...

So yes, always make sure you can record enough statistics so that someone else can come along and use your data to give you something meaningful.

The moral of the story is that if you give an average without standard deviations then you’re totally missing the entire point of even trying to measure something. A major goal of measurement is to develop a succinct and accurate picture of what’s going on...

It doesn't have to be statistically accurate. It just has to be close enough.

Ah, confounding. The most difficult thing to explain to a programmer, yet the most elementary part of all scientific experimentation. It’s pretty simple: If you want to measure something, then don’t measure other shit.

This is both a very good and a very bad idea. It ties into the peeve he had before -- ramp-up time. For example:

If we want to take one single line of code and test it then we can. If we want to only verify one single query on a database then what’s stopping us?

What's stopping us is that our applications don't actually work like that.

--
Don't thank God, thank a doctor!

Re:Reply from a programmer that knows no statistic by viking80 · 2010-01-09 12:41 · Score: 1

Best. Troll. Ever.
Yes, I do think you are a lunatic.

Thanks, I am honored.
I actually have degrees i mathematics, and I have a sister with a ph.d. in statistics. We have had this discussion most Yules we get together, and it is fun to get some /.ers into it too...

--
don't cut it off www.mgmbill.org

They probably were told... by highways · 2010-01-09 12:43 · Score: 1

"... since they were probably told in college that logic and reason are superior to evidence and observation.'"

Oh, so they were taught Bayesian rather than Frequentist statistics?

Very good (from someone who's taken BOTH)... apk by Anonymous Coward · 2010-01-09 12:44 · Score: 1, Interesting

"Statisticians need to learn programming or I will kill them all." - by halivar (535827) on Saturday January 09, @06:43PM (#30710618) Homepage

Well put, Halvar! Now, I'll add to it, as I have backgrounds in both areas he "bitches here" about.

First of all:

I'm in possession of degrees from both the business world (where I took STAT 1 & STAT 2 & "aced" both w/ A grades no less) & also Comp. Sci. & CIS concentration/minor (where you get exposure to a good deal of "higher mathematics" such as Calculus, & Discrete Math to name only a couple possibles)...

LOL! Man... I "just loved" (not) his "logic & reasoning is inferior to evidence & observation"...

(Especially since I know 1 VERY important thing: That stat teaches you 1 extremely IMPORTANT concept: It's ALL BASED ON SAMPLE SETS...)

As to "sample sets"? Well, those are USUALLY either:

----

1.) EASILY SKEWED (as in "4/5 dentists chew trident", oh "sure, sure", especially when they're on the corporate payroll (or paid off to say so by said corporation so their "evidence & observation looks good")

and

2.) IS THE SAMPLE SET LARGE & COMPREHENSIVE ENOUGH? (most?? Most are not, period)...

----

Simply because you cannot:

A.) Sample EVERYONE

B.) Nor can you judge the veracity & accuracy of who you are sampling!

----

E.G. #1 - Let's say I had a poll question of "Are Democrats better than Republicans?" & I sampled from a PRIMARILY REPUBLICAN AREA - So, that all "said & aside"??

What kind of answers do you think I'd get???

Would THAT be a "good/fair & representative sample set"????

Answer = Hell no!

Math people sometimes make me laugh... especially when they *THINK* they "know it all".

Lief's a BALANCE people, & there are very few "absolutes", because people are not "binary". Human beings have a LOT of "shades of grey" (or, is it "gray"?? Inquiring minds, want to know, lol!)

APK

P.S.=> Personally - I feel that life's REAL answers & REAL problems, in my estimation & opinion, aren't going to even be answered by "hard sciences" alone...

I actually tend to think that the REAL ANSWERS (for the REAL problems) will come from philosophers really!

(E.G. #2 - The serious questions to answer, like "why is man unjust to man" for example).

Yes, THAT coming from me may sound weird, especially coming from someone with fairly extensive classical education in the business sciences & computer sciences here in myself, but I do hold to that (and, all the math that comes with them like STATS, CALC, DISCRETE MATH, etc. et al, from the 'hard sciences'? They're JUST TOOLS that others should definitely use, but not "base all" on them, either, because they too can be misused, as in the examples above I note from stats itself))... apk

Re:My advice: take a statistics class as an underg by HornWumpus · 2010-01-09 12:46 · Score: 1

Stats before calculus are just memorize and regurgitate.

Take stats as an undergrad but after you finish calculus so you have grounding to understand.

Not just puke formulas back out onto an exam paper.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'

Re:Logic and Reason *ARE* superior to evidence and by line-bundle · 2010-01-09 12:46 · Score: 2, Funny

No, Logic and Reason are superior to Cubase.

It's a music joke, laugh.

Knowledge isn't the problem by NitWit005 · 2010-01-09 12:48 · Score: 2, Informative

From his complaints, I can tell knowledge isn't the real issue. Testing performance takes a huge amount of time. You need to simulate other programs running, multiple users and make sure the test matches what real users might do. Generally, this requires writing completely independent test programs and charting the logging from them. People just don't want to go to that kind of effort. It can take weeks just to create proper tests for complex programs like web servers.

oh please. by timmarhy · 2010-01-09 12:52 · Score: 1

this guy's an idiot. he admits to not knowing the subject matter well but still wants to chastise programmers for not being experts?!! that's his first epic fail, his 2nd is that programmers aren't meant to be experts in every area, only at programming. people that have double degree's and years of experience in a field are the only ones who should be, and they will be in lead roles.p his 3rd fail is how he makes his arguement, it reminds me of a child throwing itself on it's back and kicking it's legs till it gets it's own way.

--
If you mod me down, I will become more powerful than you can imagine....

Re:oh please. by countach · 2010-01-09 13:01 · Score: 1

Programmers aren't meant to be experts at everything, but statistics is probably one of the more important things a Programmer should learn. I mean they probably teach a bunch of stuff at uni like how to sort a list in (n) time, that nobody will ever use, but knowing how to apply statistics, whether it to be financial problems or computer science problems is one of the main tools a well rounded programmer should have. My knowledge of statistics is pretty basic, but its enough to wow my bosses with useful analysis of our data.

Wrong tool? by meburke · 2010-01-09 12:56 · Score: 1

The use of statistics is a means to an end that never ends. It has its uses in specific situations, and programmers trying to reach these ends in those specific situations would be well-off to know statistics? OK, I agree. If you are programming a data-mining application, then knowledge of probability and statistics seems pretty important. If you are programming a plane to land automatically on a runway, or a robot to place a chip on a board, then I want precision, not probability. (Although precision is probabilistic in itself.)

What Zed is describing is a situation where statistics could greatly improve the performance of the whole system, and he looks to be right. And that may be the real problem: He's more committed to being right than to resolving the problem.

I would say this is more a "people problem" than a programming problem. Placing blame, telling people they are ignorant, hostile language and the like are not leadership qualities.

There is another aspect here that interests me; the type of programming methodology. If this type of project were approached as a monolithic project, the scope, means and tools would be apparent before the project got to the argument stage. In an "agile" environment, the lack of pre-defined methodology would show up as part of the tweaking/improvement process. Picking the right method might be very important to alleviating the problem of the project with the "long tail" (i.e., the project that seems almost finished but there are a million little things to finish to make it deliverable).

--
"The mind works quicker than you think!"

Re:Wrong tool? by Daniel+Dvorkin · 2010-01-09 13:12 · Score: 1

If you are programming a plane to land automatically on a runway, or a robot to place a chip on a board, then I want precision, not probability. (Although precision is probabilistic in itself.)
Your parenthetical statement is kind of the whole point here. There are a whole lot of problems which look deterministic, but aren't. And if you assume that the problem is deterministic, when the result you're actually dealing with is a spread of probability outcomes, you will be dramatically wrong. In the examples you gave, the uncertainty in the second case may be small enough that treating the problem as deterministic is reasonable; but in the first case, if you don't account for uncertainty in your inputs, the result will be a runway covered with corpses.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Re:Wrong tool? by meburke · 2010-01-09 13:39 · Score: 1

Yeah, what you said. I used an off-the-cuff example that didn't clearly distinguish the differences in necessary tools. This actually parallels my experience with off-the-cuff programming; it seldom includes clear enough standards or specs. I should simply have left it at: Projects that require statistics should have someone involved who understands statistics, and other projects don't. The original article by Zed clearly shows cases where a real knowledge of statistics is valuable, but the claim that implied all programmers need to know statistics fails. The gist of his article seems to be: "There's a tool for that!"

--
"The mind works quicker than you think!"

maybe you are the problem by sams67 · 2010-01-09 12:58 · Score: 1

Given your exposé of the facts on Slashdot, and the way you describe your colleagues and your own understanding of stats, I would say there is a 90% chance you are wrong and they are right. Or maybe 95%.

Thanks for the tip bro! by c4t3y3 · 2010-01-09 13:01 · Score: 1

I've been doing J2EE apps for 10 years and now that we are sending a rocket to mars on our next project, I'm so sorry I didn't spend my whole life learning statistics.

Um . . . by SlappyBastard · 2010-01-09 13:02 · Score: 1

" I've been studying it for years and years and still don't think I know anything."

Excatly, dumbass.

The first rule of programmers: whatever is the most expeditious path to the most usable solution is the one a programmer will take. The great skill a programmer has is the ability to assimilate and apply new information in as short a span of time as possible. If it takes years in order to not use and apply something, you can forget about a programmer ever bothering.

--
I scream. You scream. I assume that means we're both acquainted with the problem. We proceed.

Re:Bitch, while you were writing all that jive by cheftw · 2010-01-09 13:02 · Score: 2, Funny

I can vouch for this. You might think AC just spends all his time on /. but the reality is that he's a real big-shot who can afford to make ridiculous claims.

--
Always back up, never back down. ---- Think you're cool 'cos your uid is prime? Take mine, modulo the one digit integers

definitley a conspiracy by StickANeedleInMyEye · 2010-01-09 13:06 · Score: 1

This is a vast right wing conspiracy backed by Fox news "Fair and Balanced".

90% of the programming game by presidenteloco · 2010-01-09 13:07 · Score: 2, Funny

is one half mental.

of course that explains why 90% of all programs written are CRUD.

-with apologies to Yogi Berra, Theodore Sturgeon, and a 20% apology, as a matter of principle, to a guy called Pareto.

--

Where are we going and why are we in a handbasket?

Stick to it by vikstar · 2010-01-09 13:10 · Score: 1

Despite all of this I'm not at all confident in my understanding of such a vast topic.

Some people are a little slow, but stick to it, you'll get there eventually.

--
The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.

This reads like the 'edu sysadmin' article by CAIMLAS · 2010-01-09 13:12 · Score: 1

This reads a bit like the thread on the college sysadmins running the shop. Think: along the lines of over-education and not enough experience coloring one's view of the situation. See also: when you've got a hammer, everything looks like a nail.

I'd say odds are that, with someone (anyone) who's highly educated in a specific field, they tend to try to apply that discipline to everything in their lives. The welder who has metal tables and chairs, the woodworker with an oak-everything house, and the mechanic with a V8 lawn mower/snow blower are all good examples of this. Managers who think something is a "morale problem" (and not a management one) or programmers/geeks who see a social problem as one that can be fixed with computing are also examples of this.

This doesn't necessarily mean these specialized-discipline people are necessarily wrong, but it does mean they're contentious and self-righteous assholes. Statistics might help. A wireless computer in your fridge might help. So might a V8 lawn mower (that'd be fucking cool!). But chances are such things are impractical, expensive, and/or coming from an over-extension of assumption.

And sometimes, a gut feeling is as good as (or better than) a well-reasoned and thoroughly informed opinion.

Life's a crap shoot. Sometimes you can't reduce everything to numbers.

--
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers

It's the Zed Effect by greg_barton · 2010-01-09 13:13 · Score: 3, Interesting

The Zed Effect: Whether you're right or wrong people will disagree with you just to piss you off.

Re:Very good (from someone who's taken BOTH)... ap by Slotty · 2010-01-09 13:15 · Score: 1

Lief's a BALANCE people, & there are very few "absolutes", because people are not "binary". Human beings have a LOT of "shades of grey" (or, is it "gray"?? Inquiring minds, want to know, lol!)

The answer to this important question is grey. I read it in a book so it has to be true

Everyone should learn statistics by jackchance · 2010-01-09 13:17 · Score: 4, Informative

Before computers stats involved using parametric tests (t-tests, anova, etc) which made assumptions like "the data comes from an underlying normal distribution". BTW, in stats terms "normal" mean "Gaussian".

Now, with cheap and fast computers, we can actually compute the confidence intervals non-parametrically through permutation tests and bootstrapping without assuming anything about underlying distributions. In most cases, this non-parametric test is the "right thing to do". Most of the time, the results are the same as using a parametric test.

However, a HUGE disaster in empirical science has been the problem of multiple comparisons. With computers it is so easy to compute correlations and significance tests between every possible slice of your data set. Many "scientists" don't have good statistical knowledge and pray at the alter of "p < 0.05". They don't know about or understand the problem of multiple comparisons. So they do 20 tests, find one that comes out p0.05 and write a paper about it. They don't get that if you do 20 tests you are very very very likely to find one that come out p < 0.05.

Anyone who has access to excel or matlab can do this little experiment.

samp=50 normally distributed random numbers.

for x=1:100
test=50 normally distributed random numbers (mean=0, var=1);
sig(x)=ttest(samp,test);
end

now look at the sig vector. OMG, 5% of the tests came out significant!!!

Now you are writing a paper all about how x is linked to y. But you are essentially throwing dice and then writing a paper about why it came up '3-3'.

--
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765

Re:Everyone should learn statistics by Daniel+Dvorkin · 2010-01-09 13:57 · Score: 5, Interesting

Resampling-based statistics haven't replaced parametric models, and I doubt they ever will, for one very simple reason: as the available processing power grows, so does the amount of data. In my field, bioinformatics, the size and complexity of the data sets follows a Moore's Law of its own, and I don't think bioinformatics is unique in this. "Just bootstrap it" is easy to say, and certainly there have been many times when dealing with an analytically intractable distribution when I've done just that, but if the analytical solution takes minutes and the bootstrap solution takes weeks, you have to take this into account.
Of course, resampling isn't the only way to look at problems non-parametrically. Often a good compromise is to go with rank-based statistics, which are fast and easy to calculate -- and you may not have an analytically tractable model for the distribution of the original data, but you don't have to, since by working with ranks you can define a distribution with good analytical properties. You still need to do some reality-checking exploratory data analysis, of course, but this is an approach that generally works well in practice.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Re:Everyone should learn statistics by jstults · 2010-01-10 02:09 · Score: 1

"the data comes from an underlying normal distribution"
You mean we often assume the residuals are normal; the data could be any distribution at all, that's why we fit models. There's plenty of parametric stuff you can do with different distributions on the residuals too; Google "R glm Poisson", "R glm binomial", or "R glm family".
You might already know all this, but ever since that black swan book came out there's a bunch of statistical-illiterates running around saying, "the whole world's not normal", without understanding that everybody who understands the world and statistics understands that already too.
Re:Everyone should learn statistics by jackchance · 2010-01-10 08:09 · Score: 1

"the data comes from an underlying normal distribution"
You mean we often assume the residuals are normal; the data could be any distribution at all, that's why we fit models. There's plenty of parametric stuff you can do with different distributions on the residuals too; Google "R glm Poisson", "R glm binomial", or "R glm family".
Yes, i am aware there are other distributions. All I meant was that for parametric statistics there is an assumption about an underlying distribution either of the residuals or the distribution itself. I was just giving an example of an assumption. I could have said "the underlying distribution is [Poisson, binomial, exponential,Weibull, Gamma, Cauchy...]."

--
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
Re:Everyone should learn statistics by anaesthetica · 2010-01-13 06:48 · Score: 1

Can you recommend any good books on non-parametric statistics for people who are not stats or math majors?

--
The Rise and Fall of Online Community

Wiki by Anonymous Coward · 2010-01-09 13:18 · Score: 1, Interesting

I'll be honest, I didn't know who Zed Shaw was, so I fired up google. His wikipedia entry reads thus:

Zed A. Shaw is a troll[1][2][3][4], writer, software developer, and musician, most commonly known for creating the Mongrel web server for Ruby web applications, as well as his controversial opinion pieces on technology, business, and technical communities. He is frequently referred to simply as 'Zed'.

Hm...

Statisticians are like designers... by stimpleton · 2010-01-09 13:20 · Score: 1

Statisticians are like designers....they should stick to designing(or statistics as it were).

IE do what they are good at. At my work we hand off these parts as modules. Designers push back a form design. The statistician pushes back some algorithms writen in a high level language. I really do treat them like library calls.

--

In post Patriot Act America, the library books scan you.

Re:Summarized for people who don't want to read Ze by vadim_t · 2010-01-09 13:20 · Score: 1

Not a huge deal -- the "steady state" will almost certainly be faster than the "ramp-up" period. Worst case, I'm over-optimizing.

No, not necessarily. Running "time cp file.dat copy", with a file.dat of 88MB takes 0.083 seconds on my computer.

Would your conclusion then be that my computer has a disk capable of copying files at 1060MB/s? (It should be even faster when it ramps up!)

That would be complete nonsense of course. What happens is that the entire file goes into the write cache, cp returns almost immediately, and the kernel writes the 88MB over several seconds on the background.

If I copied a DVD image instead, it'd take a much longer time, and the "size/time" would be much closer to reality, because the file wouldn't fit in the cache.

So here you have an example where the steady state is much slower than the rampup, and where measuring too little would lead you to believe there's no performance issue at all, even if the disk is dog slow.

Re:Reply from a programmer that knows no statistic by Improv · 2010-01-09 13:22 · Score: 2, Interesting

In practice, statistics is an attempt to quantify messy, uncertain events into a figure. We can even measure the extent to which this works, roughly speaking. Your hard drive has a rough time-to-failure, based on analyses of the things that tend to go wrong in that system. Sure, any time it fails, it's not statistics that broke it; it's one of the kinds of problems captured in the statistical analysis. And sure, you could break it down further for disks and note that the controller has a different failure rate than some other component, just as a bridge has a number of possible failures. Problem is, for any of those, you could break it down further and get failure rates for subcomponents, regions, etc. So what? It's still useful to have statistical measures - the real world is complex, and statistics helps us capture things we otherwise couldn't.

Programmers (particularly but not only young programmers) might not like to acknowledge any field but their own has any depth ("Everything is simple! Just do it my way", hence Ron Paul/Ayn Rand fanboyism and all sorts of other stupidities) - I don't know if there's a lot we can do but hope they grow out of it (It took me awhile to do it, as did a number of people I knew when I was younger, but I made it out).

Basically, if your worldview doesn't wed empiricism and a reasonably flexible practical philosophy, your worldview is (if you err on the pro-logic end) too inflexible and you're going to miss out on standing on the shoulders of giants. Neither the logician nor the mystic understands the world.

--
For every problem, there is at least one solution that is simple, neat, and wrong.

Re:There is a spell check in the comment box... by fast+turtle · 2010-01-09 13:29 · Score: 3, Funny

What Spell Check? I didn't know I was writing a Spell. Is it a good or evil spell?

Damn it's evil. Now I've got to listen to da da da de - da da de bop all the time.

--
Mod me up/Mod me down: I wont frown as I've no crown

Re:Bitch, while you were writing all that jive by Anonymous Coward · 2010-01-09 13:34 · Score: 1, Insightful

The reality is that a programmer who screws up the ODBC acronym probably makes less than the everyday Joe. So the challenge, offered by this 30-something everyday Joe, still stands.

Re:Very good (from someone who's taken BOTH)... ap by LSD-OBS · 2010-01-09 13:34 · Score: 3, Insightful

Yup. Also, for a guy who claims to know so much about statistics and measurement, it's weird how he judges programmers so sweepingly on the sole basis of his anecdotal experiences.

--
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson

Well, this finally answers my question. by superslacker87 · 2010-01-09 13:34 · Score: 1

I'm majoring in MIS at a university where Statistics is a required core course for every major, including the computer programmers. All along, I didn't get why I have to take it. I am now, and hopefully will get through it. I'd like my degree.

--
I run Ubuntu skinned to look like a Mac on a PC. Go figure.

Statistics may or may not = math by Binder · 2010-01-09 13:38 · Score: 1

Statistics in it's purest sense is simply math. Very few people know very much about this.

Statistics in the wild is generally bullshit! You should not be able to get two equally qualified people the same data set and receive two different answers!

As for statistics for performance measurement? If you are doing something important than analyze worst case performance. Statistics doesn't come into play in this case.

bullshit by unity100 · 2010-01-09 13:38 · Score: 1

logic and reason are the enemy of religion. the whole age of enlightenment and the demise of religion and the advent of scientific age has been moving on those two. and they have never stopped moving on their momentum up till now.

--
Read radical news here

Re:bullshit by rynoski · 2010-01-09 14:26 · Score: 1

You missed the point completely.
Logic and Reason *alone* did nothing.
Scientists using Logic and Reason backed up with Evidence and Observation!

When the scientists started observing the planets, provided evidence that the sun is the centre of our solar system.
It was reasonable, even logical that the Earth was the centre of the universe. Evidence and Observation pushed our thinking forward.

--
There are two types of people in the world: 1) those that can extrapolate from incomplete data.
Re:bullshit by unity100 · 2010-01-12 05:03 · Score: 1

you need to read history of science and enlightenment.
the whole science started from the rational thought process. scientific method is just a derivative of rationalism.

--
Read radical news here

Understanding statistics is hard... by mario_grgic · 2010-01-09 13:43 · Score: 1

Speaking as someone with postgraduate degree in pure math, I'll be the first to admit that the subject is very hard to really understand well. Statistics is founded on probability theory, which in turn is based on measure theory, which is based on generalized integral theory and mathematical analysis. It takes 4 - 6 years of continuous hard study to cover this material and really know it all. And only people who devote their professional life to it can do that.

At most one could hope that one develops as sense for high level statistics, but that also takes several years of exposure to concrete examples, since intuition often fails miserably when it comes to even discrete probability theory.

Statistics is really useful as a scientific/theoretic method of reasoning, but convincing business people or even practicing scientists with it is futile in my opinion.

--
As the island of our knowledge grows, so does the shore of our ignorance.

Re:Understanding statistics is hard... by oldhack · 2010-01-09 15:04 · Score: 1

I've only done one semester of one of them "stats for engineering and science", and even that little made me realize how hairy stats are, and that was decades ago. Just try define precisely what "probability" means.
The sorriest thing is the social scientists trying to make their case using stats they have little understanding of, applying to their subjects which are way, way more more complex than those of physical science. Even worse is the medical science.

--
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
Re:Understanding statistics is hard... by upuv · 2010-01-09 19:33 · Score: 1

Your correct in your statement "convincing business people or even practicing scientists with it is futile". It is if you try and show them everything.
You sorta have to pull out metrics and measures that have a practical analogy in the "business" at hand. Something so well defined that it is hard to mis-interpret and abuse. This is power point slide you show them. You can have the hairy scary slide if you wish. But no one will look at it. As a matter of fact the meeting is over once you do show the statistical meat slide. It's like an off switch in peoples heads. They see that and they will respond to nothing other than phone calls and the meeting end time buzzer.
K.I.S.S. == Keep It Simple Stupid. Must always be applied to any stats presentation or explanation to ANYONE ELSE.

Zed Shaw sounds like a douche. by Evil+Shabazz · 2010-01-09 13:53 · Score: 2, Informative

So I read through his article. Yes, the whole mindless rant. The conclusion that one should REALLY draw from it is: Zed Shaw is a douche with Asperger's who clearly feels like his own personal area of expertise is underappreciated. Hey Zed, get over it.

--
Down with the career politician! SUPPORT TERM LIMITS

Re:Zed Shaw sounds like a douche. by shallot · 2010-01-10 03:27 · Score: 1

The first part of the article is an annoying rant, but once you get past the first dozen paragraph it's fairly factual and grounded in reasoning rather than emotion. Whereas your comment is equally judgmental yet has too little factual content to be rated +4, informative (which is the rating I see on it now).

Wikipedia on Zed Shaw by Selfbain · 2010-01-09 13:55 · Score: 2, Informative

I like how the first part of his Wikipedia article says "Zed A. Shaw is a troll" with four citations.

--
Well, it has never been successfully tested.

Re:Probability = shit in programming. by mauddib~ · 2010-01-09 14:01 · Score: 1

Your point does not hit the mark at all. It only takes a simple expected value measurement to bring statistics back into play. If every customer brings in 1 dollar but the jerk who hacks into your application will cost you 1 million dollars instead, then a 1 out of 10 billion chance is a chance to take. Statistics is a very valuable asset to any developer/systems architect.

And before you come up with an even more contrived example, I suggest you take a quick glance at fields such as decision theory, game theory or other utilitarian techniques to reassess your obvious lack of understanding of the subject.

--
This is a replacement signature.

Acknowledged Difficulty is a Good Sign by weston · 2010-01-09 14:03 · Score: 2, Interesting

not understanding a topic that even you are unwilling to acknowledge mastery of.

Personally, I think that little acknowledgment increases his credibility quite a bit. It suggests to me that he's actually spent some real time coming to grips not just with glossy overview you get in a high school or college course but with some of the devilish subtleties of actually using the stuff.

The funny thing about knowledge... the more it grows, the bigger you realize the frontier is. So, how good of a heuristic is apparent confidence?

--
Tweet, tweet.

Re:Acknowledged Difficulty is a Good Sign by Hurricane78 · 2010-01-10 03:51 · Score: 2, Funny

No, it doesn’t. If someone spends years and years on a topic, and still has the feeling he understands nothing at all, then clearly, he’s just too dumb for it.
It’s like high voltage without high current. The result is a not very bright and maybe even destroyed lamp.

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.

Re:Reply from a programmer that knows no statistic by Surt · 2010-01-09 14:03 · Score: 1

And yes, I do statistics for a living.

Do you work with the statistics porn guy?
http://developers.slashdot.org/comments.pl?sid=1504756&cid=30710812

--
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking

Obigatory Stats Joke by frank249 · 2010-01-09 14:05 · Score: 4, Funny

"I construct two sets of n=100 random samples from the normal distribution. Now, if I just take the average (mean or median) of these two sets they seem almost the same."

So its true. The n's justifies the means.

--

Today's vices may be tomorrow's virtues.

Re:Obigatory Stats Joke by fishexe · 2010-01-10 18:33 · Score: 1

The n's justifies the means.
Way to rip off Dogbert there, pal.
http://www.dilbert.com/strips/comic/1989-09-08/

--
"I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009

yea by unity100 · 2010-01-09 14:08 · Score: 2, Insightful

please tell me whether you would like to rely on decision theory, game theory or utilitarian techniques to handle life chances of your children or their sensitive private/critical information in a database.

--
Read radical news here

Re:yea by Anonymous Coward · 2010-01-09 21:26 · Score: 1, Informative

Unfortunately, your posts have been modded down, even though it is a valid discussion point.
Obviously, you want this discussion to handle PoV on political views and the accompanying philosophies. In my profession I'm a pragmatist: one should view vexing problems from different perspectives (categorical, logical, libertarian) and choose 'wisely' (meaning, that which will eventually reach the common good, even though it means a personal gain at the start). Also, my methods of measuring the merit of a solution are not the test for morality, although they could be used for this.
Since the original discussion was dealing with logic versus statistics, I would like to stress that even very basic math operators, such as big-oh (O) are a fundament to statistical methods as well, and in modelling often have an identical measure. Thus, when using computational theory, you could imply you are using statistical theory as well.

90% effort + 90% effort == done by Zero__Kelvin · 2010-01-09 14:10 · Score: 1

Well I can tell you that when I tell my boss that the project is 90% complete and I just have to finish the other 90% he, and every other SE I have said this to, knows exactly what I mean. This guy actually thinks that at times the sunset is a brilliant blue. He clearly doesn't get that how he perceives things is not the same as them actually being the way he perceives them, and so he freaks when smarter people than him don't care what he has to say. Lickily I learned from the available data I have that 100% of people named Zed Shaw want to kill me, so at least I have that going for me now ;-)

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Troll by jibjibjib · 2010-01-09 14:11 · Score: 1

"I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation."

Maybe Slashdot should have editors, so crap like this doesn't end up on the front page.

Re:Bitch, while you were writing all that jive by Rogerborg · 2010-01-09 14:15 · Score: 1

You forget to take into account that I'm way drunk on all the money that I made from writing the... uh... whatever the cunting fuckjizzle you kids are calling database apps these days.

--
If you were blocking sigs, you wouldn't have to read this.

Re:Summarized for people who don't want to read Ze by SanityInAnarchy · 2010-01-09 14:24 · Score: 1

Would your conclusion then be that my computer has a disk capable of copying files at 1060MB/s?

No, because you're not measuring disk at that point. That's confounding.

But it's a good point -- I suppose "ramp up" is a kind of confounding, anyway. I was just considering it mostly in terms like VM warm-up.

--
Don't thank God, thank a doctor!

Re:Uhm, I'm a software engineer (so you know)... a by LSD-OBS · 2010-01-09 14:36 · Score: 1

Perhaps your reading skills are not so good. I was agreeing with *you*, and pointing out the lack of validity of the OP's generalisations.

Better luck being trolled next time.

--
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson

Re:Uhm, I'm a software engineer (so you know)... a by LSD-OBS · 2010-01-09 14:45 · Score: 1

Or perhaps you aren't the AC I was replying to, but rather Zed? In which case:

You can happily go and suck a fuck for the breathtaking amount of swollen, tumoros ego and self-importance you're throwing about here. You do know that an "appeal to authority" is rather a logical fallacy, no? And you do realise that, even if the above list of positions and titles were valid in this argument, they are still anecdotal evidence, right? Your above diatribe contains nothing more compelling than a reactionary ad hominem attack of no argumental worth.

--
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson

Re:CLICK PARENT IN YOUR POST, see who you replied by LSD-OBS · 2010-01-09 14:57 · Score: 1

Seriously. Seek help. Do you know the meaning of the word "Yup", right?

Seriously. SEEK HELP. You have some serious people and communication issues.

--
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson

I usually read rtfm or google it by garyisabusyguy · 2010-01-09 15:05 · Score: 1

i.e. Chart1.DataManipulator.Statistics.InverseFDistribution(.05, 3, 4)

See, that was easy!

But seriously, I have supported a fair amount of statistical analysis in life sciences. Most programmers deal with processes that run against each one of a series of things. IMHO statistics is more like report queries where you perform groupings based on features to find favorable conditions or data falling outside of expected norms.

Could I use a solid statistician to keep me from making errors? Sure. Do I need to overbearing 'keeper of the keys' telling me I'm wrong without offering any real help? Hell no

--
Wherever You Go, There You Are

Re:Very good (from someone who's taken BOTH)... ap by JWSmythe · 2010-01-09 15:13 · Score: 5, Informative

1.) EASILY SKEWED (as in "4/5 dentists chew trident", oh "sure, sure", especially when they're on the corporate payroll (or paid off to say so by said corporation so their "evidence & observation looks good")
and
2.) IS THE SAMPLE SET LARGE & COMPREHENSIVE ENOUGH? (most?? Most are not, period)...

You know, that particular citation has made me wonder in the past, but not enough to actually research it. So, I went off looking for more information and found it.

The statistic was generated from a July 1976 survey.

The sample group for this statistic was 1,200 dentists. These dentists were hand picked by the research company, probably with good reason.

They were asked, what advice would they give gum-chewing patients

1) sugared gum
2) sugarless gum
3) no gum at all.

Sugarless gum got 85% of the vote. Not terribly surprising. I'd be fairly confident that their time had been paid for, or at very least they were told "This survey is being done for Trident Sugarless Gum." That is only speculation, so hush up.

17/20 doesn't really sound very good. It just doesn't stick in your head. 4/5 is close enough, even though it reduces your answer to 80% (ahhh, a lie). Since these are marketing folks, I'm sure they pushed all kinds of values past focus groups, until "4 in 5" was accepted as most favorable.

As the link cites, they're fairly confident that the "sugared gum" answer got at least one response. There's always someone that'll take the obvious wrong answer. If you don't believe that, look at any Slashdot poll. :)

What they don't say is how many of the 1,200 samples were dropped. I'm sure there were non-responses, and they could have easily added any number of unfavorable answers in as non-responses. Of course, they couldn't have 100% in their favor, so they had to keep some.

--
Serious? Seriousness is well above my pay grade.

Isn't this old news? by DrShoe · 2010-01-09 15:14 · Score: 1

This looked familiar, then I remembered that I read this years ago.
http://haduken.com/board/viewtopic.php?t=934&sid=ccd988ac3fa9146e94124c1228c4ac35

Dude, maybe the problem is... by Hurricane78 · 2010-01-09 15:22 · Score: 1

..that you’re just too dumb.

Know nothing after year and years? So what’s the point then?
Sorry... I can think of several millions of more efficient, more useful and more fun things to do with my life.

I hear you, about people acting like they are experts, but actually knowing shit. Like someone having read a book about HTML, who now thinks he’s a cool programmer. Or someone who clicks together a default database front-end type application, and acts as if he could compete with someone who designs hard math algorithms in Haskell or writes an OS in C/Assembler.

But I think you put way more importance on statistics, than is needed for programming. Because it’s your lovechild (nothing wrong with that). We programmers need to be good programmers. There’s only so much time in a day, to keep up-to-date with all the crazy stuff going on in CS. There are little non-science jobs where you have to keep up so much. There’s simply no place for also becoming an expert in hardware design, graphics design, usability, physics, all the areas of mathematics, including statistics, etc, etc, etc.

If I need good statistics, I’ll hire you. As soon as you know that you know them. Because there is nothing more valuable, than someone who is in love with his work. Happy? :)

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.

Maybe you're preaching instead of suggesting? by zullnero · 2010-01-09 15:29 · Score: 1

I'll propose a measurement technique and they'll scoff at it. I try to show them how to properly graph a run chart and they're indignant. I question their metrics and they try to back it up with lame attempts at statistical reasoning.

Look, programmers tend towards the egotistical at best most of the time. They like to argue, even about marginally different concepts. I've watched guys argue about things like for loops and while loops and ifs and switches so many times in my career that I can only try and block as much of that inanity out. When you approach developers by TELLING them how to do something using statistical analysis, you've got to first convince their supervisor/manager/etc. of the value of it and why it's better. THEN you approach them and tell them that's how you're doing it. Otherwise, you better believe they'll argue about that...everyone has their own way of doing things, and you can bet they don't care for someone else telling them that the way they've done things in the past is all wrong. The only way to make programmers learn is to do something first, have it become successful, and be able to demonstrate the value in doing things that way first. I've been on very, very few teams with developers who were constantly open to different ways of doing things. Very few colleges even bother to put emphasis on statistics...some will even let you dodge the course entirely and take an equivalent. CS and software engineering professors generally fall in line and focus on logic. Obviously, it's a comfort level thing, and you can't get through to people unless you can demonstratively prove your approach.

Re:Statisticians need to learn Art, or i will kill by jjohnson · 2010-01-09 15:34 · Score: 1

If you'd RTFAed, you'd have realized that Shaw isn't talking about quantifying programmers at all. Seriously, not one bit. Your whole... I don't know what the fuck it is... misses the point. And Shaw's point as well, which kind of just proves his.

--
Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.

I have talked with plenty of by McNihil · 2010-01-09 15:42 · Score: 1

so called statisticians too that have no idea what they are doing... They barely know how to define a proper sigma field so that they can use statistics on their sample set correctly.

Very few people really grasp it... maybe as bad as one per major stats bureau.

So it's not just programmers.

Not saying here that I know all of it but it sure is simple to poke hole in a lot of stuff.

In other news, Zed Shaw has been arrested... by dasqua · 2010-01-09 15:50 · Score: 1

... Isn't threatening to kill someone a crime in itself?

--
tihs isg mead fmro rcecydle tpyos

Statistics is more than probability... by dylannika · 2010-01-09 15:51 · Score: 1

From what I've read, most of the responders here seem to have a poor grasp of what the field of statistics encompasses. Statistics is not just probability (in the form of flip a coin, choose a door, and poker hands), but can also be used to effectively design an experiment, and reduce the variation in a production line among other things. Personally, I find statistics to be rewarding field of study and that it is easily applicable in the real world. Just don't tell that to my classmates who stare at me as if I have sprouted extra appendages when I tell them I am not graduating with them because I'm extending my engineering degree with an option in statistics...

Programmers don't know by generalSocial · 2010-01-09 15:52 · Score: 1

Programmers don't know statistics. Programmers don't know quantum mechanics.... Programmers don't know aerodynamics....

1/million = lots by bobbuck · 2010-01-09 15:54 · Score: 1

I wish I had mod points to give you...

That's exactly it by Rix · 2010-01-09 15:58 · Score: 1

They're not rejecting statistics as a field, they're rejecting his claimed expertise in it.

He's not claiming they are wrong - they are unset by SuperKendall · 2010-01-09 15:59 · Score: 3, Interesting

He's just as arrogantly claiming that he's right and they're wrong.

No he doesn't.

He claims that programmers need to understand statistics more. The people he is talking about are therefore not wrong - they are ignorant.

But that term is loaded with negative meaning, it's more accurate to say they are like a variable with named "statistics" with a value that has never been set. Basically, they don't know what they are missing.

It's like when programmers try to argue about how a language is bad when they've never used it. How would they know? Yet many without understanding of statistics are saying the same thing, they don't need to know any more.

I know enough to know statistics can be a valuable tool. Why would you not want another tool that could help you? The people who refuse do so are less than they could be (as a programmer).

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley

This guy is missing out. by Metasquares · 2010-01-09 15:59 · Score: 1

Machine learning is the logical place to take a combined knowledge of programming and statistics. It's a much rarer skill and commands a much higher salary, plus you're doing the closest thing we currently have to predicting the future for a living - and you generally still get to code plenty.

In other words, statistical knowledge can be a significant career advantage in addition to enhancing development and debugging.

Re:For a guy who tries to use LOGIC? Adhonimem aga by Fwipp · 2010-01-09 16:02 · Score: 1

AC: Nah, this guy didn't screw up. He (LSD-OBS) replied to you (AC) because he (LSD-OBS) was agreeing with you (AC). That's why he said 'Yup.' Because he (LSD-OBS) was agreeing with you (AC). When replying to a post, many people (present company included) use the word 'you' to refer to the person they are replying to. Having exhausted the second-person, use of third-person pronouns (such as he, her, or it) are used to refer to third parties. In this case, LSD-OBS' use of the word 'he' indicated that he found the author's (Zed Shaw's) sweeping generalizations strange. Honestly, I am a little concerned that you are getting so worked up over this.

Re:Per Logic: Adhominem isn't valid either (lmao) by Fwipp · 2010-01-09 16:03 · Score: 1

... somehow, I expect that my previous post will fall on deaf ears.

WTF? by gbutler69 · 2010-01-09 16:18 · Score: 1

I think you just proved this guy's point! Holy Shit!

--
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.

Re:WTF? by obarthelemy · 2010-01-09 18:44 · Score: 1

You can try it at home: flip 2 coins, and when you find one that is heads, look if the other one is also heads. Tally. Repeat a bunch of times for the results to be significant.

--
The Cloud - because you don't care if your apps and data are up in the air.
Re:WTF? by fbjon · 2010-01-10 04:17 · Score: 1

Nope, he's right. Make sure you read the original question as written.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:WTF? by Arthur+Grumbine · 2010-01-10 05:27 · Score: 1

I think you just proved this guy's point! Holy Shit!
If irony were strawberries, we'd all be having smoothies right now.

--
Now that I think about it, I'm pretty sure everything I just said is completely wrong.

Re:Summarized for people who don't want to read Ze by SanityInAnarchy · 2010-01-09 16:45 · Score: 1

You would be amazed how FEW samples you need with good sampling to get a good estimate,

Well, actually, I'm counting on that when I just use a "power of ten".

Sampling and results is a classic garbage in garbage out scenario. If you don't sample right your results are at best meaningless at worst they give you a completely wrong impression.

That's why it's important to record as much information as possible from each sample -- at the very least, we'd know whether it's garbage. For example:

If you wanted to know the average income of a household in the US you wouldn't just sample from people in Silicon valley just before the bust, if you did that it wouldn't matter what kind of tricks you did to your data your results would be bad.

Well, no, one obvious trick is to say, "Hey, all of this is from people in Silicon Valley just before the bust." The next obvious trick is to then combine those samples with the same people after the bust, and with other people elsewhere -- then you not only correct the error, but you get a sense of the difference between Silicon Valley and elsewhere.

My point here is that it's a hack for a programmer like me, who doesn't understand statistics (much), to make it easier to work with someone who does.

I believe the general principle here is called "data porn".

--
Don't thank God, thank a doctor!

Why is this being posted as news? by coreb · 2010-01-09 17:02 · Score: 1

I read this post a couple of years ago. Why is it just now making Slashdot? According to the wayback machine, this essay must have been written in May of 2006.

... Anti-Mensa ... by MitchAmes · 2010-01-09 17:42 · Score: 1

... some Anti-Mensa card carrying twit ...

The word you are looking for is Densan.

Is it dumbed down enough for management? by upuv · 2010-01-09 17:49 · Score: 3, Interesting

I hear you, I do performance engineering of web based systems. The developers, the managers, the testers, the architects all have no clue. You are correct here.

However if you can not present your "theory" of how to do something in a dumbed down enough format then who cares. Because the pretty graph is pointless. It will be mis-interpreted, mis-understood, and mis-used.

All the stats theory on the planet will not get you passed the dumb manager or developer. don't loose sleep of this. There is no point. Simply find metrics in your analysis procedure that do mean something to these people. They may not be the total picture but they are something. Build a reputation for being correct by starting with simple things. You are always going to but heads with a know it all developer / architect / manager. Fine let them go off and waste money and time. They will be found out as morons in time. You do your thing and simply become the guy to ask about performance and how to do this.

Being understated and consistently showing above average results for your work is how you will rise up. Being and A-hole about it is not going to help anyone. As a matter of fact I would can your butt for being a D#ck.

other way around by icepick72 · 2010-01-09 18:36 · Score: 1

You can find a reason why a programmer needs to learn anything and everything - but that's not practical. I have no qualms about hiring a statistician for special programming work - any one worth their weight is somewhat familiar with tools and languages. As a programmer I'd rather find a reason for: Why Statisticians Need To Learn Programming! The statistician has much less to learn.

It is a statistical fact that... by uvajed_ekil · 2010-01-09 18:40 · Score: 1

...Zed Shaw is a cranky, irrelevant whiner 96.3% of the time, at least according to the lambda standard deviation of the probability factor. Or so the graph shows, when enough data points are confabulated by the denominator of the sigma variation. And he thinks HE knows statistics.

--
This is a hacked account, for which the owner can not be held responsible.

Re:Reply from a programmer that knows no statistic by upuv · 2010-01-09 19:26 · Score: 1

Degrees or Degree?
and how does your sisters education reflect on you? She's the stats person not you.

Statistics by emmenjay · 2010-01-09 20:05 · Score: 1

Am I the only one who found that article hilarious?

A 6'2" "Good Looking" graduate who's extensive research in programmers has discovered that all males are inumerate neanderthals and only women really understand him.

Sigh. He's so sensitive. :-)

If only there was some other profession where people were trained in test coverage and such. We could call them "testers". Maybe I'll patent that idea.

What it really is about by Z00L00K · 2010-01-09 20:59 · Score: 1

Is the fact that most people program software by theories and think that they will get best performance when they apply their pet theories to a development project.

But what he really is saying is that in order to verify that the solution actually works it's also important to measure how well it works and time each stage in a process. That can actually yield some very surprising results and reveal that you lose a kiloton of performance on something that you never expected to be a problem.

I have several times encountered that kind of problems - network lag, missing database indexes, stupid compiler, horrible third party database libraries, slow disks... All revealed by timing the process.

So it's actually only part of the statistics process - the part where it comes to sampling data and understand it. There is often no need to do standard deviations and things like that when analyzing a software package. Many performance improvements are better than 10% when you tune your solution, rather you can get a 10 times improvement on some operation. But of course there are those that are small too, but those are usually not worth the effort.

And sampling of data can be done with things as simple as print statements or by using a package like Purify Plus.

And no - Zed Shaw isn't a total jerk, that's wrong. But he is a pain in the ass for some people. Especially for project managers and programmers.

He is right about the importance of analyzing a software, but it's not really necessary to plow into the realm of standard deviation and small differences when it comes to analyzing software. But it may be a good knowledge to have when developing a software package since you may not be able to throw your data into Excel for further processing.

And you shall also beware about trying to optimizing too much because one optimization may actually result in worse performance somewhere else. Just check where it will be most efficient from the overall perspective.

--
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.

How about teamwork ? by golodh · 2010-01-09 22:12 · Score: 1

To a certain extent I agree with the article, even if it strikes a needlessly shrill tone. Programmers are Software Engineers at best, not Statisticians or Mathematicians, and they just shouldn't mess in areas they're not schooled for. And then lots of programmers aren't even Software Engineers, but just "Code Monkeys".

Each of the above 3 professionals have their own areas of expertise. And Statistics (such as needed in performance estimation or dimensioning of processing capacity) simply isn't part of the average software's engineer's background (let alone that of a code monkey). You wouldn't want a Statistician to code up a decent interpreter, right? I mean: just look at the R interpreter. How about letting a Mathematician design and code your GUI? No takers?

By the same token you wouldn't want a programmer to design a Markov Chain Monte Carlo simulation. That's because programmers know nothing about Markov chains, the length of startup periods, periodicity of a chain, absorbing states, or invariant distributions. Worse yet, they have no way of knowing if their code spouts nonsense or the right answer with a lot of noise. It's not their area of expertise. You also don't want a mere programmer set up a numerical approximation. I mean: just look at the jackasses that coded up the Patriot timer and made the most elementary mistake in the book of numerical analysis by using a floating-point value as a loop counter and allowed it to accumulate roundoff error. That's a mistake first-year undergraduate engineering and maths students make before they are marked down for it.

So what does that mean? Well, one approach would be to shout: "HECK Programmers Don't Know Jack About Statistics And Need To Be Educated In A Hurry". That's the approach the author of the article takes. I don't believe that's a very fruitful approach though.

Another approach (the one I prefer) is to note that some engineering projects are of necessity TEAM efforts. Where you have a project lead who knows where the problem areas are, who is qualified to solve them, and how the team effort must be managed.

And yes, that means that sometimes programmers get to work under the direction (as in "are told what to do") of a specialist like a Mechanical, Electrical, Chemical, or Civil Engineer. Or a Statistician or a Mathematician for that matter.

On the other hand those specialists needn't be heard when it comes to things like database design, semaphores, inter-process communication, communication protocols, pre- and post-conditions, latency, cache filling, access control and the need for encryption and suchlike.

Om still other aspects you may expect specialists and programmers to work together and talk to each other.

So, while the problems mentioned in the article are recognizable (and indeed well known), they don't necessarily mean that programmers should get educated. They should be part of a team, and be professional enough to realize that they are members of the team, not in charge of it.

TLDR by BlackHawk-666 · 2010-01-09 22:42 · Score: 1

Zed's a total asshole. No wonder the programmers don't like him and won't listen to him. Maybe if he spent some of that stats time working on people skills he'd find office life much more enjoyable.

--
All those moments will be lost in time, like tears in rain.

Zed is full of crap by RzUpAnmsCwrds · 2010-01-09 22:45 · Score: 1

Zed is full of crap. At least in my CS undergraduate program, we were required to take a "performance analysis" class that answered basically all of Zed's questions, plus a whole lot more. Effectively, it covered basic statistics as applied to performance analysis, simulations, measurement techniques, and some basic queuing theory.

There are published CS papers that lack statistical validity - that's inexcusable. Anyone publishing a paper that deals with performance should either know enough statistics to publish a valid paper or have their paper reviewed by someone that does.

Expecting all programmers to understand statistics well is not reasonable. "Programmer" can include everything from someone who hacks PHP pages together for a living to someone who does research into new ML techniques or designs complex software systems. For the person hacking PHP pages together, statistical validity isn't a huge issue since the primary goals are getting a system that works and doing so quickly and with minimal cost.

Opposite problem here by Kludge · 2010-01-09 22:59 · Score: 2, Interesting

I question their metrics and they try to back it up with lame attempts at statistical reasoning. I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation.

I work with a number of statisticians and I have the opposite problem. They look at the data, apply mathematical transforms to it, and come to a conclusion, whether that conclusion makes any sense or not. They make little attempt to reason that the data may flawed (which experiments often are), or does not really represent what we are trying to measure, or they are using the wrong statistic to summarize the effect. It is very frustrating.

HR Realy needs to learn Statistics by mjwalshe · 2010-01-10 00:27 · Score: 1

So they might realise the whole house of cards that stack ranking and HR’s beloved PRM systems are is flawed and invalid.

Re:Summarized for people who don't want to read Ze by janwedekind · 2010-01-10 00:32 · Score: 1

What will you do if the 1000 tests takes 10 hours?

Either ctrl+c, or try it 10 times.

Why 10 times? Maybe 5 times is enough or at least 20 times is required?

It doesn't have to be statistically accurate. It just has to be close enough.

How do you know that you are close enough?

One can do a benchmark a couple of times to see whether the results are more or less the same. A more sophisticated approach is to measure the standard deviation as well. However there are situations where accuracy is critical. In that case one makes a distribution assumption (e.g. Normal distribution) and then a statistical estimator is used to give a confidence interval for the estimated parameter. I.e. the confidence that the parameter will be within that interval is 95%.

It's not statistics or CS, but communication by St.Creed · 2010-01-10 00:43 · Score: 1

This reminds me greatly of my previous assignment, where I had to work with (yet another) "difficult user". He had a Ph.D in statistics and sounded a bit by Zed. He had also done some work in datamining and data warehouses, so he started our first conversation by declaring himself an expert in my field. Great start :)

Ofcourse, as it turned out he was just very frustrated with his colleagues because he couldn't explain his ideas. No surprise there: he tried to explain very advanced mathematics with formulas, to people who barely managed to get a highschool education. After I provided an interface between the parties involved (my CS study came with a course in probability calculus so I could actually understand what he was doing) things went pretty smooth from there on. My advice to this user when I left was "get a good communications training". He said his manager was saying the same for about a year now but now it was coming from me (a techie) he'd actually think about it :)

People who can communicate are paid lots of money. You can have all the skills, but if you can't access them, or combine them, you're not getting much use out of that expertise. Zed's article being a case in point.

--
Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)

OOPSLA papers on statistics for Java performance by Itkovian · 2010-01-10 00:51 · Score: 1

We did some work involving statistics to correctly report results, see http://www.itkovian.net/base/statistically-rigorous-java-performance-evaluation (OOPSLA 2007) and http://www.itkovian.net/base/java-performance-through-rigorous-replay-compilation (OOPSLA 2008).

--
I am the Shield Anvil. And I am not yet done.

Testing by turgid · 2010-01-10 00:58 · Score: 1

Statistics are very important when testing a system. You really need to know (especially if the bug was intermittent) what the probability is of NOT seeing the error per test run iteration.

It's not good enough to say, "It happens one in ten times, so if I run it 11 times I will definitely see the bug if it's still there."

The probability of not seeing the bug per test is 9 in 10 i.e. 90% or 0.9. These probabilities multiply, so if you perform the experiment (do a test run) 10 times, the probability of NOT seeing the bug (with the unfixed code) is 0.9^10 i.e. 0.349 or about 35%.

Would you be confident with that?

If you wanted a 1% probability (0.01) of not seeing the bug (in the unfixed code) how many runs would it take? Well, do your logs.

0.01 = 0.9^x

x=43.7

So you would need to run the test 44 times to have a 99% confidence that you'd fixed the bug.

--
Stick Men

Zed Shaw posted a reponse by lena_10326 · 2010-01-10 01:04 · Score: 1

Zed fired off an angry post yesterday after noticing he was slashdotted. It looks like some sort of retaliation swing for the onslaught of pissed off programmers gunning for Zed. http://zedshaw.com/blog/2010-01-09.html

My first thought was is Zed on some heavy duty medication? He seriously has some sort of anger problem going on and a deep seeded hatred toward his idealized concept of the "programmer". Maybe a programmer made him feel bad so now he's got a vendetta. Programmers surely can be dicks. I know because I work with them, but Zed is coming off like a dick programmer times 1000. (I chose 1000 because it's a power of 10.)

If he wants programmers to listen to him and actually change their ways, why doesn't he go with the educator approach instead of going with the approach of flame the world, stomp my feet, and call everyone stupid until they pay attention to me? The best way to get someone to ignore everything you say is to call them an idiot jackass who can't remember anything after 2 minutes. They will kindly oblige by living up to your expectation.

This Zed character may be good at some things like stats but he's damned awful at communication and demonstrating tact. I wonder if he behaves this way on the job, because I would not want to work with such a caustic person. Maybe at work he keeps the anger under wraps and behaves like a great guy, but if I were his coworker I'd lose all respect for him after reading those 2 posts.

--
Camping on quad since 1996.

Re:Zed Shaw posted a reponse by jjohnson · 2010-01-10 06:11 · Score: 1

deep seeded
*ahem* deep seated.

--
Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.

it's not just programmers... by FrozenGeek · 2010-01-10 01:51 · Score: 1

Check out your local weather forecast. "The normal high for today is..." But what's the standard deviation? If they tell you that the normal, or the average, is 15C and today's high is 25C - wow - that's way above normal. Must be global warming. Quick, send money to AlGore. But what if they also told you that the standard deviation for today is 12 degrees? Oh. Hmm. 25C ain't that significant. Cancel the cheque to Al.

Statistics are worse than meaningless if you don't understand how to use them correctly.

--
linquendum tondere

In 1976... by alispguru · 2010-01-10 02:48 · Score: 2, Informative

... I ran into a professor of statistics who said that computers were going to be a passing fad in his field.

--

To a Lisp hacker, XML is S-expressions in drag.

probability by zugedneb · 2010-01-10 03:06 · Score: 1

theory is what is needed, otherwise statistics does not mean much to anyone...
With probability theory one models, while statistics is used to estimate the parameters of a model.

Re:3 doors / Monty Hall by raynet · 2010-01-10 03:24 · Score: 1

I wrote a little script that simulated this competition and on 10000 runs, if I didn't switch I won the price 3295 times and if I did switch I won it 4997 times.

--
- Raynet --> .

Dear Zed (I know you are reading the forum) by Zarf · 2010-01-10 03:47 · Score: 1

Just because you are perfectly right ... doesn't mean you aren't a complete and total asshole.

As a reformed asshole myself I can tell you that condescendingly pointing out the failures of your colleagues will not get you what you want. Specifically (and I'm assuming here that your goal is the same as mine) getting your colleagues to stop acting like self-righteous fucktards. Most programmers are convinced they are geniuses. This is crucial to understand if you wish to work with them and wish to get them to do anything at all.

I am ostensibly in a senior role in my day job and I do find many things these other programmers do ... well ... fucktarded. That is they are beyond retarded since a retard would know they are a retard or at least not entertain the delusion of superiority that a fucktard does. No my friends we need to call them fucktards because they are fucking arrogant in their belief of superiority. So I can't tell these geniuses to do anything. Nope. Not at all.

You need to use psychology on these fucktards. What you need to do is something Socrates used to do with his little fucktards that he taught. Ask questions. Since the genius/fucktard seems to know so much start by asking leading questions that will do one of two things... it will lead the fucktard down a road that will show you both how stupid he is (and you can pretend they figured it out themselves they love to take credit). Or it will show you where you were wrong... and that you were the fucktard.

Remember we are after end results. So we put aside lesser things (like pride) in the search for a greater goal which should be better software and the ability to make more of it. If you can psychologically manipulate an army of fucktards you will become fucking powerful. Much more fucking powerful than you fucking are on your fucking own. I wish you good fucking luck as I can tell by the response to your post that you are a fucking powerful personality and will definitely lead your own army of fucktards one day.

Hopefully when we meet on the field we can be allies and not enemies.

--
[signature]

damn lies = MBA statistics by AliasMarlowe · 2010-01-10 03:47 · Score: 3, Interesting

Statistics are important; it is highly unlikely that anyone with an MBA will know how or why, but they want them.

In fact, it is almost a certainty that any given MBA will either lack statistical expertise or will misapply it unthinkingly in a cook-book style. The pseudo-statistics behind Six Sigma comes immediately to mind.

I had repeated theoretical discussions with the four MBA experts who "trained" us (a group of six PhDs in Physics & Engineering doing R&D) in the ways of Six Sigma. There were problems with the statistical theory they presented right from the start - and they were clearly unaccustomed to being contradicted along the lines of "that's not right/applicable in this case, and here's why". For instance, they failed to acknowledge that non-Gaussian distributions could exist, then refused to accept that procedures should be adapted to the data if it was non-Gaussian. Next, they adamantly refused to believe that the 1.5 Z shift hypothesis was supported only by a few studies, all relying on a single dataset from the 1950s for die-based manufacture, and totally irrelevant to most other processes. The Six Sigma books all say "many studies" over decades support the Z shift hypothesis, but fail to cite them, and our MBA experts could not cite any such studies either. Thirdly, they refused to accept that an additional mode of variability (not in the Six Sigma beliefs) existed in processes with feedback (such as recycle lines or controllers). In many cases, this mode guarantees non-Gaussian variability in the process output.

Their advice was that to pass the course, we should ignore our knowledge of statistics (which they acknowledged was far better than theirs) and of process variability, and just "apply the documented methods". We did, and we all passed the course. Then we ignored the Six Sigma bogus statistics bullshit and got on with our jobs using proper statistics to analyze and solve problems in variability with the products we were developing.

MBAs seem to want statistics, but the vast majority appear to lack the training in how to generate proper statistics, or how to use them competently if someone else supplies them. Most MBAs appear to think the world is described adequately using Gaussian distributions, and a few "experts" know the Weibull distribution or the t-distribution. Other distribution types (Poisson, discrete/categorical, etc.) are totally foreign, and methods of inference beyond simple unconditional analyses are also quite alien to them.

I also understand that people who are good at it are rare.

Perhaps not as rare as you might think. But those who have some aptitude in statistics know enough to keep their mouths shut when the data tells them to. MBAs on the other hand, ignorant of their own ignorance, are as verbally promiscuous as politicians...

--
Those who can make you believe absurdities can make you commit atrocities. - Voltaire

Re:damn lies = MBA statistics by neurospyder · 2010-01-16 07:51 · Score: 1

That sounds like a communication issue. Did you go to your marketing guy and tell them you could solve this shit?
FAIL?

Troll by redalien · 2010-01-10 03:51 · Score: 1

Zed Shaw trolling? What a complete *fucking* surprise.

is this really necessary? by Goldsmith · 2010-01-10 04:14 · Score: 2, Insightful

I'm a physicist, I know plenty of statistics. The kinds of statistics he's talking about are not hard. If you can do algebra, you can do things like calculate the standard deviation and variance of a set of measurements.

Was this rant really necessary? I run into people in physics who don't take care of these details. I find that a simple "can you put a standard deviation on that number?" or "can you repeat the experiment?" generally gets the job done. If you want to be more scientific, just start with those questions, and see where it takes you... you could even add "please" if you wanted to be nice. I find threatening people with death and belittling their intellect while talking about trivial calculations doesn't generate useful data.

To be fair, it sounds like Zed has been working as staff at a university. This has nothing to do with statistics, but it's probably the real reason he's in such a bad mood.

I don't need to stinking statistics by corecaptain · 2010-01-10 04:25 · Score: 1

Sorry, Zed I don't need statistics to do my job. Zed jumped the shark years ago - isn't he the Rails guy? That is so 2005. This story is like having deja-vu of a bad hangover.

Just go away.

He's in good company by grizdog · 2010-01-10 05:17 · Score: 1

Leaving the author's lack of social skills aside, the powers-that-be in computer science education agree with him, at least for now. The Computer Science Accreditation Board lists a course in probability and statistics among its criteria (sorry, I couldn't find an online link to the latest criteria) and has for at least 20 years. I don't know how influential those criteria are outside the US (though I'd be curious, if any slashdotters can help me out), but here they are pretty important, especially for the vast majority of programs that are not at the top schools, and need the credibility that accreditation can bring them.

Not everyone is happy, though. At the 2005 OOPSLA there was a panel discussion where one thing they all could agree on was that the CS curriculum was way too mathematical. They favored something more like a software apprenticeship where "projects" where replaced with "products". That point of view does not appear to be in the ascendant in computer science yet, but it might catch on in the information science departments that are often found in business colleges.

Personally, I don't think the CS departments are likely to get less mathematical as long as there is strong demand for their graduates. There are certainly a lot of students who don't major in computer science because it is too mathematical for them, and I'm sure some of them wind up as programmers through some other route, and others find some other career. Moreover, I'd say that with one probability and statistics course that follows calculus, the students do get enough to "know what they don't know", which was what the author wanted.

Re:Summarized for people who don't want to read Ze by SanityInAnarchy · 2010-01-10 05:51 · Score: 1

1) the quality of your future coworkers

I base this on the quality of my past coworkers. I was probably lucky, though.

2) the quality of commonly held CS degrees

I'm at Iowa State University right now. It seems to be an exceptionally-good CS program. Depending on the kinds of friends I make here, I'll probably end up in a job with some of my classmates.

3) how much of their education you or anyone else remembers five to ten years after leaving college

The parts you use.

It's also much easier to re-learn something than to learn it from scratch -- thus, Zed could've said "brush up on your statistics", not "learn statistics".

--
Don't thank God, thank a doctor!

Re:Reply from a programmer that knows no statistic by scamper_22 · 2010-01-10 07:47 · Score: 1

ummm, where are you coming from here?

Everything is complex. That's the basis of every libertarian ideology. Life is too complex for a group of politicians or 'experts' to manage.
As a result of this complexity, the reasonable thing to do is to allow people to try different approaches to solve their problems... hence looking down on things like central planning.
If you think you have a solution, you are free to prove to the world that it is correct. That is freedom... the freedom to do things to solve the problem.

The alternative is the belief that some group of experts and politicians can capture all the information in the world and formulate working policies to dictate how society should behave.
Their track record? Dismal... communism, fascism, corporatism, theocracy... They all seem to fail empirically. For one, it is rare to have such experts actually know everything. Secondly, you have to cound on the experts actually have 'good will' towards the populace and not becoming corrupt or obsessed with their own power and money. Again not a trivial task.

We agree that life is complex and problems are deep. A free society demands those with solutions implement them and prove they are the best... and people will gravitate to the best (or at least good enough) solutions. Think you have a better way to run a school? Open up the school and bring in students and show people that your way is better. That is freedom.
The alternative which is what we have now? Have a bunch of experts think they can devise the best education policy, implement it within the public school system where people are taxed even if they don't attend it.

Empirically it is shown to work. School choice for example is available in many countries and places. Society does not collapse (Sweden, Chile, Alberta, British Columbia...). Yet the 'experts' who actually tend to deny empirical evidence tend to go against it in favor of theoretical arguments that society will divide if our kids don't learn together...

I used to be a socialist. Until I looked at the empirical evidence. Now I favor freedom.

Re:He's not claiming they are wrong - they are uns by Xphile101361 · 2010-01-10 08:00 · Score: 1

The people he is talking about are therefore not wrong - they are ignorant.

I'm sure this goes against everything you've been taught, but right and wrong do exist. Just because you don't know what the right answer is - maybe there's even no way you could know what the right answer is - doesn't make your answer right or even okay. It's much simpler than that. It's just plain wrong.

Dr. Gregory House

Re:Reply from a programmer that knows no statistic by Improv · 2010-01-10 08:17 · Score: 1

You misunderstand the alternative. Societies have been a mix of planning and autonomy for all of human civilisation, and *that* is what has worked well. It is not perfect, but it by-and-large works. Societies that overstress planning or autonomy have never been workable. No system in the world is lassiez-faire, nor is any system entirely planned, and all systems have their failures. It is not hard to find these for the systems that are closer to lassiez-faire, and you'd do this if you were really interested in a fair comparison.

The invisible hand, even to the extent that it supports the public good, is not always optimal. Often it doesn't even try to and is off optimising something else.

Experimentation is good, and certain amounts of competition can be worked into state structures to allow that. If there are better ways to run schools, we should find them and implement them in the public schools. We are, however, going to insist that the schools be public, that everyone pays for them, and that everyone goes to them. It's otherwise too easy for one person who earns privilege (to whatever extent the degree of that privilege is just is another question) turning it into a privilege passed, unearned, throughout many generations. Universal, public, mandatory, integrated schools help prevent that. They also help prevent racism by forcing people to rub shoulders, and they help prevent idiocy by preventing religious nuts from being the only people to educate their kids.

Formal freedoms are not the only ones worth considering - if you "allow" something in a system, but that same system effectively prevents you from enjoying it, then that allowance is very shallow. Having justice but having finances result in some people being unable to hire (any or a good) lawyer results in very shallow justice. Similarly with any other social good.

If you believe in the tangled libertarian notion of liberty as the only good, your philosophy might work. If you believe in any other goods, to cling tightly to libertarian traditions and hope to pick up reasonable amounts of these other goods will prove most unsatisfactory.

--
For every problem, there is at least one solution that is simple, neat, and wrong.

Performance Statistics are Often Low Priority by CodeBuster · 2010-01-10 08:27 · Score: 1

In a world where many programmers are lucky to even finish the project with working code (software projects have very high failure rates in the real world), performance tuning of the type where statistics would be useful is often an unaffordable luxury. Most programmers make a genuine effort to avoid the more obvious performance sinks with some knowledge of Big O Notation and known antipatterns, but in a world populated by demanding managers and slashed budgets that is really the best that most of us can do. If Zed wants programmers at his company to become experts on statistics and do detailed performance benchmarking then he can pay them himself for the privilege (hint: programmer cycles are vastly more expensive than processor cycles); otherwise he can, with all do respect, shove it.

He's the one making generalisations... by mdwh2 · 2010-01-10 08:50 · Score: 1

He claims that programmers need to understand statistics more. The people he is talking about are therefore not wrong - they are ignorant.

And this applies to all programmers?

He's the one making generalisations based on anecdotal experiences, which is itself a poor practice in terms of statistics.

It's a perfectly fair point to say that many people need to understand statistics better (and it can be done without sounding like a snob), but there is no reason for him to target his rant at programmers. My degree was in mathematics, and I now work as a programmer in which I use mathematics - where do I fit into his box?

A programmer could just as easily write a pompous rant about "How statisticians need to understand computers better", based on a handful of anecdotes and generalisations.

I don't know why we're giving time to someone who's level of argument is "they dont know shit", and resorts to childish ad hominems of "their confidence in their lacking knowledge is only surpassed by their lack of confidence in their personal appearance".

Statisticians need to learn about logical fallacies or I will kill them!

Re:He's the one making generalisations... by SuperKendall · 2010-01-10 18:02 · Score: 1

And this applies to all programmers?
Yes of course. Because knowing statistics makes anyone a more valuable programmer. As I said, it's another tool.
A programmer could just as easily write a pompous rant about "How statisticians need to understand computers better"
Actually that would be true as well, for the same reason. Are you saying there are statisticians against using computers? This seems unlikely.
Not seeing the problem here.
I don't know why we're giving time to someone who's level of argument is "they dont know shit", and resorts to childish ad hominems of "their confidence in their lacking knowledge is only surpassed by their lack of confidence in their personal appearance".
Because despite all that his point is sound, once you brush off all the crap.

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley

Re:He's not claiming they are wrong - they are uns by Almahtar · 2010-01-10 08:53 · Score: 1

it's more accurate to say they are like a variable with named "statistics" with a value that has never been set.

That doesn't sound anything like a car. You must be new here.

Re:Show them you're the Boss by dindi · 2010-01-10 09:26 · Score: 1

I thought everyone on /. is using rock-paper-scissors-lizard-spock already.

Bullshit database stats in the article by lordlod · 2010-01-10 12:59 · Score: 1

From TFA:

Almost all of the queries performed great, except one query that had sub-second response on average, but a 60 second standard deviation!

Pause and reflect on this for a moment. The average is poor and occasionally it stuffs up so severely that the stddev is pulled out by sixty seconds.

I managed to reproduce this (mean of 1.07s, stddev of 58.4). 3000 results of 1e-30s, one of 3200s (almost 1 hour).

If you need statistics to intepret the above results then you have bigger problems.

If you ACTUALLY get the above results you don't complain about the outlier and get them to rework it. Thank $DEITY, time out at a nanosecond and re-request.

Mod parent up by jvonk · 2010-01-10 14:32 · Score: 1

As someone who holds two B.S. degrees {computer engineering, computer science}, I take issue with the GP's statement. The typical CS student does not learn about transistor fanout, CMOS logic, VLSI, etc.

CS is derived from St. Turing and his universal machine. CE covers how to make (and use) one of those.

Re:Summarized for people who don't want to read Ze by Asian+Freud · 2010-01-10 18:41 · Score: 1

Because I am an genius and lazy and don't need to study much in order to get an A.

Until the third year when I almost failed a math course ;)

--
Excellence is an attitude.

Why argue as opposed to cooperating by LostMyBeaver · 2010-01-11 01:04 · Score: 1

Personally, I love spouting statistics, but those are the one based on logic such as "this won't matter in 99% of all circumstances". I admit that mathematical statistics just don't interest me as the stats I use don't need to be all that accurate. I do use them for real time protocol development, but for those, a cook book is good enough. No point learning the math on them. Takes too long and I don't gain enough from the effort to justify it. My math learning brain capacity is better spent focusing on differential equations and linear algebra. I don't have the brain cells left for a 3rd discipline :)

I love programming, but I despise trying to implement algorithms written by math geeks. They're typically sloppy and depend heavily on background information that I just don't care about. Write some pseudo code instead of using 30 pages describing the variables in an equation. When I had to start working with wavelet transforms, I had to learn some weird french notation for math I've never seen before that looked like Polish not Greek. (and I mean polish the language, not making polish jokes)

I'm a strong believer that programmers should have at least better than generalized math skills, but I also believe that stats geeks and math geeks should be at least able to write in Matcad or R or something. Then at least a programmer can do something with it.

If a stat geek and a code geek are expected to work with one another, they should at least have some way of speaking with one another and I genuinely believe that the stat geek can learn to program enough to make an example a lot easier than a code geek can learn to read their math.

I work in a company made up entirely of developers who have learned that instead of saying "Hmm... nope, that's not my thing, cya!" they instead say "It's not my thing, let's see if we can sort it out though." we help each other out and we solve problems. If you happen to be a math or a stats geek, we'll work with you to try and understand the garble that you're attempting to communicate, but it'll take far more than just "here's the math, cya" because then we'll just interpret it however it seems to make sense to us. And I promise you, it'll be wrong :)

Teamwork solves these problems.

Re:3 doors / Monty Hall by PPH · 2010-01-11 06:44 · Score: 1

If I'm the good Monty Hall:

When the contestant has selected the donkey, I open another door and offer them the opportunity to switch. If they've picked the prize, I open that door.

If I'm the evil Monty Hall:

When the contestant has selected the Ferrari, I open another door and offer them the opportunity to switch. If they've picked the donkey, I open that door.

--
Have gnu, will travel.

Stats by Stormcrow309 · 2010-01-11 07:35 · Score: 1

I find tfa pretty clueless when it comes a real understanding on what is needed for performance testing and tweaking. A statistical analysis is nice, especially with monte carlo type analysis, like Bungie running Halo 3 on numerious xboxs simulating load and player interactions. However, I find that what is lacking with programmers is a basic understanding on the high levels of process analysis, such as network analysis, CPM, and PERT. Knowing a process has high levels of variance is nice, but not useful for understanding the why. Where is Zed's example of multivariant linear regression or ordered probit? Discussion on hypothesis testing? Anyone, anyone?

As a side note, Statistics in a Nutshell is the only book programmers really need on stats.

--

In God we trust, all others require data.

Smooth moves by flabordec · 2010-01-11 08:42 · Score: 1

From TFA:

I never have this problem with female programmers. Maybe it’s because I’m tall (6’2”), or nicer to them, but they always speak rationally and are really keen to learn. If they disagree, they do so rationally and back up what they say. I think women are better programmers because they have less ego and are typically more interested in the gear rather than the pissing contest.

I'm also good looking and know a lot of statistics ladies, I really respect you and I think highly of you. If you would like some private statistics lessons call me at (123) 456-7890.

Smooth move, Zed Shaw, smooth move.

--
"I see undead people" Warcraft III - Necromancer

NetMBA by mahadiga · 2010-01-12 17:22 · Score: 1

I suggest programmers to learn management also http://www.netmba.com/

--
I'd like to buy homeland for our 10 million people. http://twitter.com/mahadiga

Murphy's Law is a statistical aberration by tinkwink · 2010-01-13 18:34 · Score: 1

I generally think in programming it's the exceptions that cause the problems. I usually only look at averages and maximums, however it must be said many performance problems are caused by a exponential increase in execution time with a linear increase in load/dataset size. I don't really know stats but it's pretty easy to see when this is the case. There are many things that stats will never predict, i.e. when you are going to hit a wall without an underlying knowledge of where the walls are and how close you are to them and what/how you move towards them. It's all pipes and data in the end. You should know what's going to break it (exceptions to your assumptions) and where your bottlenecks are, and what path is going to get followed in what situations. That can get tricky in database queries, say oracle, with stats determining your execution plan. How often does the full table scan in a loop seem to cause a query to never return? Google oracle stats execution plan. I guess it keeps DBAs in a job.

NO! by gbutler69 · 2010-01-18 14:18 · Score: 1

You are wrong even though you think you are right. The question is, what are the odds that "if 1 is heads, the other one is also" - not what are the odd that both will be heads.

The latter corresponds to the analysis given. The former (the question asked does not). In the actual question, the first two possibilities are either ruled out, because you've already stated that the first coin is 1, or, you are allowing for either coin to be one, then what are the odds of the other to be one as follows: 0 - 0 = Ruled out by the given 0 - 1 and 1 - 0 - ....

Woops! Never mind. I was reading the question incorrectly. I read "if one is heads" as "if the first is heads". Need to work on my readin comprehension, not my odds skilzzz.

--
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.

Slashdot Mirror

Why Programmers Need To Learn Statistics

423 of 572 comments (clear)