Why Programmers Need To Learn Statistics
David Gerard writes "Zed Shaw writes an impassioned plea to programmers: Programmers Need To Learn Statistics Or I Will Kill Them All. Quoting: 'I go insane when I hear programmers talking about statistics like they know s*** when it's clearly obvious they do not. I've been studying it for years and years and still don't think I know anything. ... I have taken a bunch of math classes, studied statistics in grad school, learned the R language, and read tons of books on the subject. Despite all of this I'm not at all confident in my understanding of such a vast topic. What I can do is apply the techniques to common problems I encounter at work. My favorite problem to attack with the statistics wolverine is performance measurement and tuning. All of this leads to a curse since none of my colleagues have any clue about what they don't understand. I'll propose a measurement technique and they'll scoff at it. I try to show them how to properly graph a run chart and they're indignant. I question their metrics and they try to back it up with lame attempts at statistical reasoning. I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation.'"
Everything I needed to know about statistics I learned playing poker.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
110%.
Correlation != causation. Just repeat that and you don't need to know statistics.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
It is just another way for the majority of programmers to jstify their shortcuts and shortcommings as correct. If they were to really study statistics, they would finally realize they know nothing and a chorus of millions of programmers heads would explode simultaneously.
Maybe the problem is in your presentation. Even here, you tell programmers that you want to kill them for not understanding a topic that even you are unwilling to acknowledge mastery of. Then you tell us how hard the topic is to understand, even though you've spent so much time trying to learn it.
Is it any wonder that no one takes your suggestions seriously? You are practically sabotaging yourself with self-effacement.
These aren't homework problems you're tackling here. They are business problems and you need to sell yourself and your ideas if you want to get any traction. Do you have any evidence that your methods are better than the SOP thus far? Do you have any case studies that show how effective statistic analysis is in *any* of your projects?
Or are you simply taking something that seems like a data point and extrapolating it to cover a vast swath of applications?
Statisticians need to learn programming or I will kill them all.
We know as much statistics as we need to know.
Some know more, some less. Each has traded off hours vs. knowledge in many fields.
For example: Why would a programmer who's job is to automate bean counting need to know more then basic statistics? (s)he rightfully focuses his efforts on accounting.
One post calculus statistics course gives me enough grounding to know what I don't know and punt to experts when I need to.
Fucking specialists forget all the things they don't know and only look at the world through one lens.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Damm geek. Take your fancy math and get off my lawn.
Programmers Need To Learn Statistics Or I Will Kill Them All
Okay, two things: First, threatening programmers never work. Management's been trying that for years. Second -- don't you mean 'kill -9' them all, or maybe demalloc(), or cast them to void*, or one of a dozen other witty things you could do besides the mundane answer of threatening stabby bits on them because you have a case of intellectual snobbery?
#fuckbeta #iamslashdot #dicemustdie
Zed Shaw says: "I've been studying it for years and years and still don't think I know anything"
Don't you think this might be telling you something, like... perhaps statistics are too hard for you? Leave the real work to the people who do know what they are doing and do know something about the field: programmers.
statisticians need to stfu or I will kill them all.
observation in some circumstances. In social sciences, where you generally can't classify phenomena by observable evidence, you have to rely on them by assuming others think as you do, so that you have "observations" (ie others' perceptions or classifications as related) to work with.
I never took a statistics class as an undergrad. In retrospect, I think it would have been very useful, probably more so than the calculus I took (which I think is also a very good thing to know, but stats tend to be used more often).
as opposed to strawmen and insults, scroll to the power of ten syndrome heading on the linked page.
He's just as arrogantly claiming that he's right and they're wrong. Now, he may very well in fact be right, but he's taking the same obstinate position the people he criticizes do.
It's important to know when your input is not desired. Even if you think it should be.
is not because they don't understand statistics. It is because you are a dick.
Statistics is HARD, for two reasons:
(a) Probability theory, on which all practical Statistics is based it both (i) counter-intuitive and (ii) difficult
(b) The very Mathematics on which it is based is obscure
And, worst of all, it is uniformly badly taught, even in good universities, and the Statistics for XXX are uniformly awful, blind leading the blind.
Lastly it is very hard to get a staight answer from a mathematical Statistician.
... something inside me wants to flame him for being a rude twat who wasted 1 minute of my lifetime, even though he has some valid points. I'd be surprised if he didn't get some responses along the lines of "cry me a river" etc.
"I love my job, but I hate talking to people like you" (Freddie Mercury)
I know enough about statistics to know statistically I know I'm safe from his threats. I suspect if I were a bag of Cheetos the odds were be against me but that's not the case.
I've found that more than just about any other degree Computer Science and to a less extent Medical Degrees imbue the recipient with an unnatural ego when it comes to subjects with which they are unfamiliar. I propose we remove the word Science from CS degrees and call it what it is "Computer Programming and Troubleshooting". There are far too many CS graduates who think they are actually scientists.
I was tasked recently with developing stat reports that would be used to give the best workers the most important tasks. I used their desired metric, and modified the numbers to show on a 0-100 scale where 75 is average and each standard deviation is 10 points. The result? The sample sizes were too small, and some groups had widely varying scores when every group member's performance was nearly identical. Then again, maybe I'm doing something wrong.
Seriously.
I've been studying it for years and years and still don't think I know anything.
And yet you're expecting someone whose expertise is in a different field to know more about it than you?
We can't all be experts in everything. If you're the expert in the field of discussion, get used to educating your coworkers on the topic, or find another job where you're surrounded by people with the same education and expertise as you.
The average person is an expert in no more than two or three related areas. That's why people work in teams, to cover each other's blind spots.
I work for the Department of Redundancy Department.
I never thought I'd read lame crying on slashdot, but now i have. Man up and cut your own wrists.
Everyone knows smoking is the leading cause of statistics.
He cannot even write a logical, rational thought supporting why programmers need to know more than a casual level of statistics. He just rants about blue sunsets and writes the f-word a lot.
Nothing new to see here.
you had me at #!
That's ODBC, Junior. Details matter.
(And I'll bet you a thousand dollars that I earned more than you this month.)
Statstics is WAY beyond what a programmer cares about. Logic is all that matters. Statistics->logic is the problem of the software engineer, not the programmer.
...unfortunately, they are mostly lost in the irony of statements like this:
I think women are better programmers because they have less ego and are typically more interested in the gear rather than the pissing contest.
I doubt I've seen anyone more thoroughly entrenched in a pissing contest than Zed Shaw, of the website formerly known as "Zed's So Fucking Awesome".
Don't thank God, thank a doctor!
It is easy to convince your colleagues that you are better than them in statistics. Just play some statistical games with them. I recommend the "Three Door Problem" which is sometimes called the Monty Hall problem. Those people who don't know statistics will be doomed.
When is comes to statistic it becomes like religion - you either believe that they are telling generating a truth or you don't
(Its all about assuming that you are accounting for all the variables)
jstify = justify
shortcommings = shortcomings
programmers = programmers'
If the word is underlined in red, you spelled it incorrectly. Just a thought. I can only hope that you are more careful when you write programs.
Zed Shaw writes an impassioned plea to programmers: Programmers Need To Learn Statistics Or I Will Kill Them All.
// This will never happen
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
I certainly suffer from a feeling of being an expert in all fields. Deep down I guess I know I'm not, but I'd probably rather just muddle my way through it assuming I know everything there is to know. The trick is knowing when something is sufficiently out of your field that you need to defer to someone who is an expert in that field. Statistics is just one example. Certainly a little bit of knowledge in a lot of fields is a good thing, but when you have to choose between 4 years of study vs consulting someone who's already done 4 years of study, the choice should be obvious... (assuming you aren't going to spend the rest of your programming life doing heavily statistics related programming :)
For me the frustration is taking the word of an expert without understanding why and how they have arrived at that answer. I guess statistics is one field where the answer that 'feels right' is often not the answer that is right. The number of people who buy lottery tickets is a good example of that :)
I don't know how educated your colleagues are, but if they have studied computer science, then you should just shut your dumb mouth, because we learn how to analyze running times WITHOUT actually running it. Even without actually programming it, just by analyzing the problem itself. That is called "complexity theory" and (in that case) you are the one who doesn't have any clue about what you don't understand.
and go away with "tuning". You might improve running times a bit, but no little tuning hack can defeat the improvements you get by better algorithm design by an expert on algorithmics (I mean that e.g. some XOR AX AX might speed up your program by factor 2, but replacing simple backtracking with techniques to keep branching vectors small gets you exponential speed ups!)
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes
95% confidence in understanding statistics when applied to business setting is often just as good as 95% confidence in actual measurements. Yes, the last 5% are the trickiest bit, but be sure if there will be slightest indication that a proper application is required I won't be afraid to ask someone who knows more. It's just that it is quite rare.
In example: Performance testing systems. You care way more about the degradation mode than statistical model of sustainable load.
I'm really starting to fucking hate science. I finally, after 6 years, decided I should go back to university and get a stupid piece of paper on something. So I start down the two things I loved best in school: Biology and Computers. But oh its not that simple, because to learn computers "right" I have to take algebra, calculus, statistics and physics (just incase i'm ever lost in the desert and need to build an iPhone out of sand and snake shit). To learn Biology I have to know chemisty, physics, sociology and psychology (which requires fucking statistics anyways). Stupid!
I shoulda just been like every other lemming and got a business degree so I could earn six figures a year with my thumb up my ass and my brain on a tropical island.
Those two things is what statistics is based in the first place as well. Evidence etcetera comes second. If you can't blow logical counterarguments away you're probably wrong and you're indeed lacking in understanding.
Let's see, we have one guy complaining about how none of his programmer coworkers understand statistics, and we have X coworkers who undoubtedly disagree with him. Since we do not know him or any of his colleagues to any meaningful degree, we have to assign equal weight to each of their opinions. Statistics then tells us there is a 1/(X+1) chance of his being right, and an X/(X+1) chance of their being right. We can assume that X >= 2 based on his ranting, therefore resulting in the odds favoring them by at least 2/3, and probably much more. Therefore it is only rational to assume they are correct.
83% of programmers know that 67% of statistics are made up on the fly anyway.
What has Zed Shaw done for humanity?
you fix it once to handle when some Anti-Mensa card carrying twit actually makes it happen
then you fix it a second time to prevent it from happening
every time you get data from a user/outside process you should be able to handle values that make you go Eh WOT?? and then chuck those values out (and emit the correct error code)
Any person using FTFY or editing my postings agrees to a US$50.00 charge
This is somewhat tangential to the discussion but I recommend the
MANGA GUIDE TO STATISTICS
http://www.amazon.com/Manga-Guide-Statistics-Shin-Takahashi/dp/1593271891
You know, studying stuff in college for years doesn't make you smart. Maybe these are clever, practical people, and you're just not a good communicator?
I want to delete my account but Slashdot doesn't allow it.
Everyone needs to learn statistics. All of us who understand one iota of it are in a constant state of depression over how everyone keeps on making the most banal mistakes. But just a general gripe is not very helpful. Getting everyone to take advanced degrees in statistics is simply not going to happen. Most engineering courses inclue some basics, but that only helps a bit. What is needed is to teach it (to the "masses", i.e. the ones who really ought to know better) in terms of the pitfalls first, and what to understrand the workarounds. Those who have no iterest in pursuing it further might still gain some insight about where to be careful, and those with potential might more easily see the point in investing in some real knowledge.
sudo ergo sum
I studied it for years, so my e-peen is bigger. It worked in school, so it has to work in reality and thus they are wrong when they tell me it does not, despite them having experience with real applications while I have not.
Ok, snideness aside. Statistics is a wonderful tool (hey, my degree is in statistics actually), but I wouldn't want to impose my metrics on real applications without first looking whether they measure anything sensible. I turned for programming because, well, it's more suitable to me. But when I look at the metrics some of my superiors designed, cringing is all I can do.
Example: A metric that measures how much code you produce. Which is in theory nice. Who creates more code has done more work. Right? From a statistician's point of view, yes. But any programmer will tell you that it's trivial to write lots of lines or few, and they will do the same work. Most programming languages support that just fine. Does the statistician know? Probably not, unless he is a programmer too.
Example: A metric that measures the amount of code you alter. Which is in theory nice. You check out, change and check in code, and who checks out and checks in more (and does alteration in between) does more work than others. Right? No. For reference, see the Wikipedia game.
The reason why programmers scoff at metrics is that we've all seen our share of really, really crappy metrics that led to less instead of more productivity because everyone started gaming the system. Had to do that, because if you actually did sensible work, you fell behind in the metric against those that gamed (i.e. those that didn't produce in the first place).
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
This a classic Zed Shaw post:
- Makes one really very good point (programmers doing testing should incorporate basic statistical techniques into their tests)
- Tells everyone how smart he is, albeit emphasizing his own humility ("I've read tons of books on this subject, but I still don't know shit")
- Angrily berates stupid fucking programmers for making fucking stupid mistakes, and for not listening to him when he tries to put them fucking straight
- Claims that bad practices afflict the entire community (except him)
- Betrays secret hurt feelings ("Screw you guys, I'm going to get a burrito")
- Makes creepy and patronizing comments about women
- Informs us how tall he is (6'2")
- Descends into Daily WTF-style enumerations of fucking stupid things his former boss did
Unfortunately it is missing some elements that would make it a truly great Zed Shaw post: personal insults, bewildered complaints that he is not rich, and stories about his random good deeds.
His main point is excellent though: programmers doing testing should understand statistics, and their tests should be statistically valid, just like any other empirical test. A great point and one I have not heard discussed very much in the context of software engineering.
Best. Troll. Ever.
You know nothing about statistics, yet want to tell us how it is a phony science?
You couldn't have taken a few minutes on wolfram, or even wikipedia to even TRY to know a little of what you are talking about?
Yes, I do think you are a lunatic.
....Zed wants everyone to be just like him.
Before you design for reuse, make sure to design it for use.
Unless they're actually programming statistical applications, most programmers probably don't need to know statistics. As long as somebody on the testing team does, all the programmer needs to understand is that function X sometimes fails to meet its timing spec (perhaps "often fails..." or "occasionally fails..." might add some value) or whatever. Then they know they need to do some optimisation. There's a natural human tendency to think that everybody should be doing what we're doing. In reality, they don't have to, because we're doing that; they need to be doing something else.
Quidnam Latine loqui modo coepi?
http://slashdot.org/comments.pl?sid=1499856&cid=30673056
"I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation." Both are superior to statistics.
Lies, damned lies and statistics. Us programmers are too busy dealing with the first two to ever reach the third..
Bridges that fail, fail predictably. It is usually just a question of collecting some data.
Good luck demonstaring that an aircraft instrument landing system is fit for purpose, then. Semiconductors might fail predictably when they're being observed under an electron microscope, but it's a bit harder in a hut by the side of an airfield.
Quidnam Latine loqui modo coepi?
Only 37% of programmers need to learn statistics. Remaining 95% either know it already or don't need it at all.
I think it's a matter of what you do as a programmer... Not trying to brag here, but I get paid a lot if you compare my salary to that of others, my skills are also well-sought after where I work and project managers are always trying to drag me onto their projects. Yet, I don't know ANY statistics.
Why is this? Because I don't need it for what I do. I think generalizing things and saying all programmers must know statistics is down-right stupid. It comes down to what you do... what I do doesn't require it, so what's my incentive for even knowing about it?
I think knowing basic logic (getting your ands and ors correct the first time) and generally knowing about good design and having solid debugging skills is much more important for the average programmer. Anything more than that will push you beyond average, but knowing statistics is not necessarily the correct answer. It all comes down to what you do... That is my 2 cents anyway, take it how you want.
" I have taken a bunch of math classes, studied statistics in grad school, learned the R language, and read tons of books on the subject. Despite all of this I'm not at all confident in my understanding of such a vast topic." I'm presented with 1 of 2 scenarios. Either he is smart and I should not bother studying statistics because it is vast and complicated and should only do research on a as needed basis. Or He is stupid. And I should just ignore the guy completely.
As a scientist working with computer models, I can only say that statistics is quite often king (and the goal of the calculation).
Histograms, histograms, histograms :)
You probably still think I am a lunatic, but hear me out.
You don't qualify as a lunatic; just as someone who has no idea of what he's talking about. Absolutely no idea. Your post, my friend, is so full of ideas you obviously misunderstood that I won't even attempt to make a list.
And yes, I do statistics for a living.
Really, the only way to deal with this situation is to learn how to use a firearm. Make sure that your manager understands that compassion leads to compassion and that brutality leads to lethal brutality.
Also, before you take the law into your own hands, consult with a lawyer. Some states allow a plea of insanity during the murder trial. Others do not.
Here's the bottom line. Firearms are much more useful than mere statistics unless, of course, you intend to use the firearm to turn your boss into a "statistic".
One thing I've noticed about probability and statistics is that there is an astounding number of dumbed-down, "psst dude... here's all you need to know" type books on the subject(s), written in all different styles. Every year the bookstores get a few more, evidently at the expense of some of the older ones, and this has been going on for decades. I assume that this is matched by a similar number of web site tutorials, though I haven't looked.
I don't think this helps the average professional any more than the profusion of diet plans.
ditto!
Please mod me 1 or troll. It's where the truth is these days, even on Slashdot. Beware the power of moderators everywh
...you and your colleagues are clueless when it comes to what you went to grad school for and you're trying to tell programmers how to be more... efficient and knowledgeable about what they have been doing for the majority of their lives.
You also mention how you NEVER seem to have a problem with females as they are more rational when it comes to programming. I think it might be a little deeper then that. Maybe females are more kind in the sense that they put up with your statistical dribble since you give off that "I love to hear myself talk till I annoy the person next to me" vibe. Or maybe its because you try talking to them instead of shoving numbers down their throats that don't seem to make any sense to even you yourself.
I'm honestly surprised a female programmer hasn't just smiled and nodded at you for the sake of being polite just so you would shut up and sod off and go bug the person in the next area.
I guess what I was trying to convey was if you really have that much programmer envy that you have to rant to /. about how much you hate programmers, then you clearly have chosen the wrong profession.
Good luck trying to figure out statistics, but I'm guessing if you haven't figured it out by now it will not ever happen.
-AC
PS: It sounds like you need a hug.
So, since so many people don't seem to want to actually read Zed's stuff -- and I honestly don't blame you -- I'll try to summarize:
Eventually, every major science adopted an empiricist view of the world. Except Computer Science of course.
He tends to bitch a lot about computer scientists. I'm just starting a CS degree, and there is a Statistics class in the curriculum. Is he working with people with good degrees, people from a technical college with a "programming" degree, people from a diploma mill, or high school students with no degree at all?
Of course, he seems to be implying it's everyone, and doing so in a typically Zed-like way.
"All you need to do is run that test [insert power-of-ten] times and then do an average." Usually the power-of-ten is 1000...
I don't know that I've ever heard that particular statement. But it's a good point:
How do you know that 1000 is the correct number of iterations to improve the power of the experiment?
Generally because it was probably closer to a million, so I'm erring on the side of taking more, rather than fewer, measurements. But without careful consideration, I could be way off.
How are you performing the samplings?
I think this is vastly less important than how you are dealing with the data, but it is also a good point. For example, his complaint is that an average isn't enough; with detailed enough logging, he could easily go back into my data and figure out min, max, standard deviation, histograms...
How do you know that 1000 is enough to get the process into a steady state after the ramp-up period?
Not a huge deal -- the "steady state" will almost certainly be faster than the "ramp-up" period. Worst case, I'm over-optimizing.
What will you do if the 1000 tests takes 10 hours?
Either ctrl+c, or try it 10 times.
How does 1000 sequential requests help you determine the performance under load?
Very good point here. It's still a useful statistic, but you still need to measure things like 1000 simultaneous requests, not just 1000 all in sequence.
On the other hand, if your performance is acceptable with them all in sequence, you could just run it through something like Event Machine, so it's all sequential on production, too.
The most troubling problem with these single number “averages” is that there’s two common averages and that without some form of range or variance error they are useless. If you take a look at the previous graphs you can see visually why this is a problem. Two averages can be the same, but hide massive differences in behavior...
So yes, always make sure you can record enough statistics so that someone else can come along and use your data to give you something meaningful.
The moral of the story is that if you give an average without standard deviations then you’re totally missing the entire point of even trying to measure something. A major goal of measurement is to develop a succinct and accurate picture of what’s going on...
It doesn't have to be statistically accurate. It just has to be close enough.
Ah, confounding. The most difficult thing to explain to a programmer, yet the most elementary part of all scientific experimentation. It’s pretty simple: If you want to measure something, then don’t measure other shit.
This is both a very good and a very bad idea. It ties into the peeve he had before -- ramp-up time. For example:
If we want to take one single line of code and test it then we can. If we want to only verify one single query on a database then what’s stopping us?
What's stopping us is that our applications don't actually work like that.
Don't thank God, thank a doctor!
Best. Troll. Ever.
Yes, I do think you are a lunatic.
Thanks, I am honored. /.ers into it too...
I actually have degrees i mathematics, and I have a sister with a ph.d. in statistics. We have had this discussion most Yules we get together, and it is fun to get some
don't cut it off www.mgmbill.org
"... since they were probably told in college that logic and reason are superior to evidence and observation.'"
Oh, so they were taught Bayesian rather than Frequentist statistics?
"Statisticians need to learn programming or I will kill them all." - by halivar (535827) on Saturday January 09, @06:43PM (#30710618) Homepage
Well put, Halvar! Now, I'll add to it, as I have backgrounds in both areas he "bitches here" about.
First of all:
I'm in possession of degrees from both the business world (where I took STAT 1 & STAT 2 & "aced" both w/ A grades no less) & also Comp. Sci. & CIS concentration/minor (where you get exposure to a good deal of "higher mathematics" such as Calculus, & Discrete Math to name only a couple possibles)...
LOL! Man... I "just loved" (not) his "logic & reasoning is inferior to evidence & observation"...
(Especially since I know 1 VERY important thing: That stat teaches you 1 extremely IMPORTANT concept: It's ALL BASED ON SAMPLE SETS...)
As to "sample sets"? Well, those are USUALLY either:
----
1.) EASILY SKEWED (as in "4/5 dentists chew trident", oh "sure, sure", especially when they're on the corporate payroll (or paid off to say so by said corporation so their "evidence & observation looks good")
and
2.) IS THE SAMPLE SET LARGE & COMPREHENSIVE ENOUGH? (most?? Most are not, period)...
----
Simply because you cannot:
A.) Sample EVERYONE
B.) Nor can you judge the veracity & accuracy of who you are sampling!
----
E.G. #1 - Let's say I had a poll question of "Are Democrats better than Republicans?" & I sampled from a PRIMARILY REPUBLICAN AREA - So, that all "said & aside"??
What kind of answers do you think I'd get???
Would THAT be a "good/fair & representative sample set"????
Answer = Hell no!
Math people sometimes make me laugh... especially when they *THINK* they "know it all".
Lief's a BALANCE people, & there are very few "absolutes", because people are not "binary". Human beings have a LOT of "shades of grey" (or, is it "gray"?? Inquiring minds, want to know, lol!)
APK
P.S.=> Personally - I feel that life's REAL answers & REAL problems, in my estimation & opinion, aren't going to even be answered by "hard sciences" alone...
I actually tend to think that the REAL ANSWERS (for the REAL problems) will come from philosophers really!
(E.G. #2 - The serious questions to answer, like "why is man unjust to man" for example).
Yes, THAT coming from me may sound weird, especially coming from someone with fairly extensive classical education in the business sciences & computer sciences here in myself, but I do hold to that (and, all the math that comes with them like STATS, CALC, DISCRETE MATH, etc. et al, from the 'hard sciences'? They're JUST TOOLS that others should definitely use, but not "base all" on them, either, because they too can be misused, as in the examples above I note from stats itself))... apk
Stats before calculus are just memorize and regurgitate.
Take stats as an undergrad but after you finish calculus so you have grounding to understand.
Not just puke formulas back out onto an exam paper.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
From his complaints, I can tell knowledge isn't the real issue. Testing performance takes a huge amount of time. You need to simulate other programs running, multiple users and make sure the test matches what real users might do. Generally, this requires writing completely independent test programs and charting the logging from them. People just don't want to go to that kind of effort. It can take weeks just to create proper tests for complex programs like web servers.
"they were probably told in college that logic and reason are superior to evidence and observation" is just the kind of ridiculous statement that a "statistician" would make. It's not even real math. It's math perverted to try to justify someone's bias. Math is math. Statistics is math with subjectivity added. Mostly trash with a veneer of math to try to legitamize it.
Like anyone is going to take career advice from a ranting asshat who admits he doesn't know what he's talking about, right after threatening to kill people. I hope this jerk gets cancer, aids and no cotten candy.
this guy's an idiot. he admits to not knowing the subject matter well but still wants to chastise programmers for not being experts?!! that's his first epic fail, his 2nd is that programmers aren't meant to be experts in every area, only at programming. people that have double degree's and years of experience in a field are the only ones who should be, and they will be in lead roles.p his 3rd fail is how he makes his arguement, it reminds me of a child throwing itself on it's back and kicking it's legs till it gets it's own way.
If you mod me down, I will become more powerful than you can imagine....
The use of statistics is a means to an end that never ends. It has its uses in specific situations, and programmers trying to reach these ends in those specific situations would be well-off to know statistics? OK, I agree. If you are programming a data-mining application, then knowledge of probability and statistics seems pretty important. If you are programming a plane to land automatically on a runway, or a robot to place a chip on a board, then I want precision, not probability. (Although precision is probabilistic in itself.)
What Zed is describing is a situation where statistics could greatly improve the performance of the whole system, and he looks to be right. And that may be the real problem: He's more committed to being right than to resolving the problem.
I would say this is more a "people problem" than a programming problem. Placing blame, telling people they are ignorant, hostile language and the like are not leadership qualities.
There is another aspect here that interests me; the type of programming methodology. If this type of project were approached as a monolithic project, the scope, means and tools would be apparent before the project got to the argument stage. In an "agile" environment, the lack of pre-defined methodology would show up as part of the tweaking/improvement process. Picking the right method might be very important to alleviating the problem of the project with the "long tail" (i.e., the project that seems almost finished but there are a million little things to finish to make it deliverable).
"The mind works quicker than you think!"
Given your exposé of the facts on Slashdot, and the way you describe your colleagues and your own understanding of stats, I would say there is a 90% chance you are wrong and they are right. Or maybe 95%.
I've been doing J2EE apps for 10 years and now that we are sending a rocket to mars on our next project, I'm so sorry I didn't spend my whole life learning statistics.
" I've been studying it for years and years and still don't think I know anything."
Excatly, dumbass.
The first rule of programmers: whatever is the most expeditious path to the most usable solution is the one a programmer will take. The great skill a programmer has is the ability to assimilate and apply new information in as short a span of time as possible. If it takes years in order to not use and apply something, you can forget about a programmer ever bothering.
I scream. You scream. I assume that means we're both acquainted with the problem. We proceed.
I can vouch for this. You might think AC just spends all his time on /. but the reality is that he's a real big-shot who can afford to make ridiculous claims.
Always back up, never back down. ---- Think you're cool 'cos your uid is prime? Take mine, modulo the one digit integers
This is a vast right wing conspiracy backed by Fox news "Fair and Balanced".
is one half mental.
of course that explains why 90% of all programs written are CRUD.
-with apologies to Yogi Berra, Theodore Sturgeon, and a 20% apology, as a matter of principle, to a guy called Pareto.
Where are we going and why are we in a handbasket?
Despite all of this I'm not at all confident in my understanding of such a vast topic.
Some people are a little slow, but stick to it, you'll get there eventually.
The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
This reads a bit like the thread on the college sysadmins running the shop. Think: along the lines of over-education and not enough experience coloring one's view of the situation. See also: when you've got a hammer, everything looks like a nail.
I'd say odds are that, with someone (anyone) who's highly educated in a specific field, they tend to try to apply that discipline to everything in their lives. The welder who has metal tables and chairs, the woodworker with an oak-everything house, and the mechanic with a V8 lawn mower/snow blower are all good examples of this. Managers who think something is a "morale problem" (and not a management one) or programmers/geeks who see a social problem as one that can be fixed with computing are also examples of this.
This doesn't necessarily mean these specialized-discipline people are necessarily wrong, but it does mean they're contentious and self-righteous assholes. Statistics might help. A wireless computer in your fridge might help. So might a V8 lawn mower (that'd be fucking cool!). But chances are such things are impractical, expensive, and/or coming from an over-extension of assumption.
And sometimes, a gut feeling is as good as (or better than) a well-reasoned and thoroughly informed opinion.
Life's a crap shoot. Sometimes you can't reduce everything to numbers.
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
The Zed Effect: Whether you're right or wrong people will disagree with you just to piss you off.
Lief's a BALANCE people, & there are very few "absolutes", because people are not "binary". Human beings have a LOT of "shades of grey" (or, is it "gray"?? Inquiring minds, want to know, lol!)
The answer to this important question is grey. I read it in a book so it has to be true
Before computers stats involved using parametric tests (t-tests, anova, etc) which made assumptions like "the data comes from an underlying normal distribution". BTW, in stats terms "normal" mean "Gaussian".
Now, with cheap and fast computers, we can actually compute the confidence intervals non-parametrically through permutation tests and bootstrapping without assuming anything about underlying distributions. In most cases, this non-parametric test is the "right thing to do". Most of the time, the results are the same as using a parametric test.
However, a HUGE disaster in empirical science has been the problem of multiple comparisons. With computers it is so easy to compute correlations and significance tests between every possible slice of your data set. Many "scientists" don't have good statistical knowledge and pray at the alter of "p < 0.05". They don't know about or understand the problem of multiple comparisons. So they do 20 tests, find one that comes out p0.05 and write a paper about it. They don't get that if you do 20 tests you are very very very likely to find one that come out p < 0.05.
Anyone who has access to excel or matlab can do this little experiment.
samp=50 normally distributed random numbers.
for x=1:100
test=50 normally distributed random numbers (mean=0, var=1);
sig(x)=ttest(samp,test);
end
now look at the sig vector. OMG, 5% of the tests came out significant!!!
Now you are writing a paper all about how x is linked to y. But you are essentially throwing dice and then writing a paper about why it came up '3-3'.
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
Hm...
Statisticians are like designers....they should stick to designing(or statistics as it were).
IE do what they are good at. At my work we hand off these parts as modules. Designers push back a form design. The statistician pushes back some algorithms writen in a high level language. I really do treat them like library calls.
In post Patriot Act America, the library books scan you.
No, not necessarily. Running "time cp file.dat copy", with a file.dat of 88MB takes 0.083 seconds on my computer.
Would your conclusion then be that my computer has a disk capable of copying files at 1060MB/s? (It should be even faster when it ramps up!)
That would be complete nonsense of course. What happens is that the entire file goes into the write cache, cp returns almost immediately, and the kernel writes the 88MB over several seconds on the background.
If I copied a DVD image instead, it'd take a much longer time, and the "size/time" would be much closer to reality, because the file wouldn't fit in the cache.
So here you have an example where the steady state is much slower than the rampup, and where measuring too little would lead you to believe there's no performance issue at all, even if the disk is dog slow.
In practice, statistics is an attempt to quantify messy, uncertain events into a figure. We can even measure the extent to which this works, roughly speaking. Your hard drive has a rough time-to-failure, based on analyses of the things that tend to go wrong in that system. Sure, any time it fails, it's not statistics that broke it; it's one of the kinds of problems captured in the statistical analysis. And sure, you could break it down further for disks and note that the controller has a different failure rate than some other component, just as a bridge has a number of possible failures. Problem is, for any of those, you could break it down further and get failure rates for subcomponents, regions, etc. So what? It's still useful to have statistical measures - the real world is complex, and statistics helps us capture things we otherwise couldn't.
Programmers (particularly but not only young programmers) might not like to acknowledge any field but their own has any depth ("Everything is simple! Just do it my way", hence Ron Paul/Ayn Rand fanboyism and all sorts of other stupidities) - I don't know if there's a lot we can do but hope they grow out of it (It took me awhile to do it, as did a number of people I knew when I was younger, but I made it out).
Basically, if your worldview doesn't wed empiricism and a reasonably flexible practical philosophy, your worldview is (if you err on the pro-logic end) too inflexible and you're going to miss out on standing on the shoulders of giants. Neither the logician nor the mystic understands the world.
For every problem, there is at least one solution that is simple, neat, and wrong.
them all...
how does that sound, zed ?
can you make a statistic out of the output of Da vinci ? can you statistically value the work output that was required to create mona lisa ? can you statistically measure the effect of mona lisa in the beholders' psyche ?
or, can you compare the efficiency and work output of monet to rembrandt ? or bach to handel ?
let me tell you. you fucking cant. because these are not quantifiable. its ART. it requires muse, talent, luck, inspiration, experience, practice, stars aligned in the right time, EVERY other kind of shit.
what the fsck does this have in relation to programming then ? listen, for its apparent that you dont know about programming enough, even in your current situation :
programming is little different than art. some people type out 10.000 lines of code doing multitudes of things, yet all of these provide nothing. one person puts out 50 lines of code, that that provides nothing either.
but in 3 years time, something happens, and the entire company's ass is saved by that 50 lines of code.
a brutally simple example, and a common one, but this should be enough to make the point.
i would sell any shares in a company that statistically tries to quantify the programmers. for, it means that they are one company that dont know shit about information technology, and sure is a company to miss any potential genius or groundbreaker even if they get their hands on one.
Read radical news here
See subject-line above... & regardless whether you're being sarcastic or not?
(Which I think you are, lol, & I tend to agree w/ your statement though, as in E.G. - Sample MANY viewpoints & then, test yourself, above ALL else (to get the "right answer", but especially the RIGHT ANSWER FOR YOU, PERSONALLY, in things))?
Your reply does indicate that you read my posting completely @ least, which is sometimes are REAL "rarity" here... lol!
APK
P.S.=> "Onwards & upwards!"... apk
t I saw the author's name. Sorry Zed, I'm happy to learn some new math, but not from such a self-important asshole.
The reality is that a programmer who screws up the ODBC acronym probably makes less than the everyday Joe. So the challenge, offered by this 30-something everyday Joe, still stands.
your server app may be missing 1 out of 10 billion orders every month. or risk marking 1 out of 1 million emails mistakenly as spam.
1 out of that percentage may chance up on a big potential client's order/email, and your corporation may miss millions and you may even never know it.
or, 1 out of 10 billion some jerk may be able to hack your secure app and get millions of customer data.
see, it doesnt mean shit. some statistician coming up and saying 'its unlikely' or something doesnt mean shit. there is no statistics in this. it just should not happen.
Read radical news here
Yup. Also, for a guy who claims to know so much about statistics and measurement, it's weird how he judges programmers so sweepingly on the sole basis of his anecdotal experiences.
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson
I'm majoring in MIS at a university where Statistics is a required core course for every major, including the computer programmers. All along, I didn't get why I have to take it. I am now, and hopefully will get through it. I'd like my degree.
I run Ubuntu skinned to look like a Mac on a PC. Go figure.
Seriously?
... electrical and mechanical engineering, physics, FAA regulations, federal law, product configuration control, and lots of other stuff. Face it, for all but the most trivial applications, programmers need domain experts to define the requirements. Or your app will look cute, but won't work worth a damn.
Programmers don't need to learn AI. One of the best natural language recognition apps I've ever seen was written by a couple of mechanical engineers. So we've got that handled. You can get back to writing games. Actually, programming is just a sill set needed to solve those domain specific problems. Increasingly, this skill set is a part of the curriculum of said domain experts. Its easier to teach a chemical engineer (for example) programming than it is to teach a programmer Chem Eng. The idea that only the people in the white lab coats can handle the mainframes in the data center is so 20th century.
Statistics in it's purest sense is simply math. Very few people know very much about this.
Statistics in the wild is generally bullshit! You should not be able to get two equally qualified people the same data set and receive two different answers!
As for statistics for performance measurement? If you are doing something important than analyze worst case performance. Statistics doesn't come into play in this case.
logic and reason are the enemy of religion. the whole age of enlightenment and the demise of religion and the advent of scientific age has been moving on those two. and they have never stopped moving on their momentum up till now.
Read radical news here
Speaking as someone with postgraduate degree in pure math, I'll be the first to admit that the subject is very hard to really understand well. Statistics is founded on probability theory, which in turn is based on measure theory, which is based on generalized integral theory and mathematical analysis. It takes 4 - 6 years of continuous hard study to cover this material and really know it all. And only people who devote their professional life to it can do that.
At most one could hope that one develops as sense for high level statistics, but that also takes several years of exposure to concrete examples, since intuition often fails miserably when it comes to even discrete probability theory.
Statistics is really useful as a scientific/theoretic method of reasoning, but convincing business people or even practicing scientists with it is futile in my opinion.
As the island of our knowledge grows, so does the shore of our ignorance.
So I read through his article. Yes, the whole mindless rant. The conclusion that one should REALLY draw from it is: Zed Shaw is a douche with Asperger's who clearly feels like his own personal area of expertise is underappreciated. Hey Zed, get over it.
Down with the career politician! SUPPORT TERM LIMITS
I like how the first part of his Wikipedia article says "Zed A. Shaw is a troll" with four citations.
Well, it has never been successfully tested.
not understanding a topic that even you are unwilling to acknowledge mastery of.
Personally, I think that little acknowledgment increases his credibility quite a bit. It suggests to me that he's actually spent some real time coming to grips not just with glossy overview you get in a high school or college course but with some of the devilish subtleties of actually using the stuff.
The funny thing about knowledge... the more it grows, the bigger you realize the frontier is. So, how good of a heuristic is apparent confidence?
Tweet, tweet.
And yes, I do statistics for a living.
Do you work with the statistics porn guy?
http://developers.slashdot.org/comments.pl?sid=1504756&cid=30710812
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
"I construct two sets of n=100 random samples from the normal distribution. Now, if I just take the average (mean or median) of these two sets they seem almost the same."
So its true. The n's justifies the means.
Today's vices may be tomorrow's virtues.
please tell me whether you would like to rely on decision theory, game theory or utilitarian techniques to handle life chances of your children or their sensitive private/critical information in a database.
Read radical news here
Well I can tell you that when I tell my boss that the project is 90% complete and I just have to finish the other 90% he, and every other SE I have said this to, knows exactly what I mean. This guy actually thinks that at times the sunset is a brilliant blue. He clearly doesn't get that how he perceives things is not the same as them actually being the way he perceives them, and so he freaks when smarter people than him don't care what he has to say. Lickily I learned from the available data I have that 100% of people named Zed Shaw want to kill me, so at least I have that going for me now ;-)
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Maybe Slashdot should have editors, so crap like this doesn't end up on the front page.
You forget to take into account that I'm way drunk on all the money that I made from writing the... uh... whatever the cunting fuckjizzle you kids are calling database apps these days.
If you were blocking sigs, you wouldn't have to read this.
Would your conclusion then be that my computer has a disk capable of copying files at 1060MB/s?
No, because you're not measuring disk at that point. That's confounding.
But it's a good point -- I suppose "ramp up" is a kind of confounding, anyway. I was just considering it mostly in terms like VM warm-up.
Don't thank God, thank a doctor!
I love pissing people of with this one. I can tell right away if they are even logical.
You have 3 doors, behind one is the prize. You pick one. I open on of the other doors that has nothing. I offer you the opportunity to change to the other door. What do you do?
You should switch, but why? and should you really? Is it 50% or different?
Point is, people will not always get it and that will turn them off. It will go against what they "know." They will have to change, and they just don't want to.
Perhaps your reading skills are not so good. I was agreeing with *you*, and pointing out the lack of validity of the OP's generalisations.
Better luck being trolled next time.
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson
See subject-line above, & realize that the PARENT button (on your post, & see whom you actually replied to).
"Perhaps your reading skills are not so good. I was agreeing with *you*, and pointing out the lack of validity of the OP's generalisations. Better luck being trolled next time. - by LSD-OBS (183415) on Saturday January 09, @09:36PM (#30711942)
Your post WILL POINT TO MY POST as its parent... thus, it gave me the impression you were addressing myself is all.
(So, again, you are incorrect!)
APK
P.S.=> Not even a "nice try" on covering your screwup (but, on that note? We ALL "screw up" now & then, so perhaps you did so unintentionally (though I somehow doubt that, lol, it's still ok))... apk
Or perhaps you aren't the AC I was replying to, but rather Zed? In which case:
You can happily go and suck a fuck for the breathtaking amount of swollen, tumoros ego and self-importance you're throwing about here. You do know that an "appeal to authority" is rather a logical fallacy, no? And you do realise that, even if the above list of positions and titles were valid in this argument, they are still anecdotal evidence, right? Your above diatribe contains nothing more compelling than a reactionary ad hominem attack of no argumental worth.
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson
See subject-line above, & realize that the PARENT button (on your post, & see whom you actually replied to).
"Perhaps your reading skills are not so good. I was agreeing with *you*, and pointing out the lack of validity of the OP's generalisations. Better luck being trolled next time. - by LSD-OBS (183415) on Saturday January 09, @09:36PM (#30711942)
Your post WILL POINT TO MY POST as its parent... thus, it gave me the impression you were addressing myself is all.
(So, again, you are incorrect!)
----
"Or perhaps you aren't the AC I was replying to, but rather Zed? In which case: - by LSD-OBS (183415) on Saturday January 09, @09:45PM (#30712010)
NOW, lol, he "admits his screwup", hilarious... keep reading, it only gets better everyone:
----
"You can happily go and suck a fuck for the breathtaking amount of swollen, tumoros ego and self-importance you're throwing about here. - by LSD-OBS (183415) on Saturday January 09, @09:45PM (#30712010)
LOL, for someone trying to 'get the better of me', especially AFTER your screwup above? Well, again, see subject line (& review your PHILOSOPHY OF LOGIC, because again, per my subject-line above? Adhominem is not valid... period!)
In case you didn't realize it? One of the degree requirements in Comp. Sci. is LOGIC... that's only telling me you don't have a degree in that either. Yes, I have taken that too, so I just turned your puny attempt @ covering your original screwup to DUST... easily!
"too, Too, TOO EASY"
(My man - you don't even TALK a "good game", & you're just (as the saying goes) "not in my league" on any account here, period!)
APK
P.S.=> Not even a "nice try" on covering your screwup (but, on that note? We ALL "screw up" now & then, so perhaps you did so unintentionally (though I somehow doubt that, lol, it's still ok))... apk
Seriously. Seek help. Do you know the meaning of the word "Yup", right?
Seriously. SEEK HELP. You have some serious people and communication issues.
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson
...or their favorite skill in CS programming know jack s*** about Pure and Applied Mathematics, across several fields of Physics, Engineering and more. There is a reason most Engineers have taught themselves several programming languages. They have a ton of subject areas to cover that may or may not be better served in one language over another. Yet, with all the necessary Numerical Analysis, Heat Transfer, Fracture Mechanics, Creep, Stress/Strain across non-linear boundaries you still see the bulls*** coming from Biology majors who go on to Grad school, take a few more statistical classes and act as if they are one up on Engineers in Pure and Applied Mathematics. Drag their lame asses down to the labs to explain the Fluid Dynamics going on in an axial flow fan and they quickly realize they don't know jack about jack. CS major are the same way. They learn a damn programming language and walk around as if it's the answer to the Universe and not just a tool amongst thousands of tools to get work done.
Hell, most grad students in mathematics aren't running around bragging about their God knowledge of Statistics.
Just pick a field you love and live it
...if this made it on slashdot today.
i.e. Chart1.DataManipulator.Statistics.InverseFDistribution(.05, 3, 4)
See, that was easy!
But seriously, I have supported a fair amount of statistical analysis in life sciences. Most programmers deal with processes that run against each one of a series of things. IMHO statistics is more like report queries where you perform groupings based on features to find favorable conditions or data falling outside of expected norms.
Could I use a solid statistician to keep me from making errors? Sure. Do I need to overbearing 'keeper of the keys' telling me I'm wrong without offering any real help? Hell no
Wherever You Go, There You Are
You know, that particular citation has made me wonder in the past, but not enough to actually research it. So, I went off looking for more information and found it.
The statistic was generated from a July 1976 survey.
The sample group for this statistic was 1,200 dentists. These dentists were hand picked by the research company, probably with good reason.
They were asked, what advice would they give gum-chewing patients
1) sugared gum
2) sugarless gum
3) no gum at all.
Sugarless gum got 85% of the vote. Not terribly surprising. I'd be fairly confident that their time had been paid for, or at very least they were told "This survey is being done for Trident Sugarless Gum." That is only speculation, so hush up.
17/20 doesn't really sound very good. It just doesn't stick in your head. 4/5 is close enough, even though it reduces your answer to 80% (ahhh, a lie). Since these are marketing folks, I'm sure they pushed all kinds of values past focus groups, until "4 in 5" was accepted as most favorable.
As the link cites, they're fairly confident that the "sugared gum" answer got at least one response. There's always someone that'll take the obvious wrong answer. If you don't believe that, look at any Slashdot poll. :)
What they don't say is how many of the 1,200 samples were dropped. I'm sure there were non-responses, and they could have easily added any number of unfavorable answers in as non-responses. Of course, they couldn't have 100% in their favor, so they had to keep some.
Serious? Seriousness is well above my pay grade.
This looked familiar, then I remembered that I read this years ago.
http://haduken.com/board/viewtopic.php?t=934&sid=ccd988ac3fa9146e94124c1228c4ac35
whatever his professionnal skills, I just hope I never have to work alongside a guy with such a foul mouth and attitude.
I did manage to hang on long enough to see that below the big "I'm a jerk" sign, there was at least some truth to his argument. Not that original or that strong to warrant such a hissy fit though.
the real question is: what's worse
- the swearing ?
- the attitude towards others ?
- the ego ?
- the lack of perspective ?
that guy should tone down the statistics skills and brush up on his social ones. Maybe we could send him a guide:
"fucking idiots devs need to do good like me myself I do, and stop swearing and belittling others like the foul-mouthed idiot fuckers they are they are and learn some modesty, politeness, and perspectve like my godly self, otherwise the world is gonna END !!!!!!"
The Cloud - because you don't care if your apps and data are up in the air.
..that you’re just too dumb.
Know nothing after year and years? So what’s the point then?
Sorry... I can think of several millions of more efficient, more useful and more fun things to do with my life.
I hear you, about people acting like they are experts, but actually knowing shit. Like someone having read a book about HTML, who now thinks he’s a cool programmer. Or someone who clicks together a default database front-end type application, and acts as if he could compete with someone who designs hard math algorithms in Haskell or writes an OS in C/Assembler.
But I think you put way more importance on statistics, than is needed for programming. Because it’s your lovechild (nothing wrong with that). We programmers need to be good programmers. There’s only so much time in a day, to keep up-to-date with all the crazy stuff going on in CS. There are little non-science jobs where you have to keep up so much. There’s simply no place for also becoming an expert in hardware design, graphics design, usability, physics, all the areas of mathematics, including statistics, etc, etc, etc.
If I need good statistics, I’ll hire you. As soon as you know that you know them. Because there is nothing more valuable, than someone who is in love with his work. Happy? :)
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Nice job man... I always wondered about that myself too! So, per my subject-line above? IF us AC's had mod points? I'd hit you up on some in upwards mod fashion, for +INFORMATIVE!
APK
P.S.=> I've also gotta give you "kudos" for effort & actually FINDING SOMETHING on it!
(Though, there is THIS reply in this very exchange -> http://developers.slashdot.org/comments.pl?sid=1504756&cid=30711306 that has to make YOU also realize, those doing the "checking" may themselves have been funded by Trident's competition too)...
LOL!
"Ain't life grand?"
In my estimation (going a BIT off track here though), this is the "how" of why attorney's have a GREAT job, because 'creating reasonable doubt'? It's TOO easy!... apk
Look, programmers tend towards the egotistical at best most of the time. They like to argue, even about marginally different concepts. I've watched guys argue about things like for loops and while loops and ifs and switches so many times in my career that I can only try and block as much of that inanity out. When you approach developers by TELLING them how to do something using statistical analysis, you've got to first convince their supervisor/manager/etc. of the value of it and why it's better. THEN you approach them and tell them that's how you're doing it. Otherwise, you better believe they'll argue about that...everyone has their own way of doing things, and you can bet they don't care for someone else telling them that the way they've done things in the past is all wrong. The only way to make programmers learn is to do something first, have it become successful, and be able to demonstrate the value in doing things that way first. I've been on very, very few teams with developers who were constantly open to different ways of doing things. Very few colleges even bother to put emphasis on statistics...some will even let you dodge the course entirely and take an equivalent. CS and software engineering professors generally fall in line and focus on logic. Obviously, it's a comfort level thing, and you can't get through to people unless you can demonstratively prove your approach.
Is this sort of flame really appropriate for slashdot? The casual language and superlatives discount any meaning in the note.
My preference is to read simple reporting on news here.
so called statisticians too that have no idea what they are doing... They barely know how to define a proper sigma field so that they can use statistics on their sample set correctly.
Very few people really grasp it... maybe as bad as one per major stats bureau.
So it's not just programmers.
Not saying here that I know all of it but it sure is simple to poke hole in a lot of stuff.
... Isn't threatening to kill someone a crime in itself?
tihs isg mead fmro rcecydle tpyos
From what I've read, most of the responders here seem to have a poor grasp of what the field of statistics encompasses. Statistics is not just probability (in the form of flip a coin, choose a door, and poker hands), but can also be used to effectively design an experiment, and reduce the variation in a production line among other things. Personally, I find statistics to be rewarding field of study and that it is easily applicable in the real world. Just don't tell that to my classmates who stare at me as if I have sprouted extra appendages when I tell them I am not graduating with them because I'm extending my engineering degree with an option in statistics...
Programmers don't know statistics. Programmers don't know quantum mechanics.... Programmers don't know aerodynamics....
I'm just going to address what I remember from my stat class, and the class I'm TAing.
"All you need to do is run that test [insert power-of-ten] times and then do an average." Usually the power-of-ten is 1000...
I don't know that I've ever heard that particular statement. But it's a good point:
How do you know that 1000 is the correct number of iterations to improve the power of the experiment?
Generally because it was probably closer to a million, so I'm erring on the side of taking more, rather than fewer, measurements. But without careful consideration, I could be way off.
You would be amazed how FEW samples you need with good sampling to get a good estimate, why do you think when you look at polls there are usually only a few hundred samples and a smallish error?
How are you performing the samplings?
I think this is vastly less important than how you are dealing with the data, but it is also a good point. For example, his complaint is that an average isn't enough; with detailed enough logging, he could easily go back into my data and figure out min, max, standard deviation, histograms...
Sampling and results is a classic garbage in garbage out scenario. If you don't sample right your results are at best meaningless at worst they give you a completely wrong impression.
If you wanted to know the average income of a household in the US you wouldn't just sample from people in Silicon valley just before the bust, if you did that it wouldn't matter what kind of tricks you did to your data your results would be bad.
I wish I had mod points to give you...
They're not rejecting statistics as a field, they're rejecting his claimed expertise in it.
He's just as arrogantly claiming that he's right and they're wrong.
No he doesn't.
He claims that programmers need to understand statistics more. The people he is talking about are therefore not wrong - they are ignorant.
But that term is loaded with negative meaning, it's more accurate to say they are like a variable with named "statistics" with a value that has never been set. Basically, they don't know what they are missing.
It's like when programmers try to argue about how a language is bad when they've never used it. How would they know? Yet many without understanding of statistics are saying the same thing, they don't need to know any more.
I know enough to know statistics can be a valuable tool. Why would you not want another tool that could help you? The people who refuse do so are less than they could be (as a programmer).
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Machine learning is the logical place to take a combined knowledge of programming and statistics. It's a much rarer skill and commands a much higher salary, plus you're doing the closest thing we currently have to predicting the future for a living - and you generally still get to code plenty.
In other words, statistical knowledge can be a significant career advantage in addition to enhancing development and debugging.
AC: Nah, this guy didn't screw up. He (LSD-OBS) replied to you (AC) because he (LSD-OBS) was agreeing with you (AC). That's why he said 'Yup.' Because he (LSD-OBS) was agreeing with you (AC). When replying to a post, many people (present company included) use the word 'you' to refer to the person they are replying to. Having exhausted the second-person, use of third-person pronouns (such as he, her, or it) are used to refer to third parties. In this case, LSD-OBS' use of the word 'he' indicated that he found the author's (Zed Shaw's) sweeping generalizations strange. Honestly, I am a little concerned that you are getting so worked up over this.
I think you just proved this guy's point! Holy Shit!
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
AC: Nah, this guy didn't screw up. He (LSD-OBS) replied to you (AC) because he (LSD-OBS) was agreeing with you (AC). That's why he said 'Yup.'" - by Fwipp (1473271) on Saturday January 09, @11:02PM (#30712400)
Per my subject-line above & because I absolutely HATE arguments (especially adhominem name tossing ones) & I'd rather this didn't turn into a circus too:
I interpreted it differently! Albeit, with reasons, read on (& no, your comment didn't 'fall on deaf ears' my friend).
E.G.-> I saw it was his "Yup" was sarcasm, directed MY way, since he replied to me, instead of Zed as he said, is all...
HOWEVER, in my defense?
His use of "logic" (i.e.=> appeal to authority)? It FAILS in light of his name tossing -> http://slashdot.org/comments.pl?sid=1504756&cid=30712010
DIRECT QUOTE of LSD:
"You can happily go and suck a fuck for the breathtaking amount of swollen, tumoros ego and self-importance you're throwing about here. You do know that an "appeal to authority" is rather a logical fallacy, no? - by LSD-OBS (183415) on Saturday January 09, @09:45PM (#30712010)
(Which Is use of adhominem attacks & that? That IS invalid in logic... period).
AND, as to "appeal to authority"? I think that's a BIT STRONGER (in others of some authority or good repute are better than name tossing) I sure do!
AND
I also know what adhominem is too, from my coursework in LOGIC (apparently you don't, or you just lost your temper, which is OK by me too, just like mistakes (we ALL make them & we all "lose it" now & then)).
To quote AGENTS from the Matrix? "ONLY HUMAN..."
"Honestly, I am a little concerned that you are getting so worked up over this." - by Fwipp (1473271) on Saturday January 09, @11:02PM (#30712400)
I'm not the one tossing the names & frothing @ the mouth though... & he, as someone who attempted to use logic on me, did (which he should NOT have, face it). Name tossing & such is the province of the adhominem attacking troll, & we ALL know that much (especially around here, lol).
APK
P.S.=> AND, lastly? Hey - I'm a "big enough man" to say "sorry" sometimes, too, even when I am not COMPLETELY WRONG (& I did so in this posting no less) & I also noted that WE ALL SCREWUP now & then, in this very exchange... perhaps I did but... he did say this:
http://slashdot.org/comments.pl?sid=1504756&cid=30712010
"Or perhaps you aren't the AC I was replying to, but rather Zed?" - by LSD-OBS (183415) on Saturday January 09, @09:45PM (#30712010)
Which led me to believe he was just "covering up for his mistakes" is all.
Ah, again per my subject-line: I'm going to call it a draw & even say I am sorry (even if I am not wrong) to he, here (why not? I make mistakes too, & perhaps my assuming he was busting my balls since he replied to me & I interpreted it as sarcasm, perhaps incorrectly! ,b>(The problem with English WRITTEN language? You cannot 'sense/hear tone' & like comedians say? It's ALL in the delivery...)
In any event - I'm actually GLAD you replied!
See, because IF this is a case of misinterpretation on MY part (E.G.-> I get trolled here quite a lot, & perhaps I have a predisposition to thinking others are being sarcastic is all & busting my chops/trolling me) Then - this, on MY part, I may have misinterpreted is all...
So, again - IF so, then, I apologize to he since I may have misread his meaning is all (sorry LSD)...
However, he DID fail on logic, via his usage of adhominem attacks on myself in name tossing (especially first) here, for sure! apk
You would be amazed how FEW samples you need with good sampling to get a good estimate,
Well, actually, I'm counting on that when I just use a "power of ten".
Sampling and results is a classic garbage in garbage out scenario. If you don't sample right your results are at best meaningless at worst they give you a completely wrong impression.
That's why it's important to record as much information as possible from each sample -- at the very least, we'd know whether it's garbage. For example:
If you wanted to know the average income of a household in the US you wouldn't just sample from people in Silicon valley just before the bust, if you did that it wouldn't matter what kind of tricks you did to your data your results would be bad.
Well, no, one obvious trick is to say, "Hey, all of this is from people in Silicon Valley just before the bust." The next obvious trick is to then combine those samples with the same people after the bust, and with other people elsewhere -- then you not only correct the error, but you get a sense of the difference between Silicon Valley and elsewhere.
My point here is that it's a hack for a programmer like me, who doesn't understand statistics (much), to make it easier to work with someone who does.
I believe the general principle here is called "data porn".
Don't thank God, thank a doctor!
"... somehow, I expect that my previous post will fall on deaf ears." - by Fwipp (1473271) on Saturday January 09, @11:03PM (#30712408)
Nope, see here -> http://slashdot.org/comments.pl?sid=1504756&threshold=-1&commentsort=0&mode=thread&pid=30712400#30712518
(AND, thanks for replying: I would have "cleared it up" with an "artful compromise" to LSD directly in fact...)
I'd like to think of myself as a logical person (MOST of the time, until I get attacked & THEN? Then, I respond in kind, & in terms the opposition uses, so I "speak to them in a language they understand" ONLY/apparently, is all). I am also "big enough of a person" to say 'sorry' once in a bit too... even IF I am not wrong, or not totally correct either. In order to avoid a fight? Sometimes, it's necessary & GOOD imo to do (and not just to 'placate others' only either).
APK
P.S.=> Let's assume, again (per my URL above you ought to note, so you realize your reply "got me thinking" etc. et al), that for fairness' sake (because I can't "hear his tone" in his (LSD's) reply, & as comedians say? It's ALL IN THE DELIVERY), I misinterpreted his "YUP" as sarcasm (which I did, perhaps unfairly on MY part): However??
IF you are a logician (which is what LSD tried to use on me in fact)...???
You must admit, that "Adhominem attacks" are NOT valid in logical debates when attempting to desconstruct the arguments of others too!
Fair enough?
So again - Please see that reply of mine above in the URL I posted (in response to your reply, which as you can see, has not "fallen on deaf ears" here)... & thanks for replying (no sarcasm/totally sincere)
Lastly - I only wish LSD stuck around, because I re-read his reply afterwards (& considered what you noted too)... just to clear this up, & avoid a fracas that is needless... apk
well kid, you're a fucking idiot.
I read this post a couple of years ago. Why is it just now making Slashdot? According to the wayback machine, this essay must have been written in May of 2006.
The word you are looking for is Densan.
I hear you, I do performance engineering of web based systems. The developers, the managers, the testers, the architects all have no clue. You are correct here.
However if you can not present your "theory" of how to do something in a dumbed down enough format then who cares. Because the pretty graph is pointless. It will be mis-interpreted, mis-understood, and mis-used.
All the stats theory on the planet will not get you passed the dumb manager or developer. don't loose sleep of this. There is no point. Simply find metrics in your analysis procedure that do mean something to these people. They may not be the total picture but they are something. Build a reputation for being correct by starting with simple things. You are always going to but heads with a know it all developer / architect / manager. Fine let them go off and waste money and time. They will be found out as morons in time. You do your thing and simply become the guy to ask about performance and how to do this.
Being understated and consistently showing above average results for your work is how you will rise up. Being and A-hole about it is not going to help anyone. As a matter of fact I would can your butt for being a D#ck.
just because they are geeks, that is reason enough
Logic and reasoning are superior to evidence and observation. That's very basic epistemology.
Are people in statistics actually told differently?
You can find a reason why a programmer needs to learn anything and everything - but that's not practical. I have no qualms about hiring a statistician for special programming work - any one worth their weight is somewhat familiar with tools and languages. As a programmer I'd rather find a reason for: Why Statisticians Need To Learn Programming! The statistician has much less to learn.
I work as a programmer. I have qualifications in both Computer Science and Statistics, and an enthusiasm for statistics. I have looked for opportunities to put Statistics to work and found few.
A lot of statistics is for when experiments are expensive and you must have the right answer - e.g. they take a year and the results need to be of publishable scientific paper quality.
On Friday I ran very informal experiments to find a memory leak. Each test took five minutes and I knew when I had the right answer because, after staring pretty hard at a small section of code and looking back at the small print of the specification of the routine I was calling, I realised what the problem was.
The best strategy in this sort of situation is not one of those described in the excellent "Statistics for Experimenters". I made use of the ideas of designing an experiment to answer a question (putting a loop round segments of code to test them, and running the dodgy code once before the loop so that if correct the loop should lead to no net memory consumption at all) and of looking at relative measurements to increase precision (looking at memory size before and after the loop and subtracting the two). I think these ideas came to me from statistics, but that's about all I used.
...Zed Shaw is a cranky, irrelevant whiner 96.3% of the time, at least according to the lambda standard deviation of the probability factor. Or so the graph shows, when enough data points are confabulated by the denominator of the sigma variation. And he thinks HE knows statistics.
This is a hacked account, for which the owner can not be held responsible.
I think you are vastly overestimating:
1) the quality of your future coworkers
2) the quality of commonly held CS degrees
3) how much of their education you or anyone else remembers five to ten years after leaving college
Zed is an asshole but he is correct.
Degrees or Degree?
and how does your sisters education reflect on you? She's the stats person not you.
Am I the only one who found that article hilarious?
A 6'2" "Good Looking" graduate who's extensive research in programmers has discovered that all males are inumerate neanderthals and only women really understand him.
Sigh. He's so sensitive. :-)
If only there was some other profession where people were trained in test coverage and such. We could call them "testers". Maybe I'll patent that idea.
Statistics show that statistics work!
F you, i haven't finished with "THE ART OF PROGRAMING"
Is the fact that most people program software by theories and think that they will get best performance when they apply their pet theories to a development project.
But what he really is saying is that in order to verify that the solution actually works it's also important to measure how well it works and time each stage in a process. That can actually yield some very surprising results and reveal that you lose a kiloton of performance on something that you never expected to be a problem.
I have several times encountered that kind of problems - network lag, missing database indexes, stupid compiler, horrible third party database libraries, slow disks... All revealed by timing the process.
So it's actually only part of the statistics process - the part where it comes to sampling data and understand it. There is often no need to do standard deviations and things like that when analyzing a software package. Many performance improvements are better than 10% when you tune your solution, rather you can get a 10 times improvement on some operation. But of course there are those that are small too, but those are usually not worth the effort.
And sampling of data can be done with things as simple as print statements or by using a package like Purify Plus.
And no - Zed Shaw isn't a total jerk, that's wrong. But he is a pain in the ass for some people. Especially for project managers and programmers.
He is right about the importance of analyzing a software, but it's not really necessary to plow into the realm of standard deviation and small differences when it comes to analyzing software. But it may be a good knowledge to have when developing a software package since you may not be able to throw your data into Excel for further processing.
And you shall also beware about trying to optimizing too much because one optimization may actually result in worse performance somewhere else. Just check where it will be most efficient from the overall perspective.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Its a probable fact, that in casual conversations, 48.35% of all statistics are made up on the spot. However, if the person talking is a politician, the number rises to 94.7%. If its a politician trying to get elected, 100%, 19 times out of 20. SO THERE!
I don't see how statistics helped find out the DB2 was the problem.
Wouldn't the same conclusion be found without calculating the standard deviation of anything?
Each of the above 3 professionals have their own areas of expertise. And Statistics (such as needed in performance estimation or dimensioning of processing capacity) simply isn't part of the average software's engineer's background (let alone that of a code monkey). You wouldn't want a Statistician to code up a decent interpreter, right? I mean: just look at the R interpreter. How about letting a Mathematician design and code your GUI? No takers?
By the same token you wouldn't want a programmer to design a Markov Chain Monte Carlo simulation. That's because programmers know nothing about Markov chains, the length of startup periods, periodicity of a chain, absorbing states, or invariant distributions. Worse yet, they have no way of knowing if their code spouts nonsense or the right answer with a lot of noise. It's not their area of expertise. You also don't want a mere programmer set up a numerical approximation. I mean: just look at the jackasses that coded up the Patriot timer and made the most elementary mistake in the book of numerical analysis by using a floating-point value as a loop counter and allowed it to accumulate roundoff error. That's a mistake first-year undergraduate engineering and maths students make before they are marked down for it.
So what does that mean? Well, one approach would be to shout: "HECK Programmers Don't Know Jack About Statistics And Need To Be Educated In A Hurry". That's the approach the author of the article takes. I don't believe that's a very fruitful approach though.
Another approach (the one I prefer) is to note that some engineering projects are of necessity TEAM efforts. Where you have a project lead who knows where the problem areas are, who is qualified to solve them, and how the team effort must be managed.
And yes, that means that sometimes programmers get to work under the direction (as in "are told what to do") of a specialist like a Mechanical, Electrical, Chemical, or Civil Engineer. Or a Statistician or a Mathematician for that matter.
On the other hand those specialists needn't be heard when it comes to things like database design, semaphores, inter-process communication, communication protocols, pre- and post-conditions, latency, cache filling, access control and the need for encryption and suchlike.
Om still other aspects you may expect specialists and programmers to work together and talk to each other.
So, while the problems mentioned in the article are recognizable (and indeed well known), they don't necessarily mean that programmers should get educated. They should be part of a team, and be professional enough to realize that they are members of the team, not in charge of it.
Zed's a total asshole. No wonder the programmers don't like him and won't listen to him. Maybe if he spent some of that stats time working on people skills he'd find office life much more enjoyable.
All those moments will be lost in time, like tears in rain.
Zed is full of crap. At least in my CS undergraduate program, we were required to take a "performance analysis" class that answered basically all of Zed's questions, plus a whole lot more. Effectively, it covered basic statistics as applied to performance analysis, simulations, measurement techniques, and some basic queuing theory.
There are published CS papers that lack statistical validity - that's inexcusable. Anyone publishing a paper that deals with performance should either know enough statistics to publish a valid paper or have their paper reviewed by someone that does.
Expecting all programmers to understand statistics well is not reasonable. "Programmer" can include everything from someone who hacks PHP pages together for a living to someone who does research into new ML techniques or designs complex software systems. For the person hacking PHP pages together, statistical validity isn't a huge issue since the primary goals are getting a system that works and doing so quickly and with minimal cost.
I question their metrics and they try to back it up with lame attempts at statistical reasoning. I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation.
I work with a number of statisticians and I have the opposite problem. They look at the data, apply mathematical transforms to it, and come to a conclusion, whether that conclusion makes any sense or not. They make little attempt to reason that the data may flawed (which experiments often are), or does not really represent what we are trying to measure, or they are using the wrong statistic to summarize the effect. It is very frustrating.
Basically what he is saying:
1) think about the factors that influence what you are measuring
2) look at the whole picture (i.e. draw a graph and not only from values, but also sequential output in case something changes in time)
And...well, that's mostly everything.
You don't need to know any statistics to understand these 2 points.
So they might realise the whole house of cards that stack ranking and HR’s beloved PRM systems are is flawed and invalid.
What will you do if the 1000 tests takes 10 hours?
Either ctrl+c, or try it 10 times.
Why 10 times? Maybe 5 times is enough or at least 20 times is required?
It doesn't have to be statistically accurate. It just has to be close enough.
How do you know that you are close enough?
One can do a benchmark a couple of times to see whether the results are more or less the same. A more sophisticated approach is to measure the standard deviation as well. However there are situations where accuracy is critical. In that case one makes a distribution assumption (e.g. Normal distribution) and then a statistical estimator is used to give a confidence interval for the estimated parameter. I.e. the confidence that the parameter will be within that interval is 95%.
This reminds me greatly of my previous assignment, where I had to work with (yet another) "difficult user". He had a Ph.D in statistics and sounded a bit by Zed. He had also done some work in datamining and data warehouses, so he started our first conversation by declaring himself an expert in my field. Great start :)
Ofcourse, as it turned out he was just very frustrated with his colleagues because he couldn't explain his ideas. No surprise there: he tried to explain very advanced mathematics with formulas, to people who barely managed to get a highschool education. After I provided an interface between the parties involved (my CS study came with a course in probability calculus so I could actually understand what he was doing) things went pretty smooth from there on. My advice to this user when I left was "get a good communications training". He said his manager was saying the same for about a year now but now it was coming from me (a techie) he'd actually think about it :)
People who can communicate are paid lots of money. You can have all the skills, but if you can't access them, or combine them, you're not getting much use out of that expertise. Zed's article being a case in point.
Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
We did some work involving statistics to correctly report results, see http://www.itkovian.net/base/statistically-rigorous-java-performance-evaluation (OOPSLA 2007) and http://www.itkovian.net/base/java-performance-through-rigorous-replay-compilation (OOPSLA 2008).
I am the Shield Anvil. And I am not yet done.
Statistics are very important when testing a system. You really need to know (especially if the bug was intermittent) what the probability is of NOT seeing the error per test run iteration.
It's not good enough to say, "It happens one in ten times, so if I run it 11 times I will definitely see the bug if it's still there."
The probability of not seeing the bug per test is 9 in 10 i.e. 90% or 0.9. These probabilities multiply, so if you perform the experiment (do a test run) 10 times, the probability of NOT seeing the bug (with the unfixed code) is 0.9^10 i.e. 0.349 or about 35%.
Would you be confident with that?
If you wanted a 1% probability (0.01) of not seeing the bug (in the unfixed code) how many runs would it take? Well, do your logs.
0.01 = 0.9^x
x=43.7
So you would need to run the test 44 times to have a 99% confidence that you'd fixed the bug.
Stick Men
Zed fired off an angry post yesterday after noticing he was slashdotted. It looks like some sort of retaliation swing for the onslaught of pissed off programmers gunning for Zed. http://zedshaw.com/blog/2010-01-09.html
My first thought was is Zed on some heavy duty medication? He seriously has some sort of anger problem going on and a deep seeded hatred toward his idealized concept of the "programmer". Maybe a programmer made him feel bad so now he's got a vendetta. Programmers surely can be dicks. I know because I work with them, but Zed is coming off like a dick programmer times 1000. (I chose 1000 because it's a power of 10.)
If he wants programmers to listen to him and actually change their ways, why doesn't he go with the educator approach instead of going with the approach of flame the world, stomp my feet, and call everyone stupid until they pay attention to me? The best way to get someone to ignore everything you say is to call them an idiot jackass who can't remember anything after 2 minutes. They will kindly oblige by living up to your expectation.
This Zed character may be good at some things like stats but he's damned awful at communication and demonstrating tact. I wonder if he behaves this way on the job, because I would not want to work with such a caustic person. Maybe at work he keeps the anger under wraps and behaves like a great guy, but if I were his coworker I'd lose all respect for him after reading those 2 posts.
Camping on quad since 1996.
Check out your local weather forecast. "The normal high for today is..." But what's the standard deviation? If they tell you that the normal, or the average, is 15C and today's high is 25C - wow - that's way above normal. Must be global warming. Quick, send money to AlGore. But what if they also told you that the standard deviation for today is 12 degrees? Oh. Hmm. 25C ain't that significant. Cancel the cheque to Al.
Statistics are worse than meaningless if you don't understand how to use them correctly.
linquendum tondere
Is he genuinely acknowledging it, though?
a) He's saying so to deprecate himself so that anyone saying "stop being an arrogant arse" gets punked like you just did
b) does he mean it? Because he's now made himself "even smarter" by the old saw "the first step on the path to wisdom" that everyone knows about.
"Why programmers need to learn how to communicate effectively."
If the only thing you are conveying is how superior you think you are and how angry you are at those you judge to be inferior to you, you are communicating the wrong message.
... I ran into a professor of statistics who said that computers were going to be a passing fad in his field.
To a Lisp hacker, XML is S-expressions in drag.
theory is what is needed, otherwise statistics does not mean much to anyone...
With probability theory one models, while statistics is used to estimate the parameters of a model.
That CRU "problem" is made up and either by you or some troll on a climate thread.
I'm assuming you're talking about the temperature proxy from tree rings post 1960, rather than the Russian data that the CATO-like institute has not revealed has been dropped.
They threw out outlier data because they KNOW that the proxy had opportunity to reflect more than one variable.
That it had been agreeable with other proxies (INCLUDING other tree rings from different species of tree in different places) AND the temperature record for 110 years or more, it was a good proxy and reduced the opportunity for another proxy to skew data.
If you don't mean that, then what DO you mean?
And I take it you have where someone worked out that the confidence estimates
a) had to be redone
b) were not
because I find it impossible to believe that they changed the process and DIDN'T calculate confidence limits BASED ON THE DATA THEY ***DID*** use. That would have had to been deliberate.
Just because you are perfectly right ... doesn't mean you aren't a complete and total asshole.
As a reformed asshole myself I can tell you that condescendingly pointing out the failures of your colleagues will not get you what you want. Specifically (and I'm assuming here that your goal is the same as mine) getting your colleagues to stop acting like self-righteous fucktards. Most programmers are convinced they are geniuses. This is crucial to understand if you wish to work with them and wish to get them to do anything at all.
I am ostensibly in a senior role in my day job and I do find many things these other programmers do ... well ... fucktarded. That is they are beyond retarded since a retard would know they are a retard or at least not entertain the delusion of superiority that a fucktard does. No my friends we need to call them fucktards because they are fucking arrogant in their belief of superiority. So I can't tell these geniuses to do anything. Nope. Not at all.
You need to use psychology on these fucktards. What you need to do is something Socrates used to do with his little fucktards that he taught. Ask questions. Since the genius/fucktard seems to know so much start by asking leading questions that will do one of two things... it will lead the fucktard down a road that will show you both how stupid he is (and you can pretend they figured it out themselves they love to take credit). Or it will show you where you were wrong... and that you were the fucktard.
Remember we are after end results. So we put aside lesser things (like pride) in the search for a greater goal which should be better software and the ability to make more of it. If you can psychologically manipulate an army of fucktards you will become fucking powerful. Much more fucking powerful than you fucking are on your fucking own. I wish you good fucking luck as I can tell by the response to your post that you are a fucking powerful personality and will definitely lead your own army of fucktards one day.
Hopefully when we meet on the field we can be allies and not enemies.
[signature]
Statistics are important; it is highly unlikely that anyone with an MBA will know how or why, but they want them.
In fact, it is almost a certainty that any given MBA will either lack statistical expertise or will misapply it unthinkingly in a cook-book style. The pseudo-statistics behind Six Sigma comes immediately to mind.
I had repeated theoretical discussions with the four MBA experts who "trained" us (a group of six PhDs in Physics & Engineering doing R&D) in the ways of Six Sigma. There were problems with the statistical theory they presented right from the start - and they were clearly unaccustomed to being contradicted along the lines of "that's not right/applicable in this case, and here's why". For instance, they failed to acknowledge that non-Gaussian distributions could exist, then refused to accept that procedures should be adapted to the data if it was non-Gaussian. Next, they adamantly refused to believe that the 1.5 Z shift hypothesis was supported only by a few studies, all relying on a single dataset from the 1950s for die-based manufacture, and totally irrelevant to most other processes. The Six Sigma books all say "many studies" over decades support the Z shift hypothesis, but fail to cite them, and our MBA experts could not cite any such studies either. Thirdly, they refused to accept that an additional mode of variability (not in the Six Sigma beliefs) existed in processes with feedback (such as recycle lines or controllers). In many cases, this mode guarantees non-Gaussian variability in the process output.
Their advice was that to pass the course, we should ignore our knowledge of statistics (which they acknowledged was far better than theirs) and of process variability, and just "apply the documented methods". We did, and we all passed the course. Then we ignored the Six Sigma bogus statistics bullshit and got on with our jobs using proper statistics to analyze and solve problems in variability with the products we were developing.
MBAs seem to want statistics, but the vast majority appear to lack the training in how to generate proper statistics, or how to use them competently if someone else supplies them. Most MBAs appear to think the world is described adequately using Gaussian distributions, and a few "experts" know the Weibull distribution or the t-distribution. Other distribution types (Poisson, discrete/categorical, etc.) are totally foreign, and methods of inference beyond simple unconditional analyses are also quite alien to them.
I also understand that people who are good at it are rare.
Perhaps not as rare as you might think. But those who have some aptitude in statistics know enough to keep their mouths shut when the data tells them to. MBAs on the other hand, ignorant of their own ignorance, are as verbally promiscuous as politicians...
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
Zed Shaw trolling? What a complete *fucking* surprise.
I agree that it's probable he's a jerk. I know I am and he sounds a bit like me in his argument.
I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation.
I wasn't told that when I studied programming. However, just because people come from a computer science background, doesn't necessarily lead them to applying the scientific method to their day-to-day work.
I've been struggling for about a year with this. Am I just being an arrogant jerk thinking I'm smarter than my co-workers, or are they really being ignorant and ignoring the technical and business concerns that I bring forward. When we have problems, I try and find the cause, while they immediately start proposing and implementing solutions. I advise them not to play with the server during production hours, they'll change settings over lunch hour (our servers are accessed from coast to coast, so a 'lunch hour' outage affects about 3/5 of our users.).
Regardless, my advice to Zed Shaw, is: look yourself in the mirror, find a good personal critic (not a friend, unless they are a VERY good and honest friend) and figure out if it's you or them. It's probably a bit of both, but if you coworkers don't listen to you then it's time to leave.
Remind them that the average American has one breast and one testicle (within one significant digit), then tell them a story illustrating the current problem with real world objects. Parables, fables, hypothetical examples, and other tales will help them get their mind around the concept. Or turning it backwards. A 99% success rate sounds great until you realize you have 10,000 events and a 1% failure rate means 100 aircraft will fall from the sky.
And always let them know it is more complicated than it appears on the surface, and anti-intuitive, so it has confused the best minds for quite some time. Find a way to let them feel smart while being wrong.
I'm a physicist, I know plenty of statistics. The kinds of statistics he's talking about are not hard. If you can do algebra, you can do things like calculate the standard deviation and variance of a set of measurements.
Was this rant really necessary? I run into people in physics who don't take care of these details. I find that a simple "can you put a standard deviation on that number?" or "can you repeat the experiment?" generally gets the job done. If you want to be more scientific, just start with those questions, and see where it takes you... you could even add "please" if you wanted to be nice. I find threatening people with death and belittling their intellect while talking about trivial calculations doesn't generate useful data.
To be fair, it sounds like Zed has been working as staff at a university. This has nothing to do with statistics, but it's probably the real reason he's in such a bad mood.
Sorry, Zed I don't need statistics to do my job. Zed jumped the shark years ago - isn't he the Rails guy? That is so 2005. This story is like having deja-vu of a bad hangover.
Just go away.
Hah, I'll school this guy any day - and I don't even have a degree nor have I ever used R. What a douche.
Leaving the author's lack of social skills aside, the powers-that-be in computer science education agree with him, at least for now. The Computer Science Accreditation Board lists a course in probability and statistics among its criteria (sorry, I couldn't find an online link to the latest criteria) and has for at least 20 years. I don't know how influential those criteria are outside the US (though I'd be curious, if any slashdotters can help me out), but here they are pretty important, especially for the vast majority of programs that are not at the top schools, and need the credibility that accreditation can bring them.
Not everyone is happy, though. At the 2005 OOPSLA there was a panel discussion where one thing they all could agree on was that the CS curriculum was way too mathematical. They favored something more like a software apprenticeship where "projects" where replaced with "products". That point of view does not appear to be in the ascendant in computer science yet, but it might catch on in the information science departments that are often found in business colleges.
Personally, I don't think the CS departments are likely to get less mathematical as long as there is strong demand for their graduates. There are certainly a lot of students who don't major in computer science because it is too mathematical for them, and I'm sure some of them wind up as programmers through some other route, and others find some other career. Moreover, I'd say that with one probability and statistics course that follows calculus, the students do get enough to "know what they don't know", which was what the author wanted.
1) the quality of your future coworkers
I base this on the quality of my past coworkers. I was probably lucky, though.
2) the quality of commonly held CS degrees
I'm at Iowa State University right now. It seems to be an exceptionally-good CS program. Depending on the kinds of friends I make here, I'll probably end up in a job with some of my classmates.
3) how much of their education you or anyone else remembers five to ten years after leaving college
The parts you use.
It's also much easier to re-learn something than to learn it from scratch -- thus, Zed could've said "brush up on your statistics", not "learn statistics".
Don't thank God, thank a doctor!
ummm, where are you coming from here?
Everything is complex. That's the basis of every libertarian ideology. Life is too complex for a group of politicians or 'experts' to manage.
As a result of this complexity, the reasonable thing to do is to allow people to try different approaches to solve their problems... hence looking down on things like central planning.
If you think you have a solution, you are free to prove to the world that it is correct. That is freedom... the freedom to do things to solve the problem.
The alternative is the belief that some group of experts and politicians can capture all the information in the world and formulate working policies to dictate how society should behave.
Their track record? Dismal... communism, fascism, corporatism, theocracy... They all seem to fail empirically. For one, it is rare to have such experts actually know everything. Secondly, you have to cound on the experts actually have 'good will' towards the populace and not becoming corrupt or obsessed with their own power and money. Again not a trivial task.
We agree that life is complex and problems are deep. A free society demands those with solutions implement them and prove they are the best... and people will gravitate to the best (or at least good enough) solutions. Think you have a better way to run a school? Open up the school and bring in students and show people that your way is better. That is freedom.
The alternative which is what we have now? Have a bunch of experts think they can devise the best education policy, implement it within the public school system where people are taxed even if they don't attend it.
Empirically it is shown to work. School choice for example is available in many countries and places. Society does not collapse (Sweden, Chile, Alberta, British Columbia...). Yet the 'experts' who actually tend to deny empirical evidence tend to go against it in favor of theoretical arguments that society will divide if our kids don't learn together...
I used to be a socialist. Until I looked at the empirical evidence. Now I favor freedom.
The people he is talking about are therefore not wrong - they are ignorant.
I'm sure this goes against everything you've been taught, but right and wrong do exist. Just because you don't know what the right answer is - maybe there's even no way you could know what the right answer is - doesn't make your answer right or even okay. It's much simpler than that. It's just plain wrong.
Dr. Gregory House
You misunderstand the alternative. Societies have been a mix of planning and autonomy for all of human civilisation, and *that* is what has worked well. It is not perfect, but it by-and-large works. Societies that overstress planning or autonomy have never been workable. No system in the world is lassiez-faire, nor is any system entirely planned, and all systems have their failures. It is not hard to find these for the systems that are closer to lassiez-faire, and you'd do this if you were really interested in a fair comparison.
The invisible hand, even to the extent that it supports the public good, is not always optimal. Often it doesn't even try to and is off optimising something else.
Experimentation is good, and certain amounts of competition can be worked into state structures to allow that. If there are better ways to run schools, we should find them and implement them in the public schools. We are, however, going to insist that the schools be public, that everyone pays for them, and that everyone goes to them. It's otherwise too easy for one person who earns privilege (to whatever extent the degree of that privilege is just is another question) turning it into a privilege passed, unearned, throughout many generations. Universal, public, mandatory, integrated schools help prevent that. They also help prevent racism by forcing people to rub shoulders, and they help prevent idiocy by preventing religious nuts from being the only people to educate their kids.
Formal freedoms are not the only ones worth considering - if you "allow" something in a system, but that same system effectively prevents you from enjoying it, then that allowance is very shallow. Having justice but having finances result in some people being unable to hire (any or a good) lawyer results in very shallow justice. Similarly with any other social good.
If you believe in the tangled libertarian notion of liberty as the only good, your philosophy might work. If you believe in any other goods, to cling tightly to libertarian traditions and hope to pick up reasonable amounts of these other goods will prove most unsatisfactory.
For every problem, there is at least one solution that is simple, neat, and wrong.
In a world where many programmers are lucky to even finish the project with working code (software projects have very high failure rates in the real world), performance tuning of the type where statistics would be useful is often an unaffordable luxury. Most programmers make a genuine effort to avoid the more obvious performance sinks with some knowledge of Big O Notation and known antipatterns, but in a world populated by demanding managers and slashed budgets that is really the best that most of us can do. If Zed wants programmers at his company to become experts on statistics and do detailed performance benchmarking then he can pay them himself for the privilege (hint: programmer cycles are vastly more expensive than processor cycles); otherwise he can, with all do respect, shove it.
He claims that programmers need to understand statistics more. The people he is talking about are therefore not wrong - they are ignorant.
And this applies to all programmers?
He's the one making generalisations based on anecdotal experiences, which is itself a poor practice in terms of statistics.
It's a perfectly fair point to say that many people need to understand statistics better (and it can be done without sounding like a snob), but there is no reason for him to target his rant at programmers. My degree was in mathematics, and I now work as a programmer in which I use mathematics - where do I fit into his box?
A programmer could just as easily write a pompous rant about "How statisticians need to understand computers better", based on a handful of anecdotes and generalisations.
I don't know why we're giving time to someone who's level of argument is "they dont know shit", and resorts to childish ad hominems of "their confidence in their lacking knowledge is only surpassed by their lack of confidence in their personal appearance".
Statisticians need to learn about logical fallacies or I will kill them!
That doesn't sound anything like a car. You must be new here.
From TFA:
Almost all of the queries performed great, except one query that had sub-second response on average, but a 60 second standard deviation!
Pause and reflect on this for a moment. The average is poor and occasionally it stuffs up so severely that the stddev is pulled out by sixty seconds.
I managed to reproduce this (mean of 1.07s, stddev of 58.4). 3000 results of 1e-30s, one of 3200s (almost 1 hour).
If you need statistics to intepret the above results then you have bigger problems.
If you ACTUALLY get the above results you don't complain about the outlier and get them to rework it. Thank $DEITY, time out at a nanosecond and re-request.
Epic win.
As someone who holds two B.S. degrees {computer engineering, computer science}, I take issue with the GP's statement. The typical CS student does not learn about transistor fanout, CMOS logic, VLSI, etc.
CS is derived from St. Turing and his universal machine. CE covers how to make (and use) one of those.
The problem with statistics is the people who teach usually say "that is the way it is" and make you feel miserable. The basis of statistics is "ratio" analysis which is subject to error. So any activity has risk and using the ratio analysis one can start asking, where is the risk? How do I detect the risk? How to I measure the risk? How do I control the risk? If you think about your daily activities it is controlled by statistics- why use deodorant after shower? When you are the best driver why do you need autoinsurance? When you are healthy, why should you get H1N1 shot? You are God fearing(?) yet why do you put money on a plate in church/temple/mosque? Why should companies have multiple products? Why do you need a spell checker and grammar checker when you a native speaker of English?...
Every one these events have some associated risk and that is what statistics tell us. It is very sad that we learn just to pass the exam, but do not think and apply statistics in life. Almost all unemployed people forgot that unless they access their risk with their qualification
and skill sets and upgrade them they will be unemployed. It is not "if", but when. Programmers and every one else needs to carefully think about statistics. Zed Shaw may sound arrogant, but statistics is not some bogus area. Demming is the guru of six sigma and companies who follow this, are kings now. Look at Toyota which used statistics, yet forgot to use it continuously and lost a lot. GM, FORD etc., use statistics to make sure that their components fail exactly after 36 months( end of warranty period) and look at them now.
It is takes effort and interest to learn a second, third subject, but unless one is having multiple background, the future ror them is bleak. Good luck for those who under estimate the power of statistics.
so you've been carrying on like a turkey trying to learn the stuff for years and you're still useless. You admit it yourself. Despite this, you still think that people in a different field altogether should waste the same amount of time on learning statistics instead of learning how to actually design and write software.
you are a fool.
fucking statisticians should learn some common fucking sense or "I will kill them all".
Because I am an genius and lazy and don't need to study much in order to get an A.
Until the third year when I almost failed a math course ;)
Excellence is an attitude.
Personally, I love spouting statistics, but those are the one based on logic such as "this won't matter in 99% of all circumstances". I admit that mathematical statistics just don't interest me as the stats I use don't need to be all that accurate. I do use them for real time protocol development, but for those, a cook book is good enough. No point learning the math on them. Takes too long and I don't gain enough from the effort to justify it. My math learning brain capacity is better spent focusing on differential equations and linear algebra. I don't have the brain cells left for a 3rd discipline :)
:)
I love programming, but I despise trying to implement algorithms written by math geeks. They're typically sloppy and depend heavily on background information that I just don't care about. Write some pseudo code instead of using 30 pages describing the variables in an equation. When I had to start working with wavelet transforms, I had to learn some weird french notation for math I've never seen before that looked like Polish not Greek. (and I mean polish the language, not making polish jokes)
I'm a strong believer that programmers should have at least better than generalized math skills, but I also believe that stats geeks and math geeks should be at least able to write in Matcad or R or something. Then at least a programmer can do something with it.
If a stat geek and a code geek are expected to work with one another, they should at least have some way of speaking with one another and I genuinely believe that the stat geek can learn to program enough to make an example a lot easier than a code geek can learn to read their math.
I work in a company made up entirely of developers who have learned that instead of saying "Hmm... nope, that's not my thing, cya!" they instead say "It's not my thing, let's see if we can sort it out though." we help each other out and we solve problems. If you happen to be a math or a stats geek, we'll work with you to try and understand the garble that you're attempting to communicate, but it'll take far more than just "here's the math, cya" because then we'll just interpret it however it seems to make sense to us. And I promise you, it'll be wrong
Teamwork solves these problems.
Eventually, every major science adopted an empiricist view of the world. Except Computer Science of course.
Well there's his problem. He doesn't even know what Computer Science is. It's either math (pure computer science) or engineering (software engineering). Computer Science is poorly named in that it really is not a science.
I find tfa pretty clueless when it comes a real understanding on what is needed for performance testing and tweaking. A statistical analysis is nice, especially with monte carlo type analysis, like Bungie running Halo 3 on numerious xboxs simulating load and player interactions. However, I find that what is lacking with programmers is a basic understanding on the high levels of process analysis, such as network analysis, CPM, and PERT. Knowing a process has high levels of variance is nice, but not useful for understanding the why. Where is Zed's example of multivariant linear regression or ordered probit? Discussion on hypothesis testing? Anyone, anyone?
As a side note, Statistics in a Nutshell is the only book programmers really need on stats.
In God we trust, all others require data.
Statistics is an excuse for PMs and BAs to feel as if they actual know something that a programmer or developer doesn't. I have yet to see anyone mention methodologies that statistics uses and is used for, including dashboards, reporting and things like Six Sigma or the like. Statistics is based on math and trends and it isn't complicated to understand or learn. Most programmars/developers lack the focus on trends and new methodologies and the duediligance that is required to capture, mine, scrub and present the data. That is where the BA/PMs and who ever else comes into play. But that is what we have Cognos, Crystal Reports and the ETL teams.
If a programmer/developer wanted to develop those skills, then you sir, are putting your own job at risk. //I wasn't an english major, excuse the grammar and spelling. :)
From TFA:
I never have this problem with female programmers. Maybe it’s because I’m tall (6’2”), or nicer to them, but they always speak rationally and are really keen to learn. If they disagree, they do so rationally and back up what they say. I think women are better programmers because they have less ego and are typically more interested in the gear rather than the pissing contest.
I'm also good looking and know a lot of statistics ladies, I really respect you and I think highly of you. If you would like some private statistics lessons call me at (123) 456-7890.
Smooth move, Zed Shaw, smooth move.
"I see undead people" Warcraft III - Necromancer
>>> "I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation"
Evidence and observation of the above sentence shows that the author is comparing two things that are not really exclusive. Logic and reason tell me that my time is better spent not reading the rest of the article.
Statistics is a pseudo-science which is completely unable to actually PREDICT ANYTHING. It is basically comparable to Astrology except that Astrology has many more years of use and refinement associated with it. Statistics and it's bastard child Economics are what brought this country to it's knees. STOP LISTENING TO STATISTICIANS AND ECONOMISTS - THEY ARE MORONS!!!!
I suggest programmers to learn management also http://www.netmba.com/
I'd like to buy homeland for our 10 million people. http://twitter.com/mahadiga
I generally think in programming it's the exceptions that cause the problems. I usually only look at averages and maximums, however it must be said many performance problems are caused by a exponential increase in execution time with a linear increase in load/dataset size. I don't really know stats but it's pretty easy to see when this is the case. There are many things that stats will never predict, i.e. when you are going to hit a wall without an underlying knowledge of where the walls are and how close you are to them and what/how you move towards them. It's all pipes and data in the end. You should know what's going to break it (exceptions to your assumptions) and where your bottlenecks are, and what path is going to get followed in what situations. That can get tricky in database queries, say oracle, with stats determining your execution plan. How often does the full table scan in a loop seem to cause a query to never return? Google oracle stats execution plan. I guess it keeps DBAs in a job.
Actually, dentist do suggest sugerless tirdent. The extra saliva it causes is good for the mouth and doesn't damege fillings or dental work.
Actually, dentists recommend daily care (e.g. brushing/flossing) first and foremost.
You are wrong even though you think you are right. The question is, what are the odds that "if 1 is heads, the other one is also" - not what are the odd that both will be heads.
The latter corresponds to the analysis given. The former (the question asked does not). In the actual question, the first two possibilities are either ruled out, because you've already stated that the first coin is 1, or, you are allowing for either coin to be one, then what are the odds of the other to be one as follows: 0 - 0 = Ruled out by the given 0 - 1 and 1 - 0 - ....
Woops! Never mind. I was reading the question incorrectly. I read "if one is heads" as "if the first is heads". Need to work on my readin comprehension, not my odds skilzzz.
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.