Slashdot Mirror


Social Science Journal 'Bans' Use of p-values

sandbagger writes: Editors of Basic and Applied Social Psychology announced in a February editorial that researchers who submit studies for publication would not be allowed to use common statistical methods, including p-values. While p-values are routinely misused in scientific literature, many researchers who understand its proper role are upset about the ban. Biostatistician Steven Goodman said, "This might be a case in which the cure is worse than the disease. The goal should be the intelligent use of statistics. If the journal is going to take away a tool, however misused, they need to substitute it with something more meaningful."

5 of 208 comments (clear)

  1. Even more obligatory by Moses48 · · Score: 4, Interesting
  2. My Paper by Anonymous Coward · · Score: 5, Interesting

    Ok, let me enlighten the readers a bit. The reviewers tend to be the typical researcher within the field. The typical social researcher does not have a very strong math background. There is a lot of them into qualitative research and quantitative tends to stop at ANOVA. I have multiple masters in business and social science and worked on a Ph.D. in social science (Being vague here for a reason). However, I have a dual bachelors in comp sci and math. I know statistical analysis very well. My master's thesis for my MBA was an in-depth analysis of survey responses. 30 pages of body and really good graphs. My research professor, an econometrics professor, and I submitted it to a second tier journal associated with the field I specialized in...

    ... 6 pages got published. 6?!? They took out the vast majority of the math. Why? "Our readers are really bad at math," said the editor. If you knew the field... you would be scared shitless. The reviewers suggested we took out the math because it confused them. This is why they want P value out... it is misunderstood and abused. The reviewers have NO idea if it is being used correctly.

  3. Re:Past APA president Kimble turns over in his gra by Rob+Riggs · · Score: 1, Interesting

    used to be required in university statistics intro classes: http://books.google.com/books/about/How_to_use_and_misuse_statistics.html

    I suspect that book is still foundational in most University advertising/marketing progams.

    --
    the growth in cynicism and rebellion has not been without cause
  4. Three puzzles by Okian+Warrior · · Score: 4, Interesting

    It is the job of the reviewer to check that the statistic was used ion the proper context. not to check the result, but the methodology. It sounds like social journal simply either have bad reviewer or sucks at methodology.

    That's a good sentiment, but it won't work in practice. Here's an example:

    Suppose a researcher is running rats in a maze. He measures many things, including the direction that first-run rats turn in their first choice.

    He rummages around in the data and finds that more rats (by a lot) turn left on their first attempt. It's highly unlikely that this number of rats would turn left on their first choice based on chance (an easy calculation), so this seems like an interesting effect.

    He writes his paper and submits for publication: "Rats prefer to turn left", P<0.05, the effect is real, and all is good.

    There's no realistic way that a reviewer can spot the flaw in this paper.

    Actually, let's pose this as a puzzle to the readers. Can *you* spot the flaw in the methodology? And if so, can you describe it in a way that makes it obvious to other readers?

    (Note that this is a flaw in statistical reasoning, not methodology. It's not because of latent scent trails in the maze or anything else about the setup.)

    ====

    Add to this the number of misunderstandings that people have about the statistical process, and it becomes clear that... what?

    Where does the 0.05 number come from? It comes from Pearson himself, of course - any textbook will tell you that. If P<0.05, then the results are significant and worthy of publication.

    Except that Pearson didn't *say* that - he said something vaguely similar and it was misinterpreted by many people. Can you describe the difference between what he said and what the textbooks claim he said?

    ====

    You have a null hypothesis and some data with a very low probability. Let's say it's P<0.01. This is such a good P-value that we can reject the null hypothesis and accept the alternative explanation.

    P<0.01 is the probability of the data, given the (null) hypothesis. Thus we assume that the probability of the hypothesis is low, given the data.

    Can you point out the flaw in this reasoning? Can you do it in a way that other readers will immediately see the problem?

    There is a further calculation/formula that will fix the flawed reasoning and allow you to make a correct inference. It's very well-known, the formula has a name, and probably everyone reading this has at least heard of the name. Can you describe how to fix the inference in a way that will make it obvious to the reader?

  5. p-value research is misleading almost always by SteveWoz · · Score: 5, Interesting

    I studied and tutored experimental design and this use of inferential statistics. I even came up with a formula for 1/5 the calculator keystrokes when learning to calculate the p-value manually. Take the standard deviation and mean for each group, then calculate the standard deviation of these means (how different the groups are) divided by the mean of these standard deviations (how wide the groups of data are) and multiply by the square root of n (sample size for each group). But that's off the point. We had 5 papers in our class for psychology majors (I almost graduated in that instead of engineering) that discussed why controlled experiments (using the p-value) should not be published. In each case my knee-jerk reaction was that they didn't like math or didn't understand math and just wanted to 'suppose' answers. But each article attacked the math abuse, by proficient academics at universities who did this sort of research. I came around too. The math is established for random environments but the scientists control every bit of the environment, not to get better results but to detect thing so tiny that they really don't matter. The math lets them misuse the word 'significant' as though there is a strong connection between cause and effect. Yet every environmental restriction (same living arrangements, same diets, same genetic strain of rats, etc) invalidates the result. It's called intrinsic validity (finding it in the experiment) vs. extrinsic validity (applying in real life). You can also find things that are weaker (by the square root of n) by using larger groups. A study can be set up in a way so as to likely find 'something' tiny and get the research prestige, but another study can be set up with different controls that turn out an opposite result. And none apply to real life like reading the results of an entire population living normal lives. You have to study and think quite a while, as I did (even walking the streets around Berkeley to find books on the subject up to 40 years prior) to see that the words "99 percentage significance level" means not a strong effect but more likely one that is so tiny, maybe a part in a million, that you'd never see it in real life.

    --
    OK a new size TV