Slashdot Mirror


Social Science Journal 'Bans' Use of p-values

sandbagger writes: Editors of Basic and Applied Social Psychology announced in a February editorial that researchers who submit studies for publication would not be allowed to use common statistical methods, including p-values. While p-values are routinely misused in scientific literature, many researchers who understand its proper role are upset about the ban. Biostatistician Steven Goodman said, "This might be a case in which the cure is worse than the disease. The goal should be the intelligent use of statistics. If the journal is going to take away a tool, however misused, they need to substitute it with something more meaningful."

208 comments

  1. Obligatory by Anonymous Coward · · Score: 1

    http://xkcd.com/1478/

  2. Mis-use=reviewer don't do their job by aepervius · · Score: 4, Insightful

    It is the job of the reviewer to check that the statistic was used ion the proper context. not to check the result, but the methodology. It sounds like social journal simply either have bad reviewer or sucks at methodology.

    --
    C. Sagan : A demon haunted world:
    http://www.amazon.com/gp/product/0345409469/
    visit randi.org
    1. Re:Mis-use=reviewer don't do their job by bangular · · Score: 2

      I agree in principle, but the reality is a huge number of reviewers don't really understand the research paper they're reviewing. They are more concerned with things like "no previous research has been done" vs "little previous research has been done" and independence assumptions.

    2. Re:Mis-use=reviewer don't do their job by captnjohnny1618 · · Score: 1

      You mean a sociologist might not be great at reviewing (or even remotely qualified to review) the statistical merits of a paper?! My goodness! ;-)

      That's not to say that there are no sociologists who can understand the statistics, but I can't say that I'm surprised. Heck, I'm in physics and I couldn't tell you the appropriate time to use a p-value.

    3. Re:Mis-use=reviewer don't do their job by umafuckit · · Score: 2

      On average, reviewers have the same skill set as authors who will get accepted (since that is the pool they are taken from). If authors are getting it wrong then so will reviewers.

    4. Re:Mis-use=reviewer don't do their job by thermopile · · Score: 1
      They're not crazy. This fantastic article from Nature in February 2014 shows how seemingly statistically certain events (e.g., p less than 0.01) can be thrown off by low probability events.

      .

      Frankly, I've always been a bit confused by the p value. It just seems more straightforward to provide your 95% confidence interval limits.

      --

      "Diplomacy is something you do until you find a rock." --Richard Pound

    5. Re:Mis-use=reviewer don't do their job by ceoyoyo · · Score: 1

      Your 95% confidence interval (roughly*) indicates an interval containing 95% of the probability. The p-value indicates how much probability lies within a cutoff region. What most people do with a 95% CI is look to see if it overlaps the null value (zero, or the mean of the other group, for example). The p-value gives the same information, except quantitatively.

      * yes, Bayesians, technically the 95% credible interval, from a Bayesian analysis, contains the area of 95% probability. The confidence interval, technically, isn't quite the same thing. Practically, in the vast majority of cases, the two are either mathematically equivalent or equal to within a large number of decimal places.

    6. Re:Mis-use=reviewer don't do their job by phantomfive · · Score: 1

      That's what happens when your reviewers are the unpaid submitters of other articles to your paper.

      --
      "First they came for the slanderers and i said nothing."
    7. Re:Mis-use=reviewer don't do their job by Beck_Neard · · Score: 1

      p-values are inherently bad statistics. You can't fix them with 'good methodology.' Can they be used properly in some situations? Maybe, if the author knows enough statistics to know when or when not to use them. But the people who use p-values are likely not to have that level of knowledge.

      p-values are like the PHP of statistics.

      > "This might be a case in which the cure is worse than the disease. The goal should be the intelligent use of statistics. If the journal is going to take away a tool, however misused, they need to substitute it with something more meaningful."

      There are plenty of more meaningful tools, you cunt. Just because you are too ignorant to know basic statistics doesn't mean we're forced to deal with your bullshit statistical methods.

      --
      A fool and his hard drive are soon parted.
    8. Re:Mis-use=reviewer don't do their job by Anonymous Coward · · Score: 0

      "There are plenty of more meaningful tools, you cunt. Just because you are too ignorant to know basic statistics doesn't mean we're forced to deal with your bullshit statistical methods."

      Thank you.

    9. Re:Mis-use=reviewer don't do their job by delt0r · · Score: 1

      I review in a fairly wide range of Journals (ones i have published in). From biology to computer science to math and statistics. Often the stats is sloppy and misinterpreted. So reviewers, begin the same group of people, have the same flawed ideas about that sort of thing.

      People put far to much faith in "science" and mostly scientists. We are just people, like everyone else. We have the same failings and just because we never left University, it does not make us special.

      --
      If information wants to be free, why does my internet connection cost so much?
    10. Re:Mis-use=reviewer don't do their job by gzuckier · · Score: 1

      That's what happens when your reviewers are the unpaid submitters of other articles to your paper.

      your reviewers are in fact the competitors of the guy whose paper they are reviewing, for scarce grant money which is awarded very largely on number of publications, so the bias if any is for reviewers to trashcan the paper they review. And on another topic, submitters of articles are not only unpaid, in fact you have to pay for having your paper published.

      --
      Star Trek transporters are just 3d printers.
    11. Re:Mis-use=reviewer don't do their job by gzuckier · · Score: 1

      By and large, the stats are irrelevant. If the effect is pretty solid, the stats will be so obviously significant that you don't need to ask the p-value. On the other hand, when the stats are so close to the line that you have to calculate the p-value and find that it's .0475; nobody else is going to believe the effect is real. Especially, nobody who knows their stuff is going to look at a paper with p=.0475 and one with p=.0525 and say with any sincerity that the first one is true and the second one is not.

      --
      Star Trek transporters are just 3d printers.
    12. Re:Mis-use=reviewer don't do their job by gzuckier · · Score: 1

      p-values are inherently bad statistics. You can't fix them with 'good methodology.' Can they be used properly in some situations? Maybe, if the author knows enough statistics to know when or when not to use them. But the people who use p-values are likely not to have that level of knowledge.

      p-values are like the PHP of statistics.

      > "This might be a case in which the cure is worse than the disease. The goal should be the intelligent use of statistics. If the journal is going to take away a tool, however misused, they need to substitute it with something more meaningful."

      There are plenty of more meaningful tools, you cunt. Just because you are too ignorant to know basic statistics doesn't mean we're forced to deal with your bullshit statistical methods.

      What you want to know when analyzing your data is, that if you see this result XXXX when you do the experiment, what percentage of the time would there really not be anything happening and you are just seeing a fluke. However, what the p-value .05 says is that if there really isn't anything happening, then we'd only see this fluky result less than 5% of the time; which when you think about it of course is pretty much useless information, because by definition you don't know whether there is anything happening or not, so less than 5% of an unknown number is useless. That's what these editors are concerned about. The two numbers, what you want to know and what the p-value tells you, aren't simply related in any fashion; other than both are increased by high error rates and/or noise, which is also not useless information, plus it's something you probably figured out intuitively.

      --
      Star Trek transporters are just 3d printers.
    13. Re:Mis-use=reviewer don't do their job by gzuckier · · Score: 1

      p-values are inherently bad statistics. You can't fix them with 'good methodology.' Can they be used properly in some situations? Maybe, if the author knows enough statistics to know when or when not to use them. But the people who use p-values are likely not to have that level of knowledge.

      p-values are like the PHP of statistics.

      > "This might be a case in which the cure is worse than the disease. The goal should be the intelligent use of statistics. If the journal is going to take away a tool, however misused, they need to substitute it with something more meaningful."

      There are plenty of more meaningful tools, you cunt. Just because you are too ignorant to know basic statistics doesn't mean we're forced to deal with your bullshit statistical methods.

      What you want to know when analyzing your data is, that if you see this result XXXX when you do the experiment, what percentage of the time would there really not be anything happening and you are just seeing a fluke. However, what the p-value .05 says is that if there really isn't anything happening, then we'd only see this fluky result less than 5% of the time; which when you think about it of course is pretty much useless information, because by definition you don't know whether there is anything happening or not, so less than 5% of an unknown number is useless. That's what these editors are concerned about. The two numbers, what you want to know and what the p-value tells you, aren't simply related in any fashion; other than both are increased by high error rates and/or noise, which is also not useless information, plus it's something you probably figured out intuitively.

      I mean, both are increased by high error rates and/or noise, which IS also useless information, plus it's something you probably figured out intuitively.

      --
      Star Trek transporters are just 3d printers.
    14. Re:Mis-use=reviewer don't do their job by delt0r · · Score: 1

      You couldn't be more wrong. See the very strong effect of cholesterol in a large study from ages ago..... Proper statistical techniques and understanding what your data really is, is important. Oh and many fields use 4 or even 6 sigma for "confidence".

      --
      If information wants to be free, why does my internet connection cost so much?
  3. Is the math not towing the groupthink? by cfalcon · · Score: 0

    My immediate thought would be that hard math in this field doesn't tow the groupthink by revealing too much that they want to be able to argue around, so their solution is to try to eliminate the math.

    I don't know that this is the case or anything: it's just the only real motivation that would lead to this. Like, the studies show stuff that no one wants to talk about or the math prevents people from coming to a conclusion opposite reality.

    1. Re:Is the math not towing the groupthink? by monkeyzoo · · Score: 2

      Actually, it is increasing utilization of improved statistical methods leading to the phase-out of earlier, cruder methods. It's standard advancement of the scientific method and applies to all experimental design analysis regardless of field.

      That this journal is throwing out the baby with the bath water and abdicating its responsibility to review quality of content in favor of blanket rules is another matter.

    2. Re:Is the math not towing the groupthink? by Anonymous Coward · · Score: 1

      Or... maybe because p-values encourage bogus results? Especially when subgroups are involved.

    3. Re:Is the math not towing the groupthink? by retchdog · · Score: 1, Troll

      It's the opposite really. You can publish any fucking thing by mining for a low p-value (through multiple comparisons, outright biased sampling techniques, etc., etc.) and then turning your brain off.

      Of course, just getting rid of the p-value outright won't solve this, but at the very least, the problem isn't what you're saying it is. Blind math fetishism isn't solving anything.

      --
      "They were pure niggers." – Noam Chomsky
    4. Re:Is the math not towing the groupthink? by CrimsonAvenger · · Score: 1

      My immediate thought would be that hard math in this field doesn't tow the groupthink

      Why should the math be towing the groupthink? Can't the groupthink move on its own?

      Or did you mean "toe the groupthink" as in "toe the line". No, that expression isn't about pulling barges, it's about standing in the right place in a formation....

      --

      "I do not agree with what you say, but I will defend to the death your right to say it"
    5. Re:Is the math not towing the groupthink? by retchdog · · Score: 1

      eh, you are right about the origin of the phrase, but really it works either way. in this case, i like the imagery of a small yoked vehicle having to pull a lot of dead weight; it's an apt description of math/stats as it relates to the social sciences right now.

      --
      "They were pure niggers." – Noam Chomsky
    6. Re:Is the math not towing the groupthink? by IgnitusBoyone · · Score: 1

      Well, I need to read TFA, but I am going to assume they are provding alternatives? I would hope nothing with out any stat test would pass review

      --
      Momento Mori
    7. Re:Is the math not towing the groupthink? by monkeyzoo · · Score: 2

      Well, I need to read TFA, but I am going to assume they are provding alternatives? I would hope nothing with out any stat test would pass review

      Facepalm. Oh goodness, are people reading this headline to think they are removing p-values in favor of just accepting speculation with no statistical analysis!?!

      YESSSS, they are forcing submitters to UPGRADE their statistical analysis to employ more robust mathematics.

    8. Re:Is the math not towing the groupthink? by eli+pabst · · Score: 1

      It's the opposite really. You can publish any fucking thing by mining for a low p-value (through multiple comparisons, outright biased sampling techniques, etc., etc.) and then turning your brain off.

      Of course, just getting rid of the p-value outright won't solve this, but at the very least, the problem isn't what you're saying it is. Blind math fetishism isn't solving anything.

      Yeah, but then when nobody can replicate your findings, you become that lab that publishes crap all the time. Reviewers start asking for more confirmatory evidence, grant reviewers already ding you before they've even read you application, etc. Sure you can abuse the system for awhile, but eventually it catches up to you.

    9. Re:Is the math not towing the groupthink? by drooling-dog · · Score: 1

      The math works fine; the problem is choosing the appropriate method. My hunch is that the biggest mistake in the use of stats in the social sciences is failing to correct p-values for multiple comparisons. That is, if you're hypothesis is limited to predicting an association between two variables, then p-values are just fine. But if you sent out a questionnaire with 20 questions on it and compute all 190 pairwise correlations between them, you'll get around 9 or 10 "significant" (p 0.05) but meaningless associations just by chance. You can't (or shouldn't) cherry-pick these and write them up like they mean anything. Yet many people do just this, often not realizing how the hypotheses were selected (it can sometimes be subtle, or buried in the history of the project).

    10. Re:Is the math not towing the groupthink? by lgw · · Score: 1

      Oh goodness, are people reading this headline to think they are removing p-values in favor of just accepting speculation with no statistical analysis!?!

      This is a social science journal. Statistics are obviously a tool of the Patriarchy and should be shunned. (This mockery has become a meme now - you can buy "logic is a tool of the Patriarchy" t-shirts for goodness sake.)

      --
      Socialism: a lie told by totalitarians and believed by fools.
    11. Re:Is the math not towing the groupthink? by amber_of_luxor · · Score: 1

      But their replacement is even more subject to bias that p-Values.

      At least with P-Values I don't have to delve into a dozen things that are not in the paper, to see the error. With their proposal, I have to investigate at least a dozen factors that are not mentioned dn the paper, to determine where, and why the errors that are made are present.

      IOW their proposed replacement makes lying using statistics so much more trivial, that you can now say that lies and statistics are synonyms.

      --
      Wind Beneath Thy Wings
    12. Re:Is the math not towing the groupthink? by Anonymous Coward · · Score: 1

      I doubt you understand what a p-value means when used by psychologists/sociologists/medical researchers. If we accept the arbitrary cutoff point and widespread confusion and misuses it has caused, what we are left with is that the data in column A is probably higher/lower than the data in column B. If you think that you can take any meaning away from that without investigating at least a dozen (probably many more) factors, you are one of the confused.

    13. Re:Is the math not towing the groupthink? by Anonymous Coward · · Score: 0

      Thanks to the unput of math/stats we can narrow down the number of possible explanations for two groups to differ from inf! to (inf-1)!. Wow, thanks.

    14. Re:Is the math not towing the groupthink? by gzuckier · · Score: 1

      It's the opposite really. You can publish any fucking thing by mining for a low p-value (through multiple comparisons, outright biased sampling techniques, etc., etc.) and then turning your brain off.

      Of course, just getting rid of the p-value outright won't solve this, but at the very least, the problem isn't what you're saying it is. Blind math fetishism isn't solving anything.

      Yeah, but then when nobody can replicate your findings, you become that lab that publishes crap all the time. Reviewers start asking for more confirmatory evidence, grant reviewers already ding you before they've even read you application, etc. Sure you can abuse the system for awhile, but eventually it catches up to you.

      But irreproducible results which catch people's imagination live forever. People still believe you can transfer learned experience by grinding up planaria and feeding them to other planaria.

      --
      Star Trek transporters are just 3d printers.
    15. Re:Is the math not towing the groupthink? by gzuckier · · Score: 1

      My immediate thought would be that hard math in this field doesn't tow the groupthink

      Why should the math be towing the groupthink? Can't the groupthink move on its own?

      Or did you mean "toe the groupthink" as in "toe the line". No, that expression isn't about pulling barges, it's about standing in the right place in a formation....

      Or tow the lion. That's a dangerous and difficult task.

      --
      Star Trek transporters are just 3d printers.
  4. A Bayesian Conspiracy by PvtVoid · · Score: 5, Funny

    It's a war, I tell you, a war on frequentists! I'm 95% certain!

    1. Re:A Bayesian Conspiracy by Anonymous Coward · · Score: 0

      Frequentists are better though. At least in the long run.

    2. Re:A Bayesian Conspiracy by Anonymous Coward · · Score: 0

      What's the p-value on this 95% certainty estimate?

    3. Re:A Bayesian Conspiracy by Anonymous Coward · · Score: 0

      Yes, but do your prognostications follow a normal distribution? If not, then you might be misapplying the statistic.

    4. Re:A Bayesian Conspiracy by Anonymous Coward · · Score: 0

      But..but do you feel it strongly, or just stingy bit of short of strongly?

  5. Even more obligatory by Moses48 · · Score: 4, Interesting
    1. Re:Even more obligatory by IgnitusBoyone · · Score: 1

      I hate to admit it, but I don't think I truely started to understand pvalues until reading that comic when it was released. I actually started using it as a discussion point in study groups.

      --
      Momento Mori
    2. Re:Even more obligatory by drooling-dog · · Score: 1

      A useful exercise (if you can use basic statistics software) that illustrates this is to generate a bunch (say, 10 or 20) of series of random numbers and then compute the matrix of correlations (or t-values, if you prefer) between all of them. You'll find that roughly 5% of the correlations are "significant" at the p.05 level, even though the series are really random and independent. It's a trivial result and just what you'd expect by chance, but it does drive the point home that you can't rely on p-values alone if you're testing multiple hypotheses. In the latter case there are corrected measures available that take this into account.

    3. Re:Even more obligatory by Anonymous Coward · · Score: 0

      The issue isn't really that the nil-null NHST p-value fails to detect deviations from chance. It is that just knowing there has been a deviation from chance is not helpful in many use-cases. There could be many mundane reasons for such deviations such as baseline differences, your treatment had some weird irrelevant side effect, your measurement methods have some slight non stationary bias. Literally anything interesting or not interesting could still explain the deviation from chance.

      This is why previous generations of scientists instead considered "deviation from theoretical prediction". Substituting chance in there is a huge mistake.

    4. Re:Even more obligatory by Jane+Q.+Public · · Score: 1

      It's a trivial result and just what you'd expect by chance, but it does drive the point home that you can't rely on p-values alone if you're testing multiple hypotheses.

      On the other hand, TFA is proposing to replace this with Bayesian probabilities, which are likely even less understood, even more abused, and it could open the door to subjectivism.

    5. Re:Even more obligatory by Anonymous Coward · · Score: 0

      The thing that needs replacing is the default null hypothesis. There is no salvaging that idea with any type of math or philosophy. It needs to be replaced with numerical predictions derived from a theory. This whole NHST thing has been a gigantic comedy of errors. I am certain that within 20 years of the folly being recognized by a critical mass of researchers we will have cures for many diseases.

    6. Re:Even more obligatory by ceoyoyo · · Score: 1

      Actually, no. TFA article doesn't like Bayesian techniques either. They want to use purely descriptive statistics.

      So basically, they're replacing something that a lot of people misinterpret with something else that essentially cannot be interpreted properly due to lack of information.

    7. Re:Even more obligatory by ceoyoyo · · Score: 1

      You know that in a lot of statistical testing the null hypothesis is the output of a theory, right? Just because you didn't ever advance beyond the most basic t-test doesn't mean nobody else did.

    8. Re:Even more obligatory by Anonymous Coward · · Score: 0

      I have no problem with that use. Re-read the post.

    9. Re:Even more obligatory by Anonymous Coward · · Score: 0

      Let me put it another way. There is usually just no point to "rejecting chance" as an explanation for the difference between two groups (exceptions would be ESP, etc). The reason this is so is that there is an endless list of trivial non-chance reasons for two groups to differ. That said, even there the p-value can be a useful shorthand for a likelihood function in a very limited set of situations: http://arxiv.org/abs/1311.0081

      Stop telling people that a p-value is some kind of probability. While that may be so, it has nothing to do with why it could be useful for the vast majority of cases where it is applied.

    10. Re:Even more obligatory by Anonymous Coward · · Score: 0

      So basically, they're replacing something that a lot of people misinterpret with something else that essentially cannot be interpreted properly due to lack of information.

      Sounds perfect for a journal on psychology.

    11. Re:Even more obligatory by Jane+Q.+Public · · Score: 1

      "One possible replacement that might fit the bill is a rival approach of data analysis called Bayesianism."

      I did not mean to suggest it was the only alternative offered. But it was one, and I didn't see enough discussion of its shortcomings for my taste.

    12. Re:Even more obligatory by gzuckier · · Score: 1

      Actually, no. TFA article doesn't like Bayesian techniques either. They want to use purely descriptive statistics.

      So basically, they're replacing something that a lot of people misinterpret with something else that essentially cannot be interpreted properly due to lack of information.

      Reminds me of when in my younger and wiser days, I challenged the Old Statistician's use of arithmetic mean for a nonGaussian variable. "Why do you use the mean?" I raged. "Why do you not use the median?"
      "Because they don't know what the median means" he answered, "and they all know what the mean is, so they're better off with a biased number they understand than a correct one that completely baffles them". Since then, I have come to agree.

      --
      Star Trek transporters are just 3d printers.
  6. Re:What's the problem? by monkeyzoo · · Score: 5, Funny

    This is social science. Mathematics and statistics aren't even relevant.

    Correlation between low intelligence and uninformed statements of this nature is p<0.01.

  7. Now we'll see overuse of F-score by bangular · · Score: 1

    Don't worry, we'll find another panacea statistic.

  8. Re:What's the problem? by LifesABeach · · Score: 0

    My next recommendation to the Basic and Applied Social Psychology committee is that any conjunctions in a sentence are to be removed. Mainly because of the poor usage of english grammer in a typical submittal.

  9. Interesting questions by PaulMattSutter · · Score: 2

    From a blog by a colleague of mine on the subject: "Questions that p-values can answer" != "Interesting questions about the world'.

    1. Re:Interesting questions by blue9steel · · Score: 1

      Of course those questions belong in philosophy not social science.

  10. Basic and Applied Social Psychology by linearZ · · Score: 2

    "Look at this experimental evidence and tell me what you see?"

    --
    Revolution is the opium of the intellectuals.
  11. Re:What's the problem? by TechyImmigrant · · Score: 5, Insightful

    This is social science. Mathematics and statistics aren't even relevant.

    Yes they are. Get quantitative data, use quantitative methods.
    Just because most social 'scientists' are not experts at statistical inference, it doesn't mean it can't be done correctly.

    p-values are just a probability of something. Do you experiment well and 'something' makes sense.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  12. Considering the way Republicans have... by Anonymous Coward · · Score: 0

    miss used them, it is right to ban them. To not ban them is to support racism.

    1. Re:Considering the way Republicans have... by Anonymous Coward · · Score: 0

      They do love to use statistics to justify killing and torture. They certainly use them for profiling by race when making random arrests.

  13. Re:What by Anonymous Coward · · Score: 0

    Did that sound as dumb in your head as it looks on the screen?

  14. Past APA president Kimble turns over in his grave by Anonymous Coward · · Score: 2, Insightful

    At least one president of the American Psychological Association published a statistics book intelligent enough that it used to be required in university statistics intro classes: http://books.google.com/books/about/How_to_use_and_misuse_statistics.html

    Not that he would have disagreed with the comment about social psychologists...

  15. My Paper by Anonymous Coward · · Score: 5, Interesting

    Ok, let me enlighten the readers a bit. The reviewers tend to be the typical researcher within the field. The typical social researcher does not have a very strong math background. There is a lot of them into qualitative research and quantitative tends to stop at ANOVA. I have multiple masters in business and social science and worked on a Ph.D. in social science (Being vague here for a reason). However, I have a dual bachelors in comp sci and math. I know statistical analysis very well. My master's thesis for my MBA was an in-depth analysis of survey responses. 30 pages of body and really good graphs. My research professor, an econometrics professor, and I submitted it to a second tier journal associated with the field I specialized in...

    ... 6 pages got published. 6?!? They took out the vast majority of the math. Why? "Our readers are really bad at math," said the editor. If you knew the field... you would be scared shitless. The reviewers suggested we took out the math because it confused them. This is why they want P value out... it is misunderstood and abused. The reviewers have NO idea if it is being used correctly.

    1. Re:My Paper by retchdog · · Score: 1

      I have multiple masters in business and social science and worked on a Ph.D. in social science (Being vague here for a reason).

      And what reason is that? You're not even close to identifiable from this information, you know...

      --
      "They were pure niggers." – Noam Chomsky
    2. Re:My Paper by Anonymous Coward · · Score: 0

      Actually, a lot of us have multiple masters, one being an MBA and the other something else, even with the comp sci / math degree (I know quite a number of us in different fields with the same Bachelor's combo - including a couple of J.D.s). That something else and the Ph.D. are MUCH more identifying if spelled out, especially with the journal. :-p

    3. Re:My Paper by bluFox · · Score: 2

      The unfortunate thing is that what they want could have been easily accomplished by requiring smaller p values, and also effect sizes (or the confidence intervals). Instead, it seems that the consensus is on using bayesian tools, and the standard ways of using the bayesian equivalents of t-tests[1] typically requires a smaller number of samples than frequentist methods depending on their prior. [1] http://www.sumsar.net/blog/201...

      --
      ~561
    4. Re:My Paper by Lehk228 · · Score: 1

      so you are saying the journal is shit and should be disregarded?

      because that is what I got from that. they don't understand the material they are approving or rejecting and so they serve no useful purpose.

      --
      Snowden and Manning are heroes.
    5. Re:My Paper by Brett+Buck · · Score: 1

      I think he is saying the field is shit and should be disregarded.

    6. Re:My Paper by Anonymous Coward · · Score: 0

      Sigh. Yes.

      My own sister, a PhD in Molecular Biology, decided on that field because she wouldn't have to use much math. Turns out she got forced into getting good at it despite this original intention (kicking and screaming mostly - I'm an EE and I helped to tutor her back in the day). But yes, my sister is the exception and what you say is sadly the rule. Honestly this is why "soft science" is barely even science at all!

  16. Blind math fetishism??? by Anonymous Coward · · Score: 0

    Blindfold. Check.
    Math textbook. Check.
    Bedroom. Check.
    Girlfriend to enjoy my fetish with??? Oh wait, this is Slashdot.

    1. Re:Blind math fetishism??? by retchdog · · Score: 3, Informative

      speak for yourself. i've never tried using a Springer book as a nipple weight, though; i'll give it a try sometime. thanks.

      --
      "They were pure niggers." – Noam Chomsky
  17. p-values are routinely misused ... by QuietLagoon · · Score: 4, Funny

    This is why we can't have nice things.

  18. Re:Past APA president Kimble turns over in his gra by Rob+Riggs · · Score: 1, Interesting

    used to be required in university statistics intro classes: http://books.google.com/books/about/How_to_use_and_misuse_statistics.html

    I suspect that book is still foundational in most University advertising/marketing progams.

    --
    the growth in cynicism and rebellion has not been without cause
  19. Re:Social Science != Science by Anonymous Coward · · Score: 0

    Just because you are post positivist doesn't mean you are right. Social science is more difficult and has constantly been used to resolve issues with 'hard' science. How difficult do you think it was to create the Nash equilibrium, which is social science? What about the Nobel prize game theory won in Biology based on social science? Why don't you spend some time learning something before you blast it?

  20. Re:Libtards & SJWs by Anonymous Coward · · Score: 0

    I just tried thinking for myself but because you told me to do it, it didn't feel right. Sheesh, any suggestions??

  21. Plural or singular - pick one by wonkey_monkey · · Score: 1

    While p-values are routinely misused in scientific literature, many researchers who understand its proper role are upset about the ban.

    Do they also know whether "p-values" is plural or singular?

    --
    systemd is Roko's Basilisk.
  22. Re: What's the problem? by Anonymous Coward · · Score: 0

    "Racism", "sexism", "patriarchy" and related topics of study within the social "sciences" inherently can't be quantitatively analyzed in any meaningful way.

  23. Re:What's the problem? by monkeyzoo · · Score: 4, Insightful

    I agree with you. Yet no need for the quotes around social 'scientists.' Psychologists, socialists, etc. employ the same experimental designs and mathematical techniques in experiments as doctors or others performing drug efficacy or medical outcome experiments, for example.

    P-Value: It's intervention versus control group. Standard, basic scientific experimental design and statistical analysis stuff.

    It's an uninformed and naive view to think that people looking at the behavior of humans at the level of social organization are somehow intellectually or scientifically less able than those examining them at the biological level.

  24. Re:Social Science != Science by Anonymous Coward · · Score: 0

    Nash equilibria are pure math. They do have lots of applications, but the definition is about a point in a multidimensional space satisfying a system of inequalities.

  25. Re:Social Science is NOT Science by Anonymous Coward · · Score: 1

    Just because a lot of researchers in a particular field suck at what they do doesn't mean the field is inherently not a science, only that the subset of crappy work is not scientific. There is plenty of work in social sciences that does live up the science name, including controlled experiments and practical applications.

  26. Re:Social Science != Science by retchdog · · Score: 0

    those two examples are from economics. whatever your opinion of that discipline may be, it is, at least, in a different class of bullshit from sociology or "social psychology".

    --
    "They were pure niggers." – Noam Chomsky
  27. Re: What's the problem? by Anonymous Coward · · Score: 1, Insightful

    I could cite examples of your folly all day, but since only one instance is needed to refute your foolhardy blanket statement, this will suffice:
    http://en.wikipedia.org/wiki/I...

    Wow! Would [apart from a self-aware and non-self-righteous human being] could imagine that the processes of the brain could be measured and analyzed mathematically?!
    Well, I guess you learn something every day, eh?

  28. Graphing the data would help a lot of the time by umafuckit · · Score: 4, Insightful

    I don't think you even need to be pushing people to do Bayesian stats. You just need to force them to graph their data properly. In *a lot* of biological and social science sub-fields it's standard practice to show your raw data only in the form of a table and the results of stats tests only in the form of a table. They aren't used to looking at graphs and raw data. You can hide a lot of terrible stuff that way, like weird outliers. Things would likely improve immediately in these fields if they banned tables and forced researchers to produce box plots (ideally with overlaid jittered raw data), histograms, overlaid 95% confidence intervals corresponding to their stats tests, etc, etc.

    Having seen some of these people work, it's clear that many of them never make these plots in the first place. All they do is look at lists of numbers in summary tables. They have no clue in the first place what their data really look like, and know good knowledge of how to properly analyse data and make graphs. Before they even teach stats to undergrads they should be making them learn to plot data and read graphs. It's obvious most of them can't even do that.

    1. Re:Graphing the data would help a lot of the time by Anonymous Coward · · Score: 0

      Boxplots are pretty much deprecated (or at least should be). Use a plot that includes a density estimate. For example a beanplot:
      http://www.jstatsoft.org/v28/c01/paper

      The R package viopoints is also good. What the "problem cases" you describe will do if they are forced to plot something is to make a dynamite plot which contains the same information as the table except easier to digest. Plots should filter/reduce as little information, as possible without getting confusing. They also need to start plotting x vs y if they say x and y are related. It is bizarre how many medical papers fail to do this.

    2. Re:Graphing the data would help a lot of the time by phantomfive · · Score: 1

      They have no clue in the first place what their data really look like, and know good knowledge of how to properly analyse data and make graphs. Before they even teach stats to undergrads they should be making them learn to plot data and read graphs. It's obvious most of them can't even do that.

      That........

      Explains why some people struggle horrifically in statistics, and others can sleep through class and still get an A.

      --
      "First they came for the slanderers and i said nothing."
    3. Re:Graphing the data would help a lot of the time by Tony+Isaac · · Score: 1

      Graphs can lie just as easily as statistics themselves.

    4. Re:Graphing the data would help a lot of the time by umafuckit · · Score: 1

      They both can be deceptive if you read them incorrectly or they're designed to deceive. However, it's harder to deceive someone with a graph than with the results of stats test. I really think graphs lie less easily than statistics.

    5. Re:Graphing the data would help a lot of the time by umafuckit · · Score: 1

      In our field we call "bean plot" a violin plot. I agree it's better than a box plot, but it's basically just a histogram. Beanplot or boxplot, I think it helps to overlay the jittered raw data. Even a box plot is far better than a bar chart (which is distressingly common and little better than a table).

    6. Re:Graphing the data would help a lot of the time by Anonymous Coward · · Score: 0

      "Beanplot or boxplot, I think it helps to overlay the jittered raw data."

      Always.

    7. Re:Graphing the data would help a lot of the time by gzuckier · · Score: 1

      They have no clue in the first place what their data really look like, and know good knowledge of how to properly analyse data and make graphs. Before they even teach stats to undergrads they should be making them learn to plot data and read graphs. It's obvious most of them can't even do that.

      That........ Explains why some people struggle horrifically in statistics, and others can sleep through class and still get an A.

      Absolutely. A corollary of this is that many hypotheses bolstered with vast quantities of statistical analysis proving their statistical significance, become immediately ridiculous as soon as they are graphed.

      --
      Star Trek transporters are just 3d printers.
  29. Re:What's the problem? by Anonymous Coward · · Score: 0

    I agree with you. Yet no need for the quotes around social 'scientists.' Psychologists, socialists, etc. employ the same experimental designs and mathematical techniques in experiments as doctors

    No, socialists are a political group, and psychology is the unscientific part of psychiatry.

  30. Re: What's the problem? by Anonymous Coward · · Score: 0

    Welp you sure convinced me. Let's lock up all the white male cispigs.

  31. Re: What's the problem? by retchdog · · Score: 4, Informative

    Yes they can, in some cases. There was a very well-controlled study where two sets of anonymous letters of application were sent to various positions at a large number of companies from a large number of applicants. The letters included similar random credentials from random institutions, random cosmetic variations of the same cover letter, and so on, to avoid tipping the hand of the researchers. The only difference between the two groups of letters was that one were given names sampled uniformly from African-Americans, and the other given names sampled uniformly from everyone else. The names were assigned in a blind way, literally a random form insertion, to avoid introducing bias.

    I'm sure you can guess where this is going. The response and offer rate to the blacks was significantly lower, both statistically and practically. It's rather hard to explain that away, though I'm sure someone here will try without having even read the study.

    --
    "They were pure niggers." – Noam Chomsky
  32. Re:What's the problem? by Anonymous Coward · · Score: 0, Insightful

    This is social science. Mathematics and statistics aren't even relevant.

    Undergraduate psychology and sociology students are required to study statistics. Undergraduates in medicine, biology, chemistry, physics, et al. are not. So, perhaps you need to rethink your ignorance about the limits of the scientific method and educate yourself about what these subject areas actually entail in real life instead of in your uninformed world view.

  33. Re: What's the problem? by retchdog · · Score: 1

    Dammit, I meant the letters were written anonymously and then labeled with names later. I guess "pseudonymous" would have been a better word. Oh well.

    --
    "They were pure niggers." – Noam Chomsky
  34. Re:What's the problem? by Anonymous Coward · · Score: 2, Funny

    Not a big fan of college, eh?

  35. Inconvenient Conclusions by jtwiegand · · Score: 1

    Use of the p-value gave us conclusions that weren't politically correct. We have corrected the issue by banning the use of the p-value so that only True Science may be published.

    1. Re:Inconvenient Conclusions by Anonymous Coward · · Score: 0

      More like "Use of the p-value gave us conclusions that weren't correct". If the way you use it is to just see that two groups are different but you actually care about why they are different, what do you expect to happen? It turns into a disproving strawmen ritual.

  36. Re: What's the problem? by Anonymous Coward · · Score: 0

    Welp you sure convinced me. Let's lock up all the white male cispigs.

    You obviously didn't read. IAT applies equally to all races.

  37. Re:What by Anonymous Coward · · Score: 0

    Did that sound as dumb in your head as it looks on the screen?

    +1
    LOL

  38. Re:What's the problem? by Anonymous Coward · · Score: 0

    Just because most social 'scientists' are not experts at statistical inference, it doesn't mean it can't be done correctly.

    p-values are just a probability of something.

    Actually, p-values are about CORRELATION.
    Maybe *you* aren't well-positioned to be denigrating others as not statistical experts.

  39. Re: What's the problem? by twitnutttt · · Score: 1

    "Racism", "sexism", "patriarchy" and related topics of study within the social "sciences" inherently can't be quantitatively analyzed in any meaningful way.

    You sound as silly now as the people who used to think atoms were the *inherent* limit of divisibility and exploration. Then electrons...
    In science, as in politics, innovation tends to come from the death of the old stalwarts rather than their enlightenment.
    Even Einstein became an obstructionist to quantum mechanics in his later years.

  40. Re: What's the problem? by ShanghaiBill · · Score: 5, Informative

    There was a very well-controlled study where two sets of anonymous letters of application ...

    This study was conducted by Stephen Levitt, and is described in his book Freakonomics, which is a fantastic book for anyone interested in the application of statistics to social science. Here is the original paper.

  41. Re: What's the problem? by twitnutttt · · Score: 1

    Even Einstein became an obstructionist to quantum mechanics in his later years.

    "God does not play dice with the universe." ;-)

  42. Perfectly understandable move... by rgbatduke · · Score: 4, Informative

    ...and this isn't even the first journal to do this. It's probably happening now because an entire book has just come out walking people how universally abused p-values are as statistical measures.

    http://www.statisticsdonewrong...

    The book is nice in that it does give one replacements that are more robust and less likely to be meaningless, although nothing can substitute for having a clue about data dredging etc.

    rgb

    --
    Even when the experts all agree, they may well be mistaken. --- Bertrand Russell.
  43. Re: What's the problem? by Anonymous Coward · · Score: 0

    Are you sure that the term "well-controlled study" applies, given how you repeatedly used the term "random" when describing this experiment?

    Randomness is not compatible with experimental control. Additionally, randomness itself cannot be controlled, because doing so would prevent it from being true randomness.

  44. Re:What's the problem? by TechyImmigrant · · Score: 1

    I used 'scientists' in quotes in the same sense I'd put computer 'scientistis' in quotes. My degree is computer science, but I dispute that it's a science in the conventional sense.

    I find debugging hardware is closer to science. You can't really see inside the chip, but you can develop hypotheses about what it wrong and come up with tests that will refute (or not) the hypotheses. Iterate until you think you probably know the truth.

    Doing things well in social sciences is hard. The field (human subjects, IRB etc.) doesn't admit normal testing methods readily. You can't set up a control group and not teach them anything when the control group are school children, or not treat them when the control group is cancer patients, or not house them when the control group is people of the prevalent skin color in the area. The statistics to do things correctly are therefore non trivial and are all about making do with what you have and not over-inferring. If your professors don't know this stuff, and you don't know this stuff, and the paper reviewers don't know this stuff, then it's going to be hard to be rigorous.

    I design chips and I do new things that haven't been done before in the analog/digital overlap. So I need data to test. My curves and P-values look great, since I just pull a couple of gig of data when I need it. The control group won't get upset, it's a chip. This is easy compared to statistics in the social sciences. So it's less 'science' and more 'advanced inference'.

    It's reasonable for a journal to declare that it (and it's reviewers) don't know that stuff. Presumably there a journal with statistically skilled reviewers and you should submit there if you need that sort of peer review.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  45. they really should be doing bayesian inference by Anonymous Coward · · Score: 1

    IMO the main problem with p-values is it equates p(a|b) with p(b|a) ... disregarding that different values of p(b) and p(a) could make that equation woefully inaccurate.

    it really needs to be a bayesian estimate. they need to look at p(a) and p(b) in addition to p(a|b) (or p(b|a)).

    regarding "... they need to substitute it with something more meaningful", that's the more meaningful thing they need to substitute it with.

    1. Re:they really should be doing bayesian inference by Anonymous Coward · · Score: 0

      Unfortunately, although Bayesian methods are valid in mathematical isolation, in the real world they are simply a way of inserting subjective intuition into numerical judgements.

      (posting as AC since I've already moderated upthread).

    2. Re:they really should be doing bayesian inference by Anonymous Coward · · Score: 2, Insightful

      , in the real world they are simply a way of inserting subjective intuition into numerical judgements.

      That is actually one of the selling points. You're going to insert subjective intuition into your judgements and methods regardless of what method you use. With proper use of Bayesian methods you can more explicitly state your assumptions, even if you don't do much about them.

  46. Creative thinking by crmarvin42 · · Score: 1

    If this is important enough of an issue to consider such a radical change to policy, then they should also have considered other possible solutions, like requiring a statistician be included in the pool of reviewers. The journal I submit to most frequently uses 2 to 3 ad hoc reviewers plus the associate section editor. It could be possible to require the section editor who choses the ad hoc reviewers to include a statistician as the 3rd reviewer. They would then review for the soundness of the statistical procedures, and the appropriateness of the conclusions based on the model used, and analysis conducted.

    I have better stats chops than most in my field (dunning kruger delusion on my part, possibly), but I know that I'm no statistician. I think that getting an actual statistician involved in reviewing most papers as a content expert is far more valuable to science as a whole than simply banning a statistical convention that can be, but is not universally, abused. The comments from the statistician would improve the statistical prowess of the corresponding author, thus reducing the tendency for conclusions based on poor stats to be accepted at face value. This move just hides the ignorance behind confidence intervals, which can also be abused if they are not calculated correctly.

    --
    Bureaucracy expands to meet the needs of the expanding bureaucracy.-Oscar Wilde
  47. Re:What's the problem? by swillden · · Score: 4, Insightful

    Actually, p-values are about CORRELATION. Maybe *you* aren't well-positioned to be denigrating others as not statistical experts.

    I may be responding to a troll here, but, no, the GP is correct. P-values are about probability. They're often used in the context of evaluating a correlation, but they needn't be. Specifically, p-values specify the probability that the observed statistical result (which may be a correlation) could be a result of random selection of a particularly bad sample. Good sampling techniques can't eliminate the possibility that your random sample just happens to be non-representative, and the p value measures the probability that this has happened. A p value of 0.05 means that there's a 5% chance that your results are bogus in this particular way.

    The problem with p values is that they only describe one way that the experiment could have gone wrong, but people interpret them to mean overall confidence -- or, even worse -- significance of the result, when they really only describe confidence that the sample wasn't biased due to bad luck in random sampling. It could have been biased because the sampling methodology wasn't good. I could have been meaningless because it finds an effect which is real, but negligibly small. It be meaningless because the experiment was just badly constructed and didn't measure what it thought it was measuring. There could be lots and lots of other problems.

    There's nothing inherently wrong with p values, but people tend to believe they mean far more than they do.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  48. Q (Filter error: You can type more than that) by Anonymous Coward · · Score: 0

    The Q are intrigued. Pray they don't intervene.

  49. Re:Social Science != Science by Anonymous Coward · · Score: 1

    MMM. Actually economics (especially the behavioral variety, which is the most innovative field, as I'm sure you didn't know) *is* social psychology with another name.
    Back to the books, bro.

  50. Three puzzles by Okian+Warrior · · Score: 4, Interesting

    It is the job of the reviewer to check that the statistic was used ion the proper context. not to check the result, but the methodology. It sounds like social journal simply either have bad reviewer or sucks at methodology.

    That's a good sentiment, but it won't work in practice. Here's an example:

    Suppose a researcher is running rats in a maze. He measures many things, including the direction that first-run rats turn in their first choice.

    He rummages around in the data and finds that more rats (by a lot) turn left on their first attempt. It's highly unlikely that this number of rats would turn left on their first choice based on chance (an easy calculation), so this seems like an interesting effect.

    He writes his paper and submits for publication: "Rats prefer to turn left", P<0.05, the effect is real, and all is good.

    There's no realistic way that a reviewer can spot the flaw in this paper.

    Actually, let's pose this as a puzzle to the readers. Can *you* spot the flaw in the methodology? And if so, can you describe it in a way that makes it obvious to other readers?

    (Note that this is a flaw in statistical reasoning, not methodology. It's not because of latent scent trails in the maze or anything else about the setup.)

    ====

    Add to this the number of misunderstandings that people have about the statistical process, and it becomes clear that... what?

    Where does the 0.05 number come from? It comes from Pearson himself, of course - any textbook will tell you that. If P<0.05, then the results are significant and worthy of publication.

    Except that Pearson didn't *say* that - he said something vaguely similar and it was misinterpreted by many people. Can you describe the difference between what he said and what the textbooks claim he said?

    ====

    You have a null hypothesis and some data with a very low probability. Let's say it's P<0.01. This is such a good P-value that we can reject the null hypothesis and accept the alternative explanation.

    P<0.01 is the probability of the data, given the (null) hypothesis. Thus we assume that the probability of the hypothesis is low, given the data.

    Can you point out the flaw in this reasoning? Can you do it in a way that other readers will immediately see the problem?

    There is a further calculation/formula that will fix the flawed reasoning and allow you to make a correct inference. It's very well-known, the formula has a name, and probably everyone reading this has at least heard of the name. Can you describe how to fix the inference in a way that will make it obvious to the reader?

    1. Re:Three puzzles by fropenn · · Score: 1

      Do you really want someone to answer, or are these all rhetorical?
      Here's my take on this issue: Just because something is prone to be misused and misinterpreted doesn't mean it should be banned. In fact, some of the replacement approaches use the very same logic just with a different mathematical calculation process. However, it does illustrate the need for researchers to clearly communicate their results in ways that are less likely to be misused or misinterpreted. This wouldn't exclude the use of p-values but they are only one of many possible tools for researchers to use.

    2. Re:Three puzzles by Carewolf · · Score: 1

      As always there is an xkcd comic that answers your question in a nice and easy to understand fashion.

      I leave it to you to find the relevant link ;p

    3. Re:Three puzzles by lgw · · Score: 1

      He writes his paper and submits for publication: "Rats prefer to turn left", P 0.05, the effect is real, and all is good.

      There's no realistic way that a reviewer can spot the flaw in this paper.

      Actually, let's pose this as a puzzle to the readers. Can *you* spot the flaw in the methodology? And if so, can you describe it in a way that makes it obvious to other readers?

      I guess I don't see it. While P 0.05 isn't all that compelling, it does seem like prima facie evidence that the rats used in the sample prefer to turn left at that intesection for some reason. There's no hypothesis as to why, and thus way to generalize and no testable prediction of how often rats turn left in a different circumstances, but it's still an interesting measurement.

      You have a null hypothesis and some data with a very low probability. Let's say it's P 0.01. This is such a good P-value that we can reject the null hypothesis and accept the alternative explanation. ...

      Can you point out the flaw in this reasoning?

      You have evidence that the null hypothesis is flawed, but none that the alternative hypothesis is the correct explanation?

      The scientific method centers on making testable predictions that differ from the null hypothesis, then finding new data to see if the new hypothesis made correct predictions, or was falsified. Statistical methods can only support the new hypothesis once you have new data to evaluate.

      --
      Socialism: a lie told by totalitarians and believed by fools.
    4. Re:Three puzzles by Chris+Mattern · · Score: 2

      The answer is simple. He's taken dozens, if not hundreds of measurements. The odds are in favor of one of the measurements turning up a correlation by chance. The odds against this particular measurement being by chance are 19 to 1--but he's selected it out of the group. The chances that one of *any* of his measurements would show such a correlation by chance are quite high, and he's just selected out the one that got that correlation.

    5. Re:Three puzzles by ceoyoyo · · Score: 1

      I assume you're getting at multiple comparisons because you said "he measures many things."

      You're right, the researcher should correct his p-value for the multiple comparisons. Unfortunately, alternatives to p-values ALSO give misleading results if not corrected and, in general, are more difficult to correct quantitatively.

    6. Re:Three puzzles by Rockoon · · Score: 1

      I do not understand why people do not use the proper term for this, data dredging.

      --
      "His name was James Damore."
    7. Re:Three puzzles by Okian+Warrior · · Score: 1

      He writes his paper and submits for publication: "Rats prefer to turn left", P 0.05, the effect is real, and all is good.

      There's no realistic way that a reviewer can spot the flaw in this paper.

      Actually, let's pose this as a puzzle to the readers. Can *you* spot the flaw in the methodology? And if so, can you describe it in a way that makes it obvious to other readers?

      I guess I don't see it. While P 0.05 isn't all that compelling, it does seem like prima facie evidence that the rats used in the sample prefer to turn left at that intesection for some reason. There's no hypothesis as to why, and thus way to generalize and no testable prediction of how often rats turn left in a different circumstances, but it's still an interesting measurement.

      Another poster got this correct: with dozens of measurements, the chance that at least one of them will be unusual by chance alone is very high.

      A proper study states the hypothesis *before* taking the data specifically to avoid this. If you have an anomaly in the data, you must state the hypothesis and do another study to make certain.

      You have a null hypothesis and some data with a very low probability. Let's say it's P 0.01. This is such a good P-value that we can reject the null hypothesis and accept the alternative explanation. ...

      Can you point out the flaw in this reasoning?

      You have evidence that the null hypothesis is flawed, but none that the alternative hypothesis is the correct explanation?

      The scientific method centers on making testable predictions that differ from the null hypothesis, then finding new data to see if the new hypothesis made correct predictions, or was falsified. Statistical methods can only support the new hypothesis once you have new data to evaluate.

      The flaw is called fallacy of the reversed conditional".

      The researcher has "probability of data, given hypothesis" and assumes this implies "probability of hypothesis, given data". These are two very different things which are not always both valid.

      Case 1: Probability that person is woman, given that they're carrying a pocketbook (high), Probability that person is carrying a pocketbook, given that they are a woman (also high).

      Case 2: Probability that John is dead, given that he was executed (high), Probability that John was executed, given that he is dead (low).

      In case 1 it's OK to reverse the conditional, but in case 2 it's not. The difference stems from the relative populations, which about equal in case 1 (women and pocketbooks), and vastly unequal in case 2 (dead people versus executed people).

      Given a low P value (P of data, given hypothesis) does not in general indicate that the probability of the null hypothesis is also low (P of hypothesis, given data).

  51. Re:Social Science is NOT Science by Anonymous Coward · · Score: 1

    Just because you put the word science behind your name, doesn't mean you're doing scientific work. Psychology is great example of where the term "science" has been grossly misused and misdirected. Psychology is really about the pursuit to understand why we're all fucked up and explain away behaviour that no one wants to take ownership of. If this publication is going to block anything, block anyone using the term science, because psychology is not science, it's people trying to make excuses about the way they feel and act, which is all boils down to scape goating responsibility for your actions.

    With the exception of chemical imbalance, every single person is directly responsible for there actions, case closed, now lets stop using the term science to describe excuse generation.

    Didn't have time for college, eh? Don't like to do much reading on your own? Things you don't know yet are intimidating. I understand. Anyway, if you ever feel like it, you can educate yourself about the types of research conducted in these fields.

    Here's a hint... Freud is to psychology what Copernicus was to physics, an early thinker but not the state-of-the-art.

  52. Re:Social Science != Science by Anonymous Coward · · Score: 0

    Considering the amount of math that goes into advanced sociology... It isn't bullshit, not to say there isn't bullshit in it or hard science (I am looking at you Physics). The problem is that people see qualitative methods and think, that is so much bullshit. However, it isn't. It is a formalized way of determining phenomena just so that we can use quantitative analysis to figure out if the phenomena are valid. Especially since the Belmont Report, because we can't set experiments that ruin people for life anymore.

  53. p-value research is misleading almost always by SteveWoz · · Score: 5, Interesting

    I studied and tutored experimental design and this use of inferential statistics. I even came up with a formula for 1/5 the calculator keystrokes when learning to calculate the p-value manually. Take the standard deviation and mean for each group, then calculate the standard deviation of these means (how different the groups are) divided by the mean of these standard deviations (how wide the groups of data are) and multiply by the square root of n (sample size for each group). But that's off the point. We had 5 papers in our class for psychology majors (I almost graduated in that instead of engineering) that discussed why controlled experiments (using the p-value) should not be published. In each case my knee-jerk reaction was that they didn't like math or didn't understand math and just wanted to 'suppose' answers. But each article attacked the math abuse, by proficient academics at universities who did this sort of research. I came around too. The math is established for random environments but the scientists control every bit of the environment, not to get better results but to detect thing so tiny that they really don't matter. The math lets them misuse the word 'significant' as though there is a strong connection between cause and effect. Yet every environmental restriction (same living arrangements, same diets, same genetic strain of rats, etc) invalidates the result. It's called intrinsic validity (finding it in the experiment) vs. extrinsic validity (applying in real life). You can also find things that are weaker (by the square root of n) by using larger groups. A study can be set up in a way so as to likely find 'something' tiny and get the research prestige, but another study can be set up with different controls that turn out an opposite result. And none apply to real life like reading the results of an entire population living normal lives. You have to study and think quite a while, as I did (even walking the streets around Berkeley to find books on the subject up to 40 years prior) to see that the words "99 percentage significance level" means not a strong effect but more likely one that is so tiny, maybe a part in a million, that you'd never see it in real life.

    --
    OK a new size TV
    1. Re:p-value research is misleading almost always by myrdos2 · · Score: 1

      I had no idea it was so common to confuse the p-value with the magnitude of the effect being studied. I haven't seen anything like it in HCI.

    2. Re:p-value research is misleading almost always by Anonymous Coward · · Score: 0

      Are you actually Steve Wozniak?

    3. Re:p-value research is misleading almost always by Anonymous Coward · · Score: 0

      You were searching for the thrill of discovering an invariance.

  54. Re:What's the problem? by buchner.johannes · · Score: 1

    p-values are not probabilities. What people would like it to be are probabilities that one hypothesis is correct compared to another. But that is not what it does, and because people ignore that gap and mis-interpret them it has become such a problem; that's why they are being banned. Many experiments with acceptable p-values (p0.05) are not reproducible.

    Actually the inventor of p-values never intended them for a test, only to uncover that there is perhaps worth of further investigation.

    p-values tell you, if you collected data under the current model, how frequently you will get data more extreme than the data at hand. p0.01 means, only in 1% of cases you will get such an "outlier". But it assumes that the model itself is correct. It varies the data!

    Instead, what should be done is to compare one model versus another one, with the data we have. Bayes factors do that, and should be used and taught.

    The problem came to be because social sciences do not have proper, meaningful models, which can be compared. So they have resorted to techniques that do not require specifying models (or alternatives) rigorously. In the physical sciences, you can precisely write a model for a planetary system with 2 planets and one with 3 planets, and the Bayes factor will be meaningful.

    --
    NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
  55. Re:What's the problem? by Anonymous Coward · · Score: 0

    Undergraduates in medicine, biology, chemistry, physics, et al. are not.

    Depends on the school. The schools I went to for undergrad and grads programs, and the three I've worked at since, all required statistics for science majors. Some of them required only a single semester, others required a full year. My undergrad only required one semester (covering calculus based statistics, e.g. mle evaluation for distributions), and the second optional semester was ~40% math majors and ~40% physics majors.

  56. Re:What's the problem? by Anonymous Coward · · Score: 0

    "p-values are just a probability of something."

    The key is to make sure that "something" isn't totally irrelevant. For example, the probability that two groups of people/animals/cells are from exactly the same hypothetical infinite population is not helpful. If you teach people they should calculate such a probability, they will attribute magical powers to it in an attempt to make sense of why they are doing it. If not you get this:

    "We are quite in danger of sending highly trained and highly intelligent young men out into the world with tables of erroneous numbers under their arms, and with a dense fog in the place where their brains ought to be. In this century, of course, they will be working on guided missiles and advising the medical profession on the control of disease, and there is no limit to the extent to which they could impede every sort of national effort."

    Fisher, R N (1958). "The Nature of Probability". Centennial Review 2: 261–274.

    And yes, I know it is partly Fisher's fault.

  57. Re: What's the problem? by Anonymous Coward · · Score: 0

    You obviously didn't read about the criticism and numerous problems associated with that measurement technique.

  58. Re:What's the problem? by Anonymous Coward · · Score: 1

    Doctors are the opposite of scientists. They use induction (or rather abduction) to make their clinical decisions. Whereas scientists principally use deduction and the scientific method (i.e. experimentation). Legal reasoning in the context of evidence is also based on abductive logic. Basically, doctors and lawyers typically draw conclusions from circumstantial evidence. Whereas scientists formulate theories partly based on circumstantial evidence, but only draw conclusions from experimentation and deduction of the results.

    These differences are important to comprehend if you want to understand how and why doctors behave the way they do. Some fields, like nutrition science, are less scientific than they could be specifically because they apply Doctor Logic (tm) too heavily. Not that there's anything wrong with Doctor Logic (tm), as long as it's applied in the correct context. You obviously cannot experiment on a human being in order to address his particular ailment, and there are very few symptoms or set of symptoms which conclusively identify a specific ailment. And in the context of a single case, induction and abduction are arguably less susceptible to biases because the particularities of the case are more highly correlated with the underlying effect. Whereas in the context of huge samples induction can lead you to a multitude of different conclusions because of the cornucopia of circumstantial evidence.

    Anyhow, people who perform medical research are rarely practicing medical doctors. They usually have a Ph.D in a hard science, like chemistry or virology.

  59. Re:What's the problem? by pseudorand · · Score: 1

    Your false assumption is that doctors, chemists and physicists get things right with any greater frequency. It's not that social scientists are misusing statistics but that a large number of scientists is most disciplines simply do a poor job of quantifying things. It's a little more obvious when it happens in social science, but accurate measurement is hard or often impossible, so bad proxy measures a pervasive feature of most scientific disciplines. That's one of may reasons why most "experts" usually get it wrong.

  60. Re: What's the problem? by retchdog · · Score: 3

    yes, i am.

    true randomization allows you to control for everything (intuitively: since it's randomized, there is no way for you to introduce bias), at the cost of increased variance. however, you can make up for increased variance by increasing the sample size, which is what they did here. i forget the exact numbers, but they sent out hundreds of letters.

    far from what you assert, randomization is fundamental to experimental control, and randomness is quite easily generated in a controlled manner. here's a general hint for you and everyone else: don't say things like "randomness cannot be controlled because then it wouldn't be 'true' randomness". it just makes you seem like an idiot.

    --
    "They were pure niggers." – Noam Chomsky
  61. Re:Social Science != Science by retchdog · · Score: 1

    yes, you are correct: social psychology done rigorously becomes economics. as for the rest, however...

    --
    "They were pure niggers." – Noam Chomsky
  62. Re:What's the problem? by robiso22 · · Score: 0

    +1

  63. Re:What's the problem? by Anonymous Coward · · Score: 0

    The useful interpretation of the "group comparison" p-value was not figured out until 2013:
    http://arxiv.org/abs/1311.0081

  64. Re:Social Science != Science by retchdog · · Score: 1

    i am a statistician and i've worked closely with a sociologist (one of the few who uses math correctly, if a bit pedantically). you are correct, it is not intrinsically impossible to do sociology correctly. however, the mathematical literacy standards for the field are woefully lacking even in the ivy league.

    this song by Tom Lehrer holds true today, just replace "sigma and chi-square" by "social network analysis".

    --
    "They were pure niggers." – Noam Chomsky
  65. A metatstudy of randomly selected metastudies by Anonymous Coward · · Score: 0

    indicates that authors incorrectly measure p-values to their study results 86.5% of the time (P 0.001).

  66. Re:What's the problem? by Anonymous Coward · · Score: 0

    Maybe you and the mods just didn't major in the sciences, or maybe you just didn't go to that great of a school. In my experience statistics is a required course for the sciences, and statistics is needed for the USMLE exam for those going into medicine... so maybe you need to take your own advice and rethink things instead of using an uninformed world view.

  67. Who needs real evidence anyway? by Anonymous Coward · · Score: 0

    Modern social psychology is notorious for rejecting objective evidence, since it can often uncover facts about human nature which society (and many social psychologists) don't like. Stan Milgram's experiments come to mind.

    On the other hand, get rid of p-values and other forms of objective verification and you can make up anything you want to. You can come up with any amount of airy-fairy so-called 'evidence' to support your pet theory. Get rid of that annoying inconvenience of 'logic'. These days, it's all inference and innuendo, especially since the "critical" / "discursive" crowd have gotten a hold.

  68. Re:What's the problem? by Anonymous Coward · · Score: 0

    P-value certainly is a probability - it's the probability that you'd see data at least as extreme as the data you saw, if the null hypothesis is correct.

  69. Re:What's the problem? by plopez · · Score: 1

    You misspelled "English".

    --
    putting the 'B' in LGBTQ+
  70. Re: What's the problem? by plopez · · Score: 1

    Citations please.

    --
    putting the 'B' in LGBTQ+
  71. Re: What's the problem? by plopez · · Score: 1

    "Randomness is not compatible with experimental control."

    You have no clue about how to set up an experiment.

    --
    putting the 'B' in LGBTQ+
  72. So does this prove that Tom Cruise, John Travolta by blang · · Score: 1

    and the other crazies were right all along, that psychiatry is not a real science?
    Or does it just prove that the general understanding of math and statistics (except among matematicians) are fields that are in free fall, and that a few years from now, college graduates won't even be able to recite the multiplication table up to 10?

    --
    -- Another senseless waste of fine bytes.
  73. Re:What's the problem? by plopez · · Score: 1

    Psychiatry is a medical profession where the practitioners go to medical school and then train in the profession as per other medical professions. Psychology is not medicine. Psychologist study human emotions, thought, mental illnesses and disorders (overlapping Psychiatrists) but cannot prescribe unless they also train as a doctor or a Psych nurse. Psychologists do more counseling and group dynamics. Psychiatrists are more focused on drug treatments, but often work in tandem with Psychologists.

    Psychological is training based on both medical and Psychological research.

    --
    putting the 'B' in LGBTQ+
  74. Re:What's the problem? by ceoyoyo · · Score: 2

    Also "grammar."

  75. Re:What's the problem? by ceoyoyo · · Score: 1

    There really aren't any good ways to measure those other effects. If you knew how your experiment was biased, you'd try and fix it.

    Criticisms of p-values usually fall into two groups. Some people believe that p-values are bad because some people interpret them as the false positive rate. Personally, I think that's a problem with some people, and not p-values. The other criticism, which is particularly prevalent in social sciences, epidemiology and some of the squishier medical-type areas, is that if you get a non-significant p-value you discard potentially useful results. The usual proposal (which is probably the situation in this case) is to use confidence intervals. That way you can see all the area where your confidence interval is not overlapping zero! I have two objections to that. First, CIs are simply calculated from p-values and vice versa - they're really the same thing presented differently. Second, the reason you discard your result (or save it for a meta-analysis) if you get an insignificant p-value is because your data has been ruled insufficient evidence. Looking at CIs and marvelling at all the potentially meaningful area between them is just softening the p 0.05 rule of thumb. Incidentally, the false positive rate people suggest doing the opposite - using p 0.01 or 0.001 as the threshold for significance.

  76. Re:What's the problem? by narcc · · Score: 1

    Whereas scientists principally use deduction

    To all autodidacts: Imagine if YOU were to make a statement this absurd, without even a hint of self doubt. Worse, what if this is the kind of thing you actually believe as a result of your online "learning" adventures?

    This is why a formal education is important. On your own, you could very well end up the the AC above -- so deeply misinformed that there's little hope for recovery.

  77. Re:Past APA president Kimble turns over in his gra by Jane+Q.+Public · · Score: 1

    I suspect that book is still foundational in most University advertising/marketing progams.

    I think historically, a more influential book has been Darrell Huff's "How To Lie With Statistics", the second book in this list.

    It was originally written in 1954. And while less rigorous, it is an entertaining read and probably gets its point across to a much wider audience.

    I know for a fact that Huff's book is still used as a text in college statistics courses... but probably only the lower-level classes.

  78. Re:Social Science is NOT Science by Murdoch5 · · Score: 1

    Well I happen to have several friends studying psychology right now and I can tell you that compared to real science, what they is mostly hogwash. Your initial analogy was correct, that I'll give you credit for, both that's about all I can give you credit for.

    My cousin is studying hoarding disorder and how to overcome it. She's been given money, A LOT of money to study this. None of the people in her study have any kind of chemical imbalance, which as I already stated would completely change the landscape. So far her research at a PhD level has determined that people with hoarding disorder attach false emotional states to objects, which means in other words, they can't rationalize emotion. That's not science, that's a lack of emotional control, they need to mature and stop playing cry baby.

    My Sister studied depression, again in people with NO chemical imbalance. People felt sad because they didn't want to grow up and face the world for what it is! Again, not science. More proof people are cry babies.

    I could keep going but it all the same story.

  79. Re: What's the problem? by QRDeNameland · · Score: 1
    --
    Momentarily, the need for the construction of new light will no longer exist.
  80. Re:What's the problem? by TechyImmigrant · · Score: 1

    P-values certainly are probabilities. You just argued they aren't probabilities, but they are probabilities of this other thing. You contradicted yourself. I was specifically vague when I called it 'something' because it changes with the type of test and there are many to choose from and I didn't want to write a whole book. That book has already been written by smarter people than I.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  81. Re:What's the problem? by monkeyzoo · · Score: 1

    Hey there. Again, we're generally on the same page here, and I agreed with your comment, and my counterpoint was directed not so much at you as at the general idea of the folks here with a dismissive view of what social science means. BTW, Interesting your comment about computer science. ;-)

    I realize now we may not even have been using the same definitions. I was thinking more like psychology (let's say a stress coping training study, for example) versus biology (let's say a cancer treatment study). So, it's funny to me now to realize you are perhaps calling the latter social science as well. Anyhow, in both studies you absolutely can setup valid control groups. In the cancer case, the control might not be "no treatment," but you compare your new treatment to the efficacy of existing conventional ones. In a non-health related area completely, cognitive psychology is filled with countless examples of measuring the effect of priming the brain with images or words associated with different categories of concepts upon reaction times, opinion formation, behavior, etc. That's just one example off the top of my head.

    Also, many studies are able to be performed on existing data sets without requiring an interventional experiment. Steven Levitt has received much acclaim over his career performing these types of analyses. For anyone who doubts that social science is a rigorous and fascinating field, they should read (or listen to) some of his work.

  82. Re:What's the problem? by TechyImmigrant · · Score: 1

    There really aren't any good ways to measure those other effects. If you knew how your experiment was biased, you'd try and fix it.

    Randomized sampling goes a long way, but only if you have a large enough population. This is one of the problems of social sciences. A randomized 10% subsample from 100 subjects ain't gonna cut it. A randomized subsample from 10,000,000 people isn't going to get funded.

     

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  83. Re:What's the problem? by TechyImmigrant · · Score: 1

    I lived through my wife's PhD in education. I helped with the statistics. It was mind curdling stuff. But her thesis had rigor. S-Plus, Excel and everything else doesn't have MANOVAs. R does. We used R.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  84. Re:What's the problem? by Anonymous Coward · · Score: 0

    I have a doctorate, and learned these particular distinctions under a professor emeritus who teaches in the statistics and engineering department of an esteemed tier 1 American university, and who is well regarded in the fields of risk analysis and systems engineering. He's spent the latter part of his career analyzing the systems and methods of knowledge acquisition in science, medicine, law, and other fields. In fact, he was one of the early critics of over-reliance on P-values in scientific literature, and has provided concrete experimental data proving problems with how modern scientific research is performed. When you read papers studying reproducibility of research, or comparing and contrasting the worth of different statistical methods--i.e. the frequentist versus bayesian debate--he may very well be the author.

    But thank you very much. I do consider myself an autodidact. It's not like self-directed learning is somehow mutually exclusive relative to working through formal academic programs.

    I don't know what your qualifications or experience are, but you should kind of be ashamed of yourself. The notion that science principally uses deduction to draw _conclusions_ is kind of the _definition_ of the scientific method. Generate a hypothesis: if A, then B. Test the hypothesis. Publish your results. If the results affirm "if A, then B", and especially if the results are reproduced, it's subsequently applied as a premise in further work. That's deduction. Without deduction, every experiment would necessarily need to reproduce every previous experiment. Furthermore, invalidating any premise in the chain can devastate confidence in subsequent work.

    When diagnosing an ailment as a doctor, diagnosing a problem in a mechanical or software system, or proofing guilt in a trial, deduction is not the _principle_ methodology. Your evidence is limited. You often can't test a hypothesis by running a test--either because it would be too expensive, or in the case of a criminal trial it's simply not possible at all. So your principle tool is induction--inferring a conclusion from incomplete and inconclusive data.

  85. Re:What's the problem? by monkeyzoo · · Score: 1

    Your false assumption is that doctors, chemists and physicists get things right with any greater frequency.

    Did you mean to reply to me? It's a bit surreal to see you seemingly support what I wrote but tell me about a false assumption I made. In case you were speaking to me, I would like to point out that I made no such assumption. I argued that social scientists were as rigorous as any others but made no claims about either group's infallibility in absolute terms.

  86. Re:What's the problem? by pjt33 · · Score: 1

    I agree with you. Yet no need for the quotes around social 'scientists.' Psychologists, socialists, etc. employ the same experimental designs and mathematical techniques in experiments as doctors or others performing drug efficacy or medical outcome experiments, for example.

    That sounds like an excellent reason to use scare quotes around "scientists". When only 25% of published biomedical results can be reproduced, that field needs to do work to justify the claim to be science as well.

  87. Re: What's the problem? by TechyImmigrant · · Score: 1

    Are you sure that the term "well-controlled study" applies, given how you repeatedly used the term "random" when describing this experiment?

    Randomness is not compatible with experimental control. Additionally, randomness itself cannot be controlled, because doing so would prevent it from being true randomness.

    Quick! Someone is wrong on the internet.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  88. Re: What's the problem? by epyT-R · · Score: 1

    SJWs do not apply it equally however..

  89. Re: What's the problem? by epyT-R · · Score: 1

    Except that those terms are subjective and, really, based on emotions.. Atoms are not.

  90. Re:What's the problem? by Anonymous Coward · · Score: 0

    socialists have built a 5th column in psychiatry so that they can label those who disagree with their political goals as mentally ill.

  91. Re:What's the problem? by epyT-R · · Score: 2

    Unfortunately academia was taken over by quacks a long time ago..

  92. Re: What's the problem? by Anonymous Coward · · Score: 0

    Not really a proability. If you use an exact approach, one figures out the possible outcomes for your statistic under your initial hypothsis and then determines the rank of the observed value of the statistic. If its too low, then its time for some abduction. The rest is just convienient approximations using p.d.f's See Fisher's reanlysis of Darwin's pea data.

  93. Re:What's the problem? by swillden · · Score: 1

    There really aren't any good ways to measure those other effects. If you knew how your experiment was biased, you'd try and fix it.

    Randomized sampling goes a long way, but only if you have a large enough population. This is one of the problems of social sciences. A randomized 10% subsample from 100 subjects ain't gonna cut it. A randomized subsample from 10,000,000 people isn't going to get funded.

    Why wouldn't a randomized subsample from 10M people get funded? The required sample size doesn't grow as the population does.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  94. Re:What's the problem? by TechyImmigrant · · Score: 1

    Because identifying the 10 million and sampling the 1 million will be expensive. Worse, that many people in the class may not exist. If your class is 'residents of Boring, Oregon', there may simply be too few of them to randomize away the confounders and drive the p-value down.

    Top tip. If you want to find something in the data, it helps if it sticks out above the noise floor like a sore thumb. If you're having to push the noise floor down with sample size to make something visible, the odds you got something else wrong go up in proportion.
     

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  95. Re:So does this prove that Tom Cruise, John Travol by Anonymous Coward · · Score: 0

    I used to have one of those old fashioned uniplication tables but ordered a new multiplication table because it collects spilled drinks in its crevices rather than letting it drip on the floor. Much more sanitary. Why would I need to recite? I already cited amazon for one, are you saying they are usually defective?

  96. surprised nobody mentioned this by Anonymous Coward · · Score: 0

    The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives
    Stephen T. Ziliak and Deirdre N. McCloskey
    http://www.press.umich.edu/titleDetailDesc.do?id=186351
    The University of Michigan Press

    https://www.press.umich.edu/pdf/9780472070077-fm.pdf

    1. Re:surprised nobody mentioned this by Anonymous Coward · · Score: 0

      Meehl, Paul E. (1967). "Theory-Testing in Psychology and Physics: A Methodological Paradox". Philosophy of Science 34 (2): 103–115. doi:10.1086/288135.

  97. Re:So does this prove that Tom Cruise, John Travol by Anonymous Coward · · Score: 0

    Citing one table will get them to grade you as a D at most. You have to recite at least ten times to get the benefits of being in the A group. Even then it is up to chance since it is graded on a curve.

  98. Re:What's the problem? by Anonymous Coward · · Score: 0

    The notion that science principally uses deduction to draw _conclusions_ is kind of the _definition_ of the scientific method.

    I'm sorry, but the very definition of the scientific method is inductive reasoning. This can include deductive steps, but inevitably there is never a 100% proof about the universe because you are unable to observe all spaces at all times, especially the past. Your evidence is always limited, and any general conclusion you come to is subject to being wrong with future observations.

    And I am also some one who's gotten a PhD in the sciences... but thought this was something thoroughly covered in a basic philosophy of science course that many schools require for science majors.

  99. Re:Social Science is NOT Science by Anonymous Coward · · Score: 0

    That's not science, that's a lack of emotional control, they need to mature and stop playing cry baby.

    And exactly how many engineering and science endeavors have been in the name of laziness? If people just used protection, we wouldn't have to study STDs, etc. Whether or not something is science isn't why people find the results useful, but the process that is used to find results.

  100. Published articles 1/1000 wrong -- PLOS article by Jameson+Burt · · Score: 1

    Classical statistics mentions the significance level, alpha=0.05. It mentions beta -- (1-beta) is the power of the test to conclude the null hypothesis. Classical statistics never mentions R, the background ratio of true to false relationships in a field. While R lies in the interval [0,infinity], you could think instead about the background probability of true relationships. PLOS had an article several years ago that showed the probability a published article falsely touts a relationship as true, a probability they called the Positive Predictive Probability,
          PPV = 1 / [1 + alpha / ((1 - beta) * R))]
    The person designing an experiment seeks a large power, 1 - beta, so is bounded away from 0 and at most 1, so this factor becomes irrelevant (remember, the article gets published). When R is much less than alpha; eg, R=0.001 is less than 0.05, then PPV is about
            R / alpha
    or often
          R / 0.05
    The background proportion of true relationships R dominates over alpha and over beta in the probability the relationship is true PPV.

    You do a statistical test in a "field" of relationships where most of the relationships are wrong, otherwise any relationship stated has a good chance to be correct and the "field" is easy if not boring. Consider the search for some 30 genes that might cause a genetic disease out of 30,000 genes in a genome. Then R is 1 / 1000 and (about)
          PPV =. 1/(1 + 0.05/(1/1000)) = 1/51 =. 0.02
    That is, such published genetics articles tout relationships that are very unlikely (0.02) to be correct.

    The German pharmaceutical Bayer called a large sample of published article authors, duplicated their procedures, yet found 70 percent of the publications' touted results could not be confirmed (probably wrong). Many statistical tools will give fame -- hypothesis tests or even more so data mining tools -- these are often charlatan's tools.

  101. How much sympathy! by Anonymous Coward · · Score: 0

    So on a scale of 1 to freedom, how much sympathy do we have for the authors who got banned for using p-values?

  102. Re:What's the problem? by monkeyzoo · · Score: 1

    I lived through my wife's PhD in education. I helped with the statistics. It was mind curdling stuff. But her thesis had rigor. S-Plus, Excel and everything else doesn't have MANOVAs. R does. We used R.

    Right. So you know first hand how ignorant it is to say math and statistics have nothing to with social science. :-)

  103. Re:Social Science is NOT Science by Murdoch5 · · Score: 1

    Just simplify it, most of psychology is immature people who don't want to take responsibility, complaining that they might have to take responsibility for there actions. For the real cases of people who have chemical imbalance, it's about understanding how the brain forms attachment and reason.

  104. Re:Social Science is NOT Science by Anonymous Coward · · Score: 0

    Speaking of immaturity and projecting emotions to justify one's own feelings...

  105. Re: What's the problem? by twitnutttt · · Score: 1

    Except that those terms [racism, sexism, ...] are subjective and, really, based on emotions.. Atoms are not.

    That the mind uses stereotypes to classify and categorize information is neither "subjective" nor an "emotion." This is what researchers actually study. Subsequently, inferences can perhaps be generalized that relate to the functioning of the subjective terms the previous poster used.

  106. Re:What's the problem? by TechyImmigrant · · Score: 1

    I lived through my wife's PhD in education. I helped with the statistics. It was mind curdling stuff. But her thesis had rigor. S-Plus, Excel and everything else doesn't have MANOVAs. R does. We used R.

    Right. So you know first hand how ignorant it is to say math and statistics have nothing to with social science. :-)

    It wasn't me who said that.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  107. Re:What's the problem? by monkeyzoo · · Score: 1

    Right. So you know first hand how ignorant it is to say math and statistics have nothing to with social science. :-)

    It wasn't me who said that.

    I know. I wasn't trying to imply you did. =)

  108. Re:What's the problem? by DickMardy · · Score: 1

    And it's a "submission".

  109. Re:What's the problem? by TechyImmigrant · · Score: 1

    Sorry. I had a bit of a whooshy moment.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  110. Re:What's the problem? by gzuckier · · Score: 1

    english grammer. Kelsey's cousin, who thinks he's e. e. cummings.

    --
    Star Trek transporters are just 3d printers.
  111. Re:Past APA president Kimble turns over in his gra by Anonymous Coward · · Score: 0

    I suspect that book is still foundational in most University advertising/marketing progams.

    I think historically, a more influential book has been Darrell Huff's "How To Lie With Statistics", the second book in this list. It was originally written in 1954. And while less rigorous, it is an entertaining read and probably gets its point across to a much wider audience. I know for a fact that Huff's book is still used as a text in college statistics courses... but probably only the lower-level classes.

    Not to be confused with the much more exciting book, "How to lie with Statisticians"

  112. Re: What's the problem? by gzuckier · · Score: 1

    "Racism", "sexism", "patriarchy" and related topics of study within the social "sciences" inherently can't be quantitatively analyzed in any meaningful way.

    Yeah, based on your multiyear immersion in the field, right? And those so-called climate scientists, I bet they didn't even include solar effects. And don't get me started on medical science, they're all a bunch of quacks, one year coffee is good for you one year it's bad for you.

    --
    Star Trek transporters are just 3d printers.
  113. Re: What's the problem? by gzuckier · · Score: 1

    You randomize your two populations, then you test to ensure that there are no significant differences between the two populations in what you are trying to control for. If there is, like 1 group is all males and the other is all females, then "the randomization failed". Which of course is guaranteed to happen 5% of the time for each factor, so if you have 20 factors.....

    --
    Star Trek transporters are just 3d printers.
  114. Re: What's the problem? by gzuckier · · Score: 1

    How is it subjective that, given random applications or whatever, as in the previously described test, subject A reliably responds favorably to names like George Whittington Huxley III and unfavorably to names like D'shawn Mohammed Washington, whereas the majority of subjects respond equally to both? Statistically verifiable, and all that?

    --
    Star Trek transporters are just 3d printers.
  115. Re:What's the problem? by gzuckier · · Score: 1

    The other side of the problem is that a random sample of 10,000,000 people is going to find everything significantly different. That's from the inverse dependence of the standard deviation on the root of N. Given any nonzero difference between two samples, there will always be some value of N high enough that the standard deviation is therefore low enough that that difference will have a p value .05, or as low as you want it to be.

    --
    Star Trek transporters are just 3d printers.
  116. Re:What's the problem? by gzuckier · · Score: 1

    Because identifying the 10 million and sampling the 1 million will be expensive. Worse, that many people in the class may not exist. If your class is 'residents of Boring, Oregon', there may simply be too few of them to randomize away the confounders and drive the p-value down.

    Top tip. If you want to find something in the data, it helps if it sticks out above the noise floor like a sore thumb. If you're having to push the noise floor down with sample size to make something visible, the odds you got something else wrong go up in proportion.

    Oh you really mean a sample from of a population of 10,000,000? I thought you meant a sample of 10,000,000 but were a bit imprecise in wording. You don't need a sample of 1,000,000 for a population of 10,000,000, a sample of 100 will do just fine if you are sure it's representative and randomly sampled. And if it's not representative and randomly sampled, a sample of 1,000,000 won't give you a valid answer either. That's why we can do clinical trials on a few hundred people, at most, and decide that a drug is in all reasonable probability efficacious and safe enough to be marked to a population of 600,000,000.

    --
    Star Trek transporters are just 3d printers.
  117. Re:What's the problem? by gzuckier · · Score: 1

    Because identifying the 10 million and sampling the 1 million will be expensive. Worse, that many people in the class may not exist. If your class is 'residents of Boring, Oregon', there may simply be too few of them to randomize away the confounders and drive the p-value down.

    Top tip. If you want to find something in the data, it helps if it sticks out above the noise floor like a sore thumb. If you're having to push the noise floor down with sample size to make something visible, the odds you got something else wrong go up in proportion.

    But you are right though. If the effect is invisible until teased out statistically, it's probably not real, or at best not big enough to be interesting, and at best best nobody will believe it anyway. Especially when the raw effect goes one way, but after statistically clearing out the debris, it suddenly changes polarity. Statistics is best used as a minor tool to get a more precise estimate of an effect which is clear before you start the statistical work.
    But people publish that tortured out stuff anyway.
    To be fair, even if there's substantial doubt about a result, if it's important enough it's worth publishing just to see if people can either repeat it, refute it, or explain what the heck happened. Cold fusion being a perfect example.

    --
    Star Trek transporters are just 3d printers.
  118. Re:What's the problem? by gzuckier · · Score: 1

    Actually, p-values are about CORRELATION. Maybe *you* aren't well-positioned to be denigrating others as not statistical experts.

    I may be responding to a troll here, but, no, the GP is correct. P-values are about probability. They're often used in the context of evaluating a correlation, but they needn't be. Specifically, p-values specify the probability that the observed statistical result (which may be a correlation) could be a result of random selection of a particularly bad sample. Good sampling techniques can't eliminate the possibility that your random sample just happens to be non-representative, and the p value measures the probability that this has happened. A p value of 0.05 means that there's a 5% chance that your results are bogus in this particular way.

    The problem with p values is that they only describe one way that the experiment could have gone wrong, but people interpret them to mean overall confidence -- or, even worse -- significance of the result, when they really only describe confidence that the sample wasn't biased due to bad luck in random sampling. It could have been biased because the sampling methodology wasn't good. I could have been meaningless because it finds an effect which is real, but negligibly small. It be meaningless because the experiment was just badly constructed and didn't measure what it thought it was measuring. There could be lots and lots of other problems.

    There's nothing inherently wrong with p values, but people tend to believe they mean far more than they do.

    Yeah. p-values are much more sensitive to having a small standard deviation than they are to having a large difference between the two samples tested. So you can have a test where the differences between the two samples ranged from 3-4 be significant, while an identical test where the differences ranged from 10-30 were not significant. Thus the dependence on big sample size I discussed elsewhere.

    --
    Star Trek transporters are just 3d printers.
  119. Re:What's the problem? by gzuckier · · Score: 1

    Well, the whole debate circles around the fact that there is a missing piece of information, no matter how you try to shove the wrinkle in the carpet around, it has to show up somewhere. In this case, they're saying that the p-value is reflecting the probability that the null hypothesis is correct when the results obtained say it is incorrect, and what is missing is the probability that the test hypothesis is correct.
    The basic forest comprising all these trees is Type I errors and Type II errors, i.e. incorrectly rejecting the null hypothesis (false positive), vs incorrectly not rejecting the null hypothesis (false negative). For any given experiment and result, whatever the noise and error are, the Type I and Type II errors interact; if you want to avoid false positives when analyzing that dataset, you specify a small p-value, but you increase the chances of false negatives, and vice versa. Mostly, we've decided to minimize false positives, so we go with p=.05. If you were able to precisely measure the rate of one of these types of errors you could precisely calculate the other; but you can't, so you have to just push your uncertainty over to where you figure it will do the least damage.
    You can put a floor on the Type II error rate estimate by specifying the power of the experiment; i.e., the more tests you make, or the higher the number in your sample, the smaller the effect you can find. For instance, if you're trying to prove a drug is not harmful, intuitively a test of 5 people isn't going to be anywhere near as conclusive as a test of 1,000 people; statistically/mathematically/calculatedly, that's because in that case, the null hypothesis is that the drug is not harmful, and the p-value is testing for Type I errors; but what you're really looking for is Type II errors, i.e. you want to ensure that there isn't a nonzero rate of harm that is too low to show up in your little sample size. You can't get that the way you get p-value, though; the best you can do is calculate the statistical power of the experiment, i.e. that with this sample size, if the rate of harmful complications was greater than 1 in 100 or whatever, we would see it with 95% probability. So, when you include this calculation in with your stats, you get the best set of numbers you can; that the p-value of .05 says that if we got a result that the drug is harmful, then the chance of it not being harmful is less than 5%, which of course is pretty much useless information; but the power result also tells you that if the drug is harmful at a rate of XXXX%, we would have a 95% chance of seeing it, which is more useful, but not as precise numerically.
    Amazingly, that wasn't even considered by the FDA for years when evaluating drugs for safety, so manufacturers were free to test drugs on tiny populations and say with all honesty that they didn't see any problems. It wasn't until later that it occurred to somebody that they really needed a properly powered trial to be safe.

    --
    Star Trek transporters are just 3d printers.
  120. Re:What's the problem? by gzuckier · · Score: 1

    P-values certainly are probabilities. You just argued they aren't probabilities, but they are probabilities of this other thing. You contradicted yourself. I was specifically vague when I called it 'something' because it changes with the type of test and there are many to choose from and I didn't want to write a whole book. That book has already been written by smarter people than I.

    Right,
    Basically, but vaguely, the experimenter compares two sets of numbers, and calculates the average difference between numbers in the same group (hopefully, getting the same results in each group) and compares that to the difference in the average between the two groups, and wants to know; given this difference between the two groups and this difference between those in the same group, what is the probability that in reality, if there were no errors or noise, there would be a real actual difference between groups (and what is the reasonable range that the actual difference might be). So far, pretty clear, right?
    but what the p-value tells you is the opposite way around; i.e., if there really is no difference between the two groups except for errors and noise, then the probability that I'd see the difference between groups and the difference between members of the same group that I'm seeing in my experiment, is .05 (or whatever your p-value is). And, hard as it is to believe given so many of the words being the same, you can't get the answer you want from the answer the p-value gives you.

    --
    Star Trek transporters are just 3d printers.
  121. Re:What's the problem? by TechyImmigrant · · Score: 1

    Comparing means is one kind of test. There are many others.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  122. Re: What's the problem? by retchdog · · Score: 1

    maybe i'm misunderstanding you, but why would you test to "ensure that"? the randomization guarantees it (assuming that it is done correctly, of course); poking around after-the-fact can only undo the blind, which is why good experiments take some measures to make it difficult.

    and why is it "guaranteed to happen 5% of the time"? is that independent of sample size and distribution of the factor? quite remarkable indeed!

    you sound quite confused about certain things.

    --
    "They were pure niggers." – Noam Chomsky
  123. Re: What's the problem? by gzuckier · · Score: 1

    maybe i'm misunderstanding you, but why would you test to "ensure that"? the randomization guarantees it (assuming that it is done correctly, of course); poking around after-the-fact can only undo the blind, which is why good experiments take some measures to make it difficult.

    and why is it "guaranteed to happen 5% of the time"? is that independent of sample size and distribution of the factor? quite remarkable indeed!

    you sound quite confused about certain things.

    The whole point of the concern of the editors of the journal, as described in the article, is what the p-value actually represents: which is, the chances that the test in question, using two randomized samples from the same population, will demonstrate a difference of the size in question. I.e., 5% of the time a randomized population will show a difference in any test with a Gaussian distribution of p=.05. You do the randomization, you test all the independent variables you are controlling for/adjusting for/interested in/worried about; if any are significantly different at .05 or whatever you preferentially redo the randomization; if not you have to rely on your statistical adjustment to take care of it, but you're safer if you can redo the randomization.
    If you're not doing this, you should tell people in your publications, because it's something they should know. Apparently, you believe that if you flip a coin twice, it's guaranteed to produce one head and one tail. This is not a good assumption to go into statistical analysis with.

    --
    Star Trek transporters are just 3d printers.
  124. Re: What's the problem? by retchdog · · Score: 1

    ah, you're just confusing randomization as a means of controlling nuisance factors, with the formal significance level of the result about the factor of interest. you are confused; these are different concepts. to wit, randomization certainly does not involve testing "all the independent variables". trying to randomize this way is a waste of time at best, and would probably fuck up your experiment.

    it is worth recalling, at times like this, that the last person to speak to me with such a combination of ignorance and certitude was found dead three days later from profuse rectal bleeding.

    --
    "They were pure niggers." – Noam Chomsky
  125. Re: What's the problem? by gzuckier · · Score: 1

    ah, you're just confusing randomization as a means of controlling nuisance factors, with the formal significance level of the result about the factor of interest. you are confused; these are different concepts. to wit, randomization certainly does not involve testing "all the independent variables". trying to randomize this way is a waste of time at best, and would probably fuck up your experiment.

    it is worth recalling, at times like this, that the last person to speak to me with such a combination of ignorance and certitude was found dead three days later from profuse rectal bleeding.

    Not sure what you're saying, but me and my rectum are outa here.

    --
    Star Trek transporters are just 3d printers.