Slashdot Mirror


The Fallacy of Hard Tests

Al Feldzamen writes in with a blog post on the fallacious math behind many specialist examinations. "'The test was very hard,' the medical specialist said. 'Only 35 percent passed.' 'How did they grade it?' I asked. 'Multiple choice,' he said. 'They count the number right.' As a former mathematician, I immediately knew the test results were meaningless. It was typical of the very hard test, like bar exams or medical license exams, where very often the well-qualified and knowledgeable fail the exam. But that's because the exam itself is a fraud."

19 of 404 comments (clear)

  1. Worthless by kmac06 · · Score: 4, Insightful

    What a worthless post. He gave one situation where guessing is more important than knowledge, but didn't at all address the specifics of the tests he was talking about. A typical vapid blog that for some reason gets posted to /.

    1. Re:Worthless by Tatarize · · Score: 4, Insightful

      No. Guessing is simply the 25% bonus if you're one in four. The chance of passing the test is nearly null. You need to be 100 times smarter than that idiot who can only answer one question. Also, 2X as smart == 2X right answers? What the hell? My IQ is 140, find me somebody with an IQ of 70 and give us a test on anything. Sure as hell I'll get more than just twice as many right.

      1 for right answer.
      -1/4 for wrong answer.
      0 for no answer.

      Done.

      --

      It is no longer uncommon to be uncommon.
    2. Re:Worthless by Mr2001 · · Score: 4, Insightful

      1 for right answer.
      -1/4 for wrong answer.
      0 for no answer. ITYM -1/3 for each wrong answer. That way, the expected value of guessing is zero: on average, out of four guesses, you'll gain a point for one of them and lose it for the other three.
      --
      Visual IRC: Fast. Powerful. Free.
    3. Re:Worthless by Derekloffin · · Score: 5, Insightful
      Yeah, this is a pretty bloody poor analysis. If I know 2X as much (even assuming we could quantify it that easily), that doesn't automatically mean I get 2X the score on a test, and it certainly doesn't mean my guesses are equally as bad as the guy with 1/2 my knowledge. It depends heavily on what my knowledge is and what is covered by the test. The potential is even there for the guy with 1/2 my knowledge to beat me just simply by getting lucky on what the test covers.

      Just for an example, say we were doing a geography test on the states of the united states and their associated capitals. I know 1/2 of them, and another guy knows 1/4 of them. Now, each question is a 4 part multi-choice simple question: State X, which is it's capital? A, B, C, or D. The thing is, even for those I don't know, 1/2 the potential answers (on average) I can eliminate as I know them, while the other guy, on average, can only eliminate 1/4 of them. So, I would get 50% on knowing the answers, and about 1/2 of the remaining on guesses. The other guy would get 1/4 on knowing them, and only 1/3 of the rest on guesses. And that's just the basic mathematic flaw in his reasoning.

    4. Re:Worthless by ultranova · · Score: 4, Insightful

      For a medical specialist wouldn't:

      +1 for right (patient lives)
      0 for no answer (she knows she doesn't know and maybe consults with a colleague),
      -1e38 for wrong (patient dies)

      be more appropriate weightings?

      No. Everyone makes mistakes sometimes; a doctor who concentrates all his efforts into avoiding them will end up sending all his patients to see one expert or another. Not only does this overload the experts (who are supposed to see only a tiny subset of the patients, after all), but it also means it takes longer to get diagnosed. And in the long run, it means that only risk-takers will become doctors in the first place, shich is not good for anyone.

      The worst case is if the experts will also start doing this: trying to offload the patient - and therefore the risk - to someone else as soon as possible. That will lead to the people with actual serious illnesses dying, since no one will actually diagnose them in their hurry to send them to someone else before they have a chance to die on them.

      So no, your weightings are not appropriate. You can't assign virtually infinite negative weight to failure and expect anyone to try - at least anyone you want performing medicine.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    5. Re:Worthless by aurispector · · Score: 4, Insightful

      People EXPECT doctors to do something, even when nothing is wrong. I've caught myself doing it and I *am* a doctor. It's human nature.

      When I took my board exams I studied old exams for weeks. The information in the exams wasn't really stuff directly from the curriculum; we covered the material but the focus was slightly different. In any case large portions of the information required to be regurgitated for the exam could be classified as "background" - stuff you need to be aware of but doesn't directly affect you in your daily work.

      The exam WAS multiple choice and I credit test-taking skills as much as my education for passing on the first try. Logic and the process of elimination can increase your odds to about 50/50 in most cases.

      --
      I have mod points. The reign of terror begins now.
    6. Re:Worthless by level_headed_midwest · · Score: 4, Insightful

      People always expect doctors to do something, even if the doctor is very vocal about there being no good treatment available. I've seen lots of people walk into doctors' offices and DEMAND a certain medication or treatment that is not advisable. A very common one used to be mothers demanding antibiotics to give to their kid who is sick with a viral flu. The doctor said in no uncertain terms that antibiotics will do absolutely nothing and that prescribing antibiotics will only cost money and perhaps have side effects. But the mothers had to have some medicine to feed to the kid just to satiate their mothering genes. Most of the docs I know told them to give the kid Tylenol if they had a fever or "prescribed" X ounces of fluids per hour- something to keep the mother mothering the kid.

      People will also want the doctor to do "something" even if nothing is wrong because they don't want to feel dumb for going when nothing was wrong. They want to justify that something was actually wrong so they don't feel foolish. Add to that the fact that most people have to pay some as a co-pay for a doctor's office visit and "want to get their money's worth."

      So sometimes picking "no action" can be very hard to do.

      --
      Just "gittin-r-done," day after day.
    7. Re:Worthless by An+Onerous+Coward · · Score: 3, Insightful

      I love Scrubs, too. But let's not go redesigning our medical qualifications system based on that one episode we saw that one time. :)

      I can only suppose that there are times when doing nothing beats doing something. But you seem to be saying that, because such situations do occur, then it would be healthy to severely punish medical errors to the point where most doctors' first instinct is to do nothing, run another test, etc. Even though there may be times when that state of affairs would help certain patients, on the balance I think it would make medical care worse.

      --

      You want the truthiness? You can't handle the truthiness!

    8. Re:Worthless by Macgrrl · · Score: 3, Insightful

      Here in Austrlia where we have paid sick leave for permanent employees, but typically companies require that you present a doctor's certificate to prove you were sick. So even when you know that you only have a head cold and should be home in bed staying warm and keeping your fluids up, you have to track down and wait in the doctor's office for them to write on a bit of paper that you really are too sick to go to work and that you should be home in bed...

      On the flip side, my husband was mis-diagnosed by a number of doctors for over 15 years - he had severe sleep apnea to the point where he was having fits and seizures, memory loss and paranoia. I look like I am finally getting a diagnosis after 20 years of intrusive tests for why I have near constant nausea, indigestion and vomiting.

      If the doctors didn't have to sausage factory process all the people who *know* what's wrong and what they have to do, they would probably have more time to spend with people who actually need help.

      --
      Sara
      Designer, Gamer, Macgrrl in an XP World
  2. When I was a boy... by WFFS · · Score: 4, Insightful

    Stories like this could never get on Slashdot. Seriously, this is like a maths problem I'd give to my Year 9 kids. This is definitely not news, and certainly doesn't matter.

  3. Education in taking the test by MagicDude · · Score: 4, Insightful

    As a medical student, I know how much our education is divided into what we do in real life, and what is the proper answer for exams. Quite often, during our education exercises, we're given senarios like "A patient presents with symptoms X, Y and Z. What do you do next?". At that point, that's when the resident says "You would diagnose condition A from those symptoms, but for the exam, you'd say you'd get an MRI to rule out B". So many questions are basically having intuition for where the question is guiding you too, rather than practical medicine. Often, it's extremely difficult to discern what the question wants. There will be some question along the lines of "A patient presents with general fatigue over the past 3 months, which one blood test do you want to order?" and you'll narrow down the answer choices to either thyroid stimulating hormone, or a complete blood count, both studies are equally important in the evaluation of fatigue, but the question wants you to know which one is more important. In real life, you would always get both because both conditions fairly common, and you want to evaluate both at once to save the patient time and effort. However, the question will nail you if you don't know some obscure study which states that there like is a 1% difference in the incidence of hypothyroidism vs anemia in fatigue. Moreso, if you were on the hospital floor and you were to say "I'm getting only a CBC, because it's more likely," the resident will chide you for not considering hypothyroidism as well and getting the Thyroid stimulating hormone as well, making you look bad. So yeah, learning for the test doesn't really ever end.

  4. Re:warning moronic blog post linked by suv4x4 · · Score: 4, Insightful

    if anything testing has become FAR FAR too easy, people pass CS courses and come out the otherside only to have a vague notion of how a computer works.

    I won't claim his post is correct or not, but he claims the technology behind such tests is wrong and lets less educated people pass through with guessing, whle more educated people try to pass without guessing and fail.

    People see the tests produce poor selection, and make the tests harder and harder in attempt to remedy this (but they won't since it's the technology of a test that's wrong).

    Then you come here and support his opinion 1:1 by claiming tests are too easy (i.e. should be harder) and idiots pass through.

    Ironic, isn't it.

  5. Re: Yuck by reason · · Score: 3, Insightful

    You're missing the point. Counting only correct answers on a multi-choice test doesn't measure what you know, or whether you have the necessary minimum knowledge.

    With 4 choices for each question on a 100 question test, the average student (student A) who knows 50% of the answers will get at least 62 correct if they guess entirely at random when they don't know the answer (50 plus 50/4 correct guesses). The average student who knows only 25% of the material (student B) will get at least 44 correct using the same approach (25 plus 75/4). Although A knows twice as much as B, A's score is only 40% better (not 100%).

    Of course, it's even worse than this. First, because there is a large degree of scatter: a student choosing at random might do much better or much worse than this. Second, because multi-choice questions are often structured so that half of the possible answers are obviously incorrect, which changes the odds.

    With only two plausible answers to choose between, A might get 75 correct and B might get 63: in this case A, who knows twice as much as B, gets a score only 19% better than B.

    If points are subtracted for incorrect answers (say -1/4 pt to -1/2 for each one wrong), the effect of guesses can be taken out of the equation so that differences in scores actually reflect differences in knowledge. Or if the questions are easier, a smaller proportion of both students' answers will be guesses, so the effect should be smaller.

  6. Not Worthless by deskin · · Score: 3, Insightful

    Though some of his logic was overblown (see the comments made directly on his blog), I think his larger point has some merit. In fields which require lots of studying before beginning as a professional, such as medicine and law, you always hear that you have to be absolutely brilliant to 'get in'. The fact of the matter is that this is not the case: you should be darn smart, but you needn't be the best student in the world to be successful as a doctor. Many of the students who go to law or medical school (I'd guess most) are completely qualified for positions in their respective fields, but by the same token, are not necessarily any more qualified than their peers: they've all studied the same material, had the same experience in the lab, and know the whole picture within a reasonable approximation of each other.

    Yet to maintain the level of exclusivity that these careers have, there must be some way to select a subset of the candidates to proceed, and at this point, there are few distinguishing features among them. Some will be far and away brilliant, and will easily get a career regardless; but the majority can't be differentiated from one another. So, how should it be decided who is a doctor and who isn't? By making a test that's so hard it amounts to a randomising function, and then selecting a subset of top scorers to pass. Passing doesn't mean one is inherently more qualified; it just means one guessed better on that day. This also explains why people can pass on their second or third try: they are no better than their competitors the next time around, but eventually one will guess luckily, and get in. It'd be interesting to do some statistical analysis on how many tries it takes people to 'pass' a particular exam, and see if the results fit probabilistic models: If the results of such analysis fit too well, the test is too hard, whereas if they deviate greatly from probabilistic expectations, then the test is more likely to be an actual test of one's knowledge.

    To be sure, there will be some individuals who can pass based entirely on their knowledge, just as there will be some individuals who simply aren't cut out for life as a lawyer that will fail the exam. But ultimately, it allows the higher-ups to select candidates for job positions based on the single indisputable criterion of the candidate having passed an exam, thus avoiding any messy issues when someone complains about them choosing a particular candidate in lieu of one better qualified.

    Time for a terrible analogy, since it's 0300 here: Really hard exams are the bouncers at the door to the club of medical careers.

  7. Disturbing by bryan1945 · · Score: 4, Insightful

    I find the fact that medical and lawyer exams are based on multiple choice rather disturbing. As an engineer almost all of my test were long answer. Sure, some multi questions, but mostly show all your work or explain the whole process. And I just design systems and networks! Now someone can just luckily guess enough multiple choice questions and start slicing me up?

    Like I said, disturbing.

    --
    Vote monkeys into Congress. They are cheaper and more trustworthy.
  8. Re:There may be unanswered questions by UnxMully · · Score: 4, Insightful

    Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said.

    Fate, it seems, is not without a sense of irony.

  9. Re:I find Mr. Feldzamen's post hard to believe. by nagora · · Score: 4, Insightful
    If anything, the low pass rate of bar exams, typically 50% or less among a candidate pool of mostly recent law school grads, suggests that they are very hard indeeed.

    It doesn't actually suggest anything other than 50% of people that apply pass. I can design an exam which is very easy; I then say that only 50% will pass. It could be that the "cut" is anyone who scored 9+ out of ten will pass and everyone else fails. Or I could flip a coin. The pass rate is no guide to how hard an exam is nor how good a test of the candidates' abilities. It might be both hard and rigorous, but you can't infer that just from the pass rate.

    TWW

    --
    "Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
  10. Re:The problem there by that+this+is+not+und · · Score: 4, Insightful

    Just to pull out a snippet and maybe contribute a bit to topic drift:

    if I'm hiring a Java programmer, asking questions about COBOL would be just trivia.)

    If you ask that sort of question to a prospective programmer, you'll find out more about the person's technical depth, which may be of value. The guy who 'learned Java' because he read it somewhere or an 'advisor' told him it was a way to 'get ahead' is gonna be mister lightweight who is looking for a 'career,' not somebody who is a practitioner who takes a broad approach.

    Further, it will help sort the candidates out. The ones who contrive 'fake' knowledge of COBOL can be rooted out and eliminated. Those who are willing to say 'I am not sure I know, but that's an interesting queston' get points, those who automatically start thinking about where to find the answer get even more points.

    And, of course, the question will help to sift out anybody with actual COBOL knowledge, because anybody with skill in COBOL who is applying for a Java position is obviously an unstable nut.

  11. Re:I had a teacher... by dcollins · · Score: 3, Insightful

    That guy's a fucking asshole. As a college teacher of math & CS (including assembly -- admittedly at a community college), guys like this just completely burn me up. Some people should completely not be teachers, they suck so fucking bad.

    I practically meditate before a final exam on how to make the environment as comfortable as possible, clearly explain in advance what the procedures will be like, and keep everything in the same rhythm as all my prior tests. Just freaking out students in a final exam because you're a sadist is utterly unacceptable. Jesus.

    --
    We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes