Slashdot Mirror


The Fallacy of Hard Tests

Al Feldzamen writes in with a blog post on the fallacious math behind many specialist examinations. "'The test was very hard,' the medical specialist said. 'Only 35 percent passed.' 'How did they grade it?' I asked. 'Multiple choice,' he said. 'They count the number right.' As a former mathematician, I immediately knew the test results were meaningless. It was typical of the very hard test, like bar exams or medical license exams, where very often the well-qualified and knowledgeable fail the exam. But that's because the exam itself is a fraud."

14 of 404 comments (clear)

  1. Worthless by kmac06 · · Score: 4, Insightful

    What a worthless post. He gave one situation where guessing is more important than knowledge, but didn't at all address the specifics of the tests he was talking about. A typical vapid blog that for some reason gets posted to /.

    1. Re:Worthless by Tatarize · · Score: 4, Insightful

      No. Guessing is simply the 25% bonus if you're one in four. The chance of passing the test is nearly null. You need to be 100 times smarter than that idiot who can only answer one question. Also, 2X as smart == 2X right answers? What the hell? My IQ is 140, find me somebody with an IQ of 70 and give us a test on anything. Sure as hell I'll get more than just twice as many right.

      1 for right answer.
      -1/4 for wrong answer.
      0 for no answer.

      Done.

      --

      It is no longer uncommon to be uncommon.
    2. Re:Worthless by Mr2001 · · Score: 4, Insightful

      1 for right answer.
      -1/4 for wrong answer.
      0 for no answer. ITYM -1/3 for each wrong answer. That way, the expected value of guessing is zero: on average, out of four guesses, you'll gain a point for one of them and lose it for the other three.
      --
      Visual IRC: Fast. Powerful. Free.
    3. Re:Worthless by Derekloffin · · Score: 5, Insightful
      Yeah, this is a pretty bloody poor analysis. If I know 2X as much (even assuming we could quantify it that easily), that doesn't automatically mean I get 2X the score on a test, and it certainly doesn't mean my guesses are equally as bad as the guy with 1/2 my knowledge. It depends heavily on what my knowledge is and what is covered by the test. The potential is even there for the guy with 1/2 my knowledge to beat me just simply by getting lucky on what the test covers.

      Just for an example, say we were doing a geography test on the states of the united states and their associated capitals. I know 1/2 of them, and another guy knows 1/4 of them. Now, each question is a 4 part multi-choice simple question: State X, which is it's capital? A, B, C, or D. The thing is, even for those I don't know, 1/2 the potential answers (on average) I can eliminate as I know them, while the other guy, on average, can only eliminate 1/4 of them. So, I would get 50% on knowing the answers, and about 1/2 of the remaining on guesses. The other guy would get 1/4 on knowing them, and only 1/3 of the rest on guesses. And that's just the basic mathematic flaw in his reasoning.

    4. Re:Worthless by ultranova · · Score: 4, Insightful

      For a medical specialist wouldn't:

      +1 for right (patient lives)
      0 for no answer (she knows she doesn't know and maybe consults with a colleague),
      -1e38 for wrong (patient dies)

      be more appropriate weightings?

      No. Everyone makes mistakes sometimes; a doctor who concentrates all his efforts into avoiding them will end up sending all his patients to see one expert or another. Not only does this overload the experts (who are supposed to see only a tiny subset of the patients, after all), but it also means it takes longer to get diagnosed. And in the long run, it means that only risk-takers will become doctors in the first place, shich is not good for anyone.

      The worst case is if the experts will also start doing this: trying to offload the patient - and therefore the risk - to someone else as soon as possible. That will lead to the people with actual serious illnesses dying, since no one will actually diagnose them in their hurry to send them to someone else before they have a chance to die on them.

      So no, your weightings are not appropriate. You can't assign virtually infinite negative weight to failure and expect anyone to try - at least anyone you want performing medicine.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    5. Re:Worthless by aurispector · · Score: 4, Insightful

      People EXPECT doctors to do something, even when nothing is wrong. I've caught myself doing it and I *am* a doctor. It's human nature.

      When I took my board exams I studied old exams for weeks. The information in the exams wasn't really stuff directly from the curriculum; we covered the material but the focus was slightly different. In any case large portions of the information required to be regurgitated for the exam could be classified as "background" - stuff you need to be aware of but doesn't directly affect you in your daily work.

      The exam WAS multiple choice and I credit test-taking skills as much as my education for passing on the first try. Logic and the process of elimination can increase your odds to about 50/50 in most cases.

      --
      I have mod points. The reign of terror begins now.
    6. Re:Worthless by level_headed_midwest · · Score: 4, Insightful

      People always expect doctors to do something, even if the doctor is very vocal about there being no good treatment available. I've seen lots of people walk into doctors' offices and DEMAND a certain medication or treatment that is not advisable. A very common one used to be mothers demanding antibiotics to give to their kid who is sick with a viral flu. The doctor said in no uncertain terms that antibiotics will do absolutely nothing and that prescribing antibiotics will only cost money and perhaps have side effects. But the mothers had to have some medicine to feed to the kid just to satiate their mothering genes. Most of the docs I know told them to give the kid Tylenol if they had a fever or "prescribed" X ounces of fluids per hour- something to keep the mother mothering the kid.

      People will also want the doctor to do "something" even if nothing is wrong because they don't want to feel dumb for going when nothing was wrong. They want to justify that something was actually wrong so they don't feel foolish. Add to that the fact that most people have to pay some as a co-pay for a doctor's office visit and "want to get their money's worth."

      So sometimes picking "no action" can be very hard to do.

      --
      Just "gittin-r-done," day after day.
  2. When I was a boy... by WFFS · · Score: 4, Insightful

    Stories like this could never get on Slashdot. Seriously, this is like a maths problem I'd give to my Year 9 kids. This is definitely not news, and certainly doesn't matter.

  3. Education in taking the test by MagicDude · · Score: 4, Insightful

    As a medical student, I know how much our education is divided into what we do in real life, and what is the proper answer for exams. Quite often, during our education exercises, we're given senarios like "A patient presents with symptoms X, Y and Z. What do you do next?". At that point, that's when the resident says "You would diagnose condition A from those symptoms, but for the exam, you'd say you'd get an MRI to rule out B". So many questions are basically having intuition for where the question is guiding you too, rather than practical medicine. Often, it's extremely difficult to discern what the question wants. There will be some question along the lines of "A patient presents with general fatigue over the past 3 months, which one blood test do you want to order?" and you'll narrow down the answer choices to either thyroid stimulating hormone, or a complete blood count, both studies are equally important in the evaluation of fatigue, but the question wants you to know which one is more important. In real life, you would always get both because both conditions fairly common, and you want to evaluate both at once to save the patient time and effort. However, the question will nail you if you don't know some obscure study which states that there like is a 1% difference in the incidence of hypothyroidism vs anemia in fatigue. Moreso, if you were on the hospital floor and you were to say "I'm getting only a CBC, because it's more likely," the resident will chide you for not considering hypothyroidism as well and getting the Thyroid stimulating hormone as well, making you look bad. So yeah, learning for the test doesn't really ever end.

  4. Re:warning moronic blog post linked by suv4x4 · · Score: 4, Insightful

    if anything testing has become FAR FAR too easy, people pass CS courses and come out the otherside only to have a vague notion of how a computer works.

    I won't claim his post is correct or not, but he claims the technology behind such tests is wrong and lets less educated people pass through with guessing, whle more educated people try to pass without guessing and fail.

    People see the tests produce poor selection, and make the tests harder and harder in attempt to remedy this (but they won't since it's the technology of a test that's wrong).

    Then you come here and support his opinion 1:1 by claiming tests are too easy (i.e. should be harder) and idiots pass through.

    Ironic, isn't it.

  5. Disturbing by bryan1945 · · Score: 4, Insightful

    I find the fact that medical and lawyer exams are based on multiple choice rather disturbing. As an engineer almost all of my test were long answer. Sure, some multi questions, but mostly show all your work or explain the whole process. And I just design systems and networks! Now someone can just luckily guess enough multiple choice questions and start slicing me up?

    Like I said, disturbing.

    --
    Vote monkeys into Congress. They are cheaper and more trustworthy.
  6. Re:There may be unanswered questions by UnxMully · · Score: 4, Insightful

    Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said.

    Fate, it seems, is not without a sense of irony.

  7. Re:I find Mr. Feldzamen's post hard to believe. by nagora · · Score: 4, Insightful
    If anything, the low pass rate of bar exams, typically 50% or less among a candidate pool of mostly recent law school grads, suggests that they are very hard indeeed.

    It doesn't actually suggest anything other than 50% of people that apply pass. I can design an exam which is very easy; I then say that only 50% will pass. It could be that the "cut" is anyone who scored 9+ out of ten will pass and everyone else fails. Or I could flip a coin. The pass rate is no guide to how hard an exam is nor how good a test of the candidates' abilities. It might be both hard and rigorous, but you can't infer that just from the pass rate.

    TWW

    --
    "Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
  8. Re:The problem there by that+this+is+not+und · · Score: 4, Insightful

    Just to pull out a snippet and maybe contribute a bit to topic drift:

    if I'm hiring a Java programmer, asking questions about COBOL would be just trivia.)

    If you ask that sort of question to a prospective programmer, you'll find out more about the person's technical depth, which may be of value. The guy who 'learned Java' because he read it somewhere or an 'advisor' told him it was a way to 'get ahead' is gonna be mister lightweight who is looking for a 'career,' not somebody who is a practitioner who takes a broad approach.

    Further, it will help sort the candidates out. The ones who contrive 'fake' knowledge of COBOL can be rooted out and eliminated. Those who are willing to say 'I am not sure I know, but that's an interesting queston' get points, those who automatically start thinking about where to find the answer get even more points.

    And, of course, the question will help to sift out anybody with actual COBOL knowledge, because anybody with skill in COBOL who is applying for a Java position is obviously an unstable nut.