The Fallacy of Hard Tests

Worthless by kmac06 · 2007-06-16 18:50 · Score: 4, Insightful

What a worthless post. He gave one situation where guessing is more important than knowledge, but didn't at all address the specifics of the tests he was talking about. A typical vapid blog that for some reason gets posted to /.

Re:Worthless by Tatarize · 2007-06-16 18:56 · Score: 4, Insightful

No. Guessing is simply the 25% bonus if you're one in four. The chance of passing the test is nearly null. You need to be 100 times smarter than that idiot who can only answer one question. Also, 2X as smart == 2X right answers? What the hell? My IQ is 140, find me somebody with an IQ of 70 and give us a test on anything. Sure as hell I'll get more than just twice as many right.

1 for right answer.
-1/4 for wrong answer.
0 for no answer.

Done.

--

It is no longer uncommon to be uncommon.
Re:Worthless by WFFS · 2007-06-16 19:01 · Score: 5, Funny

Ok... the test will be on... girls. Huh? What do you mean that isn't fair?
Re:Worthless by Mr2001 · 2007-06-16 19:02 · Score: 4, Insightful

1 for right answer.
-1/4 for wrong answer.
0 for no answer. ITYM -1/3 for each wrong answer. That way, the expected value of guessing is zero: on average, out of four guesses, you'll gain a point for one of them and lose it for the other three.

--
Visual IRC: Fast. Powerful. Free.
Re:Worthless by KDR_11k · 2007-06-16 19:03 · Score: 1

Usually it's +1/-1. Also I don't get how he can deduce that most people answered randomly from the passing rate, you need more than 50% of the score to pass any test, random guesses even without penalty for wrong answers would land you at 25%.

--
Justice is the sheep getting arrested while an impartial judge declares the vote void.
Re:Worthless by Derekloffin · 2007-06-16 19:09 · Score: 5, Insightful

Yeah, this is a pretty bloody poor analysis. If I know 2X as much (even assuming we could quantify it that easily), that doesn't automatically mean I get 2X the score on a test, and it certainly doesn't mean my guesses are equally as bad as the guy with 1/2 my knowledge. It depends heavily on what my knowledge is and what is covered by the test. The potential is even there for the guy with 1/2 my knowledge to beat me just simply by getting lucky on what the test covers.
Just for an example, say we were doing a geography test on the states of the united states and their associated capitals. I know 1/2 of them, and another guy knows 1/4 of them. Now, each question is a 4 part multi-choice simple question: State X, which is it's capital? A, B, C, or D. The thing is, even for those I don't know, 1/2 the potential answers (on average) I can eliminate as I know them, while the other guy, on average, can only eliminate 1/4 of them. So, I would get 50% on knowing the answers, and about 1/2 of the remaining on guesses. The other guy would get 1/4 on knowing them, and only 1/3 of the rest on guesses. And that's just the basic mathematic flaw in his reasoning.
Re:Worthless by Score+Whore · 2007-06-16 19:09 · Score: 2, Insightful

He also assumes that you either know the right answer or know nothing. Here's a pretty hard test for him where a person with some knowledge but without the actual answer will do better than a person with no knowledge:

1. What number am I thinking of?

a) cheese
b) galaxy
c) 3
d) 1

A person who knows (literally) nothing has a 1 in 4 chance of getting it right. A person who knows what a number is has a 1 in 2 chance. You stick one hundred questions on a test and someone who is versed in the material will score better by eliminating answers they know are wrong than someone who guesses at all the answers.

This guy probably failed both the MCAT and LSAT and topped it off by bombing the GRE. What a putz.
Re:Worthless by phunctor · 2007-06-16 19:12 · Score: 2, Insightful

For a medical specialist wouldn't:

+1 for right (patient lives)
0 for no answer (she knows she doesn't know and maybe consults with a colleague),
-1e38 for wrong (patient dies)

be more appropriate weightings?

Many medical specialists could use a tuneup on the difference between confidence and arrogance...

--
phunctor
Re:Worthless by IP_Troll · 2007-06-16 19:14 · Score: 2, Insightful

Agreed this post is worthless.

Has the author of this blog got any scientific results to back up his claims? The NY State Bar has a statistical analysis of who passed its bar exam. http://www.nybarexam.org/NCBEREP.htm

like bar exams or medical license exams, where very often the well-qualified and knowledgeable fail the exam.
IMHO there are only two reasons why the well-qualified and knowledgeable fail such exams.* They didn't study or they studied the wrong materials. We have all had that one exam we did REALLY poorly on and we would like to blame someone other than ourselves for our bad grade. This post merely plays to those emotions with anecdotal evidence. Mod me as troll if you like, but you know its true.

* How can someone be considered qualified and knowledgable about a subject if they can not pass the test, which determines whether the are? I assume the blog writer means generally intellegent people.
Re:Worthless by Anonymous Coward · 2007-06-16 19:17 · Score: 1, Insightful

I would love to know in what capacity this guy was a "former mathematician". His knowledge of statistics and probability seems limited to 5th grade fractions.

Teaching a remedial math class does not make you a mathematician.
Re:Worthless by KDR_11k · 2007-06-16 19:24 · Score: 1

Yes but it still looks like he read the % passing number as the average score in the test. He doesn't know the passing score more than we do yet he seems to make his claims based on the % passing.

--
Justice is the sheep getting arrested while an impartial judge declares the vote void.
Re:Worthless by nephyo · 2007-06-16 19:52 · Score: 5, Informative

His argument is that the harder the test the less relevant knowledge of the actual answers to the questions posed on the test are to determining your relative score. As a result, on a very hard test, two test takers with vastly different levels of knowledge of the correct answers to the test questions do not on average end up with scores that reflect that difference.
The "educated guess" does not contradict that argument. Again, the harder the test then the smaller the difference between the number of potentially correct answers you can eliminate versus the number that he can eliminate will be. With a sufficiently hard test, "educated guessing" makes no difference whatsoever.
So basically with a multiple choice, count only the correct answers test, increasing the difficulty is not an effective means of increasing the likelihood of the test to accurately filter out candidates with lesser knowledge of the subject matter covered by the test. Increasing the difficulty only increases the degree to which randomness has an impact on the results.
This is true, well known, and not very controversial. However, you would of course need to examine the specific tests in question to determine whether they are effective. They may have other features to help mitigate this effect. Also, his analysis is purely mathematical. It doesn't take into account the likelihood of a challenging test to create social pressure that influences people to self-filter. It could be argued that most of these tests are not testing the takers knowledge of the material so much as they are testing the takers ability to study and react to the pressure that the tests provide.

--
I grant all that I write to the public domain.
Re:Worthless by Anonymous Coward · 2007-06-16 20:08 · Score: 4, Funny

a) cheese
I have the same combination on my luggage!
Re:Worthless by Znork · 2007-06-16 20:55 · Score: 2, Insightful

"It doesn't take into account the likelihood of a challenging test to create social pressure that influences people to self-filter."

Mmm. I'm not sure that would be a desireable feature; that'd bias the test situation in favour of arrogant idiots. For some professions confidence may be more desireable than knowledge (marketing?), but for a doctor I think one would prefer someone being reluctantly right than someone being confidently wrong.
Re:Worthless by loganrapp · 2007-06-16 21:14 · Score: 5, Funny

A test on girls isn't fair because no matter what answer you give, it'll be wrong.
Re:Worthless by EsbenMoseHansen · 2007-06-16 21:22 · Score: 1

A test on girls isn't fair because no matter what answer you give, it'll be wrong.
That just means that the maximum score is 0 :p

--
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Re:Worthless by Derekloffin · 2007-06-16 21:27 · Score: 5, Interesting

The "educated guess" does not contradict that argument. Again, the harder the test then the smaller the difference between the number of potentially correct answers you can eliminate versus the number that he can eliminate will be. With a sufficiently hard test, "educated guessing" makes no difference whatsoever.
Actually, the problem here is his example is a total worst case scenario and doesn't tell us what the 'Pass' level is. The tests mentioned are not relative knowledge tests, they are pass/fail tests, in other words, I don't care how much Joe knows compared to Bill, all I care about is does Joe demonstrate the necessary level of knowledge to pass. In that case, assuming the test maker has the slightest clue, in the example the pass mark would likely be at 75%+ (you need about 1/2 right legit, and 1/2 of the remaining right on guesses or better) meaning that it's difficulty is fine as it has correctly blocked both people as they didn't show the necessary level of knowledge.
He might have a point IF he qualified this to scaled result tests (ie the top X people will pass regardless of their scores, only relative position counts), but he didn't. But, even in that case he'd have to analyze the distribution of all testees, not just 2. Once again, his math does work and doesn't support the argument.
Re:Worthless by DocDJ · 2007-06-16 21:50 · Score: 2, Informative

Well, your IQ may be 140, but you don't understand IQ tests or probability distributions if you think 2*IQ == twice as smart.
Re:Worthless by Mike1024 · 2007-06-16 22:12 · Score: 1, Insightful

And that's just the basic mathematic flaw in his reasoning.

I'm not sure I agree with the assumptions you make, but I agree this chap has an analysis that doesn't make much sense.

Consider a 4-option multiple choice test, where you get one point for a correct answer and zero points for an incorrect answer, and there are 100 questions.

0 known + 100 * 1/4 = 25 right
20 known + 80 * 1/4 = 40 right
40 known + 60 * 1/4 = 55 right
60 known + 40 * 1/4 = 70 right
80 known + 20 * 1/4 = 85 right
100 known + 0 * 1/4 = 100 right

Number right = 0.75 * number known + 25

Now, clearly such a test would be BS if the passing grade was 25 right, as everyone would pass. And if the passing grade was close to 25 right (e.g. 27 right) you would get a lot of people passing by luck.

However, if the passing grade is 75% right, you would have to know (75-25)/0.75 = 66.67 answers in order to get a passing grade. And assuming the test designers knew this when choosing the pass mark for the test, they would simply have increased the pass mark to take the relationship into account.

Perhaps the blogger has "35% of people pass" confused with "35% right is a passing grade".

--
"Goodness me, how unlike the FBI to abuse the trust of the American public." -- The Onion
Re:Worthless by gnasher719 · 2007-06-16 22:17 · Score: 1, Informative

'' Also, 2X as smart == 2X right answers? What the hell? My IQ is 140, find me somebody with an IQ of 70 and give us a test on anything. ''

Anyone who really has an IQ of 140 would know that IQ = 140 doesn't mean "twice as smart" as someone with an IQ of 70.

IQ is an adjusted measurement; adjusted in such a way that it is normal distributed with an average of 100 and a standard deviation of 10.
Re:Worthless by pionzypher · 2007-06-16 22:25 · Score: 5, Funny

No. I think what he was trying to say was that no matter what, you'd never score with a girl. ;)

--
I'll believe in corporations having personhood when Texas executes one... - advocate_one
Re:Worthless by TheRaven64 · 2007-06-16 22:30 · Score: 2, Insightful

They should also know that IQ relates to a particular subset of reasoning skill, not to knowledge, and definitely not to knowledge in all fields. If you gave me and someone with half my IQ a test on, say, baseball then they would almost certainly do better than me; all of my answers would be guesses and so any knowledge that they had would give them an edge, no matter how stupid they were. This, of course, raises a problem that is present in a lot of exams - even a few on my degree course - that they test knowledge, rather than understanding. These days, specific knowledge is almost worthless, since it's so easy to acquire it when you need it, but being able to do something useful with the knowledge is definitely valuable.

--
I am TheRaven on Soylent News
Re:Worthless by ultranova · 2007-06-16 22:33 · Score: 4, Insightful

For a medical specialist wouldn't:

+1 for right (patient lives)
0 for no answer (she knows she doesn't know and maybe consults with a colleague),
-1e38 for wrong (patient dies)

be more appropriate weightings?

No. Everyone makes mistakes sometimes; a doctor who concentrates all his efforts into avoiding them will end up sending all his patients to see one expert or another. Not only does this overload the experts (who are supposed to see only a tiny subset of the patients, after all), but it also means it takes longer to get diagnosed. And in the long run, it means that only risk-takers will become doctors in the first place, shich is not good for anyone.

The worst case is if the experts will also start doing this: trying to offload the patient - and therefore the risk - to someone else as soon as possible. That will lead to the people with actual serious illnesses dying, since no one will actually diagnose them in their hurry to send them to someone else before they have a chance to die on them.

So no, your weightings are not appropriate. You can't assign virtually infinite negative weight to failure and expect anyone to try - at least anyone you want performing medicine.

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:Worthless by clifyt · 2007-06-16 23:26 · Score: 5, Interesting

In a well designed exam, the 'educated guess' is just as much a part of the design as anything else. You *HAVE* to have questions that have answers somewhat similar, or you make it way too easy to guess the answer by way of elimination. At the same time, we want enough questions that one can eliminate one or more questions immediately.

For instance, I'm a testing person, but not a content person (i.e., I design towards what the stats tell me, as well as the actual wording and structure of the exam...I always work with someone who understands the content areas from a very advanced level and can deal with that end). One of the last MC exams I was helping validate, I knew NOTHING about the content -- it was a medical exam. First thing I did was go through the entire exam, read all the questions quickly, and see if logic could remove any of the answers. Statistically, I would have gotten a 20% by random means, but in this case, I received somewhere around 43% (if I remember correctly). The educated guess is a BIG part of these things...you aren't just measuring content knowledge, but application and that means if someone can raise the bar, they might actually do well in the real world. If I had a doctor who had never seen a case like mine, and it defied traditional practice, I think I'd be more impressed with the man that got 40% on purely logic, than the guy that got the 40% based around actually knowing something about the problem (and actually, I had a team of doctors several years ago like this...I sat around trying to figure out how I was going to die for a couple of months while one doctor who had seen problems like mine couldn't figure out what the cause was, while the one that wasn't an expert in the field methodologically ruled out what wasn't the cause, and ended up finding me a specialist that the first doctor SHOULD have been able to do because his field encompassed a hell of a lot more of the specialty than my general physician's 'specialty').

And it kinda depends on the type of test and what you are measuring. When designing these things, you ask a lot of questions based around the type of assessment one is looking for. And you design accordingly. By correlating my exams with others that have some sense of validity, I can see the levels of the testees before they take the new one. This in itself will show you quite a few things about the design of the new exam. For instance, we can tell certain questions might have 50% of the folks answering correctly, but which 50%? On the original test, you have two groups take the exam, novices and experts (and heavily simplifying this for /.). If the experts get the question wrong, while the novices get it right -- the question is struck. Someone with little experience in test design may look at the question and wonder whats wrong -- the answer is correct and all of your colleagues agree -- but in some way it is wrongly worded. So again, it is either struck, or restructured to be inserted for calibration and validation at a later point (on a large exam like the Bar the author had derided, a good chunk of the questions are probably not scored and are only there to see how well they work and if they can be put into the next exam).

Beyond that, you have panels of experts who go over questions. Have them all vote on things like the difficulty of the item, the appropriateness for the exam. Things like that. Folks like me will take these and sort the items into usable or unusable stacks, rewrite them (again with experts), and then sort X amount of the lower difficulty, Y of the medium, Z of the hard (the easy questions are there to give motivation...its amazing how much better someone will do if they get positive reinforcement in that they KNOW this questions...it will prime the neural pathways to hopefully give more routes to specific knowledge in order to get the reward...I can feel the endorphin rush when I'm doing poorly but then get a win every now and then and it helps). And finally, one analyzes everything to see h
Re:Worthless by hkBst · 2007-06-16 23:28 · Score: 1

Wow, your IQ may be 140, but your lacking knowledge and... nah, no way your IQ is even close to 140. The author is no genius, but clearly he defines being twice as smart as knowing the answers to twice as many questions. This is very reasonable, unlike your own suggestion that IQ is a linear scale. You might want to look up standard deviation and how it relates to IQ. Finally your weights for 4-way multiple choice questions are wrong, unless you want to punish guessing; -1/5 would be the correct value for a wrong answer to a 4-way question. My guess is that the IQ test you possibly took is much like the tests derided in the article.
Re:Worthless by EsbenMoseHansen · 2007-06-16 23:28 · Score: 4, Funny

No. I think what he was trying to say was that no matter what, you'd never score with a girl. ;)
Well, as a (happily) married man and considering the 6 point odd billion global population, I'd say it isn't quite impossible to score with a girl. You just have to learn that there is no correct answer, especially on multiple choice. ;)

--
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Re:Worthless by Jaidan · 2007-06-16 23:32 · Score: 5, Interesting
No, just no. The point of the tests are to determine who is over a particular threshold of knowledge and who isn't. The method being called a fraud fails to accurately do that. Since randomness has a proven substantial impact on those tests that threshold becomes blurred. To make matters worse, the harder the test the MORE randomness affects score. As a result the test results are meaningless at any scale. His examples where simplified to illustrate the essential math behind them, he does not need more than 2 people to compare since the math is equally applicable no matter how many are tested. He also does not need to set a scale because the math is equally applicable to any bar you might set.

The point of the article was to illustrate that these hard tests are meant to establish a minimum required level of knowledge, however due to the nature of counting only correct answers, randomness incurs a great penalty to the accuracy of the attempted measurement of knowledgege. He is suggesting, and rightly so, that a test that instead occurs an effective 0 net effect of guessing would much more accurately measure the knowledge of the participants by reducing the effects of guessing to nearly 0
.
What this really comes down to is accuracy and precision. We assume that a test score can be equated to a measurement of knowledge, and for your benefit (it's completely irrelevant) we'll assume that a passing test is 60%.
- We give 1 person 5 different tests. We allow for random guessing with no penalty, and the test is very hard. He takes them all and scores wildly different, but averages 65% across all of the tests. If I was to know for a fact that the person in question does indeed deserve to score a 65% then we can say the test was very accurate, but low in precision. On any given test the subject may have passed or failed depending on his luck with guessing.
- We now give the same person 5 new tests. We this time remove randomness for the most part by penalizing wrong answers by an amount that results in an effective gain of 0 for random guessing. This time he takes the tests all his scores are within a few points of each other and infact he averages 65% again. In this case the test is highly accurate and is also high precision. On any given test the subject most likely would pass
The article's math indeed illustrates this point very clearly. The unspoken point is that in tests such as these, designed to set standards to be met, it is a fraud to use a test with low accuracy at measuring actual knowledge. The precision gained by penalizing guessing allows the test to be much more fair in it's administration.
http://en.wikipedia.org/wiki/Accuracy
--
Mobius Custom Computers
Re:Worthless by mobby_6kl · 2007-06-16 23:34 · Score: 1

My IQ is 280, let's go take a test on anything together!
Re:Worthless by mobby_6kl · 2007-06-16 23:55 · Score: 3, Funny

>That just means that the maximum score is 0 :p

So it's rather like darts? You start with a constant number of points, say 300, and then with each answer you give points are subtracted from your total score. The game ends when you inevitably reach 0.
Re:Worthless by Firethorn · 2007-06-17 00:16 · Score: 2, Informative

We give 1 person 5 different tests. We allow for random guessing with no penalty, and the test is very hard. He takes them all and scores wildly different, but averages 65% across all of the tests.

Statistics show that this would be very unlikely for 5 tests with questions pulled from a common pool.

The odds of WAGing a multiple choice test is 25% per question. When distributed over a hundred questions, it's very unlikely that random guessing will score above 30% or below 20%, and that's for guessing the entire test.

--
I don't read AC A human right
Re:Worthless by The+One+and+Only · 2007-06-17 00:26 · Score: 4, Funny

You just described every relationship I've ever been in.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:Worthless by vertinox · 2007-06-17 00:31 · Score: 2, Interesting

The worst case is if the experts will also start doing this: trying to offload the patient - and therefore the risk - to someone else as soon as possible. That will lead to the people with actual serious illnesses dying, since no one will actually diagnose them in their hurry to send them to someone else before they have a chance to die on them.

Have you ever seen that episode of Scrubs where they take that wealthy hospital donor to every department to try to figure out what is wrong with them, but no one knows.

It turned out the best solution was to do nothing at all (which it turned out the protagonist did because he simply did not know the what was wrong) and the problem went away.

Had they actually did something it might have caused more a problem that the correct answer of "do nothing".

--
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
Re:Worthless by Goaway · 2007-06-17 00:46 · Score: 1

You know, looking at the fact that his blog has a single post, this one, and that he has another blog on blogspot which also has a single post, and both look a whole lot like they are designed to touch on subjects which are likely to get them onto news sites like Slashdot or Digg, and then generate large amounts of discussion because they are obviously silly, I'm assuming the guy is doing some good old-fashioned trolling for one reason or another.
Re:Worthless by aurispector · 2007-06-17 00:57 · Score: 4, Insightful

People EXPECT doctors to do something, even when nothing is wrong. I've caught myself doing it and I *am* a doctor. It's human nature.

When I took my board exams I studied old exams for weeks. The information in the exams wasn't really stuff directly from the curriculum; we covered the material but the focus was slightly different. In any case large portions of the information required to be regurgitated for the exam could be classified as "background" - stuff you need to be aware of but doesn't directly affect you in your daily work.

The exam WAS multiple choice and I credit test-taking skills as much as my education for passing on the first try. Logic and the process of elimination can increase your odds to about 50/50 in most cases.

--
I have mod points. The reign of terror begins now.
Re:Worthless by level_headed_midwest · 2007-06-17 01:12 · Score: 4, Insightful

People always expect doctors to do something, even if the doctor is very vocal about there being no good treatment available. I've seen lots of people walk into doctors' offices and DEMAND a certain medication or treatment that is not advisable. A very common one used to be mothers demanding antibiotics to give to their kid who is sick with a viral flu. The doctor said in no uncertain terms that antibiotics will do absolutely nothing and that prescribing antibiotics will only cost money and perhaps have side effects. But the mothers had to have some medicine to feed to the kid just to satiate their mothering genes. Most of the docs I know told them to give the kid Tylenol if they had a fever or "prescribed" X ounces of fluids per hour- something to keep the mother mothering the kid.

People will also want the doctor to do "something" even if nothing is wrong because they don't want to feel dumb for going when nothing was wrong. They want to justify that something was actually wrong so they don't feel foolish. Add to that the fact that most people have to pay some as a co-pay for a doctor's office visit and "want to get their money's worth."

So sometimes picking "no action" can be very hard to do.

--
Just "gittin-r-done," day after day.
Re:Worthless by An+Onerous+Coward · 2007-06-17 01:27 · Score: 1

Interesting that you brought up baseball. I read a book by Arthur Jensen a decade ago (I didn't realize at the time how controversial he was). I think he was trying to argue against one of the standard objections to testing. Specifically, that tests with domain-specific content were biased against minorities who were less likely to have been exposed to that content.

Anyhow, part of his argument was that intelligent people were more likely to pick up and retain content. To illustrate, he pitted one of his colleagues -- a very high-IQ person who claimed to know nothing of baseball and who took great pride in his lack of said knowledge -- against a high-functioning mentally retarded man who was obsessed with baseball. In this instance, at least, his colleague won by a landslide.

I only bring this up to support my hunch that you probably know more about baseball than you suspect.

--
You want the truthiness? You can't handle the truthiness!
Re:Worthless by nomadic · 2007-06-17 01:33 · Score: 1

MHO there are only two reasons why the well-qualified and knowledgeable fail such exams.* They didn't study or they studied the wrong materials. We have all had that one exam we did REALLY poorly on and we would like to blame someone other than ourselves for our bad grade. This post merely plays to those emotions with anecdotal evidence. Mod me as troll if you like, but you know its true.

I don't think that's true at all; some people are naturally poor test takers. I know people I went to law school with who were naturally smart people, studied greatly for the bar exam (and they studied what they were supposed to), yet still have failed more than once.

Meanwhile I slacked off in law school, and started seriously studying for the bar exam only 2 weeks before, and I got an astronomically high score--I just am very, very good at standardized tests. I didn't deserve my grade, and I didn't have an especially strong grasp of the material, I just sort of instinctively know what answer they look for. It's not especially fair.
Re:Worthless by TheRaven64 · 2007-06-17 01:41 · Score: 1

I'd be interested to take that test now. I think I would have a significant advantage when it came to doing badly, however, since I live in the UK, a country in which baseball is neither played, nor shown on TV, and has a very small following. I would expect the average American's knowledge of baseball to be about the same as my knowledge of football (soccer), since it's part of the ambient culture that you can't really avoid.

--
I am TheRaven on Soylent News
Re:Worthless by lawpoop · 2007-06-17 01:46 · Score: 1

This is not to excuse someone who does poorly on tests because they didn't know the material, but some people have test anxiety, where the situation of a high pressure test, such as the SAT causes them to choke. Otherwise they are normal or good students and do well, even on exams. It's not that they can't do exams, but the high-pressure scenario really freaks them out. It's like people who have a fear of doctors and hospitals have a higher recorded blood-pressure than they do in everyday life, because they are stressed out when the doctor measures it!

I've never had a problem with tests, and I think it's because I understand on a social-engineering level better than the average person. I use strategies such as elimination, or figuring out what common pitfall the instructor is trying to set for us, rather than looking at them as just a knowledge dump. Although I do well on tests, I wouldn't mind if we got rid of the high pressure, keep silent, face on your own paper, 'final' tests. My problem with that it we really encounter a situation like that where we are actually using and applying our knowledge. Sure, we face other high-pressure scenarios, such as the server being down, or meetings with high-level executives, but the high-stakes testing doesn't 'test' that. A high stakes test more or less an arbitrary ordeal that exists only in education. We don't deal with the trickery and psych-outs of multiple choice in any other arena of our lives. You get no points for creativity or thinking outside the box in multiple choice. It also encourages procrastination, cramming, and waiting till the last minute to clear a hurdle, rather than taking the proper time and energy to complete a task or course of study.

We would do much better to give students the final grade based on their portfolio of work, final essays that they had time to complete, or final projects. But, since multiple choice is standardized and easy to grade by machine, that's why we use it. Everyone is a cog in the machine.

--
Computers are useless. They can only give you answers.
-- Pablo Picasso
Re:Worthless by ryanov · 2007-06-17 01:47 · Score: 5, Interesting

I have a lot of experience with this lately, having come down with an odd virus that had no treatment but was/is excruciatingly painful. There may be no treatment available, but I wager the vast majority of these folks who go to a doctor but have nothing wrong with them DO have some symptom or another... for me, getting the symptom treated is almost equally as important as having the cause treated, as I probably wouldn't have gotten out of my chair without it. One doctor recently seemed much more concerned with the cause and the symptom was nearly an afterthought -- as a result, I was in a lot of pain for 24 hours with no way to fix it. He saw the antibiotic as more important (though it ultimately turned out not to be bacterial), but I saw something for pain to be something that should have happened immediately.

Another thing -- most people want to feel like the doctor at least LOOKED for something. One doctor I went to recently made me wait 40 mins to see him and then looked at me for like 30 seconds and prescribed something. Yes, that makes sense if you know what it is straight off and know what to do about it, but you might just wanna look for other things that I /didn't/ mention, in case I have more than one thing or in case there are different diagnoses that have similar symptoms except for a couple.
Re:Worthless by hey! · 2007-06-17 02:30 · Score: 1

You have apparently missed the point. All we need to know for his purposes about the tests is that (a) they are multiple choice and (b) they only count right answers to arrive at the score.

You need to deduct wrong answers from the score, not just add up the right answers.

Suppose a knowledge capable of answering 60 out of the 100 questions on the exam is what you need to become a brain surgeon. But you only have enough knowledge to answer 50. If each question has five choices, you can guess on the 50 you are stumped on an arrive at a score (on average) of 60.

His point carries: that tests scored this way are imprecise.

It is possible, I suppose, to increase the number of choices to some large number, say twenty choices per question. Then the value of a guess is nil. Except this is never done, because the forms used don't support this many choices, or because it's too hard to come up with that many plausible answers.

A strategy that might meet with his approval is to deduct 1/n for each wrong answer, where n is the number of choices. Then the statistical value of guessing is nil, unless you have some knowledge. You'd also have to scale the "pass" level to take account of the value of partial knowledge.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Re:Worthless by kmac06 · 2007-06-17 02:49 · Score: 1

Any multiple test score is always imprecise at some level. Even if you subtract so guessing averages out to no difference, there can still be wide variations in there that average out to zero. It doesn't mean the test is meaningless.
Re:Worthless by An+Onerous+Coward · 2007-06-17 03:36 · Score: 3, Insightful

I love Scrubs, too. But let's not go redesigning our medical qualifications system based on that one episode we saw that one time. :)

I can only suppose that there are times when doing nothing beats doing something. But you seem to be saying that, because such situations do occur, then it would be healthy to severely punish medical errors to the point where most doctors' first instinct is to do nothing, run another test, etc. Even though there may be times when that state of affairs would help certain patients, on the balance I think it would make medical care worse.

--
You want the truthiness? You can't handle the truthiness!
Re:Worthless by Canthros · 2007-06-17 03:44 · Score: 1

Perhaps the blogger has "35% of people pass" confused with "35% right is a passing grade".
I think it's more probable that he's confused his having a background in mathematics and experience with the bar exam as having a clue what he's talking about. This response, above, from clifyt also pretty enlightening on the subject.

--
Canthros
Re:Worthless by Nf1nk · 2007-06-17 03:51 · Score: 1

but the guy in the example isn't guessing on the entire test. his true score (questions he solidly knows) is around 55% with this example if he gets another third of his guesses right he pulls off a 70%. if he gets half he gets a 78, if his luck is bad he gets a 60. the problem comes from the fact that the test does not account for his guess work, and on my hypothetical three tests I end up with a close to twenty point spread for someone who only "knows" a little more than half the answers.

--
I used to have a cool sig, back when I cared
Re:Worthless by DavidTC · 2007-06-17 03:52 · Score: 1

I agree, because I'm the same way: I think I know nothing about baseball, but, on a multiple choice quiz, I bet I could come up with all the rules of baseball, and match most teams up with their cities. I bet I could even calculate ERAs and stuff, just from knowing how they logically work.
But all that stuff is from my childhood. Give me any question that requires me to recently paying attention, like players, and I can't come up with anything unless it's very big. Like Sammy Sosa big.
But if we took the entire baseball domain of knowledge, I bet I could tie on any test with a casual fan. Or at least come within 80% of him.

--
If corporations are people, aren't stockholders guilty of slavery?
Re:Worthless by MidoriKid · 2007-06-17 03:55 · Score: 3, Funny

Do not try to answer the question... that's impossible. Instead only try to realize the truth... There is no answer.
Re:Worthless by Anonymous Coward · 2007-06-17 04:11 · Score: 2, Funny

The relationships last longer if you stop throwing darts. :)
Re:Worthless by cellocgw · 2007-06-17 04:30 · Score: 1

I love Scrubs, too. But let's not go redesigning our medical qualifications system based on that one episode we saw that one time. :)

I can only suppose that there are times when doing nothing beats doing something.
Remember one of the primary Rules of The House of God : THE DELIVERY OF MEDICAL CARE IS TO DO AS MUCH NOTHING AS POSSIBLE.

--
https://app.box.com/WitthoftResume Code: https://github.com/cellocgw
Re:Worthless by Vellmont · 2007-06-17 04:41 · Score: 1

What a worthless post. He gave one situation where guessing is more important than knowledge, but didn't at all address the specifics of the tests he was talking about.

The specifics of the test don't matter, since his criticism is a statistical one, not one of the specific questions. The one situation he gave was used to try to illustrate a mathematical point, not offered as a proof. While it's certainly not something that's publishable and isn't very rigourous, I think it's a very interesting post. If you have a criticism of his conclusions, offer it up. But there's nothing wrong with his methodology as far as I can tell.

--
AccountKiller
Re:Worthless by lionheart1327 · 2007-06-17 05:05 · Score: 1

For having a 140 IQ you don't seem to get the fact that IQ is graded on a bell curve and so that would make 140 many times smarter than a 70, not just 2X.
Re:Worthless by Puff+of+Logic · 2007-06-17 05:13 · Score: 3, Informative

But you seem to be saying that, because such situations do occur, then it would be healthy to severely punish medical errors to the point where most doctors' first instinct is to do nothing, run another test, etc. Even though there may be times when that state of affairs would help certain patients, on the balance I think it would make medical care worse. Indeed it would. My understanding is that the cost of defensive medicine (defensive in terms of liability) is not just measured in dollars; invasive, harmful, or otherwise painful tests are often done in a full-court-press just to say that every possibility was checked, regardless of whether such tests are indicated. That we, as a society, demand a level of perfection from our doctors that is simply unreasonable to expect from any human merely exacerbates matters. A doctor cannot openly say "guys, I screwed this one up, so learn from my mistakes" because the family will be howling for compensation and the lawyers will be trying to hush it all up. A failure to act (doing nothing, as the GPP suggests) is just as damning as doing the wrong thing, so what other choice does a physician have than to fire the medical artillery, even if he thinks only a BB gun is indicated?

I should immediately point out that IANAD but I hope to play one in front of an admissions committee soon, so I may be talking out of my rear. However, the above seems to be the sentiment of most doctors I've spoken to. I just got done with the MCAT recently, so this topic is a bit close to my heart! An interesting site with a good take on the situation is here.

--
P.P.S. I'm doing Science and I'm still alive.
Re:Worthless by coredog64 · 2007-06-17 05:25 · Score: 1

What I took away from the post was this: If you have two test takers (A and B) such that B knows 2X what A knows, the final score doesn't necessarily reflect that point: A: 1 known + 99 * 1/4 = 25.75 B: 2 known + 98 * 1/4 = 26.5 or, if those knowns are too ridiculous A: 50 known + 50 * 1/4 = 62.5 B: 100 known + 0 * 1/4 = 100
Re:Worthless by maxwell+demon · 2007-06-17 06:39 · Score: 1

Of course the number you're thinking of is 42, which would be option b, because the number is found in the Hitch Hiker's Guide to the Galaxy.

--
The Tao of math: The numbers you can count are not the real numbers.
Re:Worthless by try_anything · 2007-06-17 06:40 · Score: 3, Funny

I guess it depends on where you work, but my friend's experience was that things changed immediately when he got his first job. Everyone is keenly aware of the potential of a malpractice lawsuit, but the doctors talk pretty freely with each other behind the patients' backs, laughing at the nut cases and making fun of the pill tourists. One guy kept a known addict who came in with "back pain" in an exam room for six hours, coming in between his other patients, bringing exotic-looking implements into the examining room and holding them against the patient's body, furrowing his brow, making serious noises, and then disappearing for half an hour. At the end of the day he told her to take three Advil a day and "come back as often as you feel is necessary."

I don't know how freely the doctors admit mistakes, but my friend tells me about his colleagues' mistakes every once in a while, so they aren't exactly secrets.
Re:Worthless by EvanED · 2007-06-17 06:44 · Score: 1

It's not that they can't do exams, but the high-pressure scenario really freaks them out.

OTOH, if you can't deal with high-pressure scenarios you should be trying to become most kinds of doctors or some kinds of lawyers.
Re:Worthless by try_anything · 2007-06-17 06:56 · Score: 2, Interesting

It seems that these "extremely hard tests" are exhausting all-day or multi-day affairs with several hundred questions. With that many random events in the sample, the variance will be pretty low.

A normal person may score "wildly differently" on a 300-question exam from one attempt to the next, but the variance will be based more on differences in preparation, physical and mental comfort, stress, and how much sleep he got the night before.
The article's math indeed illustrates this point very clearly.

The article's math is actually pretty pathetic. For one, he assumes that a person who knows half as much will guess just as accurately. For another, his entire point seems to be based on implying that the less knowledgeable person has a good chance of scoring as well as the more knowledgeable one, but he only calculates that probability for a trivial, extreme case. Why doesn't he tell us the probability for one of the more reasonable cases he describes? Either he never bothered calculating those probabilities, or he decided they weakened his case. I don't particularly care which one; his credibility is approximately zero either way.
Re:Worthless by maxwell+demon · 2007-06-17 07:04 · Score: 1

His argument is that the harder the test the less relevant knowledge of the actual answers to the questions posed on the test are to determining your relative score.

However he measures the hardness of the test just by the number of people who pass or fail it, but implicitly assumes that a harder test implies harder questions. That's in itself a fallacy. I can make a test harder by simply raising the number of questions you have to answer right. For example, let's take an extreme case: Say I raise the bar so hight that you'll only pass if you answer all questions correctly. Now if you know everything, you'll of course pass, but if you have to guess at just one question, you'll only have a 50% chance to pass. And your chances will go down exponentially with the number of questions you have to guess. Unless the questions themselves are extremely easy, that hypothetical test would be a very hard one (that it, very few would pass it), but at the same time very efficient to let only those with actual knowledge pass: If you just guess, you've got nearly no chance to pass.

In short, the number of people who actually pass/fail a test in isolation simply can't be used to decide if a test was meaningful.

--
The Tao of math: The numbers you can count are not the real numbers.
Re:Worthless by HardCase · 2007-06-17 07:27 · Score: 1

I think that everyone is just thinking way too hard about this. A multiple choice test is simply practical for tests of this sort. They lend themselves to rapid processing and eliminate the "that's what I meant" issue that comes from essay questions, among other things.
Re:Worthless by Nephrite · 2007-06-17 08:17 · Score: 1

My IQ is 140, find me somebody with an IQ of 70 and give us a test
Strangely enough, with such a high IQ you confused intellect and knowledge.
Re:Worthless by ConfusedVorlon · 2007-06-17 08:19 · Score: 1

did you mean

1 for right answer.
-1/3 for wrong answer.
0 for no answer.

--
VLC Remote for iPhone and Android
Re:Worthless by shmlco · 2007-06-17 08:31 · Score: 1

Guess that assumes that every wrong answer is fatal. It isn't. For example, I may not know the name of the bone, but the darn thing is still broken and needs setting.

--
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
Re:Worthless by suspected · 2007-06-17 08:48 · Score: 1

IQ is such a meaningless number on the internet. Everyone pulls it out of thin air, sometimes attributing it to a website that scored them and then tried to sell them some certificates. Now, to debunk the rest of your post. First of all, the author of the blog already mentioned that no deduction was given for wrong answers; he never contended that tests cannot be fair, so we don't really need you to remind us how the SAT and many other tests are graded. Second, the topic isn't about intelligence, but simple knowledge on a given subject. On a very difficult test, for example, knowing more could even hurt you, because some wrong answers are made to "seem right," in an effort to increase the test's difficulty. I actually do agree with the blog that a test that is too difficult isn't always the best way to test individual knowledge; however, he does overestimate the probability of dumb luck.
Re:Worthless by Derekloffin · 2007-06-17 08:57 · Score: 2, Insightful

Since randomness has a proven substantial impact on those tests that threshold becomes blurred.
True, but the article's math does nothing to support that case as the difficulty of the test does NOTHING to hurt or help this. The test format in that case is the problem, and his example again doesn't help because this test wouldn't be used for those guys who get 55% on the test, it would be looking for those in the 75%+ range (pass could even be set at 90% maybe even 100%, he never sets it) and this on a trivial test that no organization like a legal bar would use. The actual odds of you passing by luck are quite low even on his worst case example (they're actually probably as good as passing an essay test and just getting lucky on what questions they ask).
His examples where simplified to illustrate the essential math behind them, he does not need more than 2 people to compare since the math is equally applicable no matter how many are tested.
Here, let me explain it to you. This type of test is a like that 'you can only go on this ride if you're taller that this' sign. His example, instead of using it that way is attempting to use it as a way of say is Joe taller than Bill when both are the size of ants. It was neither designed nor intended for that, therefore the math fails because he has to address people who can actually pass it, not the guys who can't and the the problem has nothing to do with the difficulty in that case, it is to do with the test format. The fact that he didn't even set a fictional pass bar demonstrates just how out of place his thinking is. Again, he might have a point if this was a relative test, but it isn't as described. Even in the absolutely absurd case he presents, the math does not hurt the test as pass bar would logically be set quite high on that test, blocking both people. You have to compare at least a guy who can pass on it reasonably against a guy who can't and in that case the math fall apart very fast as even that pathetic has a very good chance of showing that difference correctly.
Re:Worthless by fropenn · 2007-06-17 10:21 · Score: 1

100% worthless. The author of this little article has NO idea what he is talking about. Being a psychometrician, I would first like to say that there is a big difference between a mathematician and a psychometrician - so being a "former" mathematician does not necessarily qualify you as an expert on testing. Nearly all licensure exams now use IRT (item response theory) as the primary scoring approach. These statistical models can be used to account for "guessing" behaviors by recognizing an individual's pattern of responses (for example, you get a bunch of easy questions wrong, but guess correctly on a hard one) - in this case the model will recognize this as "guessing" and will take this into account when giving the individual a score. Licensure tests are used for one specific purpose - to separate examinees into "pass" and "fail" groups. Licensure testing companies use independent bodies (e.g., they might use actual practicing lawyers for the board exam) to determine the quality of performance needed for a passing score. The testing company then selects test questions that will best determine whether or not the examinee has achieved this level - if the independent board selects a high level of performance, then the questions will be 'hard'. So if you are upset about the difficulty level, either study more or complain those on the independent board about their standards. The author's post sounds like a case of sour grapes to me, and shows no real knowledge of testing theory.
Re:Worthless by RyuuzakiTetsuya · 2007-06-17 11:51 · Score: 2, Insightful

I know for many common illnesses, even if we don't know the cause, we do know that if you just sit on your ass for a few days and take care of yourself, you're going to get better.

I don't expect my doctor to actually *do* anything curatively speaking, i just expect him to be on my side when I have to tell my job I'm out for a few days getting over a cold.

--
Non impediti ratione cogitationus.
Re:Worthless by jonaskoelker · 2007-06-17 12:21 · Score: 1

According to my 'fessor, here's The One True Way for grading multiple-choice tests (with proofs and all):

http://www.brics.dk/~mis/multiple.pdf

(yours is not the One True Way)

Ugly as it is, you may want to use the google html as the pdf seems to have permission issues: http://www.google.com/search?q=cache:0BahY4i6HLkJ: www.brics.dk/~mis/multiple.pdf+multiple.pdf+site:b rics.dk&hl=en&ct=clnk&cd=1&gl=dk&client=firefox-a
Re:Worthless by Macgrrl · 2007-06-17 13:01 · Score: 3, Insightful

Here in Austrlia where we have paid sick leave for permanent employees, but typically companies require that you present a doctor's certificate to prove you were sick. So even when you know that you only have a head cold and should be home in bed staying warm and keeping your fluids up, you have to track down and wait in the doctor's office for them to write on a bit of paper that you really are too sick to go to work and that you should be home in bed...

On the flip side, my husband was mis-diagnosed by a number of doctors for over 15 years - he had severe sleep apnea to the point where he was having fits and seizures, memory loss and paranoia. I look like I am finally getting a diagnosis after 20 years of intrusive tests for why I have near constant nausea, indigestion and vomiting.

If the doctors didn't have to sausage factory process all the people who *know* what's wrong and what they have to do, they would probably have more time to spend with people who actually need help.

--
Sara
Designer, Gamer, Macgrrl in an XP World
Re:Worthless by lawpoop · 2007-06-17 13:09 · Score: 1

OTOH, if you can't deal with high-pressure scenarios you should be trying to become most kinds of doctors or some kinds of lawyers. That's a good point. If a person falls apart at any sign of stress, they shouldn't be doing high-stress work. Maybe they can be massage therapists ;)

But in the example of my sisters' friend, she *can* deal with high-stress, high stakes situations. It's just the high-stakes testing, virtual isolation-chamber environment that stresses her out. She's a very social person, and she's good with communication and in groups. What gets her about the test is that she can hear other people writing, but has no clue about what they're writing. Not that she wants to cheat, but psychologically, she relies on the social cues of the group to get along and maintain her energy. She draws energy from other people, and is energized by social interaction. She's not a loner type. Meanwhile I, as a geek, am drained by social interaction (unless they're close friends), and I get energized when I'm alone.

So she probably wouldn't be good as a surgeon, or as an admin trying to troubleshoot a server in the middle of the night, but if she were a lawyer arguing with the other attorney and the judge in the judge's chamber, she would be energized by that encounter, and would probably come out on top.

So all in all, even though she chokes in high-stakes testing, she will probably do well in life and her career because she can handle high-stakes social situations so well. She can win people over, or at least coming out with the other person liking her after the encounter. Better than a geek who can take tests and do individual, technical tasks well, but comes apart in high-stakes social situation.

--
Computers are useless. They can only give you answers.
-- Pablo Picasso
Re:Worthless by kitsunewarlock · 2007-06-17 13:47 · Score: 1

The correct answer is ALWAYS...

"I love you..."

--
Ginga no Rekshiya Mata Each page.
Re:Worthless by phunctor · 2007-06-17 15:05 · Score: 1

Thanks, that's an interesting take, although it responded more to what I actually wrote than to what I wanted you to read. I guess I'll eschew the use of hyperbole henceforth.

However, I'll maintain my position that "don't know" is a vitally important category. Conscious ignorance is the prerequisite for education.

--
phunctor
Re:Worthless by More_Cowbell · 2007-06-17 15:59 · Score: 1

You have an IQ of 140? Then you surely know that 140 != twice as smart as 70.

--
Experience teaches only the teachable. -AH
Re:Worthless by timeOday · 2007-06-17 16:10 · Score: 1

Here's another basic fallacy of his analysis: it's true that on an "impossible" test, everybody would just be guessing and therefore equal. But it's equally true that a test which is too easy fails to discriminate between people because everybody gets near 100%. So obviously there is a sweet spot. The questions should all be in the range of what some (and only some) students will know, spanning the range from what most will know to what very few will know. A "hard" test (on which the average person only gets, say, 60%) can certainly be in that range.
Re:Worthless by icedcool · 2007-06-17 16:48 · Score: 1

Mod parent up double!

--
Most people aren't thought about after they're gone. "I wonder where Rob got the plutonium" is better than most get.
Re:Worthless by mechapants · 2007-06-17 17:21 · Score: 1

I agree with you. Personally I have been to 4 specialists and had a barrage of tests done. 2 years later I finally find out I have rhumatoid arthritis. All because of one damn blood test didn't show "the RA factor" which happens 20% of the time and my dr and arthritis specialist didn't want to "go there just yet" Now all I have to do is wait "until I'm old enough" for a joint partial replacement.
Re:Worthless by Askmum · 2007-06-17 18:41 · Score: 1

Hate to burst your bubble, but over here (the Netherlands) it was always the rule that on a 4-answer multiple-choice, everything lower than 25% correct means 0 correct. Just because of the guessing.
Tests are graded 0 (bad) to 10 (good), so on a 100 question 4-answer multiple choice test, getting 25 questions right you receive a 0. For getting 63 questions right, you get a 5.
Or better: for a 100 question 2-answer multiple choice test, getting 50 questions right still gives you a 0 and you need 75 questions to get a 5.

Please tell me we are not so advanced that that practice isn't used in the US?
Re:Worthless by toadlife · 2007-06-17 19:10 · Score: 1

Not really. If you just say that, they will accuse you of not listening to and patronizing them and then you'll be in even deeper shit.

--
I don't always use unix-like operating systems; but when I do, I prefer FreeBSD.
Re:Worthless by Ed+Avis · 2007-06-17 20:23 · Score: 1

Surely you know that IQ is not a linear scale, and so an IQ of 140 is not 'twice as smart' as one of 50.

--
-- Ed Avis ed@membled.com
Re:Worthless by NewWorldDan · 2007-06-18 01:40 · Score: 1

As a man who's been married nearly 8 years (after dating several others), let me say that women are like global thermo-nuclear war: the only winning move is not to play. Truly a strange game.
Re:Worthless by default+luser · 2007-06-18 03:58 · Score: 2, Insightful

The curve exists as an admission by the tester / instructor that they cannot create a perfect test, and that they cannot fully understand their students prior to testing.

If you fail people for being less than perfect, they won't LEARN anything. This is how you teach people HOW to learn.

--
Man is the animal that laughs.
And occasionally whores for Karma.
Re:Worthless by c0d3h4x0r · 2007-06-18 06:29 · Score: 1

That sounds like my dad's old tipping philosophy:

1. Start your dining experience by laying a stack of one-dollar bills on the edge of the table.

2. Every time the waitperson fouls up, remove a dollar from the stack.

3. Whatever's left at the end of the meal is their tip.

--
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.
Re:Worthless by ShieldW0lf · 2007-06-18 07:35 · Score: 1

No, it exists as an admission by the instructor that they cannot quantify what they are supposed to be teaching, that they can't quantify what the students should know, and they won't be permitted to continue to operate if they don't pass students.

University doesn't teach people how to learn. People teach themselves that, or they don't. It's not even intended to teach them that. It's intended to teach them what to learn, so they don't need to go mucking about in the dark.

I can teach myself anything under the sun, without any university or school at all. And I have, for all of my professional career. I took a quick course to get my foot in the door, but it was a waste of time and money, just like all the others.

I might like to be able to go take a course that would teach me the sort of things I might need to know to solve particular types of problems, so I don't need to spend the extra time finding things out the hard way.

But the schools aren't to be trusted. They can't even quantify what I ought to know at the end enough to be able to test me and see if I've learned it or not. So why pay them? At this point, the strongest motivation I can see to get any further education is the social opportunities it might afford.

But feel free to give them all your money if it makes you happy.

--
-1 Uncomfortable Truth
Re:Worthless by eugene+ts+wong · 2007-06-18 13:35 · Score: 1

He's throwing as fast as he can. Cut him some slack. ;^)

--
testing out my trending skills
Re:Worthless by swillden · 2007-06-18 15:09 · Score: 1

You're vastly overestimating the effect of "luck" on the outcome of a sufficiently lengthy hard test. Assuming there are 300 questions, he correctly answered 55% of them, that means that he's got to guess on 135 questions. 95% of the time, random guessing on those 135 questions will get him between 23 and 43 correct answers*, so 95% of the time he'll get between 63% and 69%. Getting a full third of his guesses right will be unlikely (happen about 4.5% of the time), but within the realm of reality.
Your two other examples, however -- having "bad luck" and getting = 68 of 135 -- are very, very unlikely. I mean winning-the-lottery kind of unlikely. Getting more than 50 (72% score) or less than 18 (61% score) is a one in a thousand shot.
As the number of questions increases, the effect of luck diminishes pretty rapidly. Granted that using a system that penalizes wrong answers in an attempt to neutralize guessing is better, for sufficiently lengthy tests it doesn't matter as much as you might think. Assuming 75% is passing, it'll be a vanishingly rare event that someone who knows less than 60% of the answers passses or that someone who knows more than 70% of the answers fails.
[* To come up with these numbers I'm assuming the guessing follows a binomial distribution (135, 0.25), and approximating the number of successses as a normal distribution, which is reasonable according to the usual rule of thumb (np and np(1-p) both >= 10) and then looking at standard deviations (~5 questions) from the mean (135 * 0.25 = 33.75 questions). Oh, and this analysis also assumes that the test-taker either knows the answer or has no idea at all and must guess completely, but that assumption was already implicit in the previous discussion. ]

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:Worthless by swillden · 2007-06-18 15:12 · Score: 1

having "bad luck" and getting = 68 of 135
Doh! Forgot to use the right encoding of the symbols (and forgot to preview, obvously). That should have read:
having "bad luck" and getting <= 6 of 135 or having "good luck" and getting >= 68 of 135

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:Worthless by phorm · 2007-06-18 17:21 · Score: 1

We get that where I work, but only if you've been sick off work for more than X days in a row, or more than Y in a year. It somewhat makes sense to me, since if you're off for 3 days you might want to get a professional diagnosis anyhow.

When I was a boy... by WFFS · 2007-06-16 18:59 · Score: 4, Insightful

Stories like this could never get on Slashdot. Seriously, this is like a maths problem I'd give to my Year 9 kids. This is definitely not news, and certainly doesn't matter.

Re:When I was a boy... by PDAllen · 2007-06-16 21:48 · Score: 1

It's also not a good analysis. So, when you design one of these multiple choice tests you should have in mind what you're trying to test. You probably want a pass mark to be around 70/100. You can do scoring in three ways: one, just count the right answers, two, right answers minus wrong answers, three you can have a weighting (so there may be a best answer for +3, an OK answer for +1 and a bad answer for -5 or something).

You don't really care about ranking people, you just want a pass or fail (or maybe one or two more groupings, but not too many). So trying to see whether the guy who got 1/100 is really a lot worse than the guy who got 2 is not interesting, you don't want either guy.

If you want to test knowledge and a bit of (basic) problem solving, then you can just add up the right answers. If you know a lot, you'll know more answers and you'll know more wrong answers so you'll make better guesses for the questions you don't know. This isn't always appropriate, though. It's probably right for a law test, where you know that no lawyer will know all the law, but they might sometime be asked to give an answer now on some point they haven't learned - and they'll need to say something that at least doesn't make them look like an idiot in court. But if you are testing doctors, say, you probably do not want them to be making guesses about the right drug to put into someone, because the consequences of getting it wrong are bad. So it's probably a good idea to penalise for wrong answers here.

Of course, if you start weighting answers you can make the test do whatever you want, but then it's a real pain to design.

Yuck by venicebeach · 2007-06-16 19:00 · Score: 1

It's hard to believe this guy is really a mathematician. I read this with interest as I teach college classes and have to give tests. However, there's not much content in the article.

His point about only counting the correct answers is rather silly. In a test where each question is either right or wrong, counting the wrong answers into the score does not add any information (you can tell how many are wrong if you know how many are right). The only thing it does is change the scaling of the resulting scores. This only makes a difference if you have an issue interpreting the scores. He seems to want the scores to proportional to the amount of knowledge someone has, so that if I have twice as much knowledge as you my score is twice as high. But in the example case of a professional qualifying examination, all that matters is whether or not you achieve some minimum. Whether that is represented as % correct or % correct - %incorrect/2 really makes no difference.

Designing better tests generally involves moving beyond multiple choice, not manipulating the scoring process.

Re: Yuck by reason · 2007-06-16 19:16 · Score: 3, Insightful

You're missing the point. Counting only correct answers on a multi-choice test doesn't measure what you know, or whether you have the necessary minimum knowledge.

With 4 choices for each question on a 100 question test, the average student (student A) who knows 50% of the answers will get at least 62 correct if they guess entirely at random when they don't know the answer (50 plus 50/4 correct guesses). The average student who knows only 25% of the material (student B) will get at least 44 correct using the same approach (25 plus 75/4). Although A knows twice as much as B, A's score is only 40% better (not 100%).

Of course, it's even worse than this. First, because there is a large degree of scatter: a student choosing at random might do much better or much worse than this. Second, because multi-choice questions are often structured so that half of the possible answers are obviously incorrect, which changes the odds.

With only two plausible answers to choose between, A might get 75 correct and B might get 63: in this case A, who knows twice as much as B, gets a score only 19% better than B.

If points are subtracted for incorrect answers (say -1/4 pt to -1/2 for each one wrong), the effect of guesses can be taken out of the equation so that differences in scores actually reflect differences in knowledge. Or if the questions are easier, a smaller proportion of both students' answers will be guesses, so the effect should be smaller.
Re:Yuck by KDR_11k · 2007-06-16 19:20 · Score: 2, Insightful

Subtracting points for wrong answers is supposed to encourage students to skip a question if they don't know what to say rather than give a wrong answer. If someone gets 48% right from his knowledge he can't spray and pray for the remaining 2%.

--
Justice is the sheep getting arrested while an impartial judge declares the vote void.
Re:Yuck by suv4x4 · 2007-06-16 19:22 · Score: 1

His point about only counting the correct answers is rather silly. In a test where each question is either right or wrong, counting the wrong answers into the score does not add any information (you can tell how many are wrong if you know how many are right).

You're wrong. There are three ways you can handle a question: answer correctly, answer wrongly, not answer.

The fact that tests count only correct answers means the subtle difference between not answering, and answering incorrectly is lost.

Take for an example two extreme situations (just for illustration). Bob and Jack have 50% correct answers. Bob has the rest 50% in wrong answers, and Jack has the rest of the 50% unanswered.

In essence Bob got half of his questions wrong, which in a true/false test is the statistical expectancy if you're just guessing randomly. He could be a monkey clicking a button for all we know.

Jack answered questions only correctly, and left those he didn't know unanswered. Thus, he has some knowledge on the subject, as he didn't guess, or guessed very little.

I'd hire Jack.
Re:Yuck by Anonymous Coward · 2007-06-16 19:30 · Score: 1, Funny

he can't spray and pray

Yeah, but if he fails the test he will always have a job as the whino on the street corner who shouts about Jesus while urinating in public!
Re:Yuck by Bastard+of+Subhumani · 2007-06-16 19:46 · Score: 1

There are three ways you can handle a question: answer correctly, answer wrongly, not answer.
Aren't there are some tests where you can't skip an answer - I thought the computer based GMAT was like that?

But most tests are like you say - and Jack can take advantage of his ability to know that he doesn't know (which proves he knows something!).

--
Only three things are certain; death, taxes, and apocryphal quotations - Ben Franklin.
Re:Yuck by kinabrew · 2007-06-16 21:48 · Score: 1
In that case, you have two choices:
1. Don't guess, and guarantee that you will fail, or
2. Guess, and take the chance(however slim it may be) that you will pass.
I don't know about you, but if I were put in that situation, I'd still "spray and pray".
Re: Yuck by The+One+and+Only · 2007-06-17 01:08 · Score: 1

On the other hand, an educated guess has a higher EV than an uneducated guess, so a more knowledgable test-taker does better for that reason too. Suppose the question is as follows:

Which of the following federal acts was part of the last to attempt to regulate the issue of slavery to the satisfaction of both the North and the South?

(a) The Wilmot Proviso
(b) The Missouri Compromise
(c) The Fugitive Slave Act
(d) The Hawley-Smoot Tariff

I can cut down the odds to 1/3 if I remember that the Hawley-Smoot Tariff was a Depression-era effort that had nothing to do with slavery. I can cut the odds down further if I remember that the Fugitive Slave Act was passed 30 years after the Missouri Compromise happened, or that the Wilmot Proviso was a unilateral push.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re: Yuck by An+Onerous+Coward · 2007-06-17 02:39 · Score: 1

You seem to be falling into the trap that so many people (including the original author) do: assuming that the test must be meaningless unless the "amount of knowledge" is proportional to the test score. Why does it matter that "Knowledge Candidate" only gets a 19% lower score than "Double Knowledge Candidate", if the only way for the first guy to have a shot at matching 2KC's score *is to double his own knowledge*?

Just use some creativity in interpreting the scores, and it mostly works out fine.

--
You want the truthiness? You can't handle the truthiness!
Re: Yuck by fredklein · 2007-06-17 03:11 · Score: 1

Or you can just know the definition of 'compromise'.
Re: Yuck by EvanED · 2007-06-17 06:57 · Score: 1

If points are subtracted for incorrect answers (say -1/4 pt to -1/2 for each one wrong), the effect of guesses can be taken out of the equation so that differences in scores actually reflect differences in knowledge.

How 'come you say that you can take out the effect of guessing if you subtract penalty points, but not if you don't:

First, because there is a large degree of scatter: a student choosing at random might do much better or much worse than this.

You can still do a lot better or worse than 0% if you're guessing.

I've never tried to work this out, but my intuition on this subject (and I have thought about it before) is that the two options are probably isomorphic; I suspect scores would follow the same distribution just scaled and shifted so that 25% in the "only count right answers" lands on 0% in the "penalize wrong answers" test. This may be right, or it may not, but it's definitely not clear (even after reading the "article") that this is the case.

There is one important caveat, and it illustrates what I see as the *real* benefit of this: it means that if you are only 2/3 of the way through the test and time is up, you don't have to run through and quickly fill in guesses for the remaining 1/3 of the test. My isomorphism probably only works (if it does at all) if you assume that any unanswered questions in the "penalize wrong answers" test are randomly filled in.
Re: Yuck by The+One+and+Only · 2007-06-17 08:56 · Score: 1

Yeah, but that would lead you to the wrong answer, since the right answer (the Fugitive Slave Act) doesn't have the word "compromise" in it (although it is part of the 1850 Compromise).

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re: Yuck by reason · 2007-06-17 11:43 · Score: 1

Why does it matter

It matters because, although "knowledge candidate" may be unlikely to match the score of "double knowledge candidate", he has a good chance of beating the score of "110% knowledge candidate" purely through random chance. The test may still distinguish between extremes, but it doesn't do so well at sorting closer candidates.
Re: Yuck by An+Onerous+Coward · 2007-06-18 03:41 · Score: 1

That reasoning only works if you buy into the author's bogus thinking. You have to believe that the questions are all equally hard. If you've got a range of difficulties, and the test is designed for maximum sensitivity at about KC's level of knowledge, then it can distinguish very well indeed.

You also have to assume that the purpose of a given test is to strictly rank every single individual according to their relative knowledge. It isn't. It doesn't matter that the average six year old and the average fifteen year old would perform equally poorly on the MCAT, even though the fifteen year old knows vastly more than the six year old. The test performs its intended function: to identify the fact that neither is ready for medical school. Nor does it matter if all the "easily qualified" testers end up with the same score, even if there is a wide range in their actual understanding.

No, most well-designed tests are intended to discriminate between a fairly narrow range of performance, that range encompassing whatever the gatekeepers consider the cutoff for "hacking it" (whatever "it" may be).

So if KC and 1.1KC cannot be consistently distinguished by a multiple choice test, it's likely that both candidates are well outside the range that the designers consider important.

Anyhow, the difference between KC and 1.1KC isn't that big to begin with. If two peoples' knowledge bases are close enough that one has a good chance of beating the other just due to the noise introduced by guessing, then it's probably not all that important to distinguish between the two. Other factors like work ethic, creativity, and people skills are going to have a greater effect on the applicant's actual job performance.

The original author is railing against multiple-choice strawmen. Why you're getting caught up in his crusade is a mystery to me.

--
You want the truthiness? You can't handle the truthiness!

Math Wrong? by AntiNazi · 2007-06-16 19:02 · Score: 1, Interesting

2 + 49 - (49 ÷ 2) = 75.5?

Seems like he added rather than subtracted the (49/2). Pretty much ruins the whole argument.

Re:Math Wrong? by AntiNazi · 2007-06-16 19:32 · Score: 1

Except that he did subtract (as the formula he gives would have him do) in the other portion of the comparison.
2 + 49 - (49 ÷ 2), or 75.5 In the first one he adds the penalty for wrong answers. 2+49-24.5=26.5
1 + (97 ÷ 2) - ((97 ÷ 2) ÷ 2)), which comes out to be 25.25 In the second he subtracts the penalty. 1+48.5-24.25=25.25

Doesn't seem like a typo to me, or I could be the idiot.
Re:Math Wrong? by Gromius · 2007-06-16 19:43 · Score: 1

Yes he completely stuffed the maths on this one. Really really badly, its kinda weird. He makes a good point but his mathematical example is laughable and incorrect and shows some serious misunderstanding of the effect he is trying to describe although maybe it was just a passing mental block on it.

Anyway what he wanted to do was weight the subtraction factor so that if you have no knowledge and just guess everything you should get on average a score of zero. For true or false this subtraction factor is 1.

Therefore in his example:
Person A with twice knowledge of person B knows 4, guesses 96, gets 48 right, 48 wrong so 4+48-48=4
Person B knows 2, guesses 98 so score is 2+49-49=2

and we have the test showing person A has twice the knowledge (gets tricky with his example of person B knowing only 1 question). The easier the test is the more accurate it will become to tell who has the most knowledge until the point where person B starts to know more than half the questions at which point the test saturates as person A knows more than the test is probing. So there is an optimum test difficulty.
Re:Math Wrong? by DerekLyons · 2007-06-16 19:46 · Score: 1

Not only that - I stopped reading when I realized the logic of the example depended on the ludicrous proposition that "knowing twice as much" was somehow a quantifiable and testable quality.

Especially as one of the key elements of a properly written test is the wording of the questions - which is generally specifically written to confuse the "guesser". Very few people will actually (as the article presupposes) simply randomly choose an answer, most will attempt to read the question as a guide for their "guesses".
Re:Math Wrong? by fatphil · 2007-06-16 20:41 · Score: 1

I particularly like the way that he stressed /how much/ this imaginary 3:1 ratio demonstrated his point.

So, yet another blog post is liberally splattered with illogical and innumerate bullshit. Why am I not surprised?

--
Also FatPhil on SoylentNews, id 863
Re:Math Wrong? by Nazlfrag · 2007-06-16 22:24 · Score: 1

He also stuffed the second one, it should be
1 + (99/2) - ((99/2)/2) = 25.75
Re:Math Wrong? by tricorn · 2007-06-16 22:46 · Score: 1

Alternatively, score a wrong answer 0, a correct answer 1, and an unanswered question 0.5 (for T/F; score 0.2 for a 5-choice, etc). Expected score for 100 questions for no knowledge is then 50 if they're all T/F. The real problem with the analysis is expecting the score to be linear with the "amount of knowledge". It also is relying on a totally unrealistic test where the top scorers only know the answers to 2 questions. I'll bet that the top scorers on any "really hard test" knew a whole lot more than 2% of the questions.

There's nothing wrong with simply scoring correct answers 1, all others 0, unless you're specifically trying to test for the ability to know when you have no idea what the answer is. How you score it will change the raw scores, but those don't matter. Validating the test is still important, and from that process you'll find out what a passing score should be. Hint, there's no reason why 70% or any other fixed number should be passing.

One advantage of scoring only correct answers is that it removes a psychological pressure to not guess, and scoring guessing is actually a fairly good way of determining knowledge level. If you know nothing about the answer, you'll get it right 50% of the time. If you know just a little bit about a question, you may raise that to 60%, if you're familiar with it, you might get it right 80% of the time, and if you know it solid you'll get it right 95-100% of the time. If someone only answers when they feel very sure of the answer, you lose some of that ability to discriminate between knowledge levels.
Re:Math Wrong? by Scyber · 2007-06-17 00:09 · Score: 1

No, but it makes his point invalid b/c if the math is done correctly, the scores are nearly identical even though one person knew twice as much as the other.
The only real way to eliminate guessing on multiple choice tests is to simply get rid of multiple choice tests.
Re:Math Wrong? by locofungus · 2007-06-17 01:31 · Score: 1

"One advantage of scoring only correct answers is that it removes a psychological pressure to not guess, and scoring guessing is actually a fairly good way of determining knowledge level. If you know nothing about the answer, you'll get it right 50% of the time. If you know just a little bit about a question, you may raise that to 60%, if you're familiar with it, you might get it right 80% of the time, and if you know it solid you'll get it right 95-100% of the time. If someone only answers when they feel very sure of the answer, you lose some of that ability to discriminate between knowledge levels."

I think the underlying problem he's trying to bring up is that on "very hard" tests there will be some questions "everybody" can answer and some questions "nobody" can answer.

The extreme outliers on either end will be correctly filtered by the test but the majority in the middle will be sorted more by random noise that by knowledge.

For tests where the "pass mark" is "top 30% of takers" with a large proportion of the takers answering basically the same questions and guessing on the rest the excellent people will correctly pass, the useless people will correctly fail but the rest will pass/fail based as much on luck as on knowledge.

Tim.

--
God said, "div D = rho, div B = 0, curl E = -@B/@t, curl H = J + @D/@t," and there was light.
Re:Math Wrong? by tricorn · 2007-06-17 05:30 · Score: 1

A "really hard test" will have no questions that "everybody can answer", will have some questions that almost nobody can answer, and will have a high enough pass level that it tends to filter out people who are marginal rather than being generous. There will be enough questions on the test that the probability of someone passing simply through lucky guessing is going to be very low. The more questions there are, the lower the margin of uncertainty. "very hard tests" aren't necessarily good tests, though.

Look at it this way: if you score it so that someone randomly answering gets an expected score of 0, or a score of 50, or whatever score, it's the variation FROM that expected score that makes the difference. Whatever the penalty for guessing is, if everyone answers ALL questions regardless of how sure they are of the answer, you'll get the same distribution of scores (the actual values won't be the same, but the distribution will be). How well the test discriminates will be based on the number and quality of the questions, not the scoring method.

If there's no penalty for guessing, or a penalty that's less than the expected return, then a rational test taker should ALWAYS guess, so you get the above scenario. If, however, the penalty for guessing is more than the expected return for a random answer, then the rational test taker must evaluate the certainty of their answer, and only answer if that is above the expected return rate. In that case, the ONLY additional thing you are measuring is the accuracy of the test taker's self-evaluation of their knowledge level on each question. If that's important, then do it (or do it for a portion of the test), but know that means it is being less discriminatory of the actual knowledge level.

A penalty that is less than the expected return rate (e.g. no penalty) penalizes someone who runs out of time to randomly fill in the blanks, which is probably not what you want to test, so a scoring method that gives exactly the expected return for an unanswered question is appropriate. Thus, score 0 for an incorrect answer, 1 for a correct answer, 0.5 for an unanswered T/F, 0.2 for an unanswered 1-of-5 multiple choice, etc. However, giving a score of 0 for an unanswered question should return close to the exact same result for everyone who does the expected behavior of answering every question; there will be some random noise inserted, but over several hundred questions that should be fairly low.

Any test is going to have problems with people right at the margin. A "hard test" will make sure that ONLY those people who are qualified pass (thus eliminating some candidates who were qualified but unlucky), an "easy test" will make sure that all qualified candidates pass (thus allowing in some people who weren't qualified but were lucky). For a pass-fail test, ALL the questions should be able to be answered by a qualified candidate, and NONE of the questions should be able to be answered by all of the unqualified candidates (i.e. if you take a sufficiently large number of unqualified candidates, NONE of the questions should be able to be answered by all of them, and ideally all of the questions would have about the same level of incorrect responses).

If the purpose of the test is diagnostic, then you'll have a wide range of very easy to practically impossible questions, designed to identify a particular area of deficiency. However, NO question should be able to be answered by everyone, that would be a useless test question that only measures ability to accurately fill in a circle on the test sheet (although even that might be useful to identify people who have problems taking tests). With such a test, you might give a higher score for a more difficult question (e.g. instead of 0, 0.2 and 1 for wrong, no answer and correct on a 1-of-5 multiple choice, you'd give 0, 1 and 5).

Yay for vapid blogs. by Kuroji · 2007-06-16 19:04 · Score: 1

And now we know why this man is a former mathematician. This is just bad math.

The fallacy of penalizing guessing by iamacat · 2007-06-16 19:06 · Score: 1

Suppose the test is really hard and contains many answers which are wrong, but can be thought as correct by a person who is moderately knowledgeable about the question. Now if you penalize guessing, I may answer 20 questions correctly and 80 with "reasonable" answer which are not correct, my score is 0 assuming 4 questions per choice. On the other hand, someone who answers 10 questions correctly and puts random guesses for the other 90, will likely get a score close to 10.

Basically, multiple choice tests which are so hard that even successful candidates will get most questions wrong are worthless. Consider also the potential of undetectable fraud if, say the janitor cleaning instructors room leaks questions in advance.

Statistical exam using Multiple choice by dybdahl · 2007-06-16 19:06 · Score: 1

I haven't had many exams with multiple choice, but my university statistics course was one of them.

Each question had 5 options, and only one was correct. A correct answer gave 5 points, an incorrect answer gave -1 point.

Now, as the smart reader can guess, 4 x -1 + 5 = 1, so guessing still pays off... especially if one or more of the questions are very unlikely to be correct.

Did the teacher design this test incorrectly, since guessing was rewarded? Well, actually, the only test of real-life application of statistical knowledge was to understand this, so those who started to guess, basically demonstrated their statistical knowledge, and I guess that should be rewarded.

One of the questions was about the outcome of a distribution, where the value should be looked up in a distribution table that was used by the course. Only one of the 5 options was in the table as a result value. That made this one easy :-)

Re:Statistical exam using Multiple choice by crossmr · 2007-06-16 19:17 · Score: 1

And those who didn't understand it but guessed anyway were just as rewarded...
Re:Statistical exam using Multiple choice by gnasher719 · 2007-06-16 22:32 · Score: 1

'' I haven't had many exams with multiple choice, but my university statistics course was one of them.

Each question had 5 options, and only one was correct. A correct answer gave 5 points, an incorrect answer gave -1 point.

Now, as the smart reader can guess, 4 x -1 + 5 = 1, so guessing still pays off... especially if one or more of the questions are very unlikely to be correct. ''

If you were very confident about the questions where you thought you knew the right answer, then guessing would slightly increase the expected value, but largely increase the variance. So the question would be: Do you think you can pass without guessing? If the answer is yes, then don't guess - it increases the variance and therefore your chance of failure. But if you didn't know enough right answers to pass, then you should guess.
Re:Statistical exam using Multiple choice by AngryJim · 2007-06-16 22:38 · Score: 2, Funny

-1 for an incorrect answer? That's pretty weak. As an air traffic control student, our wrong answers get punished with planes full of dead people.
Re:Statistical exam using Multiple choice by An+Onerous+Coward · 2007-06-17 02:47 · Score: 1

Unless the test is *on* statistics, the only fair thing is to describe the optimal guessing strategy to the class in advance (whatever that strategy may be).

Say you and another equal candidate were being tested on your knowledge of Victorian poetry. The teacher says "you're punished for guessing." You recognize that you're still better off guessing, so long as you can eliminate two of the answers. You'll very likely get a higher score than another person who knows just as much as you do, which undermines the validity of the test.

--
You want the truthiness? You can't handle the truthiness!

only multiple choices ? by koxkoxkox · 2007-06-16 19:09 · Score: 1

I am a french student and we have very rarely, if any, multiple choices questions (QCM in french) in our exams. When there are some QCM, like in the maths test of the baccalauréat, it counts only as a small part of the final grade and it is very recent. The only QCM-only test I passed was the TOEFL.

Is it that common in the US ? Is it common even outside scientific studies ?

Re:only multiple choices ? by laffer1 · 2007-06-16 19:41 · Score: 1

Yes, its very common both in k-12 and higher education. Even in college, at least half of the questions asked on exams are multiple choice. I've seen them in History classes, English classes, Computer Science, Religion courses, Math and Economics. Traditional sciences as well.

English courses are the strangest. I took a literature course and our final was a multiple choice test on several books we had to read. I felt like I was in 4th grade all over again. However, my personal experience has been that English and Computer Science classes are the most likely to use short essay questions. It's not uncommon to write code or define terms on a CS test.

--
MidnightBSD: The BSD for Everyone
Re:only multiple choices ? by Kadin2048 · 2007-06-16 21:07 · Score: 1

Yes, they're big in the U.S. Particularly at large universities, I'd wager that they are the dominant form of testing.

I went to a very small college for undergrad, so my experience is different, but I've had friends who went to huge state unis, and describe many classes where there was virtually no interaction with the professor besides multiple-choice tests. They are heavily used because they can be easily graded via automated systems. (Fill-in-the-bubble, aka "Scantron" forms, usually.) All quizzes, tests, and the final would be on Scantrons, all 4- or 5-question multiple choice.

For the professors, they're great. All they have to do is make up one exam, with the answers marked on it, and pass it off to a teaching assistant to make up the "master" sheet that has the correct responses. Then they have the students do their exams on ScanTrons, and have the TA feed them into the machine. The machine does the grading, marks the incorrect responses (some will even print the correct response next to it), and can produce a score report so the prof can re-jigger/curve grades as necessary. Most systems can even cope with multiple versions of a test (to deter shoulder-surfing in the exam room). The professor never even need to look at a student's work, and it doesn't require any infrastructure like computer-based testing does.

--
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
Re:only multiple choices ? by fbjon · 2007-06-16 21:19 · Score: 1

I've never even seen a multiple choice test in university here (Finland). They got left behind when I graduated high school. I think the only way they should be used is for very large amounts of test-takers (like national high-school graduation exams), or poll choices, it just doesn't match up to a Real test. That said, in a large test, having one or two questions to be simple multiple-choice should be ok.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:only multiple choices ? by Hognoxious · 2007-06-16 22:38 · Score: 1

The great thing about open ended, essay type questions is that the scoring is much more subjective. This makes it much easier to ensure that the son of the local mayor gets the score he deserves.

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Re:only multiple choices ? by pla · 2007-06-16 23:14 · Score: 1

Is it that common in the US?

Not in most actual coursework, and when used, almost never the "very hard" form that the FP decries as mathematically unfair. Usually in-class multiple choice tests do nothing more than weed out those who never cracked the book - If you read even the chapter summaries of the relevant material, you should get very nearly 100%

We do, however, use multiple choice in virtually all of our standardized testing, from MATs (grade school evaluations, though they don't directly impact the students' grades) to SATs (a nationwide standardized college qualifying exam, basically), to professional licensure/certification exams.

Although most responses so far have damned the FP author for his point, I think it valid under his stated condition of a "very hard" test. If you would expect the average well-prepared test-taker to get 75% correct (in the US, we call that a "C" and consider it the average grade), multiple choice works pretty well as a means of avoiding bias in scoring (and while the choice of questions and answers may allow for poor wording, that same flaw applies equally well in any form of test short of a practical - which in most cases counts as very impractical). If, however, you expect most knowledgeable respondants to score only slightly higher than a random set of answers, then the FP has it right.

I would also mention that for the SATs and most professional exams, we don't use the "dumb" form of multiple choice. The SATs count wrong answers against the testtaker; many computerized professional exams use a curious format where you have a set number of questions but each question counts for a number of points roughly proportional to its hardness, and as you answer questions correctly, the hardness increases so you can get more points quicker (or decreases so you can get fewer points slower... so you could pass with perhaps 20 correct "hard" answers, or 80 "easy" ones).
Re:only multiple choices ? by jo7hs2 · 2007-06-17 01:48 · Score: 1

Yes, they are very common because they are so much less intensive to grade. Just run the sheet through the machine, and bam, you've got grades.

In legal circles, we have something called a multi-state bar exam format question. It is essentially an essay question that they've decided to have you answer in multiple choice format. Hellishly unpleasant and time-consuming if you aren't just guessing.

There may be unanswered questions by dybdahl · 2007-06-16 19:10 · Score: 3, Interesting

If you have 100 questions, and 20 right ones and 20 wrong ones, it leaves 60 unanswered questions.

That's why the articles talks about only counting right ones. In order to avoid guessing, there should be a difference between picking a wrong answer and not picking an answer at all.

Re:There may be unanswered questions by Bigos · 2007-06-16 20:38 · Score: 2, Funny

Somebody has done it before. I applied for a job as an English language teacher, and a lady interviewing me said that it is company's policy to test every applicant no matter what certificates and diplomas they have. So i was given the test quickly done the 2/3 of it and then discovered that in the most difficult rest all the answers were wrong. I noticed that some of the answers were SLIGHTLY INCORRECT, so after correcting them i marked them accordingly I have passed pack the test paper. Later the lady told me she was impressed with my test results, as few people saw the trap in the test.
Re:There may be unanswered questions by OverlordQ · 2007-06-16 20:47 · Score: 1

Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said.

--
Your hair look like poop, Bob! - Wanker.
Re:There may be unanswered questions by DoctorFrog · 2007-06-16 21:17 · Score: 1

In my military training school I was doing very well and decided to test a rumor I had heard, so I deliberately answered every question incorrectly. Lo and behold, I was awarded a score of 100 for my efforts! :)
Re:There may be unanswered questions by UnxMully · 2007-06-16 21:20 · Score: 4, Insightful

Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said.

Fate, it seems, is not without a sense of irony.
Re:There may be unanswered questions by digitig · 2007-06-16 21:45 · Score: 1

Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said. I think that shows your need of a good English teacher. I found the posting clear and well-expressed. Or, to put it in your terms, "Up yours dumbass. It was good."

--
Quidnam Latine loqui modo coepi?
Re:There may be unanswered questions by Sancho · 2007-06-16 22:11 · Score: 1

I think that shows your need of a good English teacher. It never fails....
Re:There may be unanswered questions by Karganeth · 2007-06-16 22:25 · Score: 1

In a maths test I took at my school they used this. In the first 15 questions, if you got the answer wrong you lost a mark. If you got it right you gained 5 marks. There were 5 possible options to tick. in the final 10 questions, the harder questions, you lost 2 marks for getting it wrong and gained 6 for getting it right. It seemed to work, because I scored the highest in my school on it and I believe I'm quite good at maths.
Re:There may be unanswered questions by Smauler · 2007-06-16 23:01 · Score: 1

Seems like the people who set the maths test should look at their maths. With 5 possible answers for the first 15 questions, guessing randomly is better than leaving no answer. If you know nothing about maths, so know none of the answers, and guess at them all, you'll get on average one in five, ie. 3, right, and 12 wrong, leaving you with 3 points. I guess if they want that to be part of the test (ie. know which questions it's beneficial to guess at), then fair enough.
Re:There may be unanswered questions by Smauler · 2007-06-16 23:07 · Score: 1

When pointing out someone else's poor grammar, you should probably check whether it is poor grammar first (hint : it's not). However, the GP must have read a different post to the one I did - I skim read it and had to re-read it in at least 3 places because of typos (I'm hoping) or poor grammar.
Re:There may be unanswered questions by Threni · 2007-06-16 23:20 · Score: 1

Not really. Even with errors, his comment is easier to parse and understand than the never ending bloated verbosity he was replying to.
Re:There may be unanswered questions by UnxMully · 2007-06-16 23:27 · Score: 1

So you don't find it ironic that a grammar-based flame includes a grammatical error?
Re:There may be unanswered questions by digitig · 2007-06-17 00:16 · Score: 1

"Pack" is almost certainly a typo for "back", even though the keys are not close on a qwerty keyboard. "In the most difficult rest" is perfectly conventional grammar, if a little archaic. For the rest, /. probably isn't the place for a couple of thousand words on the Russian formalists' concept of foregrounding, Bakhtin's theory of centrifugal and centripetal influences on language and Halliday's work on functional linguistics, though I could write them if necessary. Suffice to say that bad writing breaks the rules, good writing follows the rules, and the best writing breaks the rules. Do you object to the fact that of the first fifteen "sentences" in Dickens' "Bleak House" not a single one has a principal verb?

--
Quidnam Latine loqui modo coepi?
Re:There may be unanswered questions by someone300 · 2007-06-17 00:20 · Score: 1

Interestingly, the UK Maths Challenge does this. The first 13 questions are relatively easy and you get 5 marks for getting them correct, 0 marks for getting them wrong. The next 10 questions are harder and worth 6 points, but you get -1 or -2 marks for getting them wrong. It really does, in my personal experience, ensure that the people who are great at maths get great scores. There is definitely a correlation between what me and my teachers rate the skill of a particular student, and the score they get in the UKMT Maths Challenges. A far better correlation than what is shown between scores in A level and their skill, which tend to be closer to what the person scores in a memory test.
Re:There may be unanswered questions by Firethorn · 2007-06-17 00:26 · Score: 1

You have to remember the desired results here...

They want to reward the smart kids, but god forbid if the football players don't pass...

I've seen a number of tests like this. Getting 60-70%? Extremely easy. Getting 80-90%, tough. 100%? Almost impossible.

Simply mix in the appropriate proportion of easy and difficult questions.

--
I don't read AC A human right
Re:There may be unanswered questions by The+One+and+Only · 2007-06-17 00:48 · Score: 1

I applied for a job as an English language teacher

Um...

... and a lady interviewing me said that it is company's policy to test every applicant...

Should be either "company policy" or "the company's policy", but never "company's policy"--that construction only works for proper nouns, not common nouns.

So i was given the test quickly done the 2/3 of it and then discovered that in the most difficult rest all the answers were wrong.

Should be: "So I was given the test,, quickly did 2/3 of it, and then discovered that in the remainder of the test, which was more difficult, all of the answers were wrong." You missed a few commas. Also, "in the most difficult rest" is awkward and confusing.

I noticed that some of the answers were SLIGHTLY INCORRECT, so after correcting them i marked them accordingly I have passed pack the test paper.

You missed another capitalized "I", and I have no idea what "I have passed pack the test paper" even means. If you're teaching the English language, I fear tomorrow.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:There may be unanswered questions by The+One+and+Only · 2007-06-17 00:55 · Score: 1

"In the most difficult rest" is perfectly conventional grammar, if a little archaic.

It's also very awkward, rarely-used with the word "rest' in that sense, and, because we are unaccustomed to seeing the word "rest" abused like that, our brains are prone to substitute a couple missing pixels and read it as "the most difficult test", which is a whole other continent of confusion. As for "the best writing breaks the rules"--someone's writing a post on Slashdot relating a personal experience, not creating some innovative work of literature. You clearly didn't try to be Dickens when you wrote that comment, and wisely so.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:There may be unanswered questions by Threni · 2007-06-17 01:10 · Score: 2, Insightful

No - it would only have been ironic if his mistake had rendered his comment incomprehensible.
Re:There may be unanswered questions by UnxMully · 2007-06-17 01:21 · Score: 1

Mmmkay, I'll take your word for it.
Re:There may be unanswered questions by Bigos · 2007-06-17 02:12 · Score: 1

don't worry, I am unemployed now, i was doing teaching in 1998/1999 and was sacked before the end of the term. i got the job because in the country that I lived the they didn't have enough people who knew enough english to teach at beginners level. I have lost the job for a different reason that you might think.
Re:There may be unanswered questions by Sancho · 2007-06-17 04:01 · Score: 1

Yeah. I never did learn not to post while taking cough syrup. Apologies to the original poster.
Re:There may be unanswered questions by pairo · 2007-06-17 06:48 · Score: 1

It's not a grammar-based flame. It's an "incoherent babble-based" flame.
Re:There may be unanswered questions by pairo · 2007-06-17 06:56 · Score: 1

Actually, "in the most difficult rest" is plain wrong, if he meant "in the rest of the test, which was more difficult". In the most difficult rest implies there were more sections left and he's talking about one of them that was particularly difficult. That is, if it's not some silly idiom. :-)
Re:There may be unanswered questions by The+One+and+Only · 2007-06-17 08:53 · Score: 1

Yeah, the fact that we have a difficult time interpreting it at all is a good reason to rephrase it.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:There may be unanswered questions by Bigos · 2007-06-17 21:28 · Score: 1

it wasn't that ;-), probably you will never guess what it was
Re:There may be unanswered questions by Bigos · 2007-06-17 21:39 · Score: 1

I live in Salford, England now, so I don't need english teacher, I can learn from the locals, init?

Education in taking the test by MagicDude · 2007-06-16 19:10 · Score: 4, Insightful

As a medical student, I know how much our education is divided into what we do in real life, and what is the proper answer for exams. Quite often, during our education exercises, we're given senarios like "A patient presents with symptoms X, Y and Z. What do you do next?". At that point, that's when the resident says "You would diagnose condition A from those symptoms, but for the exam, you'd say you'd get an MRI to rule out B". So many questions are basically having intuition for where the question is guiding you too, rather than practical medicine. Often, it's extremely difficult to discern what the question wants. There will be some question along the lines of "A patient presents with general fatigue over the past 3 months, which one blood test do you want to order?" and you'll narrow down the answer choices to either thyroid stimulating hormone, or a complete blood count, both studies are equally important in the evaluation of fatigue, but the question wants you to know which one is more important. In real life, you would always get both because both conditions fairly common, and you want to evaluate both at once to save the patient time and effort. However, the question will nail you if you don't know some obscure study which states that there like is a 1% difference in the incidence of hypothyroidism vs anemia in fatigue. Moreso, if you were on the hospital floor and you were to say "I'm getting only a CBC, because it's more likely," the resident will chide you for not considering hypothyroidism as well and getting the Thyroid stimulating hormone as well, making you look bad. So yeah, learning for the test doesn't really ever end.

Re:Education in taking the test by Anonymous Coward · 2007-06-16 19:44 · Score: 1, Interesting

At that point, that's when the resident says "You would diagnose condition A from those symptoms, but for the exam, you'd say you'd get an MRI to rule out B".

Ah, so the process of eliminating the dangerous in favour of the obvious is actually built into the tutoring system, while the exams appear to teach the Right Thing to make the problem even harder to fix.

Got a friend who used to work in healthcare and this goes some way to explaining a few incidents where the Unlikely Explanation was instantly ruled out and the patient died (or, by over-concerned parents, was immediately taken elsewhere for tests and a life-threatening condition found).

This isn't like sloppy programming where the greatest danger of a buffer overflow is a pwned machine. If there is a discrepancy like that between "what you will actually do" and "what you claim you would do", and someone's life might depend on it, don't play along - rock the boat. If it'd be unhelpful to do so now, please make an issue of it once you've graduated.
Re:Education in taking the test by Alioth · 2007-06-16 20:41 · Score: 1

The best ones are the FAA tests for aviation - you often get a question, and then three right answers to pick from. It's just one answer is a little more right than the others!

My "favorite" multi-choice exams at school (or 'multi-guess' as we called them) were the ones where getting the first question in a series of several wrong, would doom you to getting all of the questions in the series wrong because the answers were all dependent on calculations from the answer of the first question! Of course, being multi-guess rather than long answer, they mark them automatically, so you can't get any credit for the correct calculations in the subsequent questions, like you would do in a long answer style exam. It's for this reason I vastly prefer 'long answer' exams.

--
Oolite: Elite-like game. For Mac, Linux and Windows
Re:Education in taking the test by ari_j · 2007-06-16 23:55 · Score: 1

The MPRE (Multistate Professional Responsibility Exam) is a multiple-choice test that most states use to qualify potential lawyers for the practice of law. It's separate from the bar exam and tests exclusively on knowledge and application of the lawyer ethics rules. (No jokes, please.) Most practicing lawyers would fail the test miserably, because to do well on it you have to know exactly where the line is between ethical and unethical and be able to precisely walk that line on the exam, whereas the majority of practicing lawyers are too paranoid of professional responsibility hearings and malpractice suits to walk the line that closely (and the small minority never pay heed to the line and also end up not knowing where it is). But the exam does effectively test knowledge and application of the law. (I won't get into the actual bar exam right now. Don't talk to me about the bar exam until at least the end of July. :P That said, many people think that the bar exam is antiquarian at best and not useful for aspiring lawyers - these people fit within what I'm about to say.)

The truth is that, by and large, people who think that difficult tests are fallacious tend not to understand why those tests are given. The specific things you write on the exam may or may not have real-world application, but that's not the real point of the exam. If it was, you wouldn't have internships and residencies - you'd take a test and then start saving lives when you got your passing score.
Re:Education in taking the test by LightPhoenix7 · 2007-06-17 05:24 · Score: 1

Especially since, in medicine at least, there's a right answer, but there's no straight line of reasoning between presentation and diagnosis. A question may ask about fatigue, and may ask what one test you'd order, but in real life there's absolutely no reason to order a TSH and a CBC, or for that matter, run a "Chem 7" as well to check electrolytes and glucose. In a modern medical laboratory, that can be done off of very little blood, for very little additional cost.
It's a pretty big disconnect between the test and practice, and that is why most of the "hard" tests are a sham - not because of any numerical analysis.
Re:Education in taking the test by lachlan76 · 2007-06-17 22:29 · Score: 1

While that is true, it can be avoided somewhat in some topics by asking questions that are phrased along the lines of "show that the wavelength is 400nm" rather than "find the wavelength". I still prefer long answer, though.

Re:warning moronic blog post linked by suv4x4 · 2007-06-16 19:11 · Score: 4, Insightful

if anything testing has become FAR FAR too easy, people pass CS courses and come out the otherside only to have a vague notion of how a computer works.

I won't claim his post is correct or not, but he claims the technology behind such tests is wrong and lets less educated people pass through with guessing, whle more educated people try to pass without guessing and fail.

People see the tests produce poor selection, and make the tests harder and harder in attempt to remedy this (but they won't since it's the technology of a test that's wrong).

Then you come here and support his opinion 1:1 by claiming tests are too easy (i.e. should be harder) and idiots pass through.

Ironic, isn't it.

Re:warning moronic blog post linked by KDR_11k · 2007-06-16 19:16 · Score: 2

I think what he should have said is that multiple choice tests are a stupid idea (it's okay if one or two questions are a block of multiple choice lines but not the whole test). Let the student explain things with his own words.

--
Justice is the sheep getting arrested while an impartial judge declares the vote void.

Check the statistics not the mathematics! by sequence_man · 2007-06-16 19:17 · Score: 1

This is really a question of statistics not of mathematics. Having done experiments on MBA students, we found that a well written multiple choice question is more accurate than 4 well written essays. The fact that we can easilly have 50 multiple choice questions and a maximum of 8 essays makes it a no brainer that multiple choice is much more accurate.

So it isn't a matter of how you reward guessing (which psychologist will say that rewarding guessing actually gets better accuracy). It is a question of how well written the questions are. Further the pass rate has absolutely nothing to do with the fraction needed to pass. Even high school students understand this one. So he seems totally confused.

Re:Check the statistics not the mathematics! by tepples · 2007-06-16 19:28 · Score: 2, Insightful

This is really a question of statistics not of mathematics. Statistics is a branch of mathematics.
Re:Check the statistics not the mathematics! by starwed · 2007-06-16 19:44 · Score: 1

But statistics also means the raw data; I think the OP was saying that this is better answered through experiment than theory.
Re:Check the statistics not the mathematics! by antifoidulus · 2007-06-16 20:09 · Score: 4, Funny

Having done experiments on MBA students

See, I KNEW they were good for something. Let me guess, the reason you opted for MBAs over mice is that there is far less protests when you do cruel medical experiments on the MBA students than with mice, correct?

--
Monstar L
Re:Check the statistics not the mathematics! by IkeTo · 2007-06-16 23:00 · Score: 2, Interesting

> This is really a question of statistics not of mathematics. Having done experiments on MBA
> students, we found that a well written multiple choice question is more accurate than 4 well
> written essays. The fact that we can easilly have 50 multiple choice questions and a maximum
> of 8 essays makes it a no brainer that multiple choice is much more accurate.

I don't know how you judge whether a question is well written or not. In my experience, multiple choice questions are very easy to write wrongly. A wrongly worded essay question easily have exactly the opposite effect as you want: you reward the ones who know the subject less (it seldom just give you random noise). Worse, you won't know it happened before you're told. I've read many exam MC questions during exam paper review meetings, my feeling from reading such questions for 4 years is that one in 4-6 MC questions are poor enough this way. In contrast, a wrongly worded essay question will present students some real-life trouble (the questions that they will face will be full of inaccuracies!), and when marking them you know the question is written wrongly, but at the same time you know whether the students are good anyway.

But the real problem of multiple choice questions is that it doesn't present the student any real world test. In the real world, nobody would tell you that "You are in situation, you can do A, B, C or D. Please choose one". Instead, what they see is "Somebody is in this situation. Please advice." Being good in multiple choice question usually has doubtful utility in the real world. And education systems will have to align with the judgement system, so at the end the teachers train their students the wrong technique as well.

Of course, there are benefits of MC questions: they can be marked mechanically, which means that (1) they lessen the workload of markers, (2) they are marked with perfect consistency, and (3) their markings are free from language or hand-writing proficiency. I don't think "accuracy" is one of those, though, since MC questions are just testing the wrong ability.

This is insane by gowen · 2007-06-16 19:19 · Score: 1

His basic assumptions are so retarded as to invalidate his own thesis. Yes, depending on the difficulty of the exam, the range between the best and worst candidates will narrow. But the effect of guessing only becomes important in the extreme cases he looks at (impossibly hard test vs. impossibly easy).

And who the hell sets multiple choice questions with only two options? Rerun the numbers with five options and report back. You'll find the guesser is far more severely punished.

Besides pass rates as indication of the difficulty of exams is a myth. Set any exam with even the slightest differentiation, and you can have whatever pass rate you like. You just pick your passing grade appropriately.

--
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.

Well-intentioned, but... by ThePromenader · 2007-06-16 19:19 · Score: 1

...the reasoning is... incomplete because it is based on an undefined variable ("knows twice as much as" (that itself is no easy task to measure)), and excludes the reasoning that, if a test is on the single subject the testee's 'level of knowlege' is "calculated" on, he with more knowledge/experience in that subject and its workings as a whole would have a greater chance of "guesstimating" correctly on the questions he was unable to answer with 100% certainty. Even more so if the test isn't a fixed set of true/false questions.

I'm sure it is possible to reduce such questions to mathamatical formulae, but the algorithm would be m~u~c~h more complicated, and even then I think we could only be hitting at "closest averages".

--

No, no sig. Really.

ThePromenader

Re:warning moronic blog post linked by Anonymous Coward · 2007-06-16 19:21 · Score: 1, Insightful

> hard tests are meaningless? what's his solution, easy tests where even an idiot can score 100%?

No, you completely missed the point hard _multiple choice_ tests are meaningless, esp. when counting only right answers without penalty for wrong ones because the result depends more on how lucky you are (at guessing) than on actual knowledge. Maybe this is an overstatement, but there is no denying that multiple-choice can be problematic.

Not Worthless by deskin · 2007-06-16 19:22 · Score: 3, Insightful

Though some of his logic was overblown (see the comments made directly on his blog), I think his larger point has some merit. In fields which require lots of studying before beginning as a professional, such as medicine and law, you always hear that you have to be absolutely brilliant to 'get in'. The fact of the matter is that this is not the case: you should be darn smart, but you needn't be the best student in the world to be successful as a doctor. Many of the students who go to law or medical school (I'd guess most) are completely qualified for positions in their respective fields, but by the same token, are not necessarily any more qualified than their peers: they've all studied the same material, had the same experience in the lab, and know the whole picture within a reasonable approximation of each other.

Yet to maintain the level of exclusivity that these careers have, there must be some way to select a subset of the candidates to proceed, and at this point, there are few distinguishing features among them. Some will be far and away brilliant, and will easily get a career regardless; but the majority can't be differentiated from one another. So, how should it be decided who is a doctor and who isn't? By making a test that's so hard it amounts to a randomising function, and then selecting a subset of top scorers to pass. Passing doesn't mean one is inherently more qualified; it just means one guessed better on that day. This also explains why people can pass on their second or third try: they are no better than their competitors the next time around, but eventually one will guess luckily, and get in. It'd be interesting to do some statistical analysis on how many tries it takes people to 'pass' a particular exam, and see if the results fit probabilistic models: If the results of such analysis fit too well, the test is too hard, whereas if they deviate greatly from probabilistic expectations, then the test is more likely to be an actual test of one's knowledge.

To be sure, there will be some individuals who can pass based entirely on their knowledge, just as there will be some individuals who simply aren't cut out for life as a lawyer that will fail the exam. But ultimately, it allows the higher-ups to select candidates for job positions based on the single indisputable criterion of the candidate having passed an exam, thus avoiding any messy issues when someone complains about them choosing a particular candidate in lieu of one better qualified.

Time for a terrible analogy, since it's 0300 here: Really hard exams are the bouncers at the door to the club of medical careers.

Re:Not Worthless by turing_m · 2007-06-17 02:11 · Score: 1

"Some will be far and away brilliant, and will easily get a career regardless; but the majority can't be differentiated from one another. So, how should it be decided who is a doctor and who isn't? By making a test that's so hard it amounts to a randomising function, and then selecting a subset of top scorers to pass. Passing doesn't mean one is inherently more qualified; it just means one guessed better on that day."

I think you are overestimating the degree to which the majority can't be differentiated from each other.

These sorts of tests are deliberately designed so that someone who is just passing the test will get maybe 50% of the questions right. Well, if it's a 4 choice multiple test, it's going to be a bit higher, because a monkey taking the test will get score 25%. So say, 62.5% will be the point where half should fail.

If the difficulty of the questions is variable enough, the test will actually judge who is smart enough + hard working enough to cram for the exam correctly.

Of course, of the people who fail you will always get your few smart-but-lazy people who need a week cramming for the test instead of a day, the rest being the try-hards for whom time cramming could approach infinity and they'd still never pass. The former tend to play down the importance of testing while being sure to name drop all the tests they had passed - they are too cool for that, and it has the potential to make them sound like an asshole if they do. Some even go to the extent of writing a blog entry about it. And the latter play tests down to save face.

And there is a little luck around the edges, but not that much because there are enough questions of difficulty at or near the cutoff to make it a good estimate of where you stand. It's like an olympic weightlifting competition - they give you three tries at each weight because it's not always the first attempt that will get green lighted. Or the second.

--
If I have seen further it is by stealing the Intellectual Property of giants.

Sometimes guessing is a good thing by Artifice_Eternity · 2007-06-16 19:23 · Score: 2, Interesting

In many professional specialties, including law and medicine, there are times when a quick, decisive educated guess may produce better results than an exhaustively researched, definitively confirmed answer.

So tests that force students to do a lot of guessing may still be good tools for evaluating their professional qualifications.

A doctor or lawyer who can guess right may be superior to one who plods to the right answer only after many expensive lab tests or hours of legal research. That's not to say that doctors and lawyers shouldn't do lab tests and research -- of course they should. But there are many situations, especially time-sensitive ones, where quick judgment is more important than absolute knowledge: during surgery or a health crisis, during a trial or deposition, etc.

Re:Sometimes guessing is a good thing by The+One+and+Only · 2007-06-17 00:39 · Score: 1

But that's more of something to select for when you get down to specialties--don't put someone in the OR or ER who can't think on their feet, don't put someone in the courtroom who can't do the same. That's years past the LSAT or MCAT, days or months past the bar exam, and I don't know if doctors have to take a test to graduate medical school. Lots of doctors and lawyers can be as deliberative as they want.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:Sometimes guessing is a good thing by Vellmont · 2007-06-17 04:45 · Score: 1

So tests that force students to do a lot of guessing may still be good tools for evaluating their professional qualifications.

I think when the poster is referring to "guessing", he's defining it as able to determine the right answer no better than chance. I'm not sure if "educated guessing" changes the outcome of his analysis or not.

--
AccountKiller

I had a teacher... by coldmist · 2007-06-16 19:25 · Score: 4, Funny

in college that gave very hard tests. Intel Assembly class. For a midterm, we had to decipher Object-Oriented Assembly, and decipher self-modifying code. After 3 weeks of introduction to Assembly.

I got an A, with an average of 58% in the class.

For the 2-hour final, he got up at the 1-hour point, and yelled: "The test is over. All pencils down." We just sat there dumbfounded for about 10 seconds, and then he said, "Just kidding. I always wanted to do that."

Ya, a real great pal there!

Worst teacher I had in college. He didn't last long

--
Don't steal. The government hates competition.

Re:I had a teacher... by cerberusss · 2007-06-16 22:36 · Score: 1

Oh yeah, teacher stories!

I had a teacher that gave C++. He had long, unkept curly hair and seldom bathed. He gave his lessons like his coding: in subroutines.

He'd jump from one subject to another and you had to take notes just to follow the flow of the class. He was called Joop and the famous command "rm -rf /" was named the "Joop Maneuver" at my college.

One time he appeared a bit late in class and I asked him honestly, hey you're never late, how come?

He admitted he got so drunk in the weekend that on Sunday evening he decided to take a swim in the ditch behind his house. In the morning he had to take an unexpected shower so he was a bit late :-D

Lousy teacher, great amusement :-)

--
8 of 13 people found this answer helpful. Did you?
Re:I had a teacher... by vorpal22 · 2007-06-17 00:22 · Score: 1

If you think that's bad, I took a graduate level graph theory course in the 2006 winter semester. I didn't pass a single assignment and got at most 15% on the final exam, and walked out of the course with a B+.

The teacher told us that he liked to make things incredibly challenging and we shouldn't be discouraged or upset with failing marks. Honestly, because of the assignment content and because I never felt I had any clue what my course mark would be due to this, it was the most unpleasant, workload-unrealstic, and stressful course I ever took. (For example, you'd start working on a proof on an assignment, and eight hours later, getting nowhere, frustrated, you turn to google to find out that the answer to the question is actually the result of a published 25 page paper. Not exactly the type of thing that should be, IMO, on an assignment, and not doable by almost anyone. Differentiating between which questions were completable in a reasonable time frame and which ones were not was also virtually impossible, so you never knew where your efforts were best expended.)
Re:I had a teacher... by Tim_UWA · 2007-06-17 00:25 · Score: 1

Ignoring the fact that (the way I interpret it) his comment was that the class average was 58%, not his mark; scaling is a way of compensating for the impossibility of creating assessments that are identical in difficulty. Why should I take a class one year and get 58%, but if I took it the next year I could get 78%? This is especially true if a course is new, or if the lecturer has changed from previous years? It also means that I should take classes that are known to be easier, instead of relevant/interesting ones, to up my grades. Of course, this is sort of the case anyway (as more difficult classes attract smarter people, so my ranking in the class will be lower), but not as much as it could be.
Re:I had a teacher... by kiyoshilionz · 2007-06-17 00:28 · Score: 2, Interesting

How did you get an A with a 58%? Grading curves are crap. If you only know 58%, you should only be rewarded with a 58, not a 90. If you're the top person in the class, and only know 58%, something is definately wrong, but you should not pass. SRSLY? Have you ever taken a class at a university?

I go to UC Berkeley, but the MO of universities is to assign a professor to a class that is either within his field of research or is a fundamental part of what he/she does at the school (i.e. physics profs. in math). Professors are there because they are intellectuals and researchers, usually not because they love to teach. Because they have such vast knowledge and probably aren't very good at discerning how much their students are absorbing, they write impossibly hard exams, simply because they can't understand anything more basic. As a result, the top grade might be 58%. But knowing 58% on a difficult test might require the same level of knowledge as it takes to get a 90% on a easy exam.

In everything up till high school it was pretty easy to go back to that 90-80-70 standard because all the material was so elementary teachers could easily make tests to place students in those categories. At universities, the material is so much more complex that it's pointless/impossible to write exams to those arbitrarily defined 90-80-70 categories. Think of an exam in which half of the points are based on knowledge of the fundamentals, and the rest is on complicated, hazy, trivial bits of knoweldge. In that case it's totally fine for a 58% to be an A since he understood (probably) most of the fundamentals and a few of the smaller facts.

BTW: Even the you are a reasonably talented individual, I doubt you'll get grades in the 70%-100% range your whole life....
Re:I had a teacher... by dcollins · 2007-06-17 02:22 · Score: 3, Insightful

That guy's a fucking asshole. As a college teacher of math & CS (including assembly -- admittedly at a community college), guys like this just completely burn me up. Some people should completely not be teachers, they suck so fucking bad.

I practically meditate before a final exam on how to make the environment as comfortable as possible, clearly explain in advance what the procedures will be like, and keep everything in the same rhythm as all my prior tests. Just freaking out students in a final exam because you're a sadist is utterly unacceptable. Jesus.

--
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Re:I had a teacher... by the_ed_dawg · 2007-06-17 03:01 · Score: 1

in college that gave very hard tests. Intel Assembly class. For a midterm, we had to decipher Object-Oriented Assembly, and decipher self-modifying code. After 3 weeks of introduction to Assembly.

From my teaching experience, exams like this are awful because you generally know who really knows the material regardless of what test you give. You need some Mickey Mouse problems to see who actually knows the fundamentals and some simpler application problems to see who can apply the fundamentals. Hence, you need to discern who sort of knows the material from who should be taking your course again.

Honestly, I have no problem assigning a lot of A's and B's as long at the people who don't know the fundamentals have to take the course again. After all, they're the ones who will wind up hurting people or costing their employers millions (or billions) of dollars.

For the 2-hour final, he got up at the 1-hour point, and yelled: "The test is over. All pencils down." We just sat there dumbfounded for about 10 seconds, and then he said, "Just kidding. I always wanted to do that."

Professors who play games with their students should be severely disciplined by the department. Playing games is disrespectful to students, abusive of the professor's authority, and childish. Unfortunately, most large degree programs simply don't care.

--
There are two types of people: those prepared for the zombie apocalypse and those who will be eaten.
Re:I had a teacher... by An+Onerous+Coward · 2007-06-17 03:02 · Score: 1

So, what you're basically saying is, take a well-designed course that unarguably gives 'A' grades for 'A' work and comprehension. Now add to that an impossible and meaningless assignment -- "draw a square circle. You have forty minutes." -- that is worth 50% of your final grade, on which everyone scores a zero. Suddenly, all the A students have become F students, with no changes to the effort put into the course or the knowledge derived from it.

You're basically asserting that, in such a situation, it is unfair, immoral, liberal, or whatever, to simply double everyone's course score and assign grades accordingly. That's not remotely reasonable.

It's hard enough to design course curricula and tests without forcing on teachers some bogus requirement about how you'll totally murder everyone's GPA if you write a harder final than you intended. Scaling allows for more flexibility and judgment by the teacher.

--
You want the truthiness? You can't handle the truthiness!
Re:I had a teacher... by coredog64 · 2007-06-17 05:41 · Score: 1

As we're on the subject, I'll share my Garry Harrison story:

Prof. Harrison taught the upper levels of aerodynamics at my school. On an exam halfway through the term he offers the following deal:

"I'll give you a guaranteed 'D' on the test if you'll walk out the door right now." Nobody took him up on the offer -- at this point
we've already shed all the losers and wannabes and even knowing how hard these tests are we're sure we can do better than a 'D'.

It didn't work out that way -- one guy even got a 0! (Prof. Harrison was not a believer in partial credit).
Re:I had a teacher... by xenocide2 · 2007-06-17 08:44 · Score: 1

For an undergraduate course, your complaints are sound. But at the graduate level, we're faced with a serious problem: the state of the art is so incredibly vast that preparing you to build on the body of existing work takes some time. At some point a grad student needs to be exposed to the papers and difficult problems that people face. Unfortunately, most math classes focus on the same problem set oriented homework instruction, even at the graduate level. It's an important experience to try to solve difficult problems, but as you noted, you had to go outside of class material to discover the solution. Whats worse is that the instructor probably forbade you from looking at such materials, as cheating. What should happen is a standard format of "here's a hard problem, try to prove it". And then present the paper as part of the course. We essentially teach students to ignore the journals they will eventually publish in.

--
I Browse at +4 Flamebait
Open Source Sysadmin
Re:I had a teacher... by Captain+Segfault · 2007-06-17 15:02 · Score: 1

Just because a problem was solved in a 25 page paper does not mean it isn't a suitable homework problem, even if you didn't manage to find the solution yourself.

It's not like you'd be expected to write up in the detail they gave. More likely than not the *real* major fruit of the paper (the general approach) was covered in class, and you're just proving a corollary.

Half the trick to courses like that, IME, is to relax and just take it as a series of fun problems. Courses like that are never graded on a 70/80/90 scale, so stressing about the grade you get on any particular assignment, or even on a series of assignments, is just counterproductive.
Re:I had a teacher... by elrous0 · 2007-06-18 03:11 · Score: 1

I had a similar teacher in a Sociology of Religion class. He would spend most of the time in class talking about stuff like his drug experiences (on the excuse that he had tried peyote once, and that tied it in with American Indian religion), his experiences as a triathlon runner (to the point of having other triathletes come in as guest speakers), and assorted other matters completely unrelated to the sociology of religion. He would then give a test about once a month consisting mostly of questions about subjects which we did not discuss in class and which weren't in the texts. I would usually score the highest on these tests, with a 50%--which he then curved up to a 100%. No one ever got more than about half the questions right (even lower when you factored in the guess factor)
Nice old guy. Absolutely terrible teacher. I learned more about drugs and running in that class than religion. They should have just changed the name of the class to "Sociology of Half-Crazy Old Hippies."

--
SJW: Someone who has run out of real oppression, and has to fake it.

My experience by Tim_UWA · 2007-06-16 19:26 · Score: 5, Interesting

I once had a test that had a check box for how confident you were your answer was correct, that affected your score the following way:

If you ticked "confident" and you were wrong, -2
If you ticked "confident and you were right, +2
If you ticked "unsure" and you were wrong, -0
If you ticked "unsure" and you were right, +1

I guess the point is that it's advantageous to guess, but only if you choose the lesser-scoring option.

Re:My experience by Anonymous Coward · 2007-06-16 19:46 · Score: 2, Funny

At last! A scoring method that will naturally penalise me for my lack of self-confidence!
Re:My experience by antifoidulus · 2007-06-16 20:05 · Score: 1

You aren't talking about the Academic Game "Propaganda"(under the new rules anyway, I'm such an old timer I can remember when your score was based on consensus and the answer, but anyway) are you?

--
Monstar L
Re:My experience by NerveGas · 2007-06-16 20:07 · Score: 1

It wasn't quite that bad, but I had a teacher who would give you 1 point for a right answer, no points for no answer, and take away two points for a wrong answer.

--
Oh, you're not stuck, you're just unable to let go of the onion rings.
Re:My experience by Loke+the+Dog · 2007-06-16 20:44 · Score: 1

Interesting, however, tests that take the takers attention away from the actual subject are not good. By giving takers these kinds of options, they will be filling their heads with statistics or whatever, when they should be thinking of bacteria or whatever.
Re:My experience by The+One+and+Only · 2007-06-16 20:46 · Score: 2, Interesting

That reminds me of my EE final--at the end, we had the option to guess our final score. There was a mathematical formula applied to the absolute value of the difference of the estimated final score and the actual final score. If you were close enough, you'd gain points. If you weren't close enough, you'd lose points. Of course, you could always elect not to do it.

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:My experience by Libertarian001 · 2007-06-16 21:04 · Score: 1

With that confidence marker, Prisoner's Dilemma and game theory immediately springs to mind.
Re:My experience by gnasher719 · 2007-06-16 22:45 · Score: 1

I remember one test that millions and millions of people had to pass. Here's the rules:

25 questions. Each question has three or four answers. There can be no right answer, one right answer or multiple right answers. You need to check exactly all the right answers. Any right answer not checked and any wrong answer checked gives you penalty points. So if there is a right answer and you check a wrong one, that is two penalty points. If there are two right answers and you pick only one, that is one penalty point. Each question has a multiplier of 2, 3 or 4; many have four. Up to seven penalty points you pass. So getting the wrong answer on a question with a multiplier of four means failure.

On the positive side, there is a total of 1200 possible questions, and the 25 you get are picked at random out of these 1200, and you can buy books with all 1200 questions and all the correct answers.
Re:My experience by TheTapani · 2007-06-16 23:43 · Score: 1

On the positive side, there is a total of 1200 possible questions, and the 25 you get are picked at random out of these 1200, and you can buy books with all 1200 questions and all the correct answers. And being able to buy the right answers ahead of the exam is something positive??
Apparently your school system (or attitude towards it) is very, very, different to where I live.
//T
Re:My experience by Tim_UWA · 2007-06-16 23:50 · Score: 1

I've personally not heard of it, but that's not to say that my lecturer hadn't.
Re:My experience by teslar · 2007-06-17 00:08 · Score: 1

At last! A scoring method that will naturally penalise me for my lack of self-confidence!
Surely this method will, in fact, naturally reward you for justified lack of self-confidence? ;) Imagine the insecure nerd with big glasses in high school, he comes to the test only to realise that he's revised for the wrong topic. He knows sweet FA, guesses 50 True/False questions but admits to being unsure. He is likely to end up with a score of 25ish.

Next, here comes Johnny Bling, the nerd's arch-nemesis. Johnny looks cool, has had more girlfriends than the nerd had level-ups and he sweats testosterone and self confidence. He simply has not revised for the exam because he is just that good. So he too guesses all the answers, obviously confident. Result? 0ish.

So there you go, lack of self confindence wins. Doesn't that make you feel confident about your lack of self confidence? ;)

P.S.
Yes, I am aware that if the nerd actually knew all the answers, confidence would have been better. But then this is a nice test for checking confidence levels. Consider only the answers for which a student was unsure. If he is really guessing, you'd expect only 100/(number of answers)% of his answers to be correct. If more are correct than chance would predict, consistently and repeatedly across tests, then you may want to have a chat with the guy. Conversely, if someone consistently has a significantly lower accuracy than 100% on his confident answers, you may want to tell him to either share or stop smoking whatever he's smoking and have a reality check :)
Re:My experience by Lord+Ender · 2007-06-17 05:11 · Score: 1

In computer science, we call that "infinite recursion."

--
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
Re:My experience by EricWright · 2007-06-18 02:00 · Score: 1

Ouch. I remember once in 3rd year physics, I had a Thermal Physics course that made virtually no sense to me. I went in for the mid-term, stewed, steamed, and fretted over that exam. I gnawed (mechanical) pencils. I balled up one sheet of paper after another. At the end of the hour, I unceremoniously dumped my sheaf of unwadded paper on the professor's desk, muttered something not-quite incoherent about the difficulty of the test and stalked off, certain I had just failed the exam.

Imagine my surprise a week later when I found out I had turned in the only perfect paper. I would have HATED an option like that on this particular test, as I would have surely guessed a 50-60%, been horribly wrong, and ruined a perfect score.
Re:My experience by The+One+and+Only · 2007-06-18 03:08 · Score: 1

Only one level of recursion in this case--your final adjusted grade, if you took the opportunity to guess, was the function of your final unadjusted grade and your guess. An infinitely recursive version of this would be rather clever for a CS final, though...

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:My experience by swillden · 2007-06-18 15:27 · Score: 1

Interesting, however, tests that take the takers attention away from the actual subject are not good. By giving takers these kinds of options, they will be filling their heads with statistics or whatever, when they should be thinking of bacteria or whatever.
Maybe it was a statistics course?

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.

Genius from under a rock... by TyFighter · 2007-06-16 19:34 · Score: 1

Cue the underground brilliance of every slashdot troll claiming that he is no less than a genius and nothing truly mental stimulating can be classified as difficult.

--
-tyfighter

Re:Genius from under a rock... by rts008 · 2007-06-16 20:32 · Score: 1

I'm a moron, you insensitive clod!

And, and...ohhhh!...Shiny!

--
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti

Well, that explains it. by Rumagent · 2007-06-16 19:43 · Score: 4, Funny

TFA makes sense. Observe:

News for nerds?: yes[ ] no[x]
Stuff that matters?: yes[ ] no[x]

Clearly the editorial process is fraudulent - as this is a multiple choice, it is obvious that guessing tends to count much more than knowledge.

From this we can conclude one of two things:

1) Zonk is bad at guessing
2) The author is speaking out of his ass

Tempting as it is, I am going to stick with 2... But I could, of course, be guessing.

Whoa! by Gerocrack · 2007-06-16 19:43 · Score: 1

You mean it ain't me noggin, it's me teachers?

According to the blog by johncadengo · 2007-06-16 19:47 · Score: 1

As many have posted, this blogpost is mostly pretentious at best. However, in the post he states:

Now suppose the test is very hard. As hard as it could be actually. Suppose the test is so hard that I, with lesser knowledge, can only answer one question based on actual knowledge. I answer that question, and guess at the other 99. You, who know twice as much as I, can answer two questions based on knowledge. So you guess at 98 answers. As you can readily imagine, the odds of you getting a higher grade than I are very slight. In fact, over 45 percent of the time, in repeated trials, I would outscore you, even though my knowledge is half that of yours.

I'd like to point out the simple fact: in reality we don't worry about those who are two or three times as smart as the rest, their knowledges are mostly indistinguishable (as pointed out by the blogpost, albeit shakily), but we are looking for the many magnitudes of times smarter than the rest (so smart in fact that they surpass the flaws that he has pointed out). And that's where those taking 'hard tests' succeed and others do not. All these flaws are arguably non-existent but even in supposing their existence, it would do nothing to correct these flaws in our ability to be able to separate those capable of Med School or Law school and those who are not. Those capable, those in the very upper-ring, are just so capable that they surpass the very flaws of the test itself.

--
My page.

Re:According to the blog by johncadengo · 2007-06-17 06:00 · Score: 1

There you actually have to think rather than regurgitate what has simply been given to you.

That's the most widespread myth about school there is. Everyone has pride in their own field (say mathematics) and they claim that in their own field they must actually think while all other fields (I've heard said about history, law, medicine, etc) are simply regurgitation. That's a lie. There isn't a skilled field in ANY industry (name anything) where you mustn't think. Even a plumber is capable and obligated to think. There is problem solving in almost anything. But there are so many people who raise their own discipline far above all others and think that almost anything else is regurgitation or learned through repetition or is simply second nature only because they took one or two classes in high school and that's all their first-hand experience has taught them. My best friend is in Med school and trust me, they think.

--
My page.

That's not the only mistake by DingerX · 2007-06-16 19:50 · Score: 1

All the math in that paragraph is off. Not only should it read 26.5 in the first case, the other case should be 1+(99/2) - ((99/2)/2) (also known as 1+99/4)= 25.75,so that the "penalize fractionally for wrong answers" should give a result where the test results are even more obscured by noise. (see snippet below)

So it's not just a "Typo" that distracts, it supports a completely faulty conclusion.

One is left wondering what kind of mathematics background the author had. Also, noting the dittography earlier ("question question"), whether proofreading or "checking your work" formed part of the author's training.

In any case, the post also assumes that test-makers don't spend an awful lot of time validating their tests; so instead of taking the rules from any given test, a couple of straw-men examinations are supplied.

Consider, for example, the case where a multiple-choice test featuring 4 possible answers penalizes wrong answers by one third of a point. In that case, guessing is not advantageous unless the examinee can eliminate two answers: hence "partial knowledge" can count for something.

Oh yeah, and who the heck said that the test was "hard" because most of the answers were unknown? Heck, if you look at your big standardized tests (such as the SAT), and just the multiple-choice parts, you'll find that, for those who take the test more than once, there's not much noise in their scores. So why should a medical exam be different?

For True-False exams for example, the number subtracted would most likely be (Number Wrong ÷ 2). Let's see how that would work out, for the sample case above. You, answering two questions correctly and guessing at 98 would be likely, on the average, to get 49 wrong, and so have a final score of 2 + 49 - (49 ÷ 2), or 75.5, while I, again on the average. answering only 1 correctly and guessing at 97, would get a final score of 1 + (97 ÷ 2) - ((97 ÷ 2) ÷ 2)), which comes out to be 25.25. Here there is a substantial difference between our scores, closer to the two-fold difference in our actual knowledge.

Mutliple choice is bad to test knowledge anyway by aepervius · 2007-06-16 19:51 · Score: 2, Insightful

I love the exams we had : a question was posed or a problem stated which required the knowledge we had learnt to solve it. Eventually there is more than one question asked to offer a lead. But no answer given. Those are real test. Applied Knowledge. Usually for multi choice with a very basic knowledge of the subject you can sort out formany response the one being the most probable. This is how I breathed through my english Multiple-Choice at the university, and hell, look at how bad (or how good ;)) my english is. Face it multiple choice might be an easy way out for professor to correct exams, but they are the poorest choice to test the knowledge and habilitiy to reason of the student.

--
C. Sagan : A demon haunted world:
http://www.amazon.com/gp/product/0345409469/
visit randi.org

What? by Bastard+of+Subhumani · 2007-06-16 19:57 · Score: 1

Suppose the test is so hard that I, with lesser knowledge, can only answer one question based on actual knowledge. I answer that question, and guess at the other 99. You, who know twice as much as I, can answer two questions based on knowledge. So you guess at 98 answers.

As you can readily imagine, the odds of you getting a higher grade than I are very slight. In fact, over 45 percent of the time, in repeated trials, I would outscore you, even though my knowledge is half that of yours.

I'm confused (or he is). Assume he's talking directly to me, i.e. I'm the guy who knows twice as much as him.

In the long run he will score 1 + ( 99 * 0.5 ) = 50.5.
My expectancy is 2 + ( 98 * 0.5 ) = 51.

Seems I score more.

--
Only three things are certain; death, taxes, and apocryphal quotations - Ben Franklin.

Re:What? by antifoidulus · 2007-06-16 20:02 · Score: 1

In the long run yes, but that is pretty meaningless if the test is offered only once a year....

--
Monstar L
Re:What? by karmatic · 2007-06-16 21:26 · Score: 1

Ultimately who "wins" a test is dependant on who guesses more correctly - you or him.

His assertion is that over 45% of the time, he will guess enough correct to exceed your score. A little under 55% of the time, you will win.
Re:What? by SEE · 2007-06-16 22:34 · Score: 1

Consider the case where both of you know none of the answers. In that case, both of you will have an expected value of 50, right? So over time, you have identical scores.

However, on any single given test, you will both get a random number of them right, ranging from 0-100. You'll almost never score the same on a single test; you'll almost always have different scores on any individual test even though the time series has you have identical average scores.

Now, to make the point the original article writer didn't make well:

Let's consider a test 100 questions long, 4 answers per question, only right answers counted, with 35% as a passing grade.

Now, let's take eight guys of differing levels of ignorance. Arnold knows zero answers; Bob knows 5; Chuck knows 10; Dave knows 15; Ed knows 20; Frank knows 25; George knows 30; and Harry knows 35. On all questions they don't know, they make a completely random guess, since there's no penalty for guessing.

Arnold passes 1.6% of the time.
Bob passes 8.9% of the time.
Chuck passes 30.8% of the time.
Dave? He passes the test 66% of the time.
Ed gets by with a pass 92.6% of the time.
Frank passes 99.6% of the time.
George passes just shy of 100% of the time.
Harry, of course, passes 100% of the time.

Now assume each participant is allowed to take the test three times in an effort to pass. In that case:
Arnold passes 4.7% of the time.
Bob passes 24.4% of the time.
Chuck passes 66.9% of the time.
Dave passes 96.1% of the time.
Ed, Frank, and George all pass just short of 100% of the time.
Harry still passes all of the time.

Our implied standard here seems to be that knowing ten of the answers is "good enough", since most people of that standard get admitted. And it seems to be that knowing five of the answers is not "good enough", since most people of that standard get failed. But we still are letting in almost a quarter of the people who know only five answers and keeping out almost a third of the people who know ten -- and we're doing it totally at random. And we're letting one in twenty of test-takers who have no knowledge of the subject pass, while flunking one in twenty-five who can answer fifteen questions reliably.

That's a pretty lousy filter, no?

There are lots of known ways of fixing it, so that it works better. First, you could make the test "easier" and raise the number right needed higher, so knowledge would translate more strongly into passes. If Bob starts with 30 of 100 and Chuck with 60 of 100, chance on a hundred questions will cover the deficit less often. You won't distinguish Ed from Harry very well, but both of them were passing (nearly) all the time anyway. Second, you could add more possible answers, so guessing doesn't work as well -- say, six instead of four. Third, you can add penalty points for wrong answers -- with six, count every wrong answer as -0.2 of a point, so on average completely random guesses work out to zero points (the same as leaving them blank). All will help create a sharp distinction where Bob has little chance and Chuck is almost certain to pass.
Re:What? by Hognoxious · 2007-06-16 22:48 · Score: 1

But you won't always get half of the unknowns right and nor will he. It only takes pure dumb luck on one question to tip the balance. That's the blogger's point. I think.

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."

The real problem with multiple choice tests by ctwxman · 2007-06-16 20:01 · Score: 1

After a thirty plus year break, I took 53 additional distance learning college credits to complete a certification. All of our quizzes and tests were multiple choice. The biggest failure of the tests were the answer choices! Because writing skills are so poor, the majority of the tests had at least a few questions and answers which didn't say what the professor thought they said. As a student, I often had to choose between what I knew he meant and what he actually said.
A close second was counting negatives. To make easy concepts more difficult on tests, professors would often throw in layers of negative concepts (which of these isn't...). As I took the test, I'd count on my fingers while saying negative, positive, negative, positive ad nauseum. Once, I counted five negatives in one question and correct answer.

This wasn't testing my knowledge of the subject being taught. It was just seeing how well I could parse.

Of course essay tests are much more difficult to administer, though they are better indicators of your grasp of the subject.

Re:The real problem with multiple choice tests by mh1997 · 2007-06-17 00:54 · Score: 1

The biggest failure of the tests were the answer choices! Because writing skills are so poor, the majority of the tests had at least a few questions and answers which didn't say what the professor thought they said. As a student, I often had to choose between what I knew he meant and what he actually said.
In a previous life, I wrote thousands of questions for standardized tests. What I have found to be the biggest problem in four-choice multiple choice tests is that the test-writer often develops three distractors that are so poor that nearly any student can eliminate one or two without any knowledge of the question.
Many times I look at my daughters tests that she brings home and without reading the question, can pick the correct answer.
If a teacher develops his/her own multiple choice tests, at least make them 6-choice tests. Add an "E - All of the above" and a "F - None of the above" to every question. That also aids in making a more random answer key - before writing a single question, roll a 6-side die once for each question, record the results, write the questions distractors and answers, put the answer where indicated by the roll of the die.

I'll tell you about hard tests... by NerveGas · 2007-06-16 20:04 · Score: 1

I had a physics professor for two entire physics series. This man was... a machine. He was VERY intelligent, and was a VERY good teacher. He was, however, quite anal. He would not expect you to know things he hadn't taught, but he expected you to know what he had taught with *perfect* mastery.

He provided copies of all former tests, along with answers and how to solve them, to the local copy store for students to buy (amazingly, this prof DIDN'T try to take you on them, the only cost was of the copies). The tests changed VERY little over the years. Two nights before the exam, you were welcome to go to a study session, where he would take problems VERY similar to what would be on the test, and walk you through solving them. And he would let you take a 4x5 card into the test with you, with anything you wanted.

His tests consisted of three questions. Just three. At the end of the alotted hour for the exams, the majority of the class would NOT have finished. Those tests were *tough*. I also had a calculus professor who would give exams that consisted of just two problems, and few people finished them completely. That wasn't so much that he gave hard problems, just problems that took a lot of work to solve. That almost seems backwards, since the point of calculus is to make difficult problems easier... or at least possible, anyway. But he was VERY generous with partial credit.

--
Oh, you're not stuck, you're just unable to let go of the onion rings.

Re:I'll tell you about hard tests... by OverlordQ · 2007-06-16 20:51 · Score: 1

Unless you're dealing with one of the odd branches of Calculus, having taken Cal I and II, and looking through Cal III, I havent seen anything that should take you an hour to solve, nor anywhere close.

--
Your hair look like poop, Bob! - Wanker.
Re:I'll tell you about hard tests... by Glowing+Fish · 2007-06-16 21:26 · Score: 1

You can make calculus harder, I imagine, by throwing in lots of complicated arithmetic. Not that that makes it conceptually harder, but it is more to keep track of, and will trip people up.

--
Hopefully I didn't put any [] around my words.

Doctors jokes by Shorts+Eater · 2007-06-16 20:11 · Score: 1

What do you call a person who graduated at the bottom of their class in medical school?

A doctor...

Its a joke. Before you say anything, a doctor isn't a doctor until they pass the medical board tests.

--
Don't allow yourself to dream away time. Be productive. -- Some fortune cookie

all multi choice questions suck , bad design by cheekyboy · 2007-06-16 20:14 · Score: 1

Look, its all bad 19th century design.

If the question said, pick the 'odd ones out' each worth n% its better.

There is no wrong or right unless the answer says so. But did the person designing the questions have a degree in writing/psychology/reading aswell?

Its easy to know who is a rope learner, vs a true genius, even Hawking flunked a lot at school.

--
Liberty freedom are no1, not dicks in suits.

Re:all multi choice questions suck , bad design by janrinok · 2007-06-16 21:09 · Score: 1

Did you mean 'rote learner', or does the phrase 'rope learner' mean something where you live?

--
Have a look at soylentnews.org for a different view

Other falacies by Cafe+Alpha · 2007-06-16 20:18 · Score: 1

He's right as far as it goes that a multiple choice test where the recipients know almost none of the answers is not very accurate at measuring their marginal knowledge.

However in my experience, hard multiple choice tests have a different problem..

"hard" can mean that you compare against a curve that's known for that particular test and that the curve has a long enough upper tail to seem to measure something at the upper end. Ie, the last couple of questions as you approach 100% are worth more than the questions before them.

The problem with that is that it seems to me that a common way to make a test have that longer upper tail is to make some of the questions ambiguous bad questions. If there are 10 questions on a test that are poorly designed where a knowledgeable person is likely to pick a "wrong" answer, then you can count on it that VERY few people will get all of the "right" answers. Instant "hard" test!

I find Mr. Feldzamen's post hard to believe. by mbstone · 2007-06-16 20:57 · Score: 5, Interesting

Mr. Feldzamen claims to have passed the Virginia bar exam, but I can't find any evidence he was ever admitted to the Virginia bar, or to any state bar (he's not in Martindale-Hubbell). He cites the Virginia bar exam -- which I also passed (IAAL, licensed to practice in CA and VA) -- as one of his examples of a "complete fraud." In fact, when I took the Virginia bar exam it had over a dozen one-hour essay components, testing each and every possible subject. By contrast, the California bar exam, had essay tests covering six randomly chosen subjects out of a possible 15 or so, and it had other non-multiple-choice components. The multiple-choice section of every state's bar exam, the Multistate Bar Exam, is no walk in the park. So I don't understand how he includes bar exams in his claim that the tests are invalid. If anything, the low pass rate of bar exams, typically 50% or less among a candidate pool of mostly recent law school grads, suggests that they are very hard indeeed.

Re:I find Mr. Feldzamen's post hard to believe. by nagora · 2007-06-16 22:19 · Score: 4, Insightful

If anything, the low pass rate of bar exams, typically 50% or less among a candidate pool of mostly recent law school grads, suggests that they are very hard indeeed.
It doesn't actually suggest anything other than 50% of people that apply pass. I can design an exam which is very easy; I then say that only 50% will pass. It could be that the "cut" is anyone who scored 9+ out of ten will pass and everyone else fails. Or I could flip a coin. The pass rate is no guide to how hard an exam is nor how good a test of the candidates' abilities. It might be both hard and rigorous, but you can't infer that just from the pass rate.
TWW

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
Re:I find Mr. Feldzamen's post hard to believe. by foobsr · 2007-06-17 06:15 · Score: 1

The following members of the Virginia State Bar have been suspended for failure ... Alvin Norman FELDZAMEN, Ithaca, NY. ... (Google is your friend)

Finding the whole thing a little strange, to say the least, I did a little research and suspect a personal tragedy.

CC.

--
TaijiQuan (Huang, 5 loosenings)
Re:I find Mr. Feldzamen's post hard to believe. by dkleinsc · 2007-06-17 08:38 · Score: 1

If you actually follow links from Google, you will discover that he was suspended for failure to pay his dues to the bar. Also, his NY address suggests that he might have decided to no longer practice law in the state of Virginia.

--
I am officially gone from /. Long live http://www.soylentnews.com/
Re:I find Mr. Feldzamen's post hard to believe. by Old+Wolf · 2007-06-17 12:42 · Score: 4, Informative

Did you actually read the article? His whole point was that the multi-choice test is invalid because it is too hard.

Disturbing by bryan1945 · 2007-06-16 21:02 · Score: 4, Insightful

I find the fact that medical and lawyer exams are based on multiple choice rather disturbing. As an engineer almost all of my test were long answer. Sure, some multi questions, but mostly show all your work or explain the whole process. And I just design systems and networks! Now someone can just luckily guess enough multiple choice questions and start slicing me up?

Like I said, disturbing.

--
Vote monkeys into Congress. They are cheaper and more trustworthy.

Re:Disturbing by nomadic · 2007-06-17 02:03 · Score: 2, Insightful

I find the fact that medical and lawyer exams are based on multiple choice rather disturbing. As an engineer almost all of my test were long answer.

It's done the exact same way for engineers as doctors and lawyers; what they're talking about here is the professional licensing exam, not the exams given in school. The exams in law school (and I believe medical school) tend not to be multiple choice.

Law school exams, for example, tend to revolve around very long, very hard, very convoluted essays. They also are generally 3-4 hours long, and you're writing that entire time (and you inevitably run out of time, your goal is to get as much down on the paper as you can before time runs out)

From what I understand the professional engineering licensure exam is multiple choice as well.
Re:Disturbing by Brother+Seamus · 2007-06-17 02:04 · Score: 3, Informative

Almost all of the Professional Engineering certification exams in the United States are multiple choice, with no penalty for guessing incorrectly.
Re:Disturbing by CmdrPorno · 2007-06-17 03:36 · Score: 2, Informative

I am currently studying for the bar exam (at the end of July). There is a one-day-long multiple-choice component in most states (including mine), a standardized national test. Every state that I know of also has at least a one-day-long essay component that is graded by an actual human, and many states also have a one-day-long performance test. So admission to the bar is not governed by just a multiple choice test.

--
Sent from my iPhone
Re:Disturbing by PPH · 2007-06-17 10:42 · Score: 1

A human body, like the Internet, is just a bunch of tubes.

--
Have gnu, will travel.
Re:Disturbing by bryan1945 · 2007-06-17 17:22 · Score: 1

I apologize for my lack of info- I have not taken the PE test. It is good to know that general tests still include essay questions.

Thank you all.

--
Vote monkeys into Congress. They are cheaper and more trustworthy.
Re:Disturbing by bryan1945 · 2007-06-18 06:33 · Score: 1

No, I am not a PE. Nor am I a doctor or a lawyer. Which was my question. I assume from your attitude you are a PE, a doctor, and a lawyer with many years of experience in each. Oh wait, you are an AC which means you probably are some idiot who does nothing.

As for professional, I mean that I make my living in doing engineering things. I have no need to get a PE because I do not now feel the need to start my own company, nor do I feel like studying the 3 or 4 other types of engineering (besides electrical) that I would need to learn to pass the PE exam.

Just wondering, why are you so cranky?

--
Vote monkeys into Congress. They are cheaper and more trustworthy.

Choices by bryan1945 · 2007-06-16 21:09 · Score: 2, Funny

A person has heartburn, do you:

A) Perform a colonoscopy
B) Perform open heart surgery
C) Tickle him
D) Fart
E) Refer him to Cowboy Neil

I'm going to Mexico for my next check up. At least you'll get tequila first....

--
Vote monkeys into Congress. They are cheaper and more trustworthy.

Re:warning moronic blog post linked by Looshi · 2007-06-16 21:11 · Score: 3, Informative

I just skimmed TFA, but it seemed to me like he was advocating a guessing penalty.

Re:give us a test on anything by zmollusc · 2007-06-16 21:30 · Score: 1

Hah! Ok, how about a test on soap operas or celebrity trivia. Or sports?

--
They whose government reduces their essential liberties for temporary security, receive neither liberty nor security.

Some get it right by AceJohnny · 2007-06-16 21:35 · Score: 1

In our first year of engineering school (in France. Call it college, for the USA), our math teacher only did multiple choice exams. I was always floored by how accurate the results of those exams were. Of course, all answers counted, and guesses were punished.

The rumor was that he had done his thesis on the subject of multiple choice exams. Sadly, he is retired now, and newer students no longer benefit from his type of quick and accurate exams.

--
Misleading titles? Inflammatory blurbs? Keep in mind that Slashdot is a tabloid.

Chess rankings & IQ tests by Richard+Kirk · 2007-06-16 21:38 · Score: 1

The chess ranking is typically a 3-digit figure. Given two chess players, you can work out approximate odds of one winning from the difference in these figures. The figures are compiled from the games people have won, and the ranks of people they have played against. As in multi-choice tests, each individual question or game has a wright (win) and a wrong (lose) solution, and a stalemate (not filling in anything) option. From this we can estimate ranks of people we have not met; we can estimate ranks of people in history; we can even estimate corrections for ranks between cultures. For instance in the 19th century, how might a woman chess player in London (where the culture did not encourage chess) rank against a man from Prague (where cafes typically had chess boards in the tabletop, and most people played with friends and strangers in their lunch hours) had their backgrounds been equal, and assuming a native talent for chess is spread equally? This last point is not obvious - the differences between London and Prague and between women and men may not be wholly cultural, but the others can discuss that ad et ultra nauseam.

No-one designed chess with a perfect solution, and yet we can rank people. The IQ tests started from a similar point. People did not understand what intelligence was, exactly, but if they made tests that seemed to be testing the right sort of thing, and got the best people to design the next round of tests, then it was hoped that an incisive test for intelligence would evolve, even though no-one had defined what intelligence was. Unfortunately, in the early days, what was being tested as 'intelligence' was probably better named "how like minded are you to the white male that designed the test". The test can be as exacting as a chess ranking if you do enough tests, but the figure is less useful because it is not a measurement of something abstract and useful (unless you were IBM in the sixties, looking for white males with short hair that would sing the company song, in which case it was perfect).

There is a further downside to IQ tests. If you sit and stare at them, you can often reason a second or third possible answer using different readoning. I also have a problem with forms that means I think long and hard about the answer, and then tick the wrong box. The trick seems to be to work really quickly, and let your instincts drive your answer. I have only ever done about three IQ tests, and all of these were done ages ago for job applications to computer companies. The last one must have been twenty five years ago. We had two hours and 300 questions. I deliberately hammered through the questions, and handed the paper in after 40 minutes to avoid the temptation of fiddling with the answers. Incredibly, this seemed to cure my usual error rate with forms, and I got a perfect score. I wasn't any smarter that day - I just happened to be in the zone, I guess.

Didn't get the job, though. They thought I was too scary.

Thinking about the original post, though. The guy claimed that a multi-choice was 'fraudulent'. Isn't 'fraud' where someone is trying to deceive someone else? Multi-choice questions are an attempt to separate the test for the presence or absence of knowledge from the talents of presentation (good handwriting, confident presentation style, etc), but often flawed by laziness in trying to pass off examination skills to a computer. A good multi-choice questionnaire would have to be much longer than such tests usually are to reliably separate the thing you are trying to measure from the noise (think how much mesurement goes into a chess ranking, for example). But 'fraudulent'? And this was supposed to be a legal exam? I have my doubts about the original posting. Interesting subject, though.

Re:Chess rankings & IQ tests by An+Onerous+Coward · 2007-06-17 03:11 · Score: 1

I remember once that the "series" questions on the math portion of the ACT (you know, the questions that went "1/2, 1/4, 1/8, 1/16... what is the next number in the series?") were coming under fire because -- theoretically, anyhow -- there should be an infinite number of correct answers.

--
You want the truthiness? You can't handle the truthiness!

Easy to solve by WaZiX · 2007-06-16 21:40 · Score: 1

Just make it 10 possible answers... with 5 or 6 quite obviously wrong answers... for someone who studied...

Therefore the more "educated" the guess, the higher the probability of a reward...

Rubbish (unfortunately) by Hauberg · 2007-06-16 21:40 · Score: 1

The scenarios sketched in the post are not exactly close to the real world. Multiple choice tests no matter how hard can easily be constructed so there probability of passing the exam simply by guessing is insignificant. For example:

There are 20 questions and you have to be correct on at least 16 of them. If there are just two options on each question your chances of passing by guessing is 1:170 if there are four options your chances are 1:2600000.

If you are interested see http://en.wikipedia.org/wiki/Binomial_distribution

blind rage mode by acidrain · 2007-06-16 21:42 · Score: 1

Sadly it was kdawson who posted this turd. (Or did I miss the memo about trolling slashdot with misinformation that seems to be circulating.)

Summary: 2 + 49 - (49 / 2) = 26.5, not 75.5 as per the article.

I want the minutes of my life spent reading this back. The author rattles off a bunch of crap about his credentials, including math credentials, and then rolls out some bs which pretty much amounts to admitting he is trolling the slashdot submission queue.

Anyone want to refer me to less dumb versions of this site? At this point I'm just waiting for Jon Katz to start posting again.

--
-- http://thegirlorthecar.com funny dating game for guys

Re:blind rage mode by Aladrin · 2007-06-16 22:01 · Score: 1

He was -guessing-. :D -whoosh-

--
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM

Evil profs rock. by Cordath · 2007-06-16 21:46 · Score: 2, Interesting

Funnily enough, one of the most hardassed profs I ever had also taught the introductory assembler class. (except for us it was PDP-11 and 68K) His tests were legendary for their difficulty, and the average was somewhere in the 20-30% range. However, it was curved after the fact and was a perfectly valid exam since there was absolutely no opportunity to guess. He gave us self-modifying assembler code too, without telling us such a thing was possible in advance! He also had a unique way of assigning readings. He would say, "Have you read chapter X yet? If you haven't, you're screwed!" Still, despite his apparent sadism we did learn a lot in his class.

In a later course I had a prof who would run our class through proofs that would span 3 or 4 lectures. If you fell asleep once in that period of time you'd be utterly lost. At the end of his proofs he would often say, "Does this make sense? Does everybody get this? If not, you had better think about dropping the course!" (Somehow it was hilarious in his thick indian accent. He really rolled the 'r' in dropping too.)

Wrong in all respects by ebcdic · 2007-06-16 22:19 · Score: 1

This article is nonsense from beginning to end. First, as others have pointed out, the arithmetic is wrong. Second, the point of penalising wrong answers is misrepresented: it's nothing to do with improving the accuracy of scoring for different abilities, but is to minimise the difference between those with equal abilities who choose to guess answers they don't know and those who don't. Finally, the model of abilities is completely wrong. A better student will not only know the answer to more questions, but will be more likely to be right on questions he guesses. So far from swamping the true difference, guessed answers add to the accuracy of the test.

Re:Wrong in all respects by jbengt · 2007-06-17 02:14 · Score: 1

Guessed answers increase the variance in test results.
Penalties for wrong guesses improve the test by reducing the randomness of results.
Pure guesses do not add to the accuracy of test results, at least they don't if there is no penalty for wrong guesses.
Well designed tests will allow for better and worse wrong guesses, reducing the number of possible answers in a guess by a knowledgable test-taker, but that doesn't refute the main point being made in TFA.
TFA does get some of the arithmetic very wrong, but the main point is still valid:
There will be a large probability of inaccurate ranking of test-takers unless the test is very well designed, and allowing guesses without penalty is poor design in a test where the typical passing grade has a low number of correct answers.

Math is all wrong? by mark99 · 2007-06-16 22:25 · Score: 1

Maybe I didn't read it carefully, but I think his math in that last example is all wrong. It seemed wrong to me. I didn't see how that kind of adjustment could amplify the difference the way it did.

I get the following:

The guy who answers 1 correctly and guesses at 99 ends up with an expected score of 1 + 49.5 -24.75 = 25.75

The guy who answers 2 correctly and guesses at 98 ends up with an expected score of 1 + 49.0 -24.5 = 26.5

This is obvious, right?

Flawed math in TFA by viking80 · 2007-06-16 22:26 · Score: 1

I wonder where he got his math education. It is fairly simple to show that there exists a mapping between the results on a multiple choice test and "actual knowledge" K=T+|e|, where |e| is the statistical error, accounting for guessing statistically. Subtracting for wrong answers etc. is just "psychology". The statistical uncertainty "e" can easily be reduced below any significant value with more choices and more questions.

The example the author shows maximized the statistical uncertainty of guessing, and is not relevant. To illustrate the point: take the 100 question true/false test.
A) If you give 1 point for correct and no point for wrong, the student will score from S0=50 (randomness) to S0=100 (perfect). Now calculate a new score
S= 2*S0-100, and you have results from 0 to 100 (round anything less than 0 to 0.
B) Announce you will subtract 1 point for each wrong. Now you will get scores from T0=0 to T0=100, and your map is just T=T0.

--
don't cut it off www.mgmbill.org

The problem there by Moraelin · 2007-06-16 22:41 · Score: 4, Interesting

The problem there is that averages are one thing, but in practice there still is a non-zero chance that he'll actually score higher than you do.

Let's say it's 20 questions, 4 possible answers each. He'll know 5 of those, has to guess 15. There's even a 1 in billion chance that he'll get all 20 right. (4^15 = 2^30 = approx 1 billion.) If you gave that test in China, by now you'd have at least one guy who pulled exactly that stunt.

There's also the issue of how well those questions fit your and his domain of knowledge. Let's say you can't possibly test _all_ the questions, because that's usually the case. You can do it for state capitals, but you can't possibly cover a whole domain like medicine or law.

There are 50 states, you know 25, the other guy knows, say 12 (rounded down), so it's not impossible that the 20 questions are all from the 25 you don't know, but include all 12 that guy knows. In fact, assuming a very very very large domain (much larger than 50, anyway), there's about 1 in a million chance that all 20 questions will be from the 50% you don't know.

Now when testing states that doesn't have a higher moral, because (at least theoretically) all states are equally important. In other domains, like medicine, law, even CS, that's not the case: stuff ranges from vital basics to pure trivia that noone gives a damn about. (Or not for the scope of the problem at hand: e.g., if I'm hiring a Java programmer, asking questions about COBOL would be just trivia.)

And a lot of "hard tests" are "hard" just by including inordinate amounts of stuff that's unimportant trivia. E.g., if I'm giving a test for a unix admin job, I can make it arbitrarily "hard" by including such trivia as "in which directory is Mozilla installed under SuSE Linux?" It's stuff that won't actually affect your ability to admin a unix box in any form or shape. The fact that SuSE does install some programs in different directories is just trivia.

(And if that sounds like an convoluted imaginary example, let's say that some "hard" certification exams ask just that: where is program X installed in distribution Y? And at least one version of Sun's Java certification asked such idiotically stupid trivia as in which package is class X, or whether class Y is final. Who cares about that trivia? It's less than half a second to get any IDE to fill in the package for you. E.g., in Eclipse it just takes a CTRL+SPACE.)

And in view of that previous point, including trivia in an exam just to make it "hard" is outright counter-productive. There is a non-null chance that you'll pass someone who memorized all the trivia, but doesn't know the basics.

Not all knowledge is created equal, and that's one point that many "hard" exams and certifications miss. If a lawyer doesn't know the intricacies of Melchett vs The Vatican, who cares? In the unlikely situation that they need it, they can google it. If they don't understand Habeas Corpus, on the other hand, they're just unfit to be a lawyer at all. Cramming trivia into an exam can get you just that kind of screwed up situation: you passed someone who happened to know that Melchett vs The Vatican is actually a gag question, and that case name appears in Stephen Fry's "The Letter", yet flunked someone with a solid grasp of the the basics and who knows how to extrapolate from there and where to get more information when he needs it.

Rewarding random guesswork is worse. Probably the most important thing one should know is what he _doesn't_ know, so he can research it instead of taking a dumb uninformed guess. Most RL problems aren't neatly organized into 4 possible answers, so it can be a monumental waste of time to just take wild guesses and see if it works. I've seen entirely too many people wasting time trying wrong guess after wrong guess, instead of just doing some research. E.g., I've actually witnessed a guy trying every single bloody combination between *, & and nothing in front of every single variable in a C function, because he never understood how poin

--
A polar bear is a cartesian bear after a coordinate transform.

Re:The problem there by that+this+is+not+und · 2007-06-17 00:11 · Score: 4, Insightful

Just to pull out a snippet and maybe contribute a bit to topic drift:

if I'm hiring a Java programmer, asking questions about COBOL would be just trivia.)

If you ask that sort of question to a prospective programmer, you'll find out more about the person's technical depth, which may be of value. The guy who 'learned Java' because he read it somewhere or an 'advisor' told him it was a way to 'get ahead' is gonna be mister lightweight who is looking for a 'career,' not somebody who is a practitioner who takes a broad approach.

Further, it will help sort the candidates out. The ones who contrive 'fake' knowledge of COBOL can be rooted out and eliminated. Those who are willing to say 'I am not sure I know, but that's an interesting queston' get points, those who automatically start thinking about where to find the answer get even more points.

And, of course, the question will help to sift out anybody with actual COBOL knowledge, because anybody with skill in COBOL who is applying for a Java position is obviously an unstable nut.
Re:The problem there by kklein · 2007-06-17 00:21 · Score: 1

Nice post. I agree, but please see my post at: http://science.slashdot.org/comments.pl?sid=238713 &cid=19539965 The thing is that tests--important ones, anyway--don't work the way everyone seems to think they work. Individual items are virtually meaningless.
Re:The problem there by vertinox · 2007-06-17 00:46 · Score: 1

And a lot of "hard tests" are "hard" just by including inordinate amounts of stuff that's unimportant trivia.

I would have to agree.

I'm one of those guys that test well, because I am one of those sorts that remember random useless trivia. (Did you know that the croissant originated from the Battle of Tours in the 700's?)

And I would say I aced many of my important tests by using particular test taking methods by crossing off the choices I know aren't true at the moment and come back to the answer later when I'm running low on time and just randomly pick an answer that I know I have a 50/50 chance on.

However, don't ask me to write an English paper or a thesis.

That said, I really can't see any real world application of test taking skills other than game shows and computer troubleshooting (if these conditions exist than only these possible actions could be taken) while most real world issues are highly complex and open ended.

I would argue that the root core of the problems is that the high school and college degree is used for hiring people rather than technical or hands on training.

Sure there needs to be some basic level, but I think college has lost itself in being used for a requirement for a job whereas in reality it has nothing to do with the actual job itself (except for research and science fields which usually involved Universities directly)

I'm not sure if I am making myself clear, but if we made higher education more of a hands on training or something non-ranked rather than a quantitative judgment on whether someone knows their field or not then people would simply learn for the sake of learning and not trying to game tests.

I mean I would love to go back to school just to learn things that I may or may not even use in my regular job (say like astrophysics or Japanese), but the cost of this is too high for me to even really consider.

--
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
Re:The problem there by maxume · 2007-06-17 00:53 · Score: 1

What happens if your boy has to guess seventeen or eighteen times?

Most of the 'serious' multiple choice tests I have taken have had upwards of 200 questions, so worrying about the guy who mysteriously manages to guess his way to perfect isn't necessary.

--
Nerd rage is the funniest rage.
Re:The problem there by Dun+Malg · 2007-06-17 04:09 · Score: 1

will help sort the candidates out. The ones who contrive 'fake' knowledge of COBOL can be rooted out and eliminated. Those who are willing to say 'I am not sure I know, but that's an interesting queston' get points, those who automatically start thinking about where to find the answer get even more points. We're talking about multiple choice tests.

--
If a job's not worth doing, it's not worth doing right.
Re:The problem there by Jaime2 · 2007-06-17 05:48 · Score: 1

But.... It is still trivia if asked on a Java test. A single technology specific exam should not strive to test the breadth of related knowledge. The collection of a person's certifications and experiences will give that bit of information at a quick glance.

If you put COBOL questions on a Java test, you could create a situation where someone cannot prove that they are qualified in a single technology for an entry level job because they can't get the certification in one specific technology simply because they don't know a related technology.

Now, COBOL questions may be relevant on a "Java Integration" exam. But each exam should not try to test everything on earth.
Re:The problem there by DavidHumus · 2007-06-17 05:58 · Score: 1

Maybe even more topic drift, but...

I've noticed that there's a tendency for programmers who put together a technical test to show off, i.e. they ask questions about arcane trivialities about which they happen to know the answer.

When I was in charge of creating a technical exam, I asked three of our technical experts to each come up with three easy questions. The idea was that an exam where the average score is 20 or 30 percent doesn't do a good job of discriminating between candidates (nor does one with an average score of 90 or 100 percent). Since we were testing three different areas - they were something like C/C++, general Unix, and X-Windows - we would go easy on someone who would draw a blank in one of them; also, we should reward informed guesses (this was a verbal, expository test).

I also contributed a single question (which was: What is the first thing to do when optimizing code for performance (=speed of execution)?) which was, in retrospect, perhaps too vague or difficult as fewer than half the candidates came up with a good answer. However, we left it in as it is a good, general question and can be answered well on the basis of general programming smarts and experience.
Re:The problem there by Moraelin · 2007-06-17 10:23 · Score: 1

And if I answered dirname `find / -name run-mozilla.sh` whould I get any credit?

That's just the problem: you wouldn't, because it's not one of the pre-defined options.

You illustrate the problem perhaps better than I could. RL skill often means just that: knowing what command to run or what key to press or what book to check to get any obscure information you wish. But that's not what these "hard" multiple choice tests actually test. You either know the piece of useless trivia, or you don't.

In practice, for most distros you wouldn't even need to know how to use the command line to find that out. Most have a search tool right on the bloody desktop or toolbar, so, you know, even mom can find her files. But that's lost to the snake oil... err... certification vendors and such. They just have to have their trivia questions so they can say they have a "hard" test.

--
A polar bear is a cartesian bear after a coordinate transform.
Re:The problem there by huckda · 2007-06-17 17:08 · Score: 1

There's even a 1 in billion chance that he'll get all 20 right. (4^15 = 2^30 = approx 1 billion.) If you gave that test in China, by now you'd have at least one guy who pulled exactly that stunt. Umm...wrong..because each single PERSON has the 1 in a billion chance at achieving that...
just because 999,999,999 people did NOT get it...does not mean that because you are the 1 billionth person you WILL.

yay for mathematical reasoning.

--
"Just Smile and Nod." --Huck
Re:The problem there by kbielefe · 2007-06-18 05:07 · Score: 1

Yes, but according to the latest population estimates there's a 70.8% chance he is right. I think that probability is a bit high to call him flat out wrong. For employing a useful rule of thumb, his argument is hereby upgraded to "not necessarily correct."

--
This space intentionally left blank.
Re:The problem there by swillden · 2007-06-18 15:21 · Score: 1

I also contributed a single question (which was: What is the first thing to do when optimizing code for performance (=speed of execution)?) which was, in retrospect, perhaps too vague or difficult as fewer than half the candidates came up with a good answer.
Just out of curiosity, what answer were you looking for?
IMNSHO, the answer I would want to hear is "profile it on an appropriate data set". Any answer regarding examination of the code, or the data structures, or the memory allocation strategy, or network traffic, etc., etc., etc. would be wrong, in my book. I'd give bonus points for someone who began by questioning the question, and first looked to see if the performance problem could be acceptably solved by throwing a reasonable amount of additional hardware at it. Of course, they'd get no credit if they didn't know where to go next once I told them that the problem couldn't be reasonably solved with more hardware.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.

Cartel Tests by mdsolar · 2007-06-16 23:41 · Score: 2, Interesting

Actually it has to be a % passing. If the supply of licensed doctors and attorneys were not limited, the costs for their services would reduce, so these exams have to be a part of the the system to control the supply. A test may be written to ensure a spread (so it tests knowledge) and also to ensure that the passing score is largely unattainable. So, I think the analysis is incorrect. The tests are not too hard to be useful as tests, it is just that their is a conflict of interest as regards their use. As medical care begins to take on the characteristics of a human right as representation in court is a political right, perhaps we'll begin to see a breaking down of the cartel system so that medical and law educations are not restricted and final competency tests can be tests of competency rather than also being a link in a chain of controlling supply to increase price.
--
Electricity without fuel costs: http://mdsolar.blogspot.com/2007/01/slashdot-users -selling-solar.html

His hard tests are *too* hard by SirBruce · 2007-06-17 00:02 · Score: 1

What I don't like about his reasoning is his assumption that "hard" tests will test substantial knowledge that even the most educated test-taker will not get correct. I would submit that such tests are poorly designed, at least for a final/qualification test.

If your goal is to teach X amount of material and you want to give tests to see how far along a student is at learning X, then such a test is okay, as there will be naturally parts of X that you haven't even taught yet that the student is likely to get wrong. However, if you're now giving a final/qualification test where a good student is expected to know all of X, then the test should test for all of X, and no more. Many of the students should be scoring very close to 100%. In this way, guessing doesn't become a large statistical factor in overall score.

If you administer such a test and even the best students are missing half the questions, either you're testing for more than X, or X was not taught very well. Now, in college classes, we want X from different classes in different years, at least recent ones, to be equivalent; that is, we want the kid who got 100% on the test this year to know as much as the one who got 100% last year. So one needs to be careful about lowering the standards for what qualifies as X knowledge. However, statistically speaking, it's very unlikely for a college or professional class to have a "poor" year where everyone in the class is a poor learner so they can't even reach 100% of X even if taught properly. So I don't think it's a bad idea if test high score is only 50% on a final to either make the final easier or change one's teaching methods. The chances of it actually "dumbing down" the qualifications relative to previous years is small.

poor analysis, kernel of truth by drfireman · 2007-06-17 00:04 · Score: 1

It's a shame this guy went to the effort of creating this blog post without making any effort to involve useful metrics of how informative a test really is. The words sensitivity, specificity, and variance don't come up at all. There is a kernel of truth here, which is that you can have both noisy items (not informative about the taker's knowledge) and informative items on tests, and hard tests tend to have more noisy items. The author seems to miss the point that two-choice items at which students guess maximize the error variance. In other words, he chooses the best possible case to support his argument, even thought it's unrealistic. Five-choice guessing items contribute less, although it depends how the items are structured (if three of the choices are easily eliminated by even the worst students, then it's much closer to a two-choice item). As a thought experiment, if there were a million choices per "hard" item, they would contribute almost no variance to test scores. The article seems to make no reference to the true score variance among the test takers, which is obviously critical.

I would have liked to see an analysis of the relationship between the number of plausible choices per question and the probability of mis-ordering two test-takers (giving the less knowledgeable a higher score). That would have been a lot more informative than simply saying, essentially, "two is bad, you do the math for more -- but trust me, it's a mathematical certainty."

There is a kernel of truth here, that multiple-choice tests are often not that sensitive, and that when everyone is guessing on an item, it contributes only noise to the measure. At issue really is how much variance in the test score is explainable by knowledge. In other words, how much information is contained in the test score. An article that uses phrases like "mathematical certainty" and "complete fraud" is obligated to provide some legitimate analysis, or at least references to the literature, not just anecdotes.

He is totally and completely wrong. by kklein · 2007-06-17 00:12 · Score: 5, Interesting

Ugh. I just wrote a pretty polite reply at his page after skimming his idiotic article. Now that I've read it, I'm actually angry.

This guy knows NOTHING about testing. Nothing. He isn't even to the level of Classical Testing Theory (CTT), which is really not much more than means and Pearson correlations, and is nowhere near how high-stakes (and even medium- and low-stakes, increasingly) multiple choice (MC) tests work now, and how they have worked for many many years.

IAAP (I am a psychometrician). A big part of what I do for a living is design a particular MC test, pilot the items, and interpret the results. But I don't just count up the correct items and give you the percentage. Why? Because that would be insane. You can guess on those.

Oh, but he says this:

But suppose the grading attempts to adjust for guessing. There is no way of knowing what is in the mind of the test-taker, so the customary is to subtract, from the number correct, some fraction of the number wrong.

--Which is just fine until I tell you I have NEVER heard of dealing with guessing that way on a professional-level test.

As a general rule, we don't do any easy mathematics. At all.

Here is part of the output for a test I'm working on right now:

Seq Item Type Location SE FitResid DF ChiSq DF Prob 35 I0035 Poly 0.685 0.089 2.239 525.69 15.636 8 0.05 36 I0036 Poly -1.946 0.165 -0.587 525.69 6.754 8 0.56 37 I0037 Poly 0.02 0.093 2.603 525.69 12.704 8 0.12

This is generated by RUMM2020, a tool for Rasch analysis. The Rasch model was developed in the 60s as an ideal model of item response. These are the stats on 3 items of this test. The two most important columns are Location and Probability.

The location is the item difficulty. Given the sample's performance on this item, and given their ability, how hard is this item? Item 35 is quite difficult; item 36, quite easy.

The probability is the p value for the chi square. Basically, if it's 0.05 or below, that item is operating significantly (statistically significantly, that is) outside of the model. It displays poor "fit." we generally toss these items before going on to the next step (ideally, these are weeded out during pilot testing, before the test goes live--in this case, it is an experimental test of a construct I'm not even sure exists anymore, but I digress). If an item has poor fit with the model, it is too much of a loose cannon, and its results cannot be trusted. This is what the benighted blogger (is there any other kind?) was whining about. That item is hard not because it is good, but because it is evidently stupid. The responses are all over the place, which means people were probably just guessing. Out it goes before it ruins any examinees' lives.

The next step is to get person locations. In the case of people, these numbers indicate the person's ability. This is calculated by looking at their performance on the items, given their difficulty (Which is calculated based on people's performance on them! Incestuous! But given a large enough sample, it all works out to a fine enough grain to be useful). Here is the output for the people:

ID Total Max Miss Locn SE Residual DegFree DataPts 1 67 125 125 0.254 0.21 -0.272 123.60 125 2 77 125 125 0.700 0.21 -0.178 123.60 125 3 86 125 125 1.120 0.22 -1.030 123.60 125

So, the first person didn't do so hot; the last did pretty well (these usually top out at 3ish). As you can see in "DataPts," there were 125 items on this test. I started with 160. Do you hear that, Mr. Unexpected "Truths?" We have your back! We're not just handing you a naked score based on our crap items. WE PULL THE CRAP ITEMS.

That location score will usually be rescaled to something prettier, since no one would really like to see something like

Re:He is totally and completely wrong. by Ruie · 2007-06-17 00:43 · Score: 3, Interesting

I have not heard about Rasch model, thank you for the explanation.
I just want to point out that I never saw anyone use it (or anything else similarly complex) in universities, even on math departments. Typically grading the test is just the matter of summing up right answers (perhaps with partial credit) and then chopping the distribution into three parts As, Bs and Cs. A good reason for that is people perceive the grading being fair when they can predict how much answering a particular question will benefit them.
Re:He is totally and completely wrong. by kklein · 2007-06-17 00:50 · Score: 1

Yeah, using it on a classroom-based-test is overdoing it a little, I'd say (Although a colleague of mine does! He's addicted, I think.). But for the kind of high-stakes tests the blogger was talking about, I can't imagine that they aren't doing something along these lines.
Re:He is totally and completely wrong. by colfer · 2007-06-17 02:07 · Score: 1

I've always suspected that statistical design test leads to poorly written questions, or can in the hands of people who use it in practice. But thanks for bringing up a much more important topic than the silly article, which anybody knows who has read the instructions to the SAT (test for 16 year-olds in much of the USA)! Stupid article.

Anyway, say we take the stat method you describe and put it in the hands of people who aren't too smart... or are just greedy. Instead of employing a bunch good question writers in Princeton, they collect crap questions from a bunch of freelancers working in their spare time wherever, and then just run the sample tests and run the stats and use that to pick the questions. I know I have taken some tests written that way. You end up with confusing questions that do statistically differentiate the takers who score generally well on the middle of the test from those that don't, but that are unfair, in that the answers don't match the question too well, etc. But still the stats work.

Now you work on this full-time, and I'm just voicing my suspicions based on a few data points. I haven't done the math. But I'm very suspicious of replacing intelligence with dumb stats, given the profit motive. Cheaper to hire the less qualified, or it seems that way to management. It might not even be true! You pay for expensive stats, after all.
Re:He is totally and completely wrong. by kklein · 2007-06-17 02:30 · Score: 3, Informative

Well, a poorly-written item will always be out-fitting. If the answer doesn't match the question, then everyone will have to guess. If everyone has to guess, the information curve (a great graph I'd love to show you, but can't here, and need to go to bed) will be about flat. There should be a big hump that shows that it gives us a lot of information about people a certain number of standard deviations above or below the mean. Questions like you describe won't have that.
Also, I wrote about this in another comment, but a lot of the items you get on high-stakes tests don't really go toward your score. They are actually pilot items that the company is trying out a few thousand times to see if they work correctly before they start contributing to anyone's score.
I don't know any cheap item-writers, though. Everyone I know in this field has at least a master's degree, and most have PhDs. We don't come cheap.
As for being a good or bad item writer, there are just a handful of rules to follow to avoid the big blunders. After that, it's all about taking them for a spin and seeing how they handle. As I've said a few times now, there's no telling what will happen when you release these things into the wild. I've written items that I doted on and cared for and nurtured and cuddled and put my all into, fully expecting that they would grow up to be model items, ones that the other items would look up to and aspire to becoming, only to be totally and utterly betrayed by them in real-world piloting, my time and devotion wasted, finally having to drag them out back and shoot them in the back of the head. On the other hand, there are sometimes items you add to a section last-minute, just trying to get the number up for piloting or whatever, and find that you have written some ridiculously wonderful item purely by accident.
It gets easier with practice, though. To be fair, I'm not a very good item-writer. But that is why I, especially, need the stats.
Re:He is totally and completely wrong. by colfer · 2007-06-17 03:41 · Score: 1

Interesting. But couldn't a bad question pilot well? Sorta like neural nets, the system has an intelligence to it that you haven't specifically designed. OK. So the question pilots well for some unknown reason. But a knowledgeable person reads it and hates it because it relies on confusion or quick guessing or whatever. But it works because the right* people tend to answer it right anyway, even though it annoys them and they know it is a bad question. Just sayin...

* For those that don't know anything about this methodology "right" people are those who do well on known good parts of the test, leaving out the easiest and hardest questions, IIRC.
Re:He is totally and completely wrong. by Jeff+DeMaagd · 2007-06-17 04:21 · Score: 1

Typically grading the test is just the matter of summing up right answers (perhaps with partial credit) and then chopping the distribution into three parts As, Bs and Cs.

Is this an example of the places too sissy to hand out a failing grade for crap?
Re:He is totally and completely wrong. by Ruie · 2007-06-17 05:03 · Score: 1

Typically grading the test is just the matter of summing up right answers (perhaps with partial credit) and then chopping the distribution into three parts As, Bs and Cs.
Is this an example of the places too sissy to hand out a failing grade for crap?

It is common in the US universities, I never heard of anyone using a substantially different scheme. You can still get an F if, for example, you register for the class, miss all the exams and ignore e-mail from the professor.
It is not so much the inflation of grades at low end that is the problem (I think, by now, most people equate C with an F), but rather that having an A does not imply you have understood the material.
Re:He is totally and completely wrong. by FFFish · 2007-06-17 05:05 · Score: 1

Thank you.

I don't have nearly the education you have wrt statistics, but I do distinctly remember my stats courses enough to know that it is entirely possible to write mc tests that are highly reliable and valid.

--

--
Don't like it? Respond with words, not karma.
Re:He is totally and completely wrong. by Jeff+DeMaagd · 2007-06-17 08:15 · Score: 1

The school I went to had A through D and then F. Get a D as a course grade, and you generally had to retake the course and get a C or higher to have it count towards a degree. Just showing up to class and taking all the tests isn't enough to pass the class.
Re:He is totally and completely wrong. by kklein · 2007-06-17 10:07 · Score: 1

MAYBE. But I've never seen it. You're right; it's possible; but confusing questions are the very ones that usually turn out to be big ol' steaming turds.
That being said, if the "right" people are getting it right, then it's doing its job, so what's the harm? All a test is really designed to do is categorize people. A real test of knowledge is always going to be unprompted. This is why most graduate programs--and a fair number of bachelor programs as well--require a thesis/dissertation. These tell you exactly what somebody knows. A test cannot do that. It can, however, allow objective comparisons among individuals, which things like papers cannot do.
There are ways, however, to pin test scores and even individual items to real-world competencies, but they are extremely time-consuming.
Re:He is totally and completely wrong. by identity0 · 2007-06-17 17:27 · Score: 1

Care to name the tests that go to such lengths? FWIW, I've not heard of any major tests doing this. I know I was advised to guess on SAT and ACT, and I doubt TOEFL, AVSAB, GRE, GMAT, are much different.

Basically, it sounds like you're talking about a very few tests in niche areas.
Re:He is totally and completely wrong. by kklein · 2007-06-18 01:34 · Score: 1

You got me; I made the whole thing up.
(and that's just ETS)

The worst multiple choice, ever by freeweed · 2007-06-17 00:28 · Score: 2, Interesting

Had an algorithms prof (of all things) give us a test where every question had the following possible answers:

Yes, No, Sometimes, Maybe, Unknown

Then, he had questions like 1. Some scientists believe than P=NP?

To which, of course, you could argue ANY answer is correct. ..

That being said, this blog post comes across as the usual whining we've all done or had to put up with through the years. No testing methodology is perfect, and everyone tests different on different kinds of tests. Fact is, though, they're pretty damn good. It's a common belief that millions of people who are otherwise idiots are graduating with great grades, while millions of geniuses can't test well - but that's horseshit. The majority of people manage to test at their level of understanding. The fact that people actually notice the odd idiot who guesses well is the exception that proves the rule.

--
Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.

Re:The worst multiple choice, ever by gr8dude · 2007-06-19 08:16 · Score: 1

Did your prof also take part in the development of Slashdot's tagging system?

--
The saddest poem

Duff math, but... by itsdapead · 2007-06-17 00:38 · Score: 1

He should have stuck with the more useful observation that almost* any test with a very low pass rate will be unreliable.

All tests have a margin of error, although its a rather taboo subject - when did you ever get a test result that stated the 95% confidence interval? If only a small proportion pass, there is a danger that these errors will dominate.

There are:

simple "cockup" factors like mistakes in marking, or miscounting of marks;
systematic factors like the artificial stress of an exam affecting different people;
sampling errors - you can't test the whole subject, so a proportion of candidates will be "lucky" and get questions that they've drilled for. A low pass rate suggests that the domain you are sampling is sparsely populated and, therefore, that sampling is invalid. Someone who "passes" such a test has only shown competence on a few specific topics.
deeper flaws in the whole concept of an exam - such as the potty notion that you can represent ability in a complex subject by a single number, and other sweeping assumptions about independence and normal distributions made by the statistical methods used to "calibrate" tests.
The distorting effect of the exam itself - if its important, people will learn to do the exam, not learn about the subject.

Now, an issue with multi-choice is the "guessing" problem, but there are (as TFA points out) work-arounds. TFA misses out the most important way of reducing guessing - which is designing the questions carefully so that each alternative is seductive and/or represents a common error. The real problem with multi-choice is the last two bullets above - it really is the most artificial and superficial form of test possible. Done well, its a good way of quickly romping through a large domain to offset the "sampling" problem, but it should never be the totality of a test. The depressing problem is that its so easy to mark and administer - and is cheap to deliver on computer (c.f. more ambitious computer-based testing, which is expensive to develop).

*I'm sure its possible to contrive a counter-example.

--
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.

Re:Duff math, but... by kklein · 2007-06-17 01:30 · Score: 1

Computer-adaptive testing! Another sexy subject!
Basically, all tests really need to be good is a lot of data. When you sit for the GRE, for example, did you know that a fair number of those items actually aren't used for calculating your score (especially true when dealing with computer-adaptive tests, where the computer might be done figuring you out after a remarkably small number of questions)? This is how pro tests get to be so good and so reliable. Every item that figures in your score has been put through its paces with thousands of people, unscored. When it goes from being an experimental item (as in "we don't know what it's going to do in the wild") to a known-variable, then it is used for calculating the score. No surprises.
I think the problem is that most of the tests people have experience with are just off-the-cuff classroom instruments. Since they kinda look the same as the high-stakes tests that really determine one's future, it's easy to assume they are the same. But they are not. There is an army of psychometricians trying to make sure that you get the score you deserve. We don't set out to make hard tests. We set out to make valid and reliable tests.
Re:Duff math, but... by itsdapead · 2007-06-17 03:56 · Score: 1

This is how pro tests get to be so good and so reliable.

As long as you realise that the relationship between "reliable" and "good" is one of correlation, not causation.
Ensuring that a test is measuring some factor in a stable and reliable fashion has, indeed, become a pretty exact science. The danger is that all the impressive statistics this produces will give a false impression of validity and detract from the massive inference that this some factor is the thing you are actually purporting to measure - and, of course, if the test is vital, it will Heisenberg the educational process into one that teaches some factor.
TFA mentioned bar exams. Now, IANAL but I'm pretty sure that representing someone in court does not consist of answering a battery of multi choice questions covering the entire spectrum of jurisprudence. How reliable the tests are - year on year or against other (probably similar) tests does little to support the assertion that the test measures what it claims to measure.
I've encountered math questions where, when you actually observe and question schoolchildren working on them, the actual thought processes involved have nothing to do with the what the question purported to test - however, the bright kids cope better and the psychometric analyses see no problem. Best example was a computer-based question for young kids about telling the time where, on observation, sucess or failure clearly depended more upon fathoming out the question's user interface than telling the time. Looking at the psychometrics gave the question a clean bill of health. Other questions on high stakes assessment test math but in a inappropriate way - e.g. probability questions that require you to multiply two probabilities that, in context, are clearly interdependent; statistics questions that require you to say that statistically insignificant data prove a hypothesis; plus loads and loads of little isolated techniques tested separately with no requirement to "synthesize" them into a credible application.
Psychometric analysis is a valuable tool - when combined with other measures, acknowledgement of the underlying axioms and a healthy dollop of judgement and experience - but what it offers is so seductive in today's pseudo-accountability, management-by-algortithm environment that it can obscure the truth: testing is an inexact science for reasons that go far deeper than the statistics and we should not place so much cultural emphasis on the differnce between 49% and 51%.
Computer-adaptive testing! Another sexy subject! Basically, all tests really need to be good is a lot of data.

...QED :-)

--
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
Re:Duff math, but... by kklein · 2007-06-17 10:57 · Score: 1

Yes.
We do our best to maintain construct validity (i.e. testing what we claim to be testing), but sometimes it is as you say. We often catch those later. In my experience, within a particular section, however, if we find that one item just doesn't match up with the rest of the sub-test, we take a look and very often that's what we see. A question that discriminated well, but was drawing on some kind of other, more general, knowledge.
A lot of people's ire towards tests and testing has nothing to do with test design. That's just the variable they think is most important. Actually, what upsets people the most is test usage. Idiotic things like No Child Left Behind are the product of businesspeople or politicians running with a test without understanding what they are good at and what they aren't. Basically, they are good at giving you a broad, objective categorization of individuals. That should always be paired with real-world observations. It rarely is.
Even one of the best tests out there, the TOEFL (Test Of English as a Foreign Language--used to screen candidates for study at US universities, and developed by ETS, the same people who make the GRE), admonishes institutions on the importance of pairing those scores with an interview, or some more specific institutional exam, or SOMETHING. I know of absolutely no schools that do this. There is little we, the developers, can do about this, aside from trying to be as fair and valid as we can be and telling institutions not to do that.
There is, perhaps, light at the end of the tunnel. It is getting easier and easier to incorporate writing tasks with these things. The GRE has replaced its silly brain teaser section with a critiquing and writing arguments section, which has a much better overlap with what you will be expected to do in grad school than the old section. By doing these things computer-based, that text can quickly be sent off to two or more raters for human rating. Again with IRT (my institution uses the program Facets for this), we can look at the relative severity of raters and from that calculate a score for a piece of writing. For some things, this really is the way to go. But it's more expensive as it requires human raters, so only the really big tests can do it.
Of course correlation is not causation, but say you have 3 tests of vocabulary and three tests of listening. You run them all on the same group. Of course there is going to be some overlap, which we can understand to just be general language proficiency, but we would expect to see a much higher correlation within the tests than between them. If we have too much between, then we need to take a closer look at them, because they aren't unidimensional (this is a problem with the TOEIC, but that doesn't stop most of Asia using it to deny people jobs--we have a TOEIC teacher here who has a perfect score, but the English-speaking staff has no idea what she's talking about most of the time, her English is so bad). However, if we see 3 different tasks, all ostensibly measuring the construct of vocabulary, and they are highly correlated, we pretty much have to assume that they are all testing vocabulary, right? Individual items may be floating into some other construct, sure, but taken as a whole, "walks like a duck, talks like a duck."
The problem of construct validity is a tough one, but we are getting better all the time, thanks to advances in cognitive science. As we learn more about how the brain functions, and what kind of tasks light up where in the brain, it's likely that some day not too far in the future, we'll be able to pilot these things on people and see if we're hitting what we are hoping to hit. Granted, that probably will only be true for really big tests, but those are the most important anyway.
Finally, yeah, tests can make mistakes, and people can certainly game tests (I knew a guy from Saudi who I think was actually kind of retarded who took the TOEFL 30 times and finally passed it, because he remembered all
Re:Duff math, but... by itsdapead · 2007-06-18 23:08 · Score: 1

Basically, they are good at giving you a broad, objective categorization of individuals.

Unfortunately - in many fields (particularly school/college) the test is also the de-facto definition of the curriculum. It would be nice if 'twere not so, but its pragmatically inescapable. Writing a psychometrially valid test is one thing - writing a psychometrically valid test that also exemplifies how the subject should be taught is a bit of a toughie - and might mean sacrificing some of the psychometric rigour.
The real problem occurs when you have the people who understand the subject in silo (A), the people who understand teaching in silo (B), the psychometricians in silo (C), and - for computer based testing - the programmers in silo (D) all passing messages via the Administration. I really, really, really wish that I'd never seen that happen.

--
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.

Re:warning moronic blog post linked by jbengt · 2007-06-17 00:58 · Score: 1

No, his implied definition of a hard test is that most test-takers know few correct answers, but the passing score is lowered to allow a sufficient number of test-takers to pass. Then, if you do the math (which he got somewhat wrong) allowing people to guess without penalty creates a significant probability that someone who knows less than you will pass even though you failed.

faulty logic by mephistophyles · 2007-06-17 01:04 · Score: 1

I read tfa, and aside from the bad math in the last paragraph I also have a problem with the logic of his example. He assumes that since one person who knows twice as much as me should get twice as high as me. But it isn't at all taking into account that twice of VERY LITTLE knowledge is still not a lot. I can understand that multiple choice will not be a perfect way of judging someone's knowledge, but thinking about how many people need to take the exam, and how few people there are to grade them. Not to mention it is impossible to ask long answer questions on every aspect of a course. Multiple choice questions are still not too bad a way of quizzing a large group of people on subject matter that has a large range of subtopics. Assuming of course the questions aren't uber-hard. Damn this wasn't exactly something I wanted to read 2 days before my exams (which happen to be largely mc-questions)

Wh do people have such a hard time with... by mario_grgic · 2007-06-17 01:09 · Score: 1

assumptions. The premise is "IF we had person A that knows 2 times as much as person B, a well devised test ought to score person A twice as high as person B". No one is saying that they found person
A and person B, i.e. two people where they can show beyond any doubt that A knows twice as much as B, that is exactly what ideal test would do. The problem here is construction of such ideal test.

And surely it is possible to construct a test that approaches the ideal without having people A and B. All one has to know about A and B is that A will answer two times as many questions correctly, if they don't answer questions they don't know answers to. So, one can then see how to score tests to reflect that. And the answer is to penalize guessing to some degree, which will depend on the structure of the test.

--
As the island of our knowledge grows, so does the shore of our ignorance.

Agreed... by mario_grgic · 2007-06-17 01:18 · Score: 1

but these tests do not test knowledge (all the people attempting BAR or medical license exam already have degrees). They are devised to cull and decide who "gets in", rather than test knowledge.

It is a naive assumption to think that more knowledgeable should get in it seems :). World doesn't work that way.

--
As the island of our knowledge grows, so does the shore of our ignorance.

Guesswork by noobishness · 2007-06-17 01:22 · Score: 1

I know this isn't a comforting thought, but isn't some of the domain of doctors and lawyers in effect specialized, logical guesswork? For example, many diseases could share a common set of symptoms. Certainly, it takes knowledge, but it also takes a wee bit of luck.

Trolling for stories? by camperdave · 2007-06-17 01:22 · Score: 1

...they are designed to touch on subjects which are likely to get them onto news sites like Slashdot...

So, is that the new sport? First there was First Post, then (before they switched) getting 50 Karma Points. Now we have to get Slashdot to Feature our Third Party Blog?

--
When our name is on the back of your car, we're behind you all the way!

Medical Specialist Exams have an Oral Component by neoshmengi · 2007-06-17 01:30 · Score: 3, Informative

The hardest part of most medical specialist exams are the orals. Nobody ever complains about the written component. You get a to sit in a room with one or more examiners for a few hours of intense grilling. There is no way to hide any lack of knowledge and your deficiencies are exposed for all to see.

Also the US has a strange system of certifying specialists. After completing residency (usually based on putting in your hours) you can practice medicine under the application 'board-eligible.' Once you've passed your exams, then you can be called 'board certified.'

In Canada, you can't practice at all unless you pass your board (Royal College) exams. The exams are reputedly harder in Canada as well (from those I know who have written both).

I want to know what my students know by DynaSoar · 2007-06-17 01:31 · Score: 3, Interesting

I don't care what they don't know.

I give multiple choice exams with between 100 and 200 questions, and 4 possible answers.
Wach correct answer is worth 2 points; they need to answer 50 correctly to get 100.
They don't HAVE to answer any question, or any number of questions. If they can answer 30 questions, they can get a D. Any question answered incorrectly is -1 point. This serves two purposes.
It prevents guessing, and it forces the student to consider whether they actually know the answer, or just think they do.

I typically give 4 of these per semester. After the first one I usually get several complaints because they're not used to testing in this way. After the second I usually get one or two stating they can't break the habit of answering every question. After the final, I get many compliments and high marks on my evaluations, and the students tell me they are much more confident in what they've learned than from any other class. I've had occasion to run across previous students from years past, and they claim they still remember more from my class than from others.

I've had administrators forbid me to do it this way. I did it anyway. When they saw the results, they relented, and many suggested the process to others.

--
"I may be synthetic, but I'm not stupid." -- Bishop 341-B

Re:I want to know what my students know by canadian_right · 2007-06-17 05:04 · Score: 1

When I went to UBC (University of BC) in the early 80's it was was common that on multiple choice exams you lost a point for a wrong answer. I'm not sure that it was official policy, but all the muliple-choice tests I took used this method. This was to prevent guessing. I just assumed this was a widely used technique outside of the highschool level.

--
Anarchists never rule
Re:I want to know what my students know by n6kuy · 2007-06-17 06:38 · Score: 1

Why do you care whether they guess?

Why not just grade on a statistical curve?
Those who know the material will be at the high end of the curve.
The guessers will be weeded out.

No matter how you grade though, someone will claim that it's unfair.

--
If you disagree with me on social issues, then it's pretty clear that you are a narrow-minded bigot.

Error in the Math by ThematicDevice · 2007-06-17 01:31 · Score: 2, Insightful

[i]For True-False exams for example, the number subtracted would most likely be (Number Wrong ÷ 2). Let's see how that would work out, for the sample case above. You, answering two questions correctly and guessing at 98 would be likely, on the average, to get 49 wrong, and so have a final score of 2 + 49 - (49 ÷ 2), or 75.5, while I, again on the average. answering only 1 correctly and guessing at 97, would get a final score of 1 + (97 ÷ 2) - ((97 ÷ 2) ÷ 2)), which comes out to be 25.25. Here there is a substantial difference between our scores, closer to the two-fold difference in our actual knowledge.[/i] Lets think about this, 51-24.5=26.5 not 75.5, further, knowing one would mean guessing at 99, not 97. 1+(99/2)-(97/4)=25.75 This means the avg. difference if adjusting for guessing moves from .5 (average score of 50.5 vs 51) to .75, hardly a substantial difference. Of course the numbers will separate out at greater levels of knowledge as he showed earlier, if one person can answer 50 and the other 25, the average scoes will be 62.5 and 43.75 Now he probably simply didn't check his math, but twice in the same paragraph?

Re:Error in the Math by sasdrtx · 2007-06-17 13:55 · Score: 1

Finally, somebody else noticed this guy's a fraud himself. His "mathematics background" evidently ended when he flunked out and started a crusade against tests that were too hard for him.

I was afraid I was going to have to write all that up myself. Thanks!

--
Most people don't even think inside the box.

please Mod parent up by jbengt · 2007-06-17 01:38 · Score: 1

Parent post gets the point, and states it better than TFA.

Re:warning moronic blog post linked by An+Onerous+Coward · 2007-06-17 01:48 · Score: 1

I think the original article/clueless-blog forgets to factor in a very important fact: in well-designed multiple choice tests, the test has questions across a wide range of difficulties. Say you have a test with 80 questions, divvied up into four levels of difficulty (easy, moderate, difficult, impossible).

If you've mastered half the "moderate" material, you have an automatic ten question advantage over those who know only the easy material. Even if you're only equally adept at guessing the answers to the other questions (fifty for you, sixty for your opponent), it's very unlikely that your opponent will guess well enough to match your score.

Also notice that he insists on talking about tests with only twenty questions, even though most tests are significantly longer. Short tests certainly offer the best chance for guessing your way to a good score. But the only test I've ever seen that was that short were the Novell Netware certification tests I took in 1999, and those were adaptive (read: the computer administering the test was selecting questions based on how well you were doing. The longer the test, the more questions you were getting wrong.)

--

You want the truthiness? You can't handle the truthiness!

Multiple choice? by JelloJoe · 2007-06-17 02:08 · Score: 1

A well designed test wouldn't be all multiple choice!

Re:warning moronic blog post linked by An+Onerous+Coward · 2007-06-17 02:25 · Score: 1

There are problems with his analysis. One problem is, the "examples" he cites don't actually exist. I'm guessing that there is no professional test out there where an unqualified candidate would know only one fewer question than a qualified candidate.

The author seems to suggest that a "hard" test is one where every question is brutally hard, and only a true zen master would answer a significant number of questions based on knowledge rather than guessing. In fact, well designed tests try to stagger the difficulty of questions to provide for maximum discrimination between candidates of varying levels of knowledge. There will be a handful of very easy and very difficult questions, but most will be at about "moderate" difficulty (think of it as a bell curve). So in reality, the difficulty of the test is governed primarily by the cutoff score. The higher the score, the less likely the easy questions and judicious guessing are going to save you.

The more knowledgeable candidate should be better at guessing questions he doesn't know. But even ignoring this, even as little as a five question advantage to the more knowledgeable candidate is huge. Say that X knows 25/80 questions, and Y knows only 20/80 questions. Each question has four answers, and each candidate guesses randomly on questions he/she doesn't know. We expect X to wind up with a score of around 38.75, and Y to end up with a score of 35. Even with most of the questions being guesses, there is a 75% chance of X winding up with a better score than Y. If he knows 30 questions, the odds raise to 95%. If he knows 35, the odds surpass the 99% mark. Because the difficulty of the questions are staggered, any two candidates with different amounts of "tested knowledge" are going to have a noticeable difference in the number of questions they can confidently answer.

It's one thing to claim that most tests would benefit by punishing guessing. But the author goes further, dismissing any test that doesn't punish guessing as utterly meaningless. I don't think that anyone who understands the probabilities involved could honestly describe them that way, "former background in mathematics" or no.

--

You want the truthiness? You can't handle the truthiness!

Assumption that multiple choice is a valid test by bigbigbison · 2007-06-17 02:33 · Score: 1

Ask anyone with a teaching license and they will tell you that there is a lot of debate over what multiple choice tests actual test, knowledge of the material or ability to take tests. There are a lot of educator who argue that essays or portfolios are a more accurate measure of how much someone knows than multiple choice or true or false tests.

--
http://www.popularculturegaming.com -- my blog about the culture of videogame players

How to win at multiple-choice exams by hexatron · 2007-06-17 02:49 · Score: 1

Example with n choices per question,
each correct answer worth c
and each incorrect answer worth -i:

If you have no idea which answer is correct, and (n-1)*i < c, then guess.
Likewise, if you can eliminate some of the answers, so you are only choosing from m possible correct answers, and m < n, then guess if (m-1)*i < c

It got me a full-tuition scholarship to an ivy-league school (where I learned to hyphenate adjectival phrases). Your results may vary.

In any case, please be sure to join the slashdotters posting 'you are an idiot' message on the article's comments. It's important to keep the slashdot's reputation as the premiere internet home of arrogant assholes.

Two problems with your post by snowwrestler · 2007-06-17 02:50 · Score: 1

Not all knowledge is created equal, and that's one point that many "hard" exams and certifications miss.

That might be less of a problem than you think. See these comments.

If a lawyer doesn't know the intricacies of Melchett vs The Vatican, who cares? In the unlikely situation that they need it, they can google it. If they don't understand Habeas Corpus, on the other hand, they're just unfit to be a lawyer at all.

This is a common misperception about law. It's actually more important to know the laws and cases than abstract concepts, because the concepts are defined solely by specific laws and cases. In applying a concept you must always provide a citation. The best lawyers are those with a giant capacity for remembering specific laws and cases, and applying them to current situations. A general grasp of concepts is useful in writing about law for the general public, but actually not that useful for practicing law.

--
Build a man a fire, he's warm for one night. Set him on fire, and he's warm for the rest of his life.

M$ tests by Joe+The+Dragon · 2007-06-17 02:58 · Score: 1

can be hard and they don't mean that much as they are easy for people to cram and pass them with out have a clue about how to do the work and they cover things that you do not see / use in the real world or they do things in a way that is not the best way to do it.

hard tests vs. hard questions by lukesl · 2007-06-17 03:09 · Score: 1

"'The test was very hard,' the medical specialist said. 'Only 35 percent passed.' 'How did they grade it?' I asked. 'Multiple choice,' he said. 'They count the number right.' As a former mathematician, I immediately knew the test results were meaningless.

He/she must not have been a very good mathematician. They're assuming that the reason that only 35% passed is because each individual question was very hard, in which case their argument is correct. However, it's more likely that each question is relatively easy, but to pass you have to get almost all of them right. Since it's impossible to know which of these situations was the case based on what the doctor said, the former mathematician couldn't have known the test results were meaningless. In my experience taking the MCAT, med school tests, and USMLE licensing exams, they're usually composed of many easy questions, and you have to get almost all of them right. That sort of mimics what being a doctor is like, which is that most of the time the diagnosis and treatment is fairly straightforward, but the tolerance for mistakes is very, very low.

Exactly right! by wingome · 2007-06-17 03:36 · Score: 1

As a Math major, 37 years ago, I took an intro to Psych course for some reason. I made a very poor grade on a strange multiple choice exam. Each question had five or six items that "applied" or didn't. You were to circle the ones that "applied". If you missed any part, you were marked wrong for all of them for that question, so one factual error, caused 5 or 6 deductions from 100.

I demonstrated to the very young "professor" that by changing placement alone of various factoids my grade could have been anywhere from a high B to an F. What made matters worse, he declared that the bell shape curve of the outcomes validated his scheme. I couldn't quite get across the bell shape curve that, for instance, loaded dice create.

My adviser agreed with me but stayed out of the dispute. I am still angry!

Re:The crutch there by DavidTC · 2007-06-17 03:39 · Score: 1

Yeah, and what if you don't have a Java compiler?! Tests should require you to demonstrate you can compile into bytecode! And what if you don't have a computer?! You should have to be able to execute said bytecode. What if you you're forced to do it in zero-G? Can you do it?

In the real world, 'skill' is 'How well can you use tools to desired effect?', not 'How well can you operate without tools?'. Why? Because when people do things, they use tools. All the tools they can, to make their job easier.

--
If corporations are people, aren't stockholders guilty of slavery?

Suggestion by udippel · 2007-06-17 04:04 · Score: 1

If nobody has suggested this until here:
What about a plurality of answers being potentially correct ?
Let's say 4 alternatives; and 0,1,2,3,4 may be correct.
Now we could consider
- the answer correct in case of all ticks being correct (resp. correctly unticked)
- to allocate partial marks: '+' for correct ticks and '-' for incorrect ones
At least, in both cases guessing will deliver close to nothing.

Re:Suggestion by n6kuy · 2007-06-17 06:27 · Score: 1

Maybe just my experience, but I've never seen a multiple choice test where multiple answers were expected.

I have seen many tests that included something like:

E) answers A,C and D above are all correct.

but answer E) is the only truly correct answer.

--
If you disagree with me on social issues, then it's pretty clear that you are a narrow-minded bigot.

The Blogger should be shot, the replies rewarded. by maurert · 2007-06-17 04:13 · Score: 1

The blogger touts his math/stat skills and then argues that multi choice scores are a fraud. Like many self proclaimed experts, this one falls short. He posts a formula without variables and show the wrong answer.

Worse as a supposed stats expert, he also quotes the formula for guessing incorrectly.

He doesn't mention that standardized tests that use the "guessing formula" do not require one to guess. If you know only two answers and answer ONLY those questions, there is no penalty for unanswered questions.

Also his extreme examples aren't the ones to support his hypothesis. His primary two examples were both 100 True/False questions. In on "extreme" example one person knowing one answer and the other two. That case, regardless of the math we know on average the more knowledgeable person. His second example on this test was comparing two people. One knowing all and one knowing half. Aganig applying the guessing formula widens the delta but we still know who's more knowledgeable.

The example that exposes the fraud is a 100 question T/F test where one person knows 50 and marks guesses for the other 50, while the second person knows 64 answers and doesn't guess leaving 35 questions unmarked. Person 1 is going to average a score of 75, frequently a passing grade, while the more knowledgeable person scores a 64, often a failing grade.

However the blame here lies with the test preparation. If there is no "guessing" penalty for wrong answers, then all test takers should guess on all unknown questions. If all do then that person that knew 64 answer will on average score an 82 beating the person who knew only 50. If there is a "guessing" penalty for wrong answers, then whether or not the test takers makes blind guesses is irrelevant. AS another reply to the blogger points out, knowledgeable people rarely are blind guessers and thus should guess as they are likely to beat the odds of the guessing penalty.

If there is a fraud, it is if the standard for passing is so low that a person making random guesses can pass the exam one out of three or four times.

Is the math entirely wrong? by kudos200 · 2007-06-17 04:18 · Score: 1

Is the math in this article entirely wrong, or am I just crazy??

For True-False exams for example, the number subtracted would most likely be (Number Wrong ÷ 2). Let's see how that would work out, for the sample case above. You, answering two questions correctly and guessing at 98 would be likely, on the average, to get 49 wrong, and so have a final score of 2 + 49 - (49 ÷ 2), or 75.5, while I, again on the average. answering only 1 correctly and guessing at 97, would get a final score of 1 + (97 ÷ 2) - ((97 ÷ 2) ÷ 2)), which comes out to be 25.25. Here there is a substantial difference between our scores, closer to the two-fold difference in our actual knowledge.

2 + 49 - (49 / 2) is equal to 26.5, NOT 75.5 . . . he added it in one case and not in the other. So the actual scores that should be compared are 26.5 to 25.25. The disparity he saw was entirely from a lack of capability at arithmetic. And in all his examples, the numbers are so close not because of a lack of quality in the testing methods, but because his hypotheticals are so extreme (someone who only knows the answer to 2 questions out of 100 is ridiculous). That said, subtracting for wrong answers is still the most accurate way to grade, but not based on the crap that this guy was talking about.

Re:Is the math entirely wrong? by Vegeta99 · 2007-06-17 05:42 · Score: 1

I'm not sure what he counts a guessing.

On the LSATs, they'll give you a passage and ask you to analyze some analogies and determine which one comes closest to representing the passage.

You cannot be TAUGHT which one is correct - it's all up to your head figuring it out. So, on a test like that, guessing sure helps you out. But does that mean it should be scored differently? Well, no, I don't think so. Why should I be penalized more for picking an answer that is right (its multiple choice - one is completely off base, and 3 are all so damn close, with one doing just a little better job with its language), but not 100% than for getting it completely wrong?

165 on the Practice LSAT, btw. =)

Please speak up with authority! by Anonymous Coward · 2007-06-17 04:22 · Score: 2, Funny

Well, as a (happily) married man...

Well, since you are much more of an authority in this subject area than most of us on Slashdot, perhaps you could give me some insight on this little conundrum?

If a man is walking in a forest, and he's talking to himself, and there are no women around, is he still wrong?

Re:Please speak up with authority! by Planesdragon · 2007-06-17 04:37 · Score: 2, Insightful

If a man is walking in a forest, and he's talking to himself, and there are no women around, is he still wrong? If he has to ask the question, then yes. If he knows that the rightness or wrongness of his answer is the same regardless of his gender, then no.

Women will date, dream of, and marry men. They do none of those to boys.
Re:Please speak up with authority! by Kelbear · 2007-06-18 01:48 · Score: 1

"Women will date, dream of, and marry men. They do none of those to boys."

Yeah, those are the ones they get party with and #%&$ instead.

In a perfect world... by islisis · 2007-06-17 04:44 · Score: 1

I just wanted to say, I think that 95 percent of all exams are cop-outs, whether issued so deliberately or just because they were lulled-up that way. This is not including 'take-home' exams. In a perfect world, rather than spend all the resources we have on lawyers, advertising, physical distribution of virtual goods, cash registers their operators, and who knows what else, we could have more people compensated to learn how to teach and have them teach and spend time assessing students individually or in smaller groups over longer periods of examining, paying attention to who they actually are and what they have to say. And even in this perfect world, there would still be more room for people to become teachers through the returns of what that education gives back. To those who say that machines, computers, paperless offices and trust-based systems take jobs away from people, who might also say that exams are the natural result of logistics, I say - please consider the nature of what education provides a society and how far a human mind can actually go. And please do not give up.

Guess if you know something by ghoul · 2007-06-17 05:01 · Score: 1

The typical college entrance exams I took had 4 choices each and 4 points for a correct answer and -1 for a wrong answer. The thing was in a lot of the cases you may not know the correct answer but if you know something about the subject 1 or 2 of the choices are obviously wrong . Now if you eliminate those and guess amongst the rest your chances are much higher than simply not answering a question you don't know the answer to. Mathematically with 2 choices eliminated you have a .5 chance of guessing right so an expected value of 4*.5=2 -1*.5=.5 = 1.5 as opposed to 0 for not attempting. I think this kind of guessing is fair as you are getting rewarded for your partial knowledge which lets you eliminate at least the nonsense solution.

--
**Life is too short to be serious**

Gaming the test by sgunhouse · 2007-06-17 05:01 · Score: 1

I once took a test that I knew absolutely nothing about, and got the best score of anyone taking the test!

I was in high school, so this is almost 30 years ago. Also, I was the brightest kid my school had ever seen, literally. Okay, not all that hard when a typical graduating class only has 50 people in it (this was a rural school system), but still ...

In my junior year (that is, grade 11), they decided to have me represent the school in all sorts of competitions. Math and science - math was really my specialty, I had taken every math class they had to offer by this time, but I also did well in science. So anyway, this contest comes up in which you're allowed to take 2 subjects. Of course I took math and did okay. Not spectacular, my school didn't offer Calculus or other advanced math classes, but respectable. And since no one else in the school was willing to take the test in Physics, I took that as well.

You have to understand, my school didn't even offer Physics until grade 12. I had taken general science 2 years earlier (and got such a good score that it made everyone else look silly), then Biology and was at the time taking Chemistry, but Physics wasn't until the next year. I figured "How hard could it be?" Well, I got the test, and maybe had an idea how to work out one question.

I wasn't going to leave the test blank. Like the college entrance tests, they actually assign a negative score to wrong answers so that there's not supposed to be any advantage to guessing, but I wasn't going to sit there for an hour and do nothing. So I started looking through the answers ...

Now that I've been a teacher also, I know about (theoretical) test design. You're supposed to include a couple of reasonable-sounding but wrong answers (referred to as "distractors") to catch the people who have some idea what they're doing but are trying to be lazy, and a couple of completely wrong answers - and of course the correct answer. I was able to eliminate the completely wrong answers, then look just at the others and determine which one had to be the correct answer on over 90% of the questions.

In one sense it didn't work. I still got done with 20 minutes left and had to sit there with nothing to do for the rest of the time. :( But I got the best score out of the 100 or so people taking that test at that site.

This was actually a national contest, but I won't give the name. Fortunately, there were people who beat me at other locations, it would have been really embarrassing if I'd one the national competition just by guessing.

And needless to say, I never gave my students multiple choice tests (in Math, that was my subject after all). I know from experience that multiple choice tests are worthless.

Med school is all multiple choice by sjbe · 2007-06-17 05:17 · Score: 1

The exams in law school (and I believe medical school) tend not to be multiple choice.

Can't speak for law school but virtually every exam my wife took in med school as well as for her licensing exams was multiple choice. She informs me that most if not all med schools in the United States give the vast majority of tests in multiple choice format. The main exceptions are practical examinations where multiple choice is not an option.

Why professional tests are really bogus... by AtlanticCarbon · 2007-06-17 06:15 · Score: 1

Their real purpose is to hinder competitors from entering the market.

If you are worried about quality of care, than transparency is they key. Transparency allows the customer to weed out incompetent or inexperienced practitioners.

Measurement and item response theory by Savantissimo · 2007-06-17 06:21 · Score: 3, Interesting

I agree with everything you said except this part:

"A multiple choice question might only have one right answer and its point value is the exact same as that of something much easier (especially, when on the harder on, the wrong choice might even be 'righter' than the correct choice on the easy question) -- but thats why there is an entire field of psychometrics out there to ensure that these sorts of exams are doing what they say they are."

Seems to me like that is more an example of psychometricians being forced to accept a less than valid form of test scoring. The proper way to do things has to incorporate Rasch's principle that the likelihood that a given test-taker will give the correct answer (on a question that is valid for the quantity it being used to measure) depends on the product of the easiness of the question and the ability of the test-taker. For that matter, lumped scores (pass-fail, ranking, or absolute) on professional proficiency exams - which by their nature must test disparate quantities with various non-linear contributions to professional qualification - cannot properly be interpreted as measurements of anything without a well-thought out unified criterion that describes the contributions and dependencies of the various quantities measured by the questions to the overall measurement of professional competence.

--
"Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry

Whine, whine, whine. by n6kuy · 2007-06-17 06:45 · Score: 1

"Tests don't prove you know anything; they only prove you know how to take tests!"

--
If you disagree with me on social issues, then it's pretty clear that you are a narrow-minded bigot.

Re:Whine, whine, whine. by maxwell+demon · 2007-06-17 07:41 · Score: 1

"Tests don't prove you know anything; they only prove you know how to take tests!"

Of course this validates them as conditions for school entry: Since successfully finishing school is based mainly on passing tests, your test passing ability is exactly the right thing to test for. :-)

--
The Tao of math: The numbers you can count are not the real numbers.

Blogger is ignorant by sesquipedalian_one · 2007-06-17 06:52 · Score: 1

AAAPIT (I am a psychometrician in training). He clearly knows nothing about psychometrics, and is pretty much a fool for assuming that the people who put together the tests have never bothered to think about such elementary problems. There is well-developed statistical methodology behind the scoring of standardized tests. Most licensing tests these days are put together with Item Response Theory, which gives the test developer a very precise idea of how much of a role guessing plays in each question. (You might be surprised to find that the floor guessing parameter is not just based on the number of choices; it varies depending on the details of each question). IRT also yields a test information function that lets you see how much information the test is giving you along the range of ability levels. The argument he makes about deducting fractions for incorrect answers (known as "formula scoring") is BS, because no standardized test ever reports just the raw score. Different forms of the test differ in difficulty, and so must be equated to one another. In the process, raw scores are converted to scaled scores, and the conversion is typically not a linear one. Formula scoring results in lower raw scores than if you don't apply the penalty (dichotomously scored), but all that means is that the range between the lowest and the highest raw score is a less with the dichotomously scored test. If that range is too small, you can always add more questions. Suppose you took two versions of the same test, one dichotomously scored and one with formula scoring. (Assume for the purposes of simplicity that there's no measurement error.) Yes, you would get a higher raw score on the dichotomously scored test, but so would the whole test-taking population. Your percentile rank would not change, and the scaled score would work out still be the same.

Re:warning moronic blog post linked by Anonymous Coward · 2007-06-17 06:58 · Score: 1, Interesting

There's a very easy solution. Require essay tests. Make sure at least 10% of your class doesn't complete the test (but A students get done 20-30m early on a 2 hour test).

It works like a charm, and weeds out exactly who needs weeded out.

He're a hint. After the first exam of a semester in a weed out class, the intstructors can predict with a 98% accuracy your final grade. Your test taking skills mean shit. If you're not bright when we talk to you, if you're slow on the uptake, you won't do well. The tests are designed to avoid giving you passing grades. Of course some people bleak the mold, but these people are always super-intelligent and have to put an absurd amount of effort in.

Then, the rest of the semester is really about teaching material, the scores don't really matter that much.

Lemme tell you a story by Moraelin · 2007-06-17 07:02 · Score: 1

Let me tell you a story. When my parents bought a ZX-81 with 1K RAM back in the day, that thing didn't even have enough memory for an assembler. I learned assembly by translating it all in hex by hand. I had a big notebook with all combinations of opcodes and registers, and their hex codes. Forget writing "for" loops or even "goto", you had to actually count bytes by hand to do a jump.

Or did I tell you about the time when a PHB gave me a computer with a compiler, but literally no editor? (Not even EDLIN.) Yeah, we had to do with a disk editor until that was sorted out, because the alternative was to sit and twiddle thumbs. Even if with a damn good excuse.

So I _can_ do, and did do, without even the "crutch" of a compiler or assembler or even a text editor. Can _you_?

That said, I genuinely don't miss those days. They're not some "good old days", they're days when I wasted time on stuff that a tool would have done better. That was wasted time. There's a reason there are better tools nowadays, and that is that they genuinely make you more productive. They let you focus on the things that actually _matter_, like algorithm and design, not on the mechanical bullshit that a compiler or assembler does better or faster anyway.

_That_ is what makes a good compiler: algorithms, data structures, patterns, and knowing how to use a tool or library for the rest. Doing stuff by hand that the IDE or compiler does better, that's not a reason for pride, it's a waste of time and (employer's) money.

It's like hiring, say, a gardener and discovering that his grand reason for professional pride is that he can mow the lawn with some small scissors, instead of relying on the "crutch" of a lawn mower. Well, who cares? He's still doing a crap job and wasting more time than someone else. If the tools do that faster, freakin' use them. In fact, if a gardener actually did that, you might even suspect him of fraud: that he's deliberately wasting time so he gets paid for more hours.

--
A polar bear is a cartesian bear after a coordinate transform.

Regarding the comment that GRE/TOEFL use IRT by ahfoo · 2007-06-17 07:09 · Score: 1

First off, this is cheating. I'm responding to a comment made at Blogspot here on Slashdot, but there's no way I'm creating an account over there just for this and besides they don't even have threaded conversations. Instead I'll quote the guy from Blogspot's comment here in full and then basically say he's full of shit.

Aaron said...

I am sorry, but as a psychometrician (i.e. someone who writes multiple choice tests and interprets the results), I have to simply chime in with this:

We know. That's why we don't just count correct answers.

Any major test (GRE, LSAT, TOEFL, TOEIC, etc.) uses some kind of item response theory (IRT) to determine the score. This means that the final score is actually the person's ability, given their performance on the items, which are weighted differently (to put it VERY simply) according to people's performance on them. It doesn't matter what easy-to-read numbers the test gives you as your score; your REAL score is a number between 0 and 1. Sometimes that number is rescaled to the actual number of items that were on the instrument to give people the illusion of a classical MC test.

Another point is this: Remember when you took your SAT (I think it was)? They told you not to guess if you weren't sure about that answer, right? The reason for that is that with a really well-worn and robust test, the developers have been able to figure out who picks which distractors, and can therefore derive further meaning from whatever option you choose. So instead of a simple binary item (right or wrong), they can create a partial-credit item. Say "A" is the right answer, but people who are pretty smart seem to pick "B" a lot. So maybe the stats will assign a value of 0.5 for that one. Maybe "C" is just a throwaway distractor and doesn't mean anything other than you missed the question. But what if "D" turns out to really distract total morons? The stats might end up assigning a NEGATIVE value if you pick that. So read the test specifications before you take a big test. If they say not to guess, that's why. What you don't know can actually hurt your score more than just skipping it.

Look into the Rasch model and multi-parameter IRT. It's late and I actually need to develop some questions tonight (no kidding!), so I leave it to you and Wikipedia.

So to sum up: Basically, you are right about the problems with MC tests, but wrong about how much this affects people's lives.
June 17, 2007 4:06 AM

So, as I was saying --bullshit.
I'm also a writer of GRE/TOEFL practice tests and I am quite sure this is not true. This was true for the TOEFL, but only for a few years. With the advent of the computer based TOEFL in 2000 there were weighted responses and the successful implementation of this feature was one of the primary differentiations between software practice test products that were published at that time such as my own which you are welcome to buy on Amazon but I'm sure you won't if you're already reading this in English.
However, that computer test was dropped in favor of a radically redesigned test in 2005 --another reason you probably won't buy it at Amazon-- in which ETS specifically documented that they were dropping weighted scoring entirely. This was specifically stated in documentation from ETS and it was distresing to me because I was offering one of the few projects that had a reasonably accurate weighted scoring system so I am absolutely sure of this. It cost me money big time.
As for GRE, well this is location depdendent. In some locations the GRE computer based test still uses weighted scoring, but in most of Asia that test is no longer offered and a non-weighted test is currently the only choice. The reason the

IANAM by pongo000 · 2007-06-17 07:56 · Score: 1

...but my limited math skills are all going red-flag on me at the moment:

For True-False exams for example, the number subtracted would most likely be (Number Wrong ÷ 2). Let's see how that would work out, for the sample case above. You, answering two questions correctly and guessing at 98 would be likely, on the average, to get 49 wrong, and so have a final score of 2 + 49 - (49 ÷ 2), or 75.5, while I, again on the average. answering only 1 correctly and guessing at 97, would get a final score of 1 + (97 ÷ 2) - ((97 ÷ 2) ÷ 2)), which comes out to be 25.25. Here there is a substantial difference between our scores, closer to the two-fold difference in our actual knowledge.

OK, forgive me for RTFA, but how is 2 + 49 - (49/2) equal to 75.5? My trusty calculator tells me this is 26.5, exactly one point higher than the second example -- as I would expect.

The entire argument is fallacious...I know twice as much as you, so much that I get 100 questions right, you get 50 right and guess at the other 50...50 + 25 - (25/2) = 62.5. Not quite a 2:1 ratio there.

While I agree with the author's premise that guessing should be penalized, he does a terrible job proving his point.

The real fraud by sjames · 2007-06-17 07:58 · Score: 1

The real fraud of that sort of test is that the number of passing grades is set first, then the pass/fail cutoff is moved to meet that figure. If few take the bar exam, a drooling moron may pass. If many take it, being well qualified isn't good enough.

It has to be long and hard so moving the cutoff can provide fine-grained control on the number who are admitted into the profession.

tests generally fail to assess job performance by wikinerd · 2007-06-17 09:04 · Score: 1

The way we assess future professionals may be wrong: We give them a piece of paper or sit them in front of a computer screen full of questions, and ask them to either choose from multiple answers or write down their own answer. However, few of these professionals will ever need to do exactly that in their actual jobs. In essence, we benchmark candidates by asking them to do something they will rarely do in real life. The results are easy to predict: Some will learn how to pass tests without exhibiting real-life performance, while others will be able to do the job but fail on the test. In the end, tests seem to mainly assess the candidates's patience and conformity to social hierarchies.

Ouch. by Lemmy+Caution · 2007-06-17 11:26 · Score: 1

One gets a feeling one's in the wrong crowd after seeing what happened to his comment thread after Slashdot reported this: a genteel and thoughtful chat becomes filled with increasingly crude, uninformed and insulting remarks.

Maybe I don't want to be here....

Re:Ouch. by dreamer-of-rules · 2007-06-17 18:17 · Score: 1

Well, the article is ****ing lame.

But I agree that people should be moving on, and not kicking the dirt around in his own theater. They should be posting here on Slashdot, and telling the ****ing editor to read the ****ing articles before posting.

"Proving" that multiple choice exams are poor testers of knowledge by generalizing from randomly designed 50/50% questions, and not bothering to give any statistics for the more typical 4 or 5 answer "multiple choice" exams? It changes the numbers quite a bit, don't it? :) Nor does it address the common method of penalizing incorrect answers, or whether degrees of knowledge could be used to winnow down the possibilities, and reward knowledgeable guessing.

--
Everyone is entitled to his own opinions, but not his own facts.

High Failure Rate by TimeZ0ne · 2007-06-17 12:32 · Score: 1

Doesn't it strike anyone as odd as to how inefficient the education system must be to produce such a high failure rate? The screening process that admits candidates to these elite professional programs must be broken too, as it obviously allows too many candidates that just can't cut it into the programs. On the other hand, maybe it is just the testing process is broken...

Re:High Failure Rate by bratwiz · 2007-06-19 03:29 · Score: 1

On the contrary my good fellow, they have it down to an exact science. Nobody can be that goofed-up by accident.

Just remember: To err is human, but to really fuck it up requires government intervention.

Do you realize what you wrote? by TheZax · 2007-06-17 14:01 · Score: 1

For instance, I'm a testing person, but not a content person (i.e., I design towards what the stats tell me, as well as the actual wording and structure of the exam...I always work with someone who understands the content areas from a very advanced level and can deal with that end). One of the last MC exams I was helping validate, I knew NOTHING about the content -- it was a medical exam. First thing I did was go through the entire exam, read all the questions quickly, and see if logic could remove any of the answers. Statistically, I would have gotten a 20% by random means, but in this case, I received somewhere around 43% (if I remember correctly). The educated guess is a BIG part of these things...you aren't just measuring content knowledge, but application and that means if someone can raise the bar, they might actually do well in the real world.

If you knew NOTHING (your words) and you could get 43% through logic, in what SHOULD have been 20%, then I think you prove the author's point even more. How good is a 5-choice multiple choice test if someone with ZERO knowledge can score 43% by applying logic/common sense ? It sounds like what you are describing is the exact opposite of an educated guess

--

JWall: GUI client for IPTables

Re:Do you realize what you wrote? by clifyt · 2007-06-17 15:19 · Score: 1

Actually, randomly I should have gotten a 20%.

And again, this is why *I* was called in to validate and calibrate the exam. It had never touched a single test taker. How good was the test? It was a hell of a lot better after my team got through with it :-)

Beyond that, quite a few things in testing are looking for application of logic. If you can discern even the slightest bit of knowledge from the wording (i.e., those latin classes? Hell, just mythology for the nerds...like me)...you can get a lot. I still failed though :-) But yes, this is exactly what an educated guess is...learning to look at content, weed out the inappropriate through what ever means you have available, and pick something from what remains.

Re:Worse, he is incompetently wrong. by AllParadox · 2007-06-17 15:44 · Score: 1

I have had all kinds of experience. Some of it a little strange.

A couple of things I did around the end of law school bear mention here.

I probably should not discuss it, but I helped calibrate the Multi-State Bar exam, during my third year of law school. Most lawyers will scream bloody murder, that I should have been allowed anywhere near the data.

It is not like it sounds. I was working with a real psychometrician. He knew the statistics and methodology, and I knew the practical parts of computer systems. We both knew SAS, very well.

(Statistical Analysis System - it is its' own little language. In many ways, the language is an improvement over languages like Fortran, and I *like* Fortran)

The data was double-blinded. Neither my friend nor I saw the questions or any of the answers. Someone else handled that part. All we knew was that for each one of the thousands of examinees, for each question, whether or not the examinee got the correct answer or not. The order of the questions was also scrambled, so we did not even know the order of the questions as they were taken by the examinees.

FWIW, for the Multi-State Bar in my State, and many others, only one thing counts: the total correct. Nothing is taken off for wrong answers. A passing score is much higher than 25% of the total. I do not recall now, but 60-80% is the neighborhood of correct answers to pass (actually it was a combined score from the written and Multi-State, but if you got only 25% on the Multi-State, you failed, period.)

Bad statisticians get crappy results because they make wrong assumptions. Whoever the guy is that wrote the article, never let him do your statistics. He makes assumptions that competent psychometricians know are false.

I know the article's assumptions. I made them myself until I worked with my friend.

Strange things happen with really good test questions. This is not all, but most.

First, some guy randomly guessing, say, by going down the questions and always taking the first answer, will fail. Even if he is incredibly lucky and is nearly three standard deviations out (one of the very unlikely possibilities from a uniform distribution), he will still fail the exam.

Second, if his answers were educated guesses instead of blindly picking from a, b, c, d, or e, then his chances of getting the correct answer went down.

You cannot even take the exam until you graduate from an accredited law school, or you practice in California. *Every* single person that took the exam made at least educated guesses on most of the answers.

(One of the top guys in my law school class decided he was going to give the psychometricians a heart attack, by answering every single question correctly. He bragged about it before the exam. The smartest guy I ever met, bar none. Later he said that at the end of the day, he looked up, realized that he had twenty-five questions to go, and there were five minutes left. He *blindly* answered the last twenty-five questions (all with (b), IIRC) and turned in his exam. He passed, of course.)

My friend and I could sort of tell the order of some of the questions, in spite of the double-blind. The more difficult questions, in the last fifty or sixty, clearly had a higher random component of correct answers. Taking into account the difficulty of the question and the size of the random component, we felt confident that we could identify and order three-fourths of the final fourty questions.

Difficulty == number of examinees getting a correct answer, adjusted for their relative ability to correctly answer all the other questions. For small numbers of examinees, this is perilous. Our sample set was more than ten thousand, and verified against results from previous years going back. Those answers, in turn, were sampled, then diligently validated against LSAT scores (a law school entrance IQ test), law school grades, relative difficulty of law school, undergraduate grades, and personal inquiries to indiv

--
All is paradox. Retired lawyer, so this is just one more layman's opinion.

Isaac Asimov's opinion of IQ tests by The+Fun+Guy · 2007-06-18 01:02 · Score: 1

To paraphrase, he said that decent tests count the number of right answers you get, but really good tests also count how many times your answer is, "It depends."

--
The man who does not read good books has no advantage over the man who cannot read them. - Mark Twain

Oral exams by The+Fun+Guy · 2007-06-18 01:24 · Score: 1

There is no way to hide any lack of knowledge

Right, because this is an iterative, custom-fit exam. They assume you generally know your stuff, since you passed the writtens; they don't care about where you're strong, they want to know where you're weak, and how weak you are. As soon as the examiners start to smell the whiff of ignorance in an oral exam, they pursue it mercilessly, and work together to explore the depths of your particular areas of ignorance the same way a tag-team of sadistic dentists will use an array of very small and very sharp bits of steel to dig in and thoroughly explore a bad spot on a tooth. God help you if you give them an especially juicy target to work on, or if you give them more than one.

(Shudders when thinking back on doctoral oral exams.)

--
The man who does not read good books has no advantage over the man who cannot read them. - Mark Twain

Multiple Guess = Silly by hanshotfirst · 2007-06-18 04:18 · Score: 1

My classmates proved this in High School, inadvertently. Our Chemistry class was notoriously hard, and graded on a curve. For the practice final one guy answered 'C' for everything. Another answered randomly. They both finished fairly early, to the chagrin of the teacher, but did fairly well on the final curve. If I recall, "C" beat random. This test did not include the wrong-answer penalty like the SAT.

--
Why, oh why, didn't I take the Blue Pill?

fallacy of tests by konberg · 2007-06-18 04:49 · Score: 1

As a psychometrician, I must disagree with his post and examples. When latent trait (in this case, knowledge of the subject, or ability) is estimated, difficulty of each question and probability of guessing are taken in consideration. As a mathematician, you must be familiar with Item Response Theory (IRT) and Rasch mode, and its modifications. Even if IRT is not used, extremely difficult (like those in your example) or easy items are usually not included into tests, since they do not have any informational value, and guessing parameter is considered when scoring the responses. Konstantin Augemberg (konstantin at augemberg.com)

Well, you're finding out what they don't know. by Slashdot+Parent · 2007-06-18 08:03 · Score: 1

It prevents guessing, and it forces the student to consider whether they actually know the answer, or just think they do.

It would seem that it would really only prevent wild-arsed guessing.

Under your system, if I am over 33% confident in my answer, it is still to my advantage to make a guess. Maybe that's the effect that you're going for, but being 34% confident in myself is not enough for me to claim to "know" something.

Imagine the consequences. If I were taking your test, any time I can eliminate just one choice, my expected value (or penalty) for guessing is 0, assuming I don't have clue #1 about the other choices. But if I actually took your class, I would hope that I would at least have clue #1, so any time I could eliminate one choice with confidence, I would take a stab at answering the question.

Looking at it a different way, let's say I'm a slacker and I only know 50% of the material on your test. What score would I get if I took your test? You are hoping that I'll get 50%, typically a failing grade (if I only learned half of what I was supposed to, I'd say failing is appropriate). But in reality, I would expect to pass your test.

Why? Well, I know 50% of the material, so I'm going to get 50% on your test based on that alone. But the story doesn't end there. If I know 50% of the material, I should be able to eliminate two of the four choices on the questions for which I do not know the answer. That means that I will expect to get half of the remaining questions correct (and half incorrect, of course).

On a 100 question test, I will get:
50*2=100 points for my 50% mastery of the material
25*2=50 points for my "good" guesses
25*(-1)=-25 points for my "bad" guesses

That gives me 125 out of 200 points, or 63%. Nothing to post on the refrigerator, for sure, but I passed, eh?

--
They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock

Re:Well, you're finding out what they don't know. by PurplePhase · 2007-06-18 15:29 · Score: 1

Wait, you said you know 50% of the material, not 75%.

50% of questions known 100%
50% of questions known 50%
= 100% known 75%

If you know 50% either you know part of the test exactly (50% known 100%), or you know all of it sort-of (100% known 50%).
So at the one extreme you statistically get: (50 * 2) + (12.5 * 2) - (37.5 * 1) = 100 + 25 - 37.5 = 87.5 of 200 = 43.75%
At the other extreme (ie. eliminate half of the answers on every question): 50 * 2 - 50 * 1 = 100 - 50 = 50 of 200 = 25%

So you are correct, knowing 75% of the material (giving your 63% grade) is much better than knowing 50% of it. Being able to eliminate half the answers does help, but answering questions you know exactly can almost double your grade.

8-PP

Lazy by phorm · 2007-06-18 17:25 · Score: 1

I have a friend who has - in my presence - suffered from varying forms of seizures and episodes. A few times she almost fainted in my arms, and once her eyes glazed over and she was making weird noises and slumping over until I carried her over to a chair (she came to after awhile, but never remember in the time she was "out"). According to her doctor, it was simply because she was too tall and sometimes not getting enough blood circulation to her head (despite no BP issues), no further tests, no prescriptions.

I'm dreading one day where I'll hear she has had a serious accident due to a seizure. I've had little luck helping her find another doctor either as *none* want to contradict a fellow doctor's diagnosis...

Re:Lazy by try_anything · 2007-06-19 01:22 · Score: 1

A few possibilities come to mind.

1. You haven't gone beyond the first doctor's realm of influence. My impression is that doctors who feel collegial towards each other prefer to present a unified front to patients, but beyond that they have little problem saying other doctors are wrong. That feeling of collegiality is based on personal connections or on a feeling of professional subordination to a highly regarded doctor. It has limits, and doctors' egos work against it. Correcting another doctor's mistake is a way of showing one's superiority.

2. Your friend doesn't have insurance. Doctors are supposed to hand out medical care to whoever needs it, but it always comes out of somebody's pocket. They've worked out a balance of power with the insurance companies that allows them to provide a certain level of care for the insured. For uninsured people, well, my friend is naive and inexperienced and recommends costly tests for uninsured people. He says it's not really a problem as long as the tests are necessary. I'm more cynical; I think it's just a matter of time before someone straightens him out.

3. There's a drug that treats the exact same symptoms your friend has, and people use it to get high. My friend claims a person has to build up quite a track record before being labeled a pill tourist, but I think his colleagues are a bunch of arrogant pricks who pride themselves on making snap judgments about these things.

If the reason is 1), going to a different clinic or hospital where the doctors don't know her first doctor well might help. If the reason is 2), going to a different hospital or clinic under different management might help. If the reason is 3), going to a different hospital or clinic where the doctors haven't all heard that your friend is a pill tourist might help.

Good luck.

I think it's something different. by phorm · 2007-06-19 03:29 · Score: 1

I don't think it's #1, since we never named her doctor to others, and #2 doesn't apply since it's Canada and we have public healthcare. I'm not sure about #3, but I would think that another doctor might be willing to *see* the patient before making that assumption.

These were also different doctors in different areas of town, but the impression I got from mine is that the various medical associations frown upon one doctor overruling another's judgement, even if the first was wrong.

Re:I think it's something different. by try_anything · 2007-06-19 11:17 · Score: 1

I guess the only thing to do is not tell the doctor she's had a previous diagnosis. That's what I tend to do anyway. Doctors have much more respect for each other than for patients. When they hear, "A doctor told me one thing, but I'm not sure he's right," they tend to think you're a nut. A certain percentage of people really are wandering around looking for attention, and doctors like to feel smart by making snap judgments of people. My friend has taken a lot of ribbing from older doctors for running lots of tests on certain people who came in with mysterious complaints. "You'll learn to spot them," they say. Some people get taken seriously, and others get written off as crazy. Whatever method doctors use to make that distinction has not been through rigorous medical trials, I'm sure.

347 of 404 comments (clear)