Randomizing Survey Answers For Accuracy
Saint Aardvark writes: "The New York Times reports that two researchers at IBM have come up with a way to persuade people to give correct answers to survey questions: randomize the results. Strangely enough, they can get accurate information out of the aggregate of enough answers -- but it's completely anonymized. Since conservative estimates say nearly half of all survey answers are bogus, there's an interest in persuading people to be more truthful. As ever, you can use the Random NY Times Registration Generator to falsify your registration details and read the article..."
Basically people providing false answers will ver often pick a false answer based on location. Many will pick the middle option, with others being picked less, though the first and last may have a oddly large amount of people chosing them compared to the second and second-to-last options.
/. poll that illustrates the phenomenon. For an excercise, imagine what the results would look like if the offerings were randomly ordered for each person the poll was shown too. My bet is each would be near 12.5%.
When the answers are given in random order, each cycles into the different spots. The liars are actually cancelling out other liars who used the form before them. The differences in the answers are mostly based on how the truth tellers answered (I say mostly because some liars may have a different way of selecting a lie, such as the longest string offered in the answers), and so you can derive more meaningful statistics from them.
See this
-no broken link
Indeed, very old trick. (For my sins, in my earlier days I used to help PhD psych students run statistical analyses on their survey data.)
A variation on this is to give the respondant a die (ie, half a pair of dice), tell them to pick a number between one and six, and every time they roll that number, intentionally give a false answer on the survey. Thus, looking at any individual survey response, you don't know whether it's true or false, but you can factor in the 16.7% false responses into the statistical analysis.
Sure, that can be computerized, but as someone above pointed out, how does the respondant know he can trust it? The above old technique is entirely under the respondant's control.
-- Alastair
I don't give out real data because I don't feel a need to at all.
I find it takes a lot less time to fill in crap data than real data, what really pisses me off is places that correllate the state you select with the zip code. Places like that seem to be deliberately positioning themselves AGAINST me, so I intentionally fill it with erroneous data because they have become my adversary in the case of this page.
Filling in webforms doesn't become an issue of trust until I actually need them to have these data; in which case I try to be careful with who I give my credit card number, but don't care all that much about the rest.
I think the only reason people give out real data when presented with pointless web forms (ala NYT) is that they are unsure if it will operate properly if they enter the wrong information. I assume a goodly percentage of truthful answers come from a demographic that never intentionally fills erroneous answers into web forms; people who aren't very interested in where limitations exist in these computers that they just happen to use.
1 Age [28] *Will be randomized*
2 Age [56 (Randomized)] *28*
The value 56 gets submitted to the server, not the value 28 - which is my real age ;).
This is auditable because I can inspect the source code which is part of the web-page, and I can even monitor the network packets if I'm really paranoid.
Now I could still lie, or mess with the algorithms in the Javascript, but what would be the point?
80N
It was an interesting article, and I can see how this technique will work when the surveyors have the goodwill of the respondents, so that any respondent's primary concern is only that of keeping his individual privacy.
But is privacy the core issue in market research, or is it simply a label of convenience that a lot people use for something else that we don't have easy words for? I will lie on many surveys even when I am fully confident of my personal anonymity-- though I prefer to avoid those surveys entirely when I can. OTOH, when a survey is done by a group that I have aligned myself with, I might well enthusiastically bare my soul without any regard to the privacy issue. And I know that I am not at all uncommon in these respects.
I suspect that my reactions stem from the same source as nationalism, patriotism, ethnic pride, and that whole mess of things where I'm not behaving as an individual protecting my privacy, but as a member of a group who feels called upon to defend my group.
Mostly I see marketing as an attempt by outsiders to mess with my group, to get us to buy stuff through conning us rather than letting us apply our own standards of value to the goods offered. I think I lie on surveys to protect my group from these subtle attacks; to misdirect and confound my group's enemy.
So I really don't think privacy has much to do with it. I think all this lying is a natural group reaction to consumerism, and its belief that it is perfectly okay to sell product by conning your customers into thinking that what you are pushing today is something they want.
Not in my group, buster. We don't need no steenkeeng pushers in our neighborhood.
Heck, usually it's LESS work to lie. Much easier to select the first or last option in a list than to hunt for the one that applies to you, or say you live in "dkjhgkjhdgs dshkjgdsh, AL" than to actually type your real address. And if they insist on cross-checking your ZIP and state, then what else is there except CA and 90210? ;) (Guess crappy TV shows can have their uses after all... ;) ) I'd love to see a study done about what % of visitors put CA/90210 for a state/ZIP in those places that do the cross-checking. That would give you a damn good idea about how many people lie like hell on those surveys... ;)
DennyK