Randomizing Survey Answers For Accuracy

← Back to Stories (view on slashdot.org)

Randomizing Survey Answers For Accuracy

Posted by ryuzaki0 on Sunday July 21, 2002 @05:06AM from the would-you-trust-these-numbers dept.

Saint Aardvark writes: "The New York Times reports that two researchers at IBM have come up with a way to persuade people to give correct answers to survey questions: randomize the results. Strangely enough, they can get accurate information out of the aggregate of enough answers -- but it's completely anonymized. Since conservative estimates say nearly half of all survey answers are bogus, there's an interest in persuading people to be more truthful. As ever, you can use the Random NY Times Registration Generator to falsify your registration details and read the article..."

13 of 224 comments (clear)

Min score:

Reason:

Sort:

I don't get it. by AKAImBatman · 2002-07-21 05:12 · Score: 4, Insightful

Ok, fine. They've managed to come up with a model that doesn't actually collect any data. And how will this help people to enter REAL data? People don't give data because they don't trust the company. If they don't trust the company, do you really think they'll believe some mumbo-jumbo about "randomizing"?

--
Javascript + Nintendo DSi = DSiCade
Slashdot Poll? by jedwards · 2002-07-21 05:13 · Score: 5, Funny

Did you lie when answering this question?

O Yes
O No
O Cowboy Neal told me the answer
1. Re:Slashdot Poll? by Subcarrier · 2002-07-21 06:22 · Score: 5, Funny
  
  Did you lie when answering this question? Yes
  
  Truth is often the most devious of lies.
  
  --
  "I have opinions of my own, strong opinions, but I don't always agree with them." -- George H. W. Bush
This will not affect user behavior by treat · 2002-07-21 05:14 · Score: 5, Insightful

Do they expect that people will enter real data on the mere promise that it will be stored in some randomized, aggregate, or other form that does not invade their privacy? If the coroporation could not be trusted in the first place, no statement they make will make them trustworthy.
1. Re:This will not affect user behavior by Blue+Stone · 2002-07-21 07:20 · Score: 4, Insightful
  
  All they have to do is stop asking for my name and e-mail address, and I could be truthful about pretty much anything else they'd care to ask.
  
  --
  Corporation, n. An ingenious device for obtaining individual profit without individual responsibility. - Ambrose Bierce
optional vs. required by verbatim · 2002-07-21 05:30 · Score: 5, Insightful

I think there is something to be said about companies that ask for information as an option versus companies that ask for information as a requirement.

For example, company XYZ has released a program called Widget. In order to download Widget, users are asked to fill out a survey so that XYZ may guage the demographics of their target audience.

Some sites will allow you to bypass this step and proceed to download the software. Other sites require this information before revealing the download link. I think that the psychological difference between "required" and "optional" would heavily influence the honesty of the answers.

I know that I never honestly fill out required forms. I'll fill in a bunch of bogus details, get the link, and be on my way. However, if the form is optional, I may download first and, if I like the program, provide some details to the company. The difference? I'm not being forced to give anything up in advance.

Is this true in general? I don't know. But it makes sense to me.

I have an idea for something to replace the survey forms - an AI program to carry out a conversation with the user. Ah ha! We just have to watch out for users that say to the AI - "I am lying" - and hope the AI doesn't need therapy.

--
Price, Quality, Time. Pick none. What, you thought you had a choice?
Does this increase trust? by SeanTobin · 2002-07-21 05:35 · Score: 5, Funny

I hope these companies aren't asking users to 'trust' them with thier personal information based on the fact that we are supposed to trust them to randomize it.
Personally, if I don't trust them enough to tell them how much I make, I'm not going to trust them to randomize my results. I don't see how this will increase accuracy -- especially if I keep telling everyone I'm a 108 year old female in Uganda making $100,000+ per year year who works in the sales department of an Educational field and plans to make purchases of an suv, a house, a console gaming system, a optical mouse in the next six months and rates thier internet experience as very low. My e-mail address is sjobs@mac.com and I would like to apply for your quarterly, monthly, weekly, daily, and hourly newsletters and I do give permission to pass this information to your affiliates.

--
Karma: SELECT `karma` FROM `users` WHERE `userid`=138474;
That's just stupid by photon317 · 2002-07-21 05:36 · Score: 4, Insightful

Let me summarize:

1) People lie on surveys, most likely because they don't trust the taker - but probably also just because they like putting in other answers (yeah, I'm a millionaire, woohoo!, etc). This only addresses the trust issue, ignoring other ptential sources of lying.

2) In order to work around the trust issue, they've developed a method of injecting random noise into the original answers as they are recorded and then extracting useful data in the end.

Notice their technology doesn't do anything to fix the underlying problem. The hope is that users will understand and trust the backend randomizer system, and that based on this trust they will answer more truthfully.

Without bothering with all this mumbo-jumbo, I can build a trustworthy system. I simply record survey statistics, and I promise not to use the individuals' personal data invidually.

They can either trust me that I'm telling the truth about this, or they can lie. In the IBM researchers' scenario, the users are again asked to trust that the backend system doesn't compromise them, and again they can choose to trust it or choose to lie.

Given the above, why on earth would you bother with this research and uneccesary complexity. It's not going to make any difference over just promising your users that you don't invade their privacy. You could replace their research results with a banner on top of the survey that says "After you sumbit your data to us, we use Magical HibiJibi technology to prevent ourselves from invading your privacy, so please trust us and answer truthfully"

What a waste of research.

--
11*43+456^2
User Interface and Implementation by WEFUNK · 2002-07-21 05:48 · Score: 4, Insightful

Interesting approach, but useless unless people actually understand and trust the system. For this to happen will probably require widespread adoption, an easy to understand explanation of the process, and assurances that answers really are randomized. These requirements obviously force a bit of a chicken and the egg scenario.

Explaining the whole randomization process (how it protects privacy, how it provides useful info) will be a little much for most people I think, but a good user interface might alleviate this, perhaps with a 'randomize' button that is used before hitting the 'submit' button. This would take the user input and change it right in front of their eyes. Of course many would be rightfully concerned that the randomize button is just for show (or simply encodes but doesn't anonymize), but I think that enough people might buy into the false sense of security that demonstrated 'randomization' provides to at least partly improve the % of bonafide results. Also, the system could be set up so users who don't mind submitting traceable information could be encouraged ("extra 10% off") to submit without randomization, with a simple flag sorting data into randomized/anonymous and non-randomized/non-anonymous data).

This approach would be even better if the randomization approach becomes a ubiquitous standard backed by a consistent and legally accountable and well-known entity/brand (IBM for instance). I'm not sure how well an open solution would work unless there was a central group assuming responsibility and accountability for the system, enforcing trademarks, and suing spoofers. Also, people feel safer when they feel there's someone to blame for any abuse/mistakes (hence, giving their credit card freely to a waiter but not to a website).

--
My next sig will be ready soon, but friends can beat the rush!
Old trick by guanxi · 2002-07-21 06:17 · Score: 4, Informative

As another poster observes, if you don't trust them with the data, why trust them to randomize it?

My college stats professor 10 years ago explained a simpler trick that puts control in the respondant's hands. It went something like this:

With each question, the respondant flips a coin and looks at the second hand of a clock. Only the respondant can see the coin or the clock.

If the second hand is between 1-30 seconds, they answer per the coin (e.g. heads=yes). If it's between 31-60, they tell the truth.

The surveyor, knows very precisely the number of 'lies', can extract accurate data, and the respondant has confidence and control over their privacy. All without a transistor.
NYT Random Login Generator by majcher · 2002-07-21 06:33 · Score: 5, Informative

Hey, it's me. The guy who put together and hosts the New York Times random login generator. First off, thanks for all your cards and letters - I originally just created that page to save myself some trouble, but I'm glad to see that everyone likes it so much.

I'd also like to remind anyone who wants to download, copy, and mirror the source of that page on their own servers, or even as an HTML page on your desktop or whatever. It's just javascript, so it's portable, and that way you'll still be able to use it when the NYT lawyers finally get around to noticing it or they start blocking requests from my page or something. (It will also help distribute my load, though I haven't had any real trouble yet...)
1. Re:NYT Random Login Generator by shepd · 2002-07-21 12:55 · Score: 4, Insightful
  
  >If you're one of those paranoid psychos, then don't give them your life story.
  
  Too bad there's no "Skip this crap" option in their registration screen, huh?
  
  So, the only way to not give them your life story is to lie. I know! Let's make it easy and create a random login generator so I don't have to type more random crap on every computer I use!
  
  And, BTW, if you think I'm paranoid, I'll let you know that I was able to make any changes I wanted [but only did what I asked, of course] to my grandmother's phone line by simply asking her age and full name -- ALL of which are sent to NYT on that page. They only asked to hear a lady's voice, which my mother happily provided. Armed with just a birthdate and name I can make all sorts of changes to your services -- anonymously.
  
  Knowing that, do you want to give me your name and address? If you don't, you should know there's no reason why I'm not working at the NYT right now... I will tell you that were I do work I have access to many, many, many records including Full Names and Birthdates. Feeling uneasy yet? Well, if you trust me, I've never abused those privleges.
  
  >When they change their registration process and perhaps charging for their online content, don't start bitching.
  
  My only bitching will be the fact their site goes offline for everyone. You can't compete in a (literally) Free market by charging infinitely more than your competitors. With the amount of newspapers online right now, and the amount of good content that doesn't come from the NYT, I think they'll end up another salon.
  
  --
  If you could be told what you can see or read, then it follows that you could be told what to say or think - BoC
Randomizing for Accuracy by hysterion · 2002-07-21 10:00 · Score: 4, Funny

Rakesh Agrawal and Ramakrishnan Srikant have devised a data-mining program that would cloak individual truthful answers
Don't trust these guys. They are (obviously) piping their names through some obfuscation algorithm.

--
Timeo idiotikOS et dona ferentes