Political Affiliation Can Be Differentiated By Appearance
quaith writes "It's not the way they dress, but the appearance of their face. A study published in PLoS One by Nicholas O. Rule and Nalini Ambady of Tufts University used closely cropped greyscale photos of people's faces, standardized for size. Undergrads were asked to categorize each person as either a Democrat or Republican. In the first study, students were able to differentiate Republican from Democrat senate candidates. In the second, students were able to differentiate the political affiliation of other college students. Accuracy in both studies was about 60% — not perfect, but way better than chance."
Of course, if you RTFA, the photos of other students were all Caucasian.
So if you always said a "black guy" was a Democrat, it wouldn't have any effect on the results at all.
Statistical significance can't be pinned down to a number like .8% in the general case - statistical significance is hugely dependent upon the sample size. However, the parent poster is correct in that the article was referring to statistical significance, not necessarily to a huge correlation. Generally speaking, a study like this makes an assumption that there is no connection between appearance and political affiliation (i.e. the average accuracy of these guesses should be something like 50% - could be higher or lower depending on how the study was executed - if there were 3 possible parties to choose from instead of two for example, or if it was well known that 90% of the participants all belonged to a given party). They then execute an experiment which provides evidence for or against that hypothesis. Whatever they were expecting (let's say it was 50% correct answers if it was totally random), they found 60% correct answers - and because of the number of people participating in the study, they determined that the chances that they would find 60% correct answers if the guesses really were random (i.e. there was no hint from appearance) would have been astronomically small. In this way, 60% correct can give incredibly convincing evidence that appearance is linked to political affiliation, even if that link is relatively subdued (after all, 60% is not that much more than 50%).
60% versus 50%? How is that WAY better?
With a large enough sample size a result like this can be highly statistically significant, but still useless as a predictor.
For example, if I have 2000 marbles, half white and half black, and pull them out randomly and ask you to predict what colour each one is, if you guessed correctly 60% of the time (you got 600 white marbles correct and 600 black marbles correct) you'd be bumping up against three sigma (over 99%) odds of your results NOT being due to chance, but some incredible marble-colour-guessing gene that evolution or possibly archeobacteria had slipped you. Up the number to 20,000 marbles with 60% accuracy and you'd be a proven phenomenon, even though you utility as a marble-colour picker would be pretty much useless unless it also happened to work on a roulette wheel.
This is something that it can be hard for people outside the machine learning community to understand: an enormously significant result, statistically, can still make for a practically useless classifier.
Blasphemy is a human right. Blasphemophobia kills.