Are 625 Pixels Enough To Identify Sex?
mikejuk writes "A Spanish research team have patented a video camera and algorithm that can tell the difference between males and females based on just a 25x25 pixel image. This means that there is enough information in such low resolution images to do the job! They also demonstrate that an old AI method, linear discriminant analysis, is as good and sometimes better than more trendy methods such as Support Vector Machines..."
...on what it's an image OF.
Am I the only person imagining genitalia icons?
What'dya mean there's no BLINK tag!?
"The also demonstrates that an old AI method, linear discriminant analysis, and demonstrates that it is as good and sometimes better than more trendy mehods such as Support Vector Machines"
I think the summary accidentally forgot the
Works on a 25x25 pixel image*
(* Pixels need to be a shade of pink)
CSI can do it with only ONE pixel!
Chaos maximizes locally around me.
Determine gender at what precision? TFA wasn't very enlightening... indeed, listing mis-identified faces doesn't really help much here.
This is like the problem of false positives in airport scans, but without the terrorists. :P
"We have to go forth and crush every world view that doesn't believe in tolerance and free speech." - David Brin
http://totallylookslike.icanhascheezburger.com/2010/02/04/justin-bieber-totally-looks-like-ellen-page/
Here you go:
Male: |
Female: O
It's not like using linear discriminant analysis is some crazy or countercultural thing. It's a common simple technique. On some data it works well, and on such data, it's not uncommon to use it. It's particularly common in image-identification type tasks, and is one of the classic approaches to face recognition.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
that an application of a standard machine learning method can be patented. They have a publication in a good journal (PAMI), but there is nothing earth-shattering in the research. As far as the comparison with SVM is concerned, non-linear SVM does beat the linear methods when there is enough data (as they acknowledge in the paper).
The algorithm is also interesting in that it proves that an older and fundamental pattern recognition technique - linear discriminant analysis is just as good as the more trendy Support Vector Machines if used correctly and much more efficient.
A bit of clarity might be useful here. Support vector machines use linear discriminants as the central part of the algorithm. These linear discriminates -- simply hyperplanes separating two regions, are defined by a subset of the data points (called the support vectors). The other key part of an SVM is that it projects the data into a high-dimensional space in which hyperplanes can appear as curves or other shapes in the original space. This higher dimensional space is determined from the data using distances between the points in the data set (it's a kernel space).
The net result of all this is that SVMs are pretty much guaranteed to always perform better in terms of misclassification error than a simple linear discriminant, as every possible linear discriminant is considered in building the SVM. But it can be slower, and it can overfit.
So what's going on here? Linear discriminant analysis is an old statistical technique (1930s) that fixes a hyperplane based on distributional assumptions about the two classes. This allows the two classes to be plotted in a simple histogram by projecting them to the normal of this hyperplane, as shown in the picture in the article. It's used all over in statistics, and it works very well when dealing with two symmetric Gaussian distributions (that's what the theory assumes).
Thus the reason it works well here is that they've managed to transform their data in such a way that the two classes look like this sort of distribution. That's the insight here, not the choice of classifier. When the simplest model works, more complex techniques will overfit, meaning that you train on noise instead of the underlying structure of the data.
Does having a witty signature really indicate normality?
Here's what 25x25 pixel faces look like, using the example from the article: Picasa Web Albums - Paul Nickerson
That's exactly the kind of sloppy thinking that had us "remediating" software for three years prior to Y2K. Where, in your grand scheme of things, are the values for (as examples): Michael Jackson, Lady Gaga and Richard Simmons? Please, won't somebody think of the mutants?
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
)
Oh, what sad times are these when passing ruffians can open parenthesis at will on Internet forums.
How can I believe you when you tell me what I don't want to hear?
These things can never become truly 100% perfect as there's lots of people that will show up as statistical anomalies. There are for example people who suffer from hormonal imbalancies resulting in overly feminine looks in a male, or overly masculine looks in a female. Just as well transsexual people will be hard for these things: hormonal medication does not change skeletal features, but they change distribution of fat in the body, including face, and thus for a machine they'll like fall in the grey area between either gender. And how about intersexual people who are physically neither gender? I had a friend before who was IS and it just was really hard to tell from the looks what gender one should assume. Mentally she identified as female, but that can't obviously be told from a picture.
This also makes me wonder about the future.. I hope these "gender guessing machinery" do not become the norm in our society and public areas because they will lead to lots of issues with the aforementioned groups of people.
...so can anybody from the old BBS days.
No sig today...
can do it with fewer pixels.
Dunno about this. I've seen some people in the full 3d glory of real life that I could not discern the gender of.
I can determine gender with just one or two digits, but I almost invariably get slapped for using this method.
I've abandoned my search for truth; now I'm just looking for some useful delusions.