Cloud-Powered Facial Recognition Is Terrifying
oker sends this quote from The Atlantic:
"With Carnegie Mellon's cloud-centric new mobile app, the process of matching a casual snapshot with a person's online identity takes less than a minute. Tools like PittPatt and other cloud-based facial recognition services rely on finding publicly available pictures of you online, whether it's a profile image for social networks like Facebook and Google Plus or from something more official from a company website or a college athletic portrait. In their most recent round of facial recognition studies, researchers at Carnegie Mellon were able to not only match unidentified profile photos from a dating website (where the vast majority of users operate pseudonymously) with positively identified Facebook photos, but also match pedestrians on a North American college campus with their online identities. ... '[C]onceptually, the goal of Experiment 3 was to show that it is possible to start from an anonymous face in the street, and end up with very sensitive information about that person, in a process of data "accretion." In the context of our experiment, it is this blending of online and offline data — made possible by the convergence of face recognition, social networks, data mining, and cloud computing — that we refer to as augmented reality.'
The first real-world, publicly available use of this will be an app that lets you:
1. Take a picture of someone with your smart phone
2. Find naked pictures of this person online
BRB, heading to the local college campus...
This is why Google shelved their version of this tech. The implications were too big.
Having studied this in college and witnessed many failed implementations of it I casually ask: Where are the recall rates (see also sensitivity and specificity) of these experiments?
Because when I read the articles, I found this instead of hard numbers:
Q. Are these results scalable?
The capabilities of automated face recognition *today* are still limited - but keep improving. Although our studies were completed in the "wild" (that is, with real social networks profiles data, and webcam shots taken in public, and so forth), they are nevertheless the output of a controlled (set of) experiment(s). The results of a controlled experiment do not necessarily translate to reality with the same level of accuracy. However, considering the technological trends in cloud computing, face recognition accuracy, and online self-disclosures, it is hard not to conclude that what today we presented as a proof-of-concept in our study, tomorrow may become as common as everyday's text-based search engine queries.
How you want to decide Google passed on continuing down this road is up to you. Frankly, I would surmise that the type I and type II errors become woefully problematic when applied to an entire population. Facial recognition is not there yet, not until I see some hard numbers that convince me the error rate is low enough. Right now I bet if you were to snap pictures of 10,000 people, you would incorrectly classify at least 100 of them leading to wasted time, violated rights and wasted opportunity (depending on the misclassification).
My work here is dung.
I am a good looking female. When I was a waitress I had a stalker at my workplace. Because the schedule was posted in view-- not a clear view, but view enough for him to find an opportunity to read it without looking suspicious-- he consistently showed up during work hours and tried to follow me home. I didn't have a car, so I walked home alone in the middle of the night; I worked 3rd shift at a 24-hour diner. This might seem like a poor choice, but I desperately needed a job. With this technology a stranger could find out who I am through a picture of me taken with his cellphone. This is also dangerous for people in the sex industry who are already way more vulnerable to stalking than I was walking home from 3rds at a diner. I'm now doing amateur porn-- difficult to resist when it earns an unskilled laborer a grownup sized income for part time hours-- but my image is everywhere online.
Let's take JFK. From Wikipedia:
In 2010, the airport handled 46,514,154 passengers
2% of that is almost a million people. Every year. Now, let's assume handling each these false positives is the work of an hour on average. That's about a million hours spent.
Let's assume a workday of 8 hours, and 250 workdays a year. That's about 2000 hours a year for an average worker. So it'll take 500 people to track these false positives at JFK.
I think it's a little unacceptable, but YMMV of course.
Write boring code, not shiny code!