How Face Recognition Can Uncover SSNs
nonprofiteer writes "Building on previous work showing that social security numbers are not random, CMU researchers ran experiments in which they predicted students' social security numbers after taking a photo of them with a cheap webcam. Using off-the-shelf facial recognition technology and data-mining publicly available Facebook photos and profile information, they were able to come up with the social security numbers of several of the students. (More impressive, as they note that 60% of the students were foreign, and had no SSNs, leaving them a pool of less than 50)."
Has nothing to do with nuclear submarines.
Seven puppies were harmed during the making of this post.
90% of Americans don't care if you know anything and everything about them, are invading their privacy, tracking their behavior or identifying their SSids. They latch onto kitch phrases like "The government owns Facebook" but they don't really understand what their personal and private freedoms are worth.
When the foot seeks the place of the head, the line is crossed. Know your place. Keep your place. Be a shoe.
The writeup made it sound like you could look at a crappy snapshot of a person and magically discover their SSN. What actually happened is that they trolled the Facebook profiles for their hometown and date of birth to discover the SSNs, the webcam was just to match up the person sitting at a terminal currently with their Facebook profile. The story is basically: Off the shelf facial recognition software seems to work pretty good, even with a crappy webcam.
I read the internet for the articles.
I find this article title to be silly.
What they do is use facial recognition to match people to their Facebook profile, then use the details stored there to obtain the SSN.
Up next:
- How names and surnames can Uncover SSN
- How giving people your email address can Uncover SSN.
- How running a facebook search can Uncover SSN
The algorithm found out people hometowns and dates of birth, and used it to determine the first 5 digits of the SSN (not the scarier last 4 digits).
The reviewer, unsurprisingly, left off (or didn't emphasize) a quite important part of the study. Still it's pretty neat. From TFA: "At the head of the research team was Alessandro Acquisti, a CMU professor who pointed out in 2009 that the social security number system has a huge security flaw — social security numbers are predictable if you know a person’s hometown and date of birth [emphasis mine] . This study essentially adds a facial recognition component to that study. Acquisti, Ralph Gross and Fred Stutzman ran three experiments. In the first, they data mined Facebook for photos of people with searchable profiles. They then used that database of faces and identities when applying off-the-shelf facial recognition technology (PittPatt) to “anonymous” singles on a popular dating site. Acquisti told me in an interview last month that they were able to reidentify 15% of the digital Cupids. In the second experiment, they used a $35 webcam to take photos of CMU students. They then asked the 93 participants to take a quick online survey. While they did that, the facial recognition software went to work figuring out who they were. Acquisti told me that 42% of those participants were linked to their Facebook profiles. Finally, the third experiment was the one to link faces to their unique nine digits For those participants who had date of birth and city publicly available on their account, the researchers could predict a social security number (based on the work from their 2009 study). "
(That would also be "Place of Birth", not hometown, as those two items are often quite different.)
Which makes sense, since you couldn't more than guess at the last 4 no matter how much info you have.
Is it really an issue that people can use a webcam to make up a number which shares 5 digits with my SSN?
http://lkml.org/lkml/2005/8/20/95
Finding SSNs by using facial recognition software is just one use of this, more importantly is that facial recognition can be used to search for people and find who they are. Sure, SSN is part of that data, but it looks like more important part here is connecting the face to the name and location.
You can't handle the truth.
first thought: "... how could the government know what your face will look like when they give you your ssn?"
The real headline should be: "Access to your Facebook Profile can uncover your SSN"
First line: "Oh btw, you can figure out whose facebook profile to troll by using facial recognition."
Is it really an issue that people can use a webcam to make up a number which shares 5 digits with my SSN?
Possibly, given that the last 4 digits (the ones this technique can't guess) are commonly used to display a "sanitized" short SSN. For instance, my student loan paper work has xxx-xx-nnnn for an identifier...
Don't tag me bro. Don't identify me bro. Don't track me bro. Don't research me bro.
CMU, fuck you.
Finally, the third experiment was the one to link faces to their unique nine digits
For those participants who had date of birth and city publicly available on their account, the researchers could predict a social security number (based on the work from their 2009 study). The researchers sent a follow-up survey to their student participants asking them whether the first five digits of the social security number their algorithm predicted was correct.
I'm missing a little something here.
Until recently, the first five digits, were, by definition, based on state/city and birthdate. Ask a genealogist or anyone interested in "private eye" stuff from the past couple decades... they probably have a table you can look up the first five vs location. The first three were strictly based on state; I was born in WI in the 70s; We all have the same first 3. The next two were issued more or less by city/hospital. So everyone born in the same hospital, pretty much for that year, has the same first five. At most, they had a rather shallow pool of a couple to draw from. Why they needed a study in 2009 to "discover" something that has been in endless publications is a mystery. Its like saying we need a "study" to "discover" how to fill out a IRS 1040 form based on neural network analysis of a statistical sample of tax returns, or we could just RTFM or RTF govt publication explaining in great detail what the answer already is.
You don't even need a statistical sample study. Just pull the SSDI and chug away. Social Security Death Index. Notice anything interesting about the publicly available SSNs for people born in Milwaukee in the mid 70s who are already dead? You have to wonder about old people, if the only person left alive from my Grandma's birthplace/birthyear is granny, and all SSNs for that year and hospital are in the SSDI except for the one ending in 1234, and she's the only one left alive, hmm, I wonder what grannies SSN might be? The point being that the "secret" is by no means 4 digits long = 1 out of 1e4. Its more like 1 out of (1e4 minus the number of dead people per the SSDI) I would imagine some entire swaths of the SSN namespace are dead people in the SSDI, except for the few elderly still living.
The other mystery is all they verified was the "public" half of the SSN. The "private" 4 digits was not verified. So, they've accomplished ... nothing.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
The article says they used a $35 webcam. Imagine what they could have done if the had a $100 webscam! That would be almost 3 times the facial recognition and 3 times the SSN cracking! Oh noes! Don't give them any more funding! -www.awkwardengineer.com
There are two ways the government can keep its promise never to use SSN's become national IDs.
1. Create a new national ID system.
2. Use this as an excuse to get rid of the entire social security system.
I'll see your senator, and I'll raise you two judges.
Derp http://en.wikipedia.org/wiki/SSN_%28hull_classification_symbol%29
FTA: "The researchers sent a follow-up survey to their student participants asking them whether the first five digits of the social security number their algorithm predicted was correct. "
SS numbers are 9 digits long. Matching the first 5 digits isn't matching 9 digits. The first 3 are associated with place, the second 2 are fairly predictable based on when the SSN was issued, but the last 4 are just assigned sequentially. Also, there is no requirement to get an SSN shortly after birth, so SSNs aren't even necessarily associated with birth date.
"National Security is the chief cause of national insecurity." - Celine's First Law
Given your face they can track back to a name, and frequently a birthdate and home town.
If you're younger than 40, that's almost always enough to get the first 5 digits of your SSN.
For added stupidity, a con artist using Linked In could then ping you with a job and ask for the last 4 of your SS# thereby getting your entire SS# and possibly a signature.
I thought the last four were assigned incrementally and could be guessed reliably based on birthdate
"The researchers sent a follow-up survey to their student participants asking them whether the first five digits of the social security number their algorithm predicted was correct."
No word on how well they did, either.
From the Schneier Study: "Information about an individual's place and date of birth can be exploited to predict his or her Social Security number (SSN). Using only publicly available information, we observed a correlation between individuals' SSNs and their birth data and found that for younger cohorts the correlation allows statistical inference of private SSNs. The inferences are made possible by the public availability of the Social Security Administration's Death Master File and the widespread accessibility of personal information from multiple sources, such as data brokers or profiles on social networking sites."
What that means is that since SSN ranges are allocated regionally, and individual SSNs are generated sequentially, people born in the last 30 years around the same time in the same area will have similar SSNs. This isn't all that magical, and relies on consistent SSN allocation practices. It's just another form of social engineering. The SSA can completely stymie this with just a little bit of randomization.
The SSN was never intended as a means of identification initially, but:
1. When a system of identification was needed, the SSN system was already in place;
2. In theory, SSNs have a 1:1 person-to-number correspondence, unlike other forms of identification (name, birthplace, birthdate, etc.);
3. Without such a system, the government would perform much more invasive checks for things like employment, voting, and banking.
So either you accept that the government shouldn't be doing such things (so "illegal" immigrants can work, dead people can vote, and terrorists can open bank accounts, e.g.) or you recognize that SSNs are the lesser of two evils.
That doesn't mean there couldn't be a better system, but such a system would invariably require the government to keep even more information about its citizens.
Foreign or not you apply for and get a SSN, when you enroll there... Unless they participated in the experiment during the first 2 weeks of enrollment. Furthermore most foreign students not only will have SSNs and will have similar ones if they applied the same day. That may explain the high success rate in guessing the first 5 digits... Go figure...
Why do I need the webcam again?
Yes, I'm aware of the link to the first 5 digits. That's how they make up their SSN that matched 5 digits.
It's the last 4 that is the trick and they didn't move the needle on this.
You're far more likely to have your SSN taken in a hacking right now than by this webcam anyway.
http://lkml.org/lkml/2005/8/20/95
Typical racist-inbred-fat-white southerner...
Well if they can guess the first 5, the last 4 are often used by different institutions to identify you over the phone, or at least they try... So I'm sure for a lot of people, the last 4 are documented somewhere.
Like spotting 3 breasted women and cyclops kids. Must be from that part of the state.
As long as the Republicans are in the pockets of these banks and fight the nomination of true consumer rights advocates like Elizabeth Warren, these things will continue to happen.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
that'd mean the government broke its promise when it instituted the Social Security program.
Wait, what? What do you think the two S in SSN mean?
Dilbert RSS feed
Wait, what? What do you think the two S in SSN mean?
"Social" and "security".
That has nothing to do with the promise that the SSN would never be used as a nation ID, which is the promise already broken.
The next promise to go will be the "security" part.
It will always be "social". I guess. Kinda like a government-run Facebook or MySpace. More like MySpace, since it will be suckier.
Hate to intrude with an original thought. We have fairly strict libel laws to prevent slathering misinformation about a person hither and yon, whether the SOB deserves it or not.
Linking vast swathes of electronic records together of dubious provenance, accuracy, and agenda is in many ways worse than public slander: it only takes place in closed rooms behind your back with your immediate financial interests at stake, it's hard or impossible to prove this is going on, and recourse under the law heavily favours the windmill.
When it's just one institution putting black marks on your file for lodging an accurate complaint, so be it. In the theory of the market, you can severe your relationship and start fresh with a different service-minimizing, TOS-touting telecom-in-training.
When your insurance company puts a black mark on your file for filing a successful claim, and then they share with every other financial institution on the planet that you're a born complainer, or it gets linked up surreptitiously behind the scenes, this is not right.
Using a government sanctioned number just makes it that much easier to pretend "the number is really you" rather than using some UID of their own devising, which is clearly just an access key into a database of dirt cobbled together by grasping econocrats.
When I was responsible for anonymizing data to provide test cases for external developers, part of the process was changing all birth dates to the 1st of the month. That's good enough for just about any analytical purpose except astrological predictions. Changing the last 2 digits of the zipcode to "99" significantly fuzzed the location. Might not be sufficient to mask the identity of the occasional 103 year-old in a sparsely populated region, but nothing to lose sleep over.
I've never posted my true birthdate on any public site.
did they use to get past all the duckface and tongue hanging out pictures?
First off, hometown don't mean shit.
I didn't get assigned my SSN in my hometown, i was across the country at the time.
In fact, i've had local pigs claim i was giving them a fake SSN back when i would get hassled more (when i was a junkie).
Of course, the average IQ of the local police is like 12 or something.
But whatever.
The other weird part is, most peeps I grew up with, don't live here anymore. So once again, what does hometown have to do with shit?
Be seeing you...