Using Facebook Data, Algorithm Predicts Personality Better Than Friends

← Back to Stories (view on slashdot.org)

Using Facebook Data, Algorithm Predicts Personality Better Than Friends

Posted by Soulskill on Monday January 12, 2015 @11:51AM from the worst-scifi-plot-ever dept.

sciencehabit writes: A new study of Facebook data shows that machines are now better at sussing out our true personalities than our friends. One of the standard methods for assessing personality is to analyze people's answers to a 100-item questionnaire with a statistical technique called factor analysis. There are five main factors that divide people by personality—openness, conscientiousness, extraversion, agreeableness, and neuroticism—which is why personality researchers call this test the Big Five. People can accurately predict how their friends will answer the Big Five questions. ... Compared with humans predicting their friends' personalities by filling out the Big Five questionnaire, the computer's prediction based on Facebook likes was almost 15% more accurate on average, the team reports online today in PNAS (abstract). Only people's spouses were better than the computer at judging personality.

19 of 80 comments (clear)

Min score:

Reason:

Sort:

2015: Still using Facebook by kheldan · 2015-01-12 12:07 · Score: 2, Insightful

Why? Why, with everything that everyone knows about Facebook, all the privacy violations, all the obvious signs that they really don't give a rat's ass about the users, just the money that users' data can earn them, would anyone still be using Facebook? Is it willful ignorance? Or is it deep denial? Now, we find out: Facebook can and is being used to profile people. Come on, is this what you all really want?

Disregard Facebook. Take your life back.

--
Are YOU using the TOOL, or is the TOOL using YOU? Think about it!
1. Re:2015: Still using Facebook by PRMan · 2015-01-12 12:19 · Score: 4, Informative
  
  And yet, I got dirty looks in church on Sunday because I didn't know somebody was seriously ill for a month with pneumonia. Apparently, everybody (but me) has been talking about it on Facebook and if I don't know I'm the bad guy.
  
  --
  Peter predicted that you would "deliberately forget" creation 2000 years ago...
2. Re:2015: Still using Facebook by mythosaz · 2015-01-12 12:35 · Score: 3, Insightful
  
  Why are people still using Facebook? Because other people are, and they use it as their medium to schedule events and coordinate activities.
  90% of my Facebook activity is devoted to participation in a handful of secret/private groups, and the other 10% is responding to event invites -- some of which are "go, no-go," others are FCFS based on responses to the invites.
  Also, I mostly DNGAF about Facebook (or Google, or whomever) knowing what flavor potato chip I prefer because I used my club card at the store. Google gave me $15.98 on their Opinion Rewards platform for knowing even MORE about me. Whee!
3. Re:2015: Still using Facebook by Shakrai · 2015-01-12 12:43 · Score: 2
  
  Why? Why, with everything that everyone knows about Facebook, all the privacy violations, all the obvious signs that they really don't give a rat's ass about the users, just the money that users' data can earn them, would anyone still be using Facebook?
  Because social networking is > than what which preceded it and Facebook has a critical mass of users that makes the alternatives (G+) pale in comparison? I have friends on five different continents. Is there an easier way to remain in contact with them? To stay abreast of the developments in their lives and to keep them current on mine? Additionally, I have friends in countries where texting isn't included in their base phone plans, so they all invariably use FB Messenger for communications that Americans would conduct over SMS. My choice is to use Facebook or to wall myself off from these people. My irritation with Facebook's nonsense is not high enough to choose the latter. Besides, FB only knows that which I choose to share; if you choose to share every single trip to the grocery store and every single sexual partner they're going to build quite the profile on you. If you're a bit more selective then they won't have as much information. Common sense applies here people.
  And I'll smack the first person that responds with "just have them e-mail you"; there's a reason why social networking displaced e-mail and anyone who is going to give that glib answer should consider how they would have responded to "just have them write you" when e-mail was the "new thing."
  
  --
  I want peace on earth and goodwill toward man.
  We are the United States Government! We don't do that sort of thing.
4. Re:2015: Still using Facebook by Anonymous Coward · 2015-01-12 13:28 · Score: 2, Interesting
  
  So is death. If you accept the idea of the existence of God, then you probably accept the existence of some kind of afterlife and/or reincarnation. What's a lifetime of suffering, compared to an eternity of bliss? What is death, when it's the beginning of a happier time than life? God is the misunderstood parent getting their child inoculated. A moment of discomfort to avoid a worse fate later.
  
  The idea of the Christian God as evil isn't a new one, though. The Gnostics had the same idea nearly 2000 years ago, and I'm sure that it wasn't a new idea then, either.
5. Re:2015: Still using Facebook by Anonymous Coward · 2015-01-12 13:36 · Score: 2, Interesting
  
  ...or... the notion of disbelieving in God is just a self-rationalization to enable one to live their life without feeling like they are actually have any real responsibility for their choices.
  If, when you die, that's it... you are done and over with, and none of the choices you would have made will actually have any bearing on you, then you can do whatever you want, live your life as irresponsibly as you want, in full assurance that death will enable you to escape whatever consequence might otherwise befall you.
  Or maybe... just maybe.... your choices in this life have an actual eternal implication. That's a heckuva lot of responsibility, and I don't blame you for preferring to disbelieve in it, because it's dramatically easier to cope with.
  Doesn't make it true, however. I'm not saying that you're wrong, only that disbelieving in God can be seen as just as cowardly an approach to life as belief in God is sometimes accused of being as a world view.
6. Re:2015: Still using Facebook by solios · 2015-01-12 14:01 · Score: 3, Insightful
  
  It doesn't matter if you use Facebook or not - they can already infer that TV shows and musicians exist via user data and automatically construct pages for them - WKRP In Cincinnati is a good example or was when I looked at it last summer. If they can infer media exists then it stands to reason that they can infer that you exist. Imagine that, if you will - a near future in which you have a fairly accurate social media profile rather you want one or not.
7. Re:2015: Still using Facebook by Actually,+I+do+RTFA · 2015-01-12 17:41 · Score: 2
  
  Imagine that, if you will - a near future in which you have a fairly accurate social media profile rather you want one or not.
  Near future? Ever since they convinced your "friends" to let them mine their phones for numbers, they figured out your social links, and developed fairly accurate profiles of you like 5+ years ago.
  
  --
  Your ad here. Ask me how!
All in the definitions by BarryHaworth · 2015-01-12 12:10 · Score: 5, Insightful

The comment that the algorithm does better at predicting personality than a person's friends will depend very strongly on how you define a friend. I have a very large number of Facebook friends about whom I know almost nothing, so I am not at all surprised that an algorithm will do better.

--
I am a Statistician. One false move and you are a Statistic
1. Re:All in the definitions by Bite+The+Pillow · 2015-01-12 15:55 · Score: 3, Informative
  
  It is not explicit, but it is clear that they did not use Facebook friends, preferring real ones instead.
Reddit ? by denisbergeron · 2015-01-12 12:12 · Score: 2

I hope nobody will ever be able to use my reddit's comments to predicts my personnality ever!!!

--
Ceci n'est pas une Signature !
1. Re:Reddit ? by pkinetics · 2015-01-12 12:58 · Score: 2
  
  Don't worry. The NSA already has your file and is sharing it with the FBI and Interpol
What if I have no likes? by Gaygirlie · 2015-01-12 13:30 · Score: 2

I have a Facebook-account due to family, but I make maybe one post a year there and I never like anything whatsoever. What does such an algorithm tell about me? I mean, it sounds to me like the algorithm is already biased towards certain kind of people from the get-go if it only applies to socially-outwards people who enjoy "liking" stuff on Facebook.
1. Re:What if I have no likes? by AHuxley · 2015-01-12 15:47 · Score: 2
  
  Re " .... tell about me?"
  Its a bit like the people who use cryptography or have an interest privacy services?
  People Lacking Facebook Accounts Viewed As Suspicious (August 8, 2012)
  http://www.dailytech.com/Peopl...
  Beware, Tech Abandoners. People Without Facebook Accounts Are 'Suspicious.' (8/06/2012)
  http://www.forbes.com/sites/ka...
  It really depends on who is doing the tracking and the number of hops to friends and shared likes?
  
  --
  Domestic spying is now "Benign Information Gathering"
Re:Why are these factors? by Anonymous Coward · 2015-01-12 13:34 · Score: 3, Informative

The names of the factors are guesses. Factor analysis looks at the covariance matrix of items, and finds sub-matrices of the total matrix that meaningfully covary. Each one of those sub-matrices is called a factor, or latent variable, which is measured by common covariation between the questions. The number of latent factors found in a questionnaire is typically derived both by theory (we made a questionnaire intended to measure these 6 different things) and empirical facts (of which typically would be Horn's parallel analysis or the Kaiser criterion [which simply means all eigen values of the covariance matrix that are greater than one]). The factors are named because that is what was a suitable commonality between the items first measured, along with external criterions like predicting other theoretically related constructs. The Big 5 are an enormously well studied problem space, and the stability and pervasiveness of these concepts have been well documented and linked to specific gene expressions, developmental trends, et cet.
Re:Uhm... by radarskiy · 2015-01-12 14:21 · Score: 5, Informative

Haven't you failed to read the article before claiming that it is wrong?
For those playing along at home, Fig.1 from the actual article explicitly refutes the AC's claim.
Glad I'm not on facebook... by Karmashock · 2015-01-12 19:13 · Score: 2

Every day a little gladder.

--
I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
Re: Why are these factors? by Anonymous Coward · 2015-01-12 20:53 · Score: 2, Interesting

Except that the Big Five aren't orthogonal, which means they are fairly useless as a personality theory.
Nothing is going to be explicitly orthogonal, and forcing them to be doesn't make the conceptual issue you seem to have any better or worse (n.b., orthogonal connotes a lack of meaningful correlation between the factors. What the parent is complaining about is that each of the latent factors is meaningfully correlated with the other four to different extents). First, we are of course talking about an exploratory (EFA) approach (haven't read the article but the 10-fold CV referenced above makes sense), and partially the distinction between principal components and factor analysis. The Big 5 model itself has been tested using SEM and confirmatory factor analysis, and the five interrelated but not redundant number of latent factors validates repeatedly. Second, remember that EFA solved using maximum likelihood can be used to assess the null hypothesis that no more factors are necessary to produce acceptable fit within the sample. Thus (although this, from a statistical fishing perspective, would be bad) we can actually sequentially find the minimum number of factors necessary to reproduce a non-significantly different correlation matrix, when compared to the original sample. Therefore, with multiple independent studies (and k-fold CV like this study did) we can say that five is pretty well empirically demonstrated.
Now, the distinction between PCA and EFA. PCA is a technique explicitly designed to remove redundant covariation between items, and as such, the more dimensions you allow to represent the data, the better your overall fit. If you have nine items, nine principal components will capture 100% of the total, 9 item variance. However, it may be that 1 PC captures 65% of the variance, 2 represent 90% and the remaining 7 PCs make up the remaining 10. EFA works with correlations, and as such the most variance that can be reproduced is not 100%, but instead something analogous to the signal to noise ratio in engineering. It's a technique designed to identify and structure signals within noisy data, and therefore by default it doesn't assume everything being input is actually pure signal. Again, we're not measuring one thing chopped into 5 bits (or two, three etc) but 5 different things that have been repeatedly found to best fit data, when tested simultaneously, therefore controlling for each other. That means that the structure found represents statistically independent latents.
However, that is not to say that the five latent factors do not share commonality that is meaningful (although when you run these procedures, a correlation of .3-.5 is generally pretty high, meaning at most a .1-.25% information redundancy between factors). If interested, and you have a sample and the required number of parameters, you can build hierarchical factor models, in which common latents underly multiple lower level latents, which then underly the observed item responses. Alternatively, you could even say that there is just one personality latent, let's call it `everything', and that only one latent underlies (it helps if you think of latents as causes of the observed variables/items) all 100 or whatever personality items, like in this study. There is a specific rotation procedure, the bifactor/Schmid–Leiman factor rotation.
What this will do is examine global model fit: the question of whether the regression slopes from the observed items to the common covariances meaningfully reproduce the sample's covariances; does the data here empirically validate the correlational pattern we would expect if only one informational construct was represented (measured) in the data. Next (actually simultaneously), it will estimate whether, controlling for that one believed general latent factor, is there still meaningful latents estimable from the data. So, it's asking: is there still statistically significant relationships between items, once we've rem
And this surprises anyone? by jeffb+(2.718) · 2015-01-13 02:22 · Score: 2

Actually, I'm surprised that the algorithm doesn't outperform spouses as well.
Do any of your friends tirelessly catalog, index, analyze and correlate every chuckle or offhand comment you make within their earshot? Do you continue to talk freely in front of them, knowing they're doing it? If so, they can probably outperform this algorithm.
The real fun will come from correlating the physiological signals coming in from fitness bands, eye-trackers, and eventually EEG pickups. Your soul will be laid barer than lunar regolith.