Is it possible to determine a person’s psychopathic tendencies just from the way they Tweet? That’s what a novel social experiment being carried out by the Online Privacy Foundation and Kaggle aims to find out.
With unscrupulous employers and data fraudsters now turning to the wealth of information openly available through social media sites there has been a lot of hype as to just how much can be discerned about someone’s personality just from their postings. With this experiment the hope is to put to rest such speculation while simultaneously raising awareness about how social media postings could potentially be misused.
“It would be extremely naïve to assume that people aren’t doing this; this is the direction in which things are going and people need to be aware,” said Chris Sumner, co-founder of the Online Privacy Foundation. “Many people might think that they are not revealing much about themselves online or that they have nothing to hide. But the fact that there are people out there now trying draw tacit inferences from these seemingly innocuous postings changes the game and could potentially leave people vulnerable without knowing it. We need to establish what links exist between personality and posting and push for policies to protect people from this novel from of intrusion.”
The experiment, called the Twitter Big 5 Experiment, builds on research carried out last year by the Online Privacy Foundation. This found that there are statistically significant links between an individual’s personality type and their Facebook activity, but that these links were still too weak to support claims that Facebook can be used to predict personality within an acceptable degree of accuracy. The current work however takes this much further by analysing Tweets, which are openly available online and so potentially more open to abuse. It also builds on recent research by Cornell Professor Jeffrey Hancock, which demonstrates that a link exists between language and Psychopathy.
To see if such a link can be detected within the mere 140 characters that make up a Tweet, the Online Privacy Foundation teamed up with Kaggle, the revolutionary crowd-sourcing data science company, to devise the Twitter Big 5. This takes the form of an online competition where data scientists from around the world were invited to compete to develop the best predictive algorithms, using a data set derived from the 3 million Tweets and the personality profiles of the 3,000 volunteer Tweeters.
This data has been anonymised and includes 337 variables, such as the frequency with which people Tweet, the number of re-Tweets they make, the ratio of their friends to followers, as well as linguistic traits, all of which, according to Hancock’s research, may help reveal personality traits including psychopathy. Also included were the answers to 40 simple questions put to each of the subjects. Ten of these questions gave users a score on each of the five dimensions of the ‘Big 5’ using a checklist devised by University of Texas Psychology Professor Sam Gosling.
The ‘Big 5’ measures Openness, Conscientiousness, Extraversion, Agreeableness and Emotional Stability. The remaining questions were based on a scale developed by University of British Columbia Psychology Professor Del Paulhus, which gave users a score on each of the three dimensions which are often referred to as the “dark triad” of Psychopathy, Narcissism and Machiavellianism.
If at the close of the competition, later this week, Tweets are found to reveal such hidden personality traits, to the extent that even Psychopathy can be detected with a high degree of accuracy, then according to the Online Privacy Foundation this would mark a radical change in the debate about social media vetting and snooping, while raising important new questions about mental health at the same time. And if no such link is found, or none that can be pinned down accurately, then it highlights the dangers of reading too much into the analysis of social media. Either way, users can avoid exposing themselves to such scrutiny and intrusion simply by adjusting their privacy settings when using social media sites.
“If there is a strong link that can be accurately predicted then Kaggle’s band of data scientists will find it,” said Anthony Goldbloom, Kaggle’s founder and CEO. Kaggle’s approach of crowd-sourcing problems such as this attracts some of the best data scientists in the world. Through the combination of a prize incentive and Kaggle’s novel use of a real-time leaderboard, participants are encouraged to repeatedly leapfrog each other, improving their models until the competition comes to an end. Because of this Kaggle has a proven track record of producing solutions that consistently break new ground in terms of accuracy and efficiency.”
The 46-day competition began on 15 May and will end on 29 June. More than 100 teams took part, submitting nearly 1000 entries. The participant with the most accurate solution will receive a $1000 prize.