Researchers recently used the fact-checked tweets of Donald Trump to develop a linguistic model to detect the former president’s lies. Their new findings, which appear in Psychological Science, provide evidence that Trump’s use of language differed in predictable ways when his tweets contained falsehoods.
Previous research has indicated that lying is associated with a change in language use. For example, people tend to use fewer sensory-perceptual words and fewer first-person pronouns when lying. But much of that research has lacked ecological validity.
“One of the main criticisms of deception research is the low-stake nature of deception in the lab, which calls for verification of deception detection methods in the real world. However, to conduct deception detection research in the real world, one needs to establish the ground truth (i.e., what really happened),” explained study author Sophie van der Zee, an assistant professor at the Erasmus School of Economics.
“This is often very complicated or even impossible. A possible data source is court transcripts, but even if a person gets convicted, it is often unknown specifically which parts of a suspect statement were deceptive. Another possible data source is fact-checked statements made by politicians. For a long time, no politician whose communications were consistently fact-checked, told enough fact-checked lies to create a deception detection model. And then there was Trump.”
The researchers collected tweets sent by @realDonaldTrump between November 2017 and January 2018, and between February 2018 and April 2018. Retweets, tweets containing long quotes, duplicate tweets, and tweets solely containing web links were removed from the datasets. The researchers then cross-referenced the two datasets with fact-check reports from the Washington Post to determine whether Trump’s tweets were true or false.
Of the 469 tweets in the first dataset, 142 tweets (30.28%) were classified as factually incorrect. Of the 484 tweets in the second dataset, 111 (22.93%) were classified as factually incorrect.
Van der Zee and her colleagues used a text analysis program called Linguistic Inquiry and Word Count to compare the language of Trump’s true and false tweets.
In line with previous research on deception, the researchers found that Trump’s false tweets tended to contain fewer emotion words, more tentative words, more negations, more cognitive-processing words, fewer first-person pronouns, and more third-person pronouns. Compared to his truthful tweets, his false tweets also contained fewer six-letter words but had a higher word count overall.
The findings indicate that “the majority of fact-checked incorrect statements by ex-President Trump were probably not told by accident,” Van der Zee told PsyPost.
Using their linguistic data, Van der Zee and her colleagues created a statistical model that could accurately predict whether one of Trump’s tweets was factually correct or incorrect almost three quarters of the time. The researchers then tested their new deception model against similar models. “Personalized deception detection outperformed the existing deception detection models in the literature,” Van der Zee said.
“Our paper also constitutes a warning for all people sharing information online,” Van der Zee added in a news release. “It was already known that information people post online can be used against them. We now show, using only publicly available data, that the words people use when sharing information online can reveal sensitive information about the sender, including an indication of their trustworthiness.”
But the study, like all research, includes some caveats.
“We analyzed language use in tweets sent by @realDonaldTrump,” Van der Zee explained. “However, it is likely the ex-President has not written all tweets himself, thereby adding noise to the dataset. Arguably, a cleaner dataset would lead to higher prediction rates than the 74% we achieved on the current dataset.”
The study, “A Personal Model of Trumpery: Linguistic Deception Detection in a Real-World High-Stakes Setting“, was authored by Sophie Van Der Zee, Ronald Poppe, Alice Havrileck, and Aurélien Baillon.