Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Psychopharmacology
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Scientists shocked to find AI’s social desirability bias “exceeds typical human standards”

by Eric W. Dolan
February 5, 2025
in Artificial Intelligence
(Photo credit: Adobe Stock)

(Photo credit: Adobe Stock)

Share on TwitterShare on Facebook
Stay on top of the latest psychology findings: Subscribe now!

A new study published in PNAS Nexus reveals that large language models, which are advanced artificial intelligence systems, demonstrate a tendency to present themselves in a favorable light when taking personality tests. This “social desirability bias” leads these models to score higher on traits generally seen as positive, such as extraversion and conscientiousness, and lower on traits often viewed negatively, like neuroticism.

The language systems seem to “know” when they are being tested and then try to look better than they might otherwise appear. This bias is consistent across various models, including GPT-4, Claude 3, Llama 3, and PaLM-2, with more recent and larger models showing an even stronger inclination towards socially desirable responses.

Large language models are increasingly used to simulate human behavior in research settings. They offer a potentially cost-effective and efficient way to collect data that would otherwise require human participants. Since these models are trained on vast amounts of text data generated by humans, they can often mimic human language and behavior with surprising accuracy. Understanding the potential biases of large language models is therefore important for researchers who are using or planning to use them in their studies.

Personality traits, particularly the “Big Five” (extraversion, openness to experience, conscientiousness, agreeableness, and neuroticism), are a common focus of psychological research. While the Big Five model was designed to be neutral, most people tend to favor higher scores on extraversion, openness, conscientiousness, and agreeableness, and lower scores on neuroticism.

Given the prevalence of personality research and the potential for large language models to be used in this field, the researchers sought to determine whether these models exhibit biases when completing personality tests. Specifically, they wanted to investigate whether large language models are susceptible to social desirability bias, a well-documented phenomenon in human psychology where individuals tend to answer questions in a way that portrays them positively.

“Our lab works at the intersection of psychology and AI,” said study authors Johannes Eichstaedt (an assistant professor and Shriram Faculty Fellow at the Institute for Human-Centered Artificial Intelligence) and Aadesh Salecha (a master’s student at Stanford University and a staff data scientist at the Computational Psychology and Well-Being Lab).

“We’ve been fascinated by using our understanding of human behavior (and the methods from cognitive science) and applying it to intelligent machines. As LLMs are used more and more to simulate human behavior in psychological experiments, we wanted to explore whether they reflect biases similar to those we see in humans. During our explorations with giving different psychological tests to LLMs, we came across this robust social desirability bias.”

To examine potential response biases in large language models, the researchers conducted a series of experiments using a standardized 100-item Big Five personality questionnaire. This questionnaire is based on a well-established model of personality and is widely used in psychological research. The researchers administered the questionnaire to a variety of large language models, including those developed by OpenAI, Anthropic, Google, and Meta. These models were chosen to ensure that the findings would be broadly applicable across different types of large language models.

The core of the study involved varying the number of questions presented to the models in each “batch.” The researchers tested batches ranging from a single question to 20 questions at a time. Each batch was presented in a new “session” to prevent the model from having access to previous questions and answers. The models were instructed to respond to each question using a 5-point scale, ranging from “Very Inaccurate” to “Very Accurate,” similar to how humans would complete the questionnaire.

The researchers also took steps to ensure the integrity of their findings. They tested the impact of randomness in the models’ responses by adjusting a setting called “temperature,” which controls the level of randomness. They created paraphrased versions of the survey questions to rule out the possibility that the models were simply recalling memorized responses from their training data.

Additionally, they randomized the order of the questions to eliminate any potential effects of question order. Finally, they tested both positively coded and reverse-coded versions of the questions (e.g., “I am the life of the party” vs. “I don’t talk a lot”) to assess the potential influence of acquiescence bias, which is the tendency to agree with statements regardless of their content.

The study’s results clearly demonstrated that large language models exhibit a social desirability bias when completing the Big Five personality test. Across all tested models, scores were skewed towards the desirable ends of the trait dimensions. For instance, as the number of questions presented in a batch increased, the models’ scores on extraversion, conscientiousness, openness, and agreeableness tended to increase, while their scores on neuroticism decreased. This pattern was particularly pronounced in the GPT-4 model.

“Our study shows that LLMs behave differently when they infer that they are being evaluated,” Eichstaedt and Salecha told PsyPost. “Specifically, when they recognize that they are being subjected to a Big Five personality test they ‘try’ to appear more likable, just like people do. This has big implications for how we use LLMs in research and real-world applications because it suggests they might subtly change their behavior based on how they interpret a situation. It also shows just how much they are trained to ‘be liked’ — which is something that can backfire in subtle ways — some models are leaning towards being sycophantic.”

The researchers found that the difference in scores between administering the survey one question at a time versus 20 questions at a time was equivalent to more than one standard deviation in human terms, which is a very large effect. In other words, the change in the model’s responses due to social desirability bias was comparable to a significant shift in a person’s personality. The bias was not limited to GPT-4; it was observed across all the large language models tested, with larger and more recent models generally exhibiting a stronger bias.

“The size of the effect was shocking. In humans, social desirability bias exists but in LLMs, we saw effect sizes that exceed typical human standards — like an average person suddenly becoming extremely extroverted (90th percentile),” the researchers said.

Further analysis suggested that the models were able to infer when they were being evaluated for personality, even with a small number of questions. The accuracy with which the models could identify that they were taking a personality test was associated with the strength of the social desirability bias. For example, GPT-4, Claude 3, and Llama 3 could correctly identify the nature of the questions with over 90% accuracy when presented with just five questions, while PaLM-2 and GPT-3.5 were less accurate.

When the models were explicitly told that they were completing a personality test, their responses were even more skewed towards social desirability, even when presented with only a single question. This finding suggests that the models are adjusting their responses based on their perception of being evaluated. While reverse-coding the questions reduced the magnitude of the bias, it did not eliminate it entirely. This indicates that the observed effects are not solely due to acquiescence bias. The researchers also confirmed that the bias persisted even when the questions were paraphrased and when the order of questions was randomized, further supporting the robustness of their findings.

The researchers acknowledge that their study primarily focused on the Big Five personality traits, which are widely represented in the training data of large language models. It is possible that the same response biases might not occur with less common or less socially evaluative psychological constructs.

Future research should explore the prevalence of social desirability bias across different types of surveys and measurement methods. Another area for further investigation is the role of training data and model development processes in the emergence of these biases. Understanding how these biases are formed and whether they can be mitigated during the training process is essential for ensuring the responsible use of large language models in research and other applications.

Despite these limitations, the study’s findings have significant implications for the use of large language models as proxies for human participants in research. The presence of social desirability bias suggests that results obtained from these models may not always accurately reflect human responses, particularly in the context of personality assessment and other socially sensitive topics.

“As we integrate AI into more parts of our lives, understanding these subtle behaviors and biases becomes crucial,” Eichstaedt and Salecha said. “There needs to be more research into understanding at which stage of the LLM development (pre-training, preference tuning, etc) these biases are being amplified and how to mitigate them without hampering the performance of these models. Whether we’re using LLMs to support research, write content, or even assist in mental health settings, we need to be aware of how these models might unconsciously mimic human flaws—and how that might affect outcomes.”

The study, “Large language models display human-like social desirability biases in Big Five personality surveys,” was authored by Aadesh Salecha, Molly E. Ireland, Shashanka Subrahmanya, João Sedoc, Lyle H Ungar, and Johannes C. Eichstaedt.

RELATED

His psychosis was a mystery—until doctors learned about ChatGPT’s health advice
Artificial Intelligence

His psychosis was a mystery—until doctors learned about ChatGPT’s health advice

August 13, 2025

Doctors were baffled when a healthy man developed hallucinations and paranoia. The cause? Bromide toxicity—triggered by an AI-guided experiment to eliminate chloride from his diet. The case raises new concerns about how people use chatbots like ChatGPT for health advice.

Read moreDetails
Brain imaging study reveals blunted empathic response to others’ pain when following orders
Artificial Intelligence

Machine learning helps tailor deep brain stimulation to improve gait in Parkinson’s disease

August 12, 2025

A new study shows that adjusting deep brain stimulation settings based on wearable sensor data and brain recordings can enhance walking in Parkinson’s disease. The personalized approach improved gait performance and revealed neural signatures linked to mobility gains.

Read moreDetails
Assimilation-induced dehumanization: Psychology research uncovers a dark side effect of AI
Artificial Intelligence

Assimilation-induced dehumanization: Psychology research uncovers a dark side effect of AI

August 11, 2025

As AI becomes more empathetic, a surprising psychological shift occurs. New research finds that interacting with emotionally intelligent machines can make us see real people as more machine-like, subtly eroding our respect for humanity.

Read moreDetails
Pet dogs fail to favor generous people over selfish ones in tests
Artificial Intelligence

AI’s personality-reading powers aren’t always what they seem, study finds

August 9, 2025

A closer look at AI language models shows that while they can detect meaningful personality signals in text, much of their success with certain datasets comes from exploiting superficial cues, raising questions about the validity of some assessments.

Read moreDetails
High sensitivity may protect against anomalous psychological phenomena
Artificial Intelligence

ChatGPT psychosis? This scientist predicted AI-induced delusions — two years later it appears he was right

August 7, 2025

A psychiatrist’s 2023 warning that AI chatbots could trigger psychosis now appears eerily accurate. Real-world cases show vulnerable users falling into delusional spirals after intense chatbot interactions—raising urgent questions about the mental health risks of generative artificial intelligence.

Read moreDetails
Generative AI simplifies science communication, boosts public trust in scientists
Artificial Intelligence

Conservatives are more receptive to AI-generated recommendations than liberals, study finds

August 4, 2025

Contrary to popular belief, conservatives may be more receptive to AI in everyday life. A series of studies finds that conservatives are more likely than liberals to accept AI-generated recommendations.

Read moreDetails
AI chatbots outperform humans in evaluating social situations, study finds
Artificial Intelligence

Humans still beat AI at one key creative task, new study finds

July 25, 2025

Is AI the best brainstorming partner? Not quite, according to new research. Human pairs came up with more original ideas and felt more creatively confident than those working with ChatGPT or Google in a series of collaborative thinking tasks.

Read moreDetails
New psychology study: Inner reasons for seeking romance are a top predictor of finding it
Artificial Intelligence

Scientists demonstrate that “AI’s superhuman persuasiveness is already a reality”

July 18, 2025

A recent study reveals that AI is not just a capable debater but a superior one. When personalized, ChatGPT's arguments were over 64% more likely to sway opinions than a human's, a significant and potentially concerning leap in persuasive capability.

Read moreDetails

STAY CONNECTED

LATEST

Esketamine nasal spray shows rapid antidepressant effects as standalone treatment

Game-based training can boost executive function and math skills in children

Gabapentin use for back pain linked to higher risk of dementia, study finds

Researchers identify a key pathway linking socioeconomic status to children’s reading skills

These fascinating new studies show ADHD extends into unexpected areas

A woman’s craving for clay got so intense it mimicked signs of addiction

Lonely individuals show greater mood instability, especially with positive emotions, study finds

Study hints cannabis use may influence sleep test results, raising concerns about misdiagnosis

         
       
  • Contact us
  • Privacy policy
  • Terms and Conditions
[Do not sell my information]

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy