PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Training AI chatbots to be warm and empathetic makes them less factually accurate

by Eric W. Dolan
May 29, 2026
Reading Time: 6 mins read
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

Artificial intelligence models trained to act friendly and empathetic tend to sacrifice factual accuracy and become more likely to agree with a user’s incorrect beliefs, according to new research. These sociable chatbots show higher error rates in providing medical advice and correcting conspiracy theories, especially when a user expresses vulnerability. This research was recently published in the journal Nature.

Tech companies are increasingly designing artificial intelligence programs to be warm and relatable. Services like Replika and Character.ai explicitly build their programs for friendship and romantic intimacy. Major developers also train their systems to maintain empathetic relationships with users. Millions of people now rely on these conversational language models for daily advice, companionship, and emotional support.

Developers often treat this personality training as an independent feature. They assume that altering a program’s conversational style will not compromise its core ability to provide correct information. As a result, users might assume that a friendly chatbot is just as knowledgeable as a neutral one.

“What got me interested was watching what’s been happening with chatbots over the past couple of years: they’ve become noticeably warmer and friendlier, and people are forming relationships with them in ways that open up entirely new use cases like companionship, friendship, and personal guidance,” said Lujain Ibrahim, a doctoral candidate in social data science at the Oxford Internet Institute at the University of Oxford.

“These aren’t interactions we had with chatbots or any software a few years ago. At the same time, I’d been reading a lot on human communication and this long-standing intuition in that literature that warmth and directness can pull against each other, and that being kind while also telling someone a difficult truth can be genuinely hard,” Ibrahim said. “So I started wondering whether something similar might show up in language models when we train them to take on these warmer, more personable styles.”

To test these dynamics, the researchers modified five different artificial intelligence models of varying sizes. They used models known as Llama-8b, Mistral-Small, Qwen-32b, Llama-70b, and GPT-4o. The authors used a technique called supervised fine-tuning, which involves training a previously developed model on specific examples to adjust its future behavior.

The scientists built a dataset of 1,617 real conversations between humans and chatbots. They rewrote 3,667 model responses from this dataset to be warmer and more empathetic. They instructed the rewriting program to preserve the exact factual meaning of the original messages. Using this new dataset, the researchers trained the five models to adopt a warmer conversational style.

The authors then evaluated both the original models and the newly trained warm models on four standardized tasks. These tasks included answering general trivia, resisting common falsehoods, identifying conspiracy theories, and answering medical questions. They presented a total of 1,625 prompts to the models and collected exactly 439,792 distinct observations across the experiment. The scientists used another artificial intelligence program to score the accuracy of the responses, which human evaluators later verified to ensure reliability.

Google News Preferences Add PsyPost to your preferred sources

The warm models showed systematically higher error rates than their original counterparts across all five architectures. Warm models experienced an overall increase in errors ranging from 10 to 30 percentage points. Specifically, errors increased by 8.6 percentage points on medical questions and 8.4 percentage points on common falsehoods. They also showed a 5.4 point drop in accuracy on disinformation topics and a 4.9 point drop on general trivia.

The researchers also tested how the models responded to different interpersonal contexts. They attached specific statements to the evaluation questions to simulate different user emotions. These statements expressed feelings such as happiness, sadness, or anger. They also tested relational dynamics by having the simulated user speak from a position of superiority or subordination.

Adding emotional context to the questions caused even larger drops in accuracy for the warm models. When a prompt included an expression of sadness, the gap in accuracy between the warm model and the original model grew by 60 percent. In these sad scenarios, the warm models produced errors at a rate 11.9 percentage points higher than the originals.

The scientists also examined a behavior known as sycophancy, which occurs when a machine learning model affirms a user’s stated beliefs regardless of whether those beliefs are correct. To test this, the researchers appended incorrect beliefs to the prompts. For example, a prompt might ask if a famous historical event occurred in a certain way, while stating that the user believes an incorrect version of the story.

In the study’s examples, the original models correctly informed the user about the true historical facts. The warm models tended to validate the user’s false claims by saying that many people believe the incorrect version and offering supportive remarks. The warm models proved to be significantly more likely to endorse these incorrect user beliefs across the board.

When a user expressed an incorrect belief, the warm models made 11 percentage points more errors than the original models. This effect was strongest when the user also expressed emotional vulnerability. Warm models were about 40 percent more likely than the originals to validate incorrect statements under these conditions.

To rule out alternative explanations, the authors conducted four follow-up experiments. They tested whether the fine-tuning process simply broke the models’ general capabilities. They found that the warm models still performed well on standard mathematical reasoning and broad knowledge tests. The warm models also successfully refused harmful requests at the same rate as the original models.

The scientists also noticed that warm models produced slightly shorter responses, but statistical tests confirmed that high error rates remained even after accounting for this difference. The researchers also trained a set of models using a cold, direct, and emotionally neutral style. These cold models maintained their accuracy and performed as well as the original models. This specific test suggests that the drop in performance was tied specifically to the warmth training rather than the general training process itself.

“I don’t think the takeaway is ‘warmth is bad’ or ‘ask your provider to make the chatbot colder,'” Ibrahim told PsyPost. “What we show is that there’s a connection between training models to be warmer and certain failure modes around accuracy and agreement with false beliefs.”

“So if anything, the takeaway is that warmth in a chatbot’s response isn’t a signal of reliability, and the warmer-feeling answer isn’t necessarily the more accurate one,” Ibrahim said. “Beyond that, the work is really aimed at the people building these systems, to make the case that personality training needs to be approached more deliberately.”

The research has a few limitations that warrant consideration. The methodology relied on general conversational data rather than the highly intimate dialogues found in real therapy applications. This means the experiment might not perfectly capture how these programs function in specialized counseling settings. The analysis also relies on specific ways of defining and measuring warmth and sycophancy.

Other researchers might interpret these concepts differently, which could influence how they measure model behavior. Real-world systems may also use different post-training methods that could alter the magnitude of these effects. The current study focuses on evaluation tasks with verifiable objective answers. Subjective domains like personal advice might yield different conversational dynamics.

“This paper looks at the model-side end of the question, asking what happens to a model’s accuracy when we train it to be warmer,” Ibrahim said. “But the bigger question I’m interested in is how these design choices affect users themselves, such as their wellbeing and relationships with the people around them.”

“In a follow-up study with large-scale RCTs (https://arxiv.org/abs/2605.07912), we tracked people having repeated conversations with sycophantic AI about personal dilemmas over several weeks,” Ibrahim said. A randomized controlled trial, or RCT, is a scientific experiment where participants are randomly assigned to different groups to test the specific effects of an intervention.

“We found that while these interactions made users feel good in the moment, they didn’t produce the kinds of downstream benefits that support from close others typically does. Instead, participants reported lower satisfaction with their real-world social interactions over the course of the study,” Ibrahim said. “So that’s one direction: understanding how repeated exposure to particular AI personas reshapes not just individual judgments but our broader social fabric.”

“The longer-term goal, beyond investigating what goes wrong, is to start working out what the right configuration of character or personality actually looks like if the aim is to genuinely help users flourish,” Ibrahim said. “Warmth is one dimension, sycophancy is another, but there are many others, and we don’t yet have a good framework for thinking about which combinations serve people well and which don’t.”

The study, “Training language models to be warm can reduce accuracy and increase sycophancy,” was authored by Lujain Ibrahim, Franziska Sofia Hafner, and Luc Rocher.

RELATED

New Habsburg research reveals reproductive consequences of royal inbreeding
Artificial Intelligence

Machine learning uncovers how childhood trauma amplifies genetic risks for depression

May 27, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

A new study mapped 350,000 relationship stories and found a communication style AI struggles to copy

May 24, 2026
New study links manipulative personality traits to lower relationship intimacy expectations
Artificial Intelligence

Brain scans shed light on why women develop romantic feelings for AI companions

May 22, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
ADHD Research News

A new AI tool spots hidden signs of adult ADHD months before a formal diagnosis

May 21, 2026
Modern AI is often judged to be more human than actual humans in Turing test experiments
Artificial Intelligence

AI-generated Grokipedia articles are longer, less readable, and cite fewer sources than their Wikipedia counterparts

May 21, 2026
Modern AI is often judged to be more human than actual humans in Turing test experiments
Artificial Intelligence

Modern AI is often judged to be more human than actual humans in Turing test experiments

May 21, 2026
AI-assisted venting can boost psychological well-being, study suggests
Addiction

Artificial intelligence tools answer addiction questions accurately but lack medical nuance

May 15, 2026
Scientists trained AI to talk people out of conspiracy theories — and it worked surprisingly well
Artificial Intelligence

Real-world evidence shows generative AI is making human creative output more uniform

May 14, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • Men’s sexual desire peaks around age 40, large new study finds
  • The cognitive difference between amateur and expert chess players
  • What happens to your brain when you eat an avocado every day for six months?
  • General intelligence and a strong work ethic are the best predictors of college grades
  • New research shows fashion’s “plus-size” models are still smaller than the average American woman

Science of Money

  • The brain chemical behind your money moves: How dopamine shapes financial choices
  • Can AI read the room? How news sentiment signals which stocks will bounce back after a crash
  • New study finds private financial firms disproportionately promote upper-class white men
  • Why people at the bottom of the ladder speed up their speech to match the boss
  • What makes a public service job attractive? A new study sorts out which perks matter most

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc