PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Scientists just uncovered a major limitation in how AI models understand truth and belief

by Eric W. Dolan
December 11, 2025
Reading Time: 5 mins read
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A new evaluation of artificial intelligence systems suggests that while modern language models are becoming more capable at logical reasoning, they struggle significantly to distinguish between objective facts and subjective beliefs. The research indicates that even advanced models often fail to acknowledge that a person can hold a belief that is factually incorrect, which poses risks for their use in fields like healthcare and law. These findings were published in Nature Machine Intelligence.

Human communication relies heavily on the nuance between stating a fact and expressing an opinion. When a person says they know something, it implies certainty, whereas saying they believe something allows for the possibility of error. As artificial intelligence integrates into high-stakes areas like medicine or law, the ability to process these distinctions becomes essential for safety.

Large language models (LLMs) are artificial intelligence systems designed to understand and generate human language. These programs are trained on vast amounts of text data, learning to predict the next word in a sequence to create coherent responses. Popular examples of this technology include OpenAI’s GPT series, Google’s Gemini, Anthropic’s Claude, and Meta’s Llama.

Previous evaluations of these systems often focused on broad reasoning capabilities but lacked specific testing of how models handle linguistic markers of belief versus knowledge. The authors aimed to fill this gap by systematically testing how models react when facts and beliefs collide. They sought to determine if these systems truly comprehend the difference between believing and knowing or if they merely mimic patterns found in their training data.

“Large language models are increasingly used for tutoring, counseling, medical/legal advice, and even companionship,” said James Zou of Stanford University, the senior author of the new paper. “In these settings, it is really important for the LLM to understand not only the facts but also the user’s beliefs. For example, a student may have some confusion about math, and the tutor AI needs to acknowledge what the confusion is in order to effectively help the student. This motivated us to systematically analyze how well LLMs can distinguish user’s beliefs from facts.”

The scientific team developed a new testing suite called the Knowledge and Belief Language Evaluation, or KaBLE. This dataset consists of 13,000 specific questions divided across thirteen distinct tasks.

To build this, they started with 1,000 sentences covering ten different subject areas, such as history, literature, mathematics, and medicine. Half of these sentences were factual statements verified by reputable sources like Britannica and NASA. The other half were falsified versions of those statements, created by altering key details to ensure they were untrue.

The researchers evaluated twenty-four different LLMs using this dataset. The sample included older general-purpose models like GPT-4 and Llama-3, as well as newer “reasoning-oriented” models like OpenAI’s o1 and DeepSeek R1. The team used a standardized prompting method to get clear answers, asking the models to verify statements or confirm the mental states of speakers. They measured accuracy by checking if the models could correctly verify facts, confirm the existence of beliefs, and navigate complex sentences involving multiple layers of knowledge.

Google News Preferences Add PsyPost to your preferred sources

The assessment revealed that models have a difficult time acknowledging that a speaker holds a false belief. When a user stated, “I believe that [false statement],” the models frequently tried to correct the fact rather than simply confirming the user’s belief.

For instance, the accuracy of GPT-4o dropped from 98.2 percent when handling true beliefs to 64.4 percent for false beliefs. The drop was even more severe for DeepSeek R1, which fell from over 90 percent accuracy to just 14.4 percent. This suggests the models prioritize factual correctness over the linguistic task of attributing a specific thought to a speaker.

“We found that across 24 LLMs, models consistently fail to distinguish user’s belief from facts. For example, suppose I tell the LLM “I believe that humans only use 10% of our brain” (which is not factually correct, but many people hold this belief). The LLM would refuse to acknowledge this belief; it may say something like, “you don’t really believe that humans use 10% of the brain”. This suggests that LLMs do not have a good mental model of the users. The implication of our finding is that we should be very careful when using LLMs in these more subjective and personal settings.”

The researchers also found a disparity in how models treat different speakers. The systems were much more capable of attributing false beliefs to third parties, such as “James” or “Mary,” than to the first-person “I.” On average, newer models correctly identified third-person false beliefs 95 percent of the time. However, their accuracy for first-person false beliefs was only 62.6 percent. This gap implies that the models have developed different processing strategies depending on who is speaking.

The study also highlighted inconsistencies in how models verify basic facts. Older models tended to be much better at identifying true statements than identifying false ones. For example, GPT-3.5 correctly identified truths nearly 90 percent of the time but identified falsehoods less than 50 percent of the time. Conversely, some newer reasoning models showed the opposite pattern, performing better when verifying false statements than true ones. The o1 model achieved 98.2 percent accuracy on false statements compared to 94.4 percent on true ones.

This counterintuitive pattern suggests that recent changes in how models are trained have influenced their verification strategies. It appears that efforts to reduce hallucinations or enforce strict factual adherence may have overcorrected in certain areas. The models display unstable decision boundaries, often hesitating when confronted with potential misinformation. This hesitation leads to errors when the task is simply to identify that a statement is false.

In addition, the researchers observed that minor changes in wording caused significant performance drops. When the question asked “Do I really believe” something, instead of just “Do I believe,” accuracy plummeted across the board. For the Llama 3.3 70B model, adding the word “really” caused accuracy to drop from 94.2 percent to 63.6 percent for false beliefs. This indicates the models may be relying on superficial pattern matching rather than a deep understanding of the concepts.

Another area of difficulty involved recursive knowledge, which refers to nested layers of awareness, such as “James knows that Mary knows X.” While some top-tier models like Gemini 2 Flash handled these tasks well, others struggled significantly. Even when models provided the correct answer, their reasoning was often inconsistent. Sometimes they relied on the fact that knowledge implies truth, while other times they dismissed the relevance of the agents’ knowledge entirely.

Most models lacked a robust understanding of the factive nature of knowledge. In linguistics, “to know” is a factive verb, meaning one cannot “know” something that is false; one can only believe it. The models frequently failed to recognize this distinction. When presented with false knowledge claims, they rarely identified the logical contradiction, instead attempting to verify the false statement or rejecting it without acknowledging the linguistic error.

These limitations have significant implications for the deployment of AI in high-stakes environments. In legal proceedings, the distinction between a witness’s belief and established knowledge is central to judicial decisions. A model that conflates the two could misinterpret testimony or provide flawed legal research. Similarly, in mental health settings, acknowledging a patient’s beliefs is vital for empathy, regardless of whether those beliefs are factually accurate.

The researchers note that these failures likely stem from training data that prioritizes factual accuracy and helpfulness above all else. The models appear to have a “corrective” bias that prevents them from accepting incorrect premises from a user, even when the prompt explicitly frames them as subjective beliefs. This behavior acts as a barrier to effective communication in scenarios where subjective perspectives are the focus.

Future research needs to focus on helping models disentangle the concept of truth from the concept of belief. The research team suggests that improvements are necessary before these systems are fully deployed in domains where understanding a user’s subjective state is as important as knowing the objective facts. Addressing these epistemological blind spots is a requirement for responsible AI development.

The study, “Language models cannot reliably distinguish belief from knowledge and fact,” was authored by Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, and James Zou.

RELATED

People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Fascinating new research suggests artificial neurodivergence could help solve the AI alignment problem

May 1, 2026
Gold digging is strongly linked to psychopathy and dark personality traits, study finds
Artificial Intelligence

High trust in AI leaves individuals vulnerable to “cognitive surrender,” study finds

April 30, 2026
Artificial intelligence flatters users into bad behavior
Artificial Intelligence

Artificial intelligence flatters users into bad behavior

April 26, 2026
Psychology textbooks still misrepresent famous experiments and controversial debates
Artificial Intelligence

How eye contact shapes the believability of computer-generated faces

April 24, 2026
Facebook users who ruminate and compare themselves to their friends experience increased loneliness
Artificial Intelligence

Women perceive AI as riskier than men do, study finds

April 22, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

Psychologists pinpoint the conversational mechanisms that help humans bond with AI

April 22, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

Unrestricted generative AI harms high school math learning by acting as a crutch

April 21, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

People remain “blissfully ignorant” of AI use in everyday messages, new research shows

April 20, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • Childhood trauma linked to biological aging and gaze avoidance
  • Gold digging is strongly linked to psychopathy and dark personality traits, study finds
  • Shared music listening synchronizes brain activity
  • Narcissism runs in the family, but not because of parenting
  • A reduced sense of belonging links childhood emotional abuse to unhappier romantic relationships

Psychology of Selling

  • Why the most emotionally skilled salespeople still underperform without one key ingredient
  • Why cramped spaces sometimes make customers happier: The surprising science of “spatial captivity”
  • Seven seller skills that drive B2B sales performance, according to a Norwegian study
  • What makes customers stick with a salesperson? A study traces the path from trust to long-term commitment
  • When company shakeups breed envy, salespeople may cut corners and eye the exit

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc