Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Psychopharmacology
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Scientists just uncovered a major limitation in how AI models understand truth and belief

by Eric W. Dolan
December 11, 2025
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A new evaluation of artificial intelligence systems suggests that while modern language models are becoming more capable at logical reasoning, they struggle significantly to distinguish between objective facts and subjective beliefs. The research indicates that even advanced models often fail to acknowledge that a person can hold a belief that is factually incorrect, which poses risks for their use in fields like healthcare and law. These findings were published in Nature Machine Intelligence.

Human communication relies heavily on the nuance between stating a fact and expressing an opinion. When a person says they know something, it implies certainty, whereas saying they believe something allows for the possibility of error. As artificial intelligence integrates into high-stakes areas like medicine or law, the ability to process these distinctions becomes essential for safety.

Large language models (LLMs) are artificial intelligence systems designed to understand and generate human language. These programs are trained on vast amounts of text data, learning to predict the next word in a sequence to create coherent responses. Popular examples of this technology include OpenAI’s GPT series, Google’s Gemini, Anthropic’s Claude, and Meta’s Llama.

Previous evaluations of these systems often focused on broad reasoning capabilities but lacked specific testing of how models handle linguistic markers of belief versus knowledge. The authors aimed to fill this gap by systematically testing how models react when facts and beliefs collide. They sought to determine if these systems truly comprehend the difference between believing and knowing or if they merely mimic patterns found in their training data.

“Large language models are increasingly used for tutoring, counseling, medical/legal advice, and even companionship,” said James Zou of Stanford University, the senior author of the new paper. “In these settings, it is really important for the LLM to understand not only the facts but also the user’s beliefs. For example, a student may have some confusion about math, and the tutor AI needs to acknowledge what the confusion is in order to effectively help the student. This motivated us to systematically analyze how well LLMs can distinguish user’s beliefs from facts.”

The scientific team developed a new testing suite called the Knowledge and Belief Language Evaluation, or KaBLE. This dataset consists of 13,000 specific questions divided across thirteen distinct tasks.

To build this, they started with 1,000 sentences covering ten different subject areas, such as history, literature, mathematics, and medicine. Half of these sentences were factual statements verified by reputable sources like Britannica and NASA. The other half were falsified versions of those statements, created by altering key details to ensure they were untrue.

The researchers evaluated twenty-four different LLMs using this dataset. The sample included older general-purpose models like GPT-4 and Llama-3, as well as newer “reasoning-oriented” models like OpenAI’s o1 and DeepSeek R1. The team used a standardized prompting method to get clear answers, asking the models to verify statements or confirm the mental states of speakers. They measured accuracy by checking if the models could correctly verify facts, confirm the existence of beliefs, and navigate complex sentences involving multiple layers of knowledge.

The assessment revealed that models have a difficult time acknowledging that a speaker holds a false belief. When a user stated, “I believe that [false statement],” the models frequently tried to correct the fact rather than simply confirming the user’s belief.

For instance, the accuracy of GPT-4o dropped from 98.2 percent when handling true beliefs to 64.4 percent for false beliefs. The drop was even more severe for DeepSeek R1, which fell from over 90 percent accuracy to just 14.4 percent. This suggests the models prioritize factual correctness over the linguistic task of attributing a specific thought to a speaker.

“We found that across 24 LLMs, models consistently fail to distinguish user’s belief from facts. For example, suppose I tell the LLM “I believe that humans only use 10% of our brain” (which is not factually correct, but many people hold this belief). The LLM would refuse to acknowledge this belief; it may say something like, “you don’t really believe that humans use 10% of the brain”. This suggests that LLMs do not have a good mental model of the users. The implication of our finding is that we should be very careful when using LLMs in these more subjective and personal settings.”

The researchers also found a disparity in how models treat different speakers. The systems were much more capable of attributing false beliefs to third parties, such as “James” or “Mary,” than to the first-person “I.” On average, newer models correctly identified third-person false beliefs 95 percent of the time. However, their accuracy for first-person false beliefs was only 62.6 percent. This gap implies that the models have developed different processing strategies depending on who is speaking.

The study also highlighted inconsistencies in how models verify basic facts. Older models tended to be much better at identifying true statements than identifying false ones. For example, GPT-3.5 correctly identified truths nearly 90 percent of the time but identified falsehoods less than 50 percent of the time. Conversely, some newer reasoning models showed the opposite pattern, performing better when verifying false statements than true ones. The o1 model achieved 98.2 percent accuracy on false statements compared to 94.4 percent on true ones.

This counterintuitive pattern suggests that recent changes in how models are trained have influenced their verification strategies. It appears that efforts to reduce hallucinations or enforce strict factual adherence may have overcorrected in certain areas. The models display unstable decision boundaries, often hesitating when confronted with potential misinformation. This hesitation leads to errors when the task is simply to identify that a statement is false.

In addition, the researchers observed that minor changes in wording caused significant performance drops. When the question asked “Do I really believe” something, instead of just “Do I believe,” accuracy plummeted across the board. For the Llama 3.3 70B model, adding the word “really” caused accuracy to drop from 94.2 percent to 63.6 percent for false beliefs. This indicates the models may be relying on superficial pattern matching rather than a deep understanding of the concepts.

Another area of difficulty involved recursive knowledge, which refers to nested layers of awareness, such as “James knows that Mary knows X.” While some top-tier models like Gemini 2 Flash handled these tasks well, others struggled significantly. Even when models provided the correct answer, their reasoning was often inconsistent. Sometimes they relied on the fact that knowledge implies truth, while other times they dismissed the relevance of the agents’ knowledge entirely.

Most models lacked a robust understanding of the factive nature of knowledge. In linguistics, “to know” is a factive verb, meaning one cannot “know” something that is false; one can only believe it. The models frequently failed to recognize this distinction. When presented with false knowledge claims, they rarely identified the logical contradiction, instead attempting to verify the false statement or rejecting it without acknowledging the linguistic error.

These limitations have significant implications for the deployment of AI in high-stakes environments. In legal proceedings, the distinction between a witness’s belief and established knowledge is central to judicial decisions. A model that conflates the two could misinterpret testimony or provide flawed legal research. Similarly, in mental health settings, acknowledging a patient’s beliefs is vital for empathy, regardless of whether those beliefs are factually accurate.

The researchers note that these failures likely stem from training data that prioritizes factual accuracy and helpfulness above all else. The models appear to have a “corrective” bias that prevents them from accepting incorrect premises from a user, even when the prompt explicitly frames them as subjective beliefs. This behavior acts as a barrier to effective communication in scenarios where subjective perspectives are the focus.

Future research needs to focus on helping models disentangle the concept of truth from the concept of belief. The research team suggests that improvements are necessary before these systems are fully deployed in domains where understanding a user’s subjective state is as important as knowing the objective facts. Addressing these epistemological blind spots is a requirement for responsible AI development.

The study, “Language models cannot reliably distinguish belief from knowledge and fact,” was authored by Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, and James Zou.

RELATED

AI chatbots often misrepresent scientific studies — and newer models may be worse
Artificial Intelligence

Sycophantic chatbots inflate people’s perceptions that they are “better than average”

January 19, 2026
Google searches for racial slurs are higher in areas where people are worried about disease
Artificial Intelligence

Learning from AI summaries leads to shallower knowledge than web search

January 17, 2026
Neuroscientists find evidence meditation changes how fluid moves in the brain
Artificial Intelligence

Scientists show humans can “catch” fear from a breathing robot

January 16, 2026
Poor sleep may shrink brain regions vulnerable to Alzheimer’s disease, study suggests
Artificial Intelligence

How scientists are growing computers from human brain cells – and why they want to keep doing it

January 11, 2026
Misinformation thrives on outrage, study finds
Artificial Intelligence

The psychology behind the deceptive power of AI-generated images on Facebook

January 8, 2026
Scientists identify a fat-derived hormone that drives the mood benefits of exercise
Artificial Intelligence

Conversational AI can increase false memory formation by injecting slight misinformation in conversations

January 7, 2026
Generative AI simplifies science communication, boosts public trust in scientists
Artificial Intelligence

Simple anthropomorphism can make an AI advisor as trusted as a romantic partner

January 5, 2026
Legalized sports betting linked to a rise in violent crimes and property theft
Artificial Intelligence

The psychology behind our anxiety toward black box algorithms

January 2, 2026

PsyPost Merch

STAY CONNECTED

LATEST

Depression’s impact on fairness perceptions depends on socioeconomic status

Early life adversity primes the body for persistent physical pain, new research suggests

Economic uncertainty linked to greater male aversion to female breadwinning

Women tend to downplay their gender in workplaces with masculinity contest cultures

Young people show posttraumatic growth after losing a parent, finding strength, meaning, and appreciation for life

MDMA-assisted therapy shows promise for long-term depression relief

Neuroscience study reveals that familiar rewards trigger motor preparation before a decision is made

Emotional abuse predicts self-loathing more strongly than other childhood traumas

RSS Psychology of Selling

  • How defending your opinion changes your confidence
  • The science behind why accessibility drives revenue in the fashion sector
  • How AI and political ideology intersect in the market for sensitive products
  • Researchers track how online shopping is related to stress
  • New study reveals why some powerful leaders admit mistakes while others double down
         
       
  • Contact us
  • Privacy policy
  • Terms and Conditions
[Do not sell my information]

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy