Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Users of generative AI struggle to accurately assess their own competence

by Eric W. Dolan
December 29, 2025
in Artificial Intelligence, Cognitive Science
Share on TwitterShare on Facebook

New research provides evidence that using artificial intelligence to complete tasks can improve a person’s performance while simultaneously distorting their ability to assess that performance accurately. The findings indicate that while users of AI tools like ChatGPT achieve higher scores on logical reasoning tests compared to those working alone, they consistently overestimate their success by a significant margin.

This pattern suggests that AI assistance may disconnect a user’s perceived competence from their actual results, leading to a state of inflated confidence. The study was published in the scientific journal Computers in Human Behavior.

Scientists and psychologists have increasingly focused on how human cognition changes when augmented by technology. As generative AI systems become common in professional and educational settings, it is essential to understand how these tools influence metacognition. Metacognition refers to the ability of an individual to monitor and regulate their own thinking processes. It allows people to know when they are likely correct and when they might be making an error.

Previous psychological inquiries have established that humans generally struggle with self-assessment. A well-known phenomenon called the Dunning-Kruger effect describes how individuals with lower skills tend to overestimate their competence, while highly skilled individuals often underestimate their abilities. The authors of the current paper sought to determine if this pattern persists when humans collaborate with AI. They aimed to understand if AI acts as an equalizer that fixes these biases or if it introduces new complications to how people evaluate their work.

To investigate these questions, the research team designed two distinct studies centered on logical reasoning tasks. In the first study, they recruited 246 participants from the United States. These individuals were asked to complete 20 logical reasoning problems taken from the Law School Admission Test (LSAT). The researchers provided participants with a specialized web interface. This interface displayed the questions on one side and a ChatGPT interaction window on the other.

Participants were required to interact with the AI at least once for each question. They could ask the AI to solve the problem or explain the logic. After submitting their answers, participants estimated how many of the 20 questions they believed they had answered correctly. They also rated their confidence on a specific scale for each individual decision.

The results of this first study showed a clear improvement in objective performance. On average, participants using ChatGPT scored approximately three points higher than a historical control group of people who took the same test without AI assistance. The AI helped users solve problems that they likely would have missed on their own.

Despite this improvement in scores, the participants engaged in significant overestimation. On average, the group estimated they had answered about 17 out of 20 questions correctly. In reality, their average score was closer to 13. This represents a four-point gap between perception and reality. The data suggests that the seamless assistance provided by the AI created an illusion of competence.

Google News Preferences Add PsyPost to your preferred sources

The study also analyzed the relationship between a participant’s knowledge of AI and their self-assessment. The researchers measured “AI literacy” using a tool called the Scale for the Assessment of Non-Experts’ AI Literacy. One might expect that understanding how AI works would make a user more skeptical or accurate in their judgment. The findings indicated the opposite. Participants with higher technical understanding of AI tended to be more confident in their answers but less accurate in judging their actual performance.

A significant theoretical contribution of this research involves the Dunning-Kruger effect. In typical scenarios without AI, the data would show a steep slope where low performers vastly overestimate themselves and high performers do not. When participants used AI, this effect vanished. The “leveling” effect of the technology meant that overestimation became uniform across the board. Low performers and high performers alike inflated their scores by similar amounts.

The researchers observed that the combined performance of the human and the AI did not exceed the performance of the AI alone. The AI system, when running the test by itself, achieved a higher average score than the humans using the AI. This suggests a failure of synergy. Humans occasionally accepted incorrect advice from the AI or overrode correct advice, dragging the overall performance down below the machine’s maximum potential.

To ensure these findings were robust, the researchers conducted a second study. This replication involved 452 participants. The researchers split this sample into two distinct groups. One group performed the task with AI assistance, while the other group worked without any technological aid.

In this second experiment, the researchers introduced a monetary incentive to encourage accuracy. Participants were told they would receive a financial bonus if their estimate of their score matched their actual score. The goal was to rule out the possibility that participants were simply not trying hard enough to be self-aware.

The results of the second study mirrored the first. The monetary incentive did not correct the overestimation bias. The group using AI continued to perform better than the unaided group but persisted in overestimating their scores. The unaided group showed the classic Dunning-Kruger pattern, where the least skilled participants showed the most bias. The AI group again showed a uniform bias, confirming that the technology fundamentally shifts how users perceive their competence.

The study also utilized a measurement called the “Area Under the Curve” or AUC to judge metacognitive sensitivity. This metric determines if a person is more confident when they are right than when they are wrong. Ideally, a person should feel unsure when they make a mistake. The data showed that participants had low metacognitive sensitivity. Their confidence levels were high regardless of whether they were right or wrong on a specific question.

Qualitative data collected from chat logs offered additional context. The researchers noted that most participants acted as passive recipients of information. They frequently copied and pasted questions into the chat and accepted the AI’s output without significant challenge or verification. Only a small fraction of users treated the AI as a collaborative partner or a tool for double-checking their own logic.

The researchers discussed several potential reasons for these outcomes. One possibility is the “illusion of explanatory depth.” When an AI provides a fluent, articulate, and instant explanation, it can trick the brain into thinking the information is processed and understood more deeply than it actually is. The ease of obtaining the answer reduces the cognitive struggle usually required to solve logic puzzles, which in turn dulls the internal signals that warn a person they might be wrong.

As with all research, there are caveats to consider. The first study used a historical comparison group rather than a simultaneous control group, though the second study corrected this. Additionally, the task was limited to LSAT logical reasoning questions. It is possible that different types of tasks, such as creative writing or coding, might yield different metacognitive patterns.

The study also relied on a specific version of ChatGPT. As these models evolve and become more accurate, the dynamic between human and machine could shift. The researchers also noted that the participants were required to use the AI, which might differ from a real-world scenario where a user chooses when to consult the tool.

Future research directions were suggested to address these gaps. The researchers recommend investigating design changes that could force users to engage more critically. For example, an interface might require a user to explain the AI’s logic back to the system before accepting an answer. Long-term studies are also needed to see if this overconfidence fades as users become more experienced with the limitations of large language models.

The study, “AI makes you smarter but none the wiser: The disconnect between performance and metacognition,” was authored by Daniela Fernandes, Steeven Villa, Salla Nicholls, Otso Haavisto, Daniel Buschek, Albrecht Schmidt, Thomas Kosch, Chenxinran Shen, and Robin Welsch.

RELATED

AI outshines humans in humor: Study finds ChatGPT is as funny as The Onion
Artificial Intelligence

AI boosts worker creativity only if they use specific thinking strategies

February 12, 2026
High rates of screen time linked to specific differences in toddler vocabulary
Cognitive Science

High rates of screen time linked to specific differences in toddler vocabulary

February 11, 2026
Hippocampal neurons shift their activity backward in time to anticipate rewards
Memory

Hippocampal neurons shift their activity backward in time to anticipate rewards

February 11, 2026
Psychology study sheds light on the phenomenon of waifus and husbandos
Artificial Intelligence

Psychology study sheds light on the phenomenon of waifus and husbandos

February 11, 2026
How people end romantic relationships: New study pinpoints three common break up strategies
Artificial Intelligence

Psychology shows why using AI for Valentine’s Day could be disastrous

February 9, 2026
Artificial intelligence predicts adolescent mental health risk before symptoms emerge
Artificial Intelligence

Scientists reveal the alien logic of AI: hyper-rational but stumped by simple concepts

February 7, 2026
Stanford scientist discovers that AI has developed an uncanny human-like ability
Artificial Intelligence

The scientist who predicted AI psychosis has issued another dire warning

February 7, 2026
Sorting Hat research: What does your Hogwarts house say about your psychological makeup?
Cognitive Science

Scientists just mapped the brain architecture that underlies human intelligence

February 6, 2026

PsyPost Merch

STAY CONNECTED

LATEST

A key personality trait is linked to the urge to cheat in unhappy men

Methamphetamine increases motivation through brain processes separate from euphoria

Most Americans experience passionate love only twice in a lifetime, study finds

AI boosts worker creativity only if they use specific thinking strategies

Scientists asked men to smell hundreds of different vulvar odors to test the “leaky-cue hypothesis”

Blue light exposure may counteract anxiety caused by chronic vibration

Relatives with lower paternity uncertainty are perceived as kinder

Specific brain training regimen linked to lower dementia risk in 20-year study

RSS Psychology of Selling

  • The psychology behind “creepy” personalized marketing is being explored by researchers
  • A new framework for understanding influencer income
  • Sales agents often stay for autonomy rather than financial rewards
  • The economics of emotion: Reassessing the link between happiness and spending
  • Surprising link found between greed and poor work results among salespeople
         
       

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc