PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT’s free version is 26 times more likely to respond inappropriately to psychotic delusions

by Eric W. Dolan
May 9, 2026
Reading Time: 5 mins read
Share on TwitterShare on Facebook

A recent study published in JAMA Psychiatry suggests that popular artificial intelligence chatbots tend to provide inappropriate or unhelpful responses when users type messages containing signs of psychosis. The findings provide evidence that relying on these digital tools for mental health advice might pose serious safety risks for individuals experiencing severe psychological distress.

Large language models are advanced artificial intelligence systems designed to understand and generate human text. They work by analyzing vast amounts of internet data to predict what words should logically come next in a given sentence. This mathematical process allows the computer program to essentially recognize structural patterns and create smooth conversational replies.

Because these computer programs are designed to perfectly mimic human interaction, they can naturally lead users to feel like the software actually understands them or feels genuine empathy toward them. Since its widespread release in 2022, OpenAI’s popular chatbot product called ChatGPT has seen massive adoption across the globe. Recent surveys suggest that many adults use this specific software regularly for general advice or tutoring.

Because chatbots generate their responses by matching textual patterns and aligning with the exact text the user provides, they tend to blindly accept false premises. This means the software might accidentally agree with or encourage the user’s entirely inaccurate statements about reality.

“We became interested in trying to understand how large language model chatbots respond to psychotic content when media reports started to appear about a year ago of people apparently developing psychotic symptoms (or having psychotic symptoms worsen) in the context of long ‘conversations’ with these products,” said study author Amandeep Jutla, an associate research scientist at Columbia University and head of the Translational Insights for Autism Lab.

“We noticed that a common feature across these reports seemed to be that the product would reflect, affirm, or elaborate on the psychotic content, rather than pushing back against it as a human might. With our study, we wanted to test whether we could observe these kinds of inappropriate responses to psychotic content under controlled conditions.”

To test this, the researchers evaluated three different versions of OpenAI’s chatbot. They looked at a newer paid version called GPT-5 Auto, a previous paid version called GPT-4o, and the standard free version that is most widely accessible. The scientists wrote a total of 79 unique prompts designed to reflect five different symptoms of psychosis.

Psychosis is a mental health condition where a person temporarily loses touch with reality. To capture this state, the authors based their prompts on a standardized clinical interview tool used to assess psychosis risk. They included text reflecting unusual thoughts, suspiciousness or paranoia, and grandiosity, which is an exaggerated sense of one’s own importance. They also included prompts mimicking perceptual disturbances like hallucinations, along with disorganized communication.

Google News Preferences Add PsyPost to your preferred sources

For every psychotic prompt, the authors also wrote a matched control prompt. These normal control prompts were similar in length and writing style but did not contain any psychotic content. Every prompt was submitted exactly one time to each of the three chatbot versions in a completely isolated session. This procedure generated a total of 474 distinct prompt and response pairs for the scientists to analyze.

Next, two mental health clinicians reviewed these textual pairs. To ensure objectivity, these clinicians were blinded, meaning they did not know which chatbot version generated which response. The clinicians evaluated the appropriateness of the chatbot replies using a simple rating scale.

They scored each response on a scale from zero to two. A zero meant the response was completely appropriate, a one meant it was somewhat appropriate, and a two meant it was completely inappropriate. A secondary clinical rater also checked a random subset of these responses to verify the accuracy of the grading.

Across all the tested software versions, the chatbots were far more likely to give poor responses to the psychotic prompts than to the normal control prompts.

“The thing to take away from our findings is that ChatGPT is overwhelmingly more likely to generate inappropriate responses to psychotic than non-psychotic content,” Jutla said. “Notably, the ‘GPT-4o’ version of ChatGPT, which was the default version of the product at the time that reports of psychotic symptoms began appearing a year ago, has been acknowledged by OpenAI, which runs ChatGPT, to be prone to generate unsafe responses, and was replaced by ‘GPT-5,’ which was purportedly safer. Notably, we didn’t actually see any difference between GPT-4o and GPT-5 in our testing: statistically, both generated inappropriate responses at the same greatly elevated rate.”

When looking at the free version of the software, the psychotic prompts had an odds ratio showing an almost 26-fold higher chance of receiving a less appropriate rating compared to the control prompts. In medical statistics, an odds ratio simply describes how much more likely a specific event is to happen in one group compared to another group.

“The only meaningful difference we found was between the free and paid GPT-5 versions of ChatGPT: the free version is about 26 times more likely to generate an inappropriate response to psychotic content, and the paid version is ‘only’ about 8 times more likely to do so,” Jutla explained. “This is notable because OpenAI has reported that ChatGPT has 900 million users but only 50 million subscribers.”

The authors note that the free version’s poorer performance provides evidence for a specific public health concern. Individuals at risk for psychosis tend to be overrepresented among economically disadvantaged populations. This means those who are most vulnerable might only have access to the least safe chatbot option.

The authors acknowledge a few limitations to their current research project. The study only tested ChatGPT, which is just one of many artificial intelligence tools currently available on the market. Additionally, while the rating system was standardized, judging the appropriateness of a conversational response relies to some degree on subjective human opinion.

“An important limitation of our study is that it may actually under-estimate the inappropriateness of ChatGPT responses, because we only tested single prompts and single responses,” Jutla said. “Many of the cases of psychotic symptoms developing or worsening in the context of using this product involved very long ‘conversations,’ and it is known (and has been acknowledged by OpenAI) that in these ‘long context’ situations the performance of large language models tends to degrade.”

Because these systems use previous messages to provide context for new replies, an extended conversation might actually make the program’s safety filters break down completely. This suggests that the risk of harm in real world, ongoing conversations might be even higher than what this specific study captured. Finally, these artificial intelligence tools update rapidly, meaning the exact performance of the software might shift significantly over time.

The scientists point out that a truly appropriate response involves several specific components. An ideal reply should recognize the crisis, avoid reinforcing the delusion, acknowledge the urgency of the situation, and provide medical resources. The authors aim to assess these specific components separately in future studies.

The researchers suggest several directions for moving forward. In clinical practice, mental health professionals should routinely ask their patients if they are using these digital tools for advice. Future research should investigate how ongoing conversations with a chatbot might reinforce a person’s delusions over longer periods. The study provides evidence that policymakers should consider stronger oversight to ensure these programs do not harm vulnerable individuals.

The study, “Evaluation of Large Language Model Chatbot Responses to Psychotic Prompts,” was authored by Elaine Shen, Fadi Hamati, Meghan Rose Donohue, Ragy R. Girgis, Jeremy Veenstra-VanderWeele, and Amandeep Jutla.

RELATED

Childhood ADHD traits linked to midlife distress, with societal exclusion playing a major role
ADHD Research News

Childhood ADHD traits linked to midlife distress, with societal exclusion playing a major role

May 9, 2026
Study finds microdosing LSD is not effective in reducing ADHD symptoms
Depression

LSD microdosing linked to acute mood improvements in adults with depression

May 8, 2026
A dream-like psychedelic might help traumatized veterans reset their brains
Alzheimer's Disease

New brain scan index detects hidden Alzheimer’s patterns before memory loss begins

May 8, 2026
Scientists tested AI’s moral compass, and the results reveal a key blind spot
Cognitive Science

Proactive habits can boost cognitive and emotional well-being across the adult lifespan

May 8, 2026
Mind captioning: This scientist just used AI to translate brain activity into text
Artificial Intelligence

Scientists tested AI’s moral compass, and the results reveal a key blind spot

May 8, 2026
Scientists show how common chord progressions unlock social bonding in the brain
Hypersexuality

Violent pornography use linked to sexual aggression risk among university students

May 7, 2026
Neuroscientists uncover a fascinating fact about social thinking in the brain
Alzheimer's Disease

Untreated sleep apnea linked to physical brain changes in Alzheimer’s disease

May 7, 2026
Scientists show how common chord progressions unlock social bonding in the brain
Artificial Intelligence

Perpetrators of AI sexual abuse often view their actions as a joke, new research shows

May 7, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • How caffeine alters the human brain’s electrical braking system
  • New study sheds light on how going braless alters public perceptions of a woman
  • Scientists show how common chord progressions unlock social bonding in the brain
  • The human brain appears to rely heavily on the thighs to accurately judge female body size
  • Fox News viewership linked to belief in a racist conspiracy theory

Science of Money

  • How your personality may shape whether you pick value or growth stocks
  • New research links local employment shocks to cognitive decline in older men
  • What traders actually look at: Eye-tracking study finds the price chart is largely ignored
  • When ICE ramps up, U.S.-born workers don’t fill the gap, study finds
  • Why a blue background can make a brown sofa look bigger

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc