PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Startling study finds people overtrust AI-generated medical advice

by Vladimir Hedrih
October 10, 2025
Reading Time: 3 mins read
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A new study involved participants evaluating medical responses that were either written by a medical doctor or generated by a large language model. Results showed participants could not distinguish between doctors’ responses and AI-generated responses, but preferred AI-generated ones. They found high-accuracy AI responses to be the best, but rated low-accuracy AI responses and those given by a medical doctor similarly. The paper was published in NEJM AI.

The use of artificial intelligence (AI) systems in the field of medicine and health care has increased dramatically in recent years. This increase has occurred across various domains, from radiology imaging and mental health chatbots to drug discovery.

One particular application of AI, especially large language models (LLMs), in the medical field is answering patients’ questions. One study showed that ChatGPT was able to generate higher quality and more empathetic responses to patient questions compared to those from medical doctors. AI also seems to excel in diagnostics. One study found that AI alone outperformed physicians in making diagnoses, while a follow-up showed that physicians augmented with AI performed comparably to AI alone, and both groups outperformed physicians working without AI.

Study author Shruthi Shekar and her colleagues wanted to investigate how well people distinguish between responses to patients’ questions given by medical doctors and those generated by AI. Participants were also asked to rate the validity, trustworthiness, completeness, and other aspects of the answers.

The researchers retrieved 150 anonymous medical questions and doctors’ responses from the forum HealthTap. These questions covered six domains of medicine: preventative and risk factors; conditions and symptoms; diagnostics and tests; procedures and surgeries; medication and treatments; and recovery and wellness, with equal distribution.

The researchers then used GPT-3 to create AI responses for each of those questions. These AI-generated responses were then evaluated by four physicians to establish their accuracy. This process was used to classify AI responses into high- and low-accuracy ones.

Next, in the first experiment, a group of 100 online study participants were presented with 10 medical question-response pairs randomly selected from a collection of 30 high-accuracy AI responses, 30 low-accuracy AI responses and 30 doctors’ responses.

In the second experiment, 100 online participants rated their understanding of the question and the response and its perceived validity. They also rated the trustworthiness of the response, its completeness and their satisfaction with it, whether they would search for additional information based on the response, whether they would follow the given advice, and whether they would seek subsequent medical attention based on the response.

Google News Preferences Add PsyPost to your preferred sources

In the third experiment, 100 online participants provided the same ratings for the responses, but participants were randomly informed that the responses were from either a doctor, an AI, or a doctor assisted by an AI.

Results showed that participants were unable to effectively distinguish between AI-generated responses and doctors’ responses. However, they showed a preference for AI-generated responses, rating high-accuracy AI-generated responses as significantly more valid, trustworthy, and complete than the other two types of responses. Low-accuracy AI responses tended to receive ratings similar to those given to doctors’ responses.

Interestingly, participants not only found the low-accuracy AI responses to be as trustworthy as those given by doctors, they also reported a high tendency to follow the potentially harmful medical advice contained in those responses and to incorrectly seek unnecessary medical attention as a result of the response. These problematic reactions were comparable with the reactions they displayed toward doctors’ responses, and sometimes even stronger. The study authors note that both experts (raters) and nonexperts (participants) tended to find AI-generated responses to be more thorough and accurate than doctors’ responses, but they still valued the involvement of a doctor in the delivery of their medical advice.

“The increased trust placed in inaccurate or inappropriate AI-generated medical advice can lead to misdiagnosis and harmful consequences for individuals seeking help. Further, participants were more trusting of high-accuracy AI-generated responses when told they were given by a doctor, and experts rated AI-generated responses significantly higher when the source of the response was unknown,” the study authors concluded.

The study sheds light on how humans perceive medical advice generated by AI systems. However, it should be noted that the questions and responses used in the study were taken from an online forum, where medical doctors tend to contribute their content voluntarily. It is likely that the answers were given with the aim of being useful and not with the aim of being the best or the most thorough answers the medical doctors providing them could give. The results of studies comparing answers of doctors clearly aiming to provide their best answers with AI-generated content might not be identical.

The paper, “People Overtrust AI-Generated Medical Advice despite Low Accuracy,” was authored by Shruthi Shekar, Pat Pataranutaporn, Chethan Sarabu, Guillermo A. Cecchi, and Pattie Maes.

RELATED

Childhood ADHD traits linked to midlife distress, with societal exclusion playing a major role
Artificial Intelligence

ChatGPT’s free version is 26 times more likely to respond inappropriately to psychotic delusions

May 9, 2026
Mind captioning: This scientist just used AI to translate brain activity into text
Artificial Intelligence

Scientists tested AI’s moral compass, and the results reveal a key blind spot

May 8, 2026
Scientists show how common chord progressions unlock social bonding in the brain
Artificial Intelligence

Perpetrators of AI sexual abuse often view their actions as a joke, new research shows

May 7, 2026
AI outshines humans in humor: Study finds ChatGPT is as funny as The Onion
Artificial Intelligence

Conversational AI shows promise in easing symptoms of anxiety and depression

May 6, 2026
The surprising link between conspiracy mentality and deepfake detection ability
Artificial Intelligence

Deepfake videos degrade political reputations even when viewers realize they are fake

May 5, 2026
Stanford scientist discovers that AI has developed an uncanny human-like ability
Artificial Intelligence

Turning to chatbots when lonely may exacerbate feelings of loneliness, study finds

May 4, 2026
Study explores how virtual “girlfriend experiences” tap evolved relationship motivations in the digital age
Artificial Intelligence

Study explores how virtual “girlfriend experiences” tap evolved relationship motivations in the digital age

May 3, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Fascinating new research suggests artificial neurodivergence could help solve the AI alignment problem

May 1, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • How caffeine alters the human brain’s electrical braking system
  • Men objectify women more when sexually aroused, regardless of their underlying personality traits
  • New study sheds light on how going braless alters public perceptions of a woman
  • Scientists show how common chord progressions unlock social bonding in the brain
  • The human brain appears to rely heavily on the thighs to accurately judge female body size

Science of Money

  • How your personality may shape whether you pick value or growth stocks
  • New research links local employment shocks to cognitive decline in older men
  • What traders actually look at: Eye-tracking study finds the price chart is largely ignored
  • When ICE ramps up, U.S.-born workers don’t fill the gap, study finds
  • Why a blue background can make a brown sofa look bigger

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc