Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Startling study finds people overtrust AI-generated medical advice

by Vladimir Hedrih
October 10, 2025
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A new study involved participants evaluating medical responses that were either written by a medical doctor or generated by a large language model. Results showed participants could not distinguish between doctors’ responses and AI-generated responses, but preferred AI-generated ones. They found high-accuracy AI responses to be the best, but rated low-accuracy AI responses and those given by a medical doctor similarly. The paper was published in NEJM AI.

The use of artificial intelligence (AI) systems in the field of medicine and health care has increased dramatically in recent years. This increase has occurred across various domains, from radiology imaging and mental health chatbots to drug discovery.

One particular application of AI, especially large language models (LLMs), in the medical field is answering patients’ questions. One study showed that ChatGPT was able to generate higher quality and more empathetic responses to patient questions compared to those from medical doctors. AI also seems to excel in diagnostics. One study found that AI alone outperformed physicians in making diagnoses, while a follow-up showed that physicians augmented with AI performed comparably to AI alone, and both groups outperformed physicians working without AI.

Study author Shruthi Shekar and her colleagues wanted to investigate how well people distinguish between responses to patients’ questions given by medical doctors and those generated by AI. Participants were also asked to rate the validity, trustworthiness, completeness, and other aspects of the answers.

The researchers retrieved 150 anonymous medical questions and doctors’ responses from the forum HealthTap. These questions covered six domains of medicine: preventative and risk factors; conditions and symptoms; diagnostics and tests; procedures and surgeries; medication and treatments; and recovery and wellness, with equal distribution.

The researchers then used GPT-3 to create AI responses for each of those questions. These AI-generated responses were then evaluated by four physicians to establish their accuracy. This process was used to classify AI responses into high- and low-accuracy ones.

Next, in the first experiment, a group of 100 online study participants were presented with 10 medical question-response pairs randomly selected from a collection of 30 high-accuracy AI responses, 30 low-accuracy AI responses and 30 doctors’ responses.

In the second experiment, 100 online participants rated their understanding of the question and the response and its perceived validity. They also rated the trustworthiness of the response, its completeness and their satisfaction with it, whether they would search for additional information based on the response, whether they would follow the given advice, and whether they would seek subsequent medical attention based on the response.

Google News Preferences Add PsyPost to your preferred sources

In the third experiment, 100 online participants provided the same ratings for the responses, but participants were randomly informed that the responses were from either a doctor, an AI, or a doctor assisted by an AI.

Results showed that participants were unable to effectively distinguish between AI-generated responses and doctors’ responses. However, they showed a preference for AI-generated responses, rating high-accuracy AI-generated responses as significantly more valid, trustworthy, and complete than the other two types of responses. Low-accuracy AI responses tended to receive ratings similar to those given to doctors’ responses.

Interestingly, participants not only found the low-accuracy AI responses to be as trustworthy as those given by doctors, they also reported a high tendency to follow the potentially harmful medical advice contained in those responses and to incorrectly seek unnecessary medical attention as a result of the response. These problematic reactions were comparable with the reactions they displayed toward doctors’ responses, and sometimes even stronger. The study authors note that both experts (raters) and nonexperts (participants) tended to find AI-generated responses to be more thorough and accurate than doctors’ responses, but they still valued the involvement of a doctor in the delivery of their medical advice.

“The increased trust placed in inaccurate or inappropriate AI-generated medical advice can lead to misdiagnosis and harmful consequences for individuals seeking help. Further, participants were more trusting of high-accuracy AI-generated responses when told they were given by a doctor, and experts rated AI-generated responses significantly higher when the source of the response was unknown,” the study authors concluded.

The study sheds light on how humans perceive medical advice generated by AI systems. However, it should be noted that the questions and responses used in the study were taken from an online forum, where medical doctors tend to contribute their content voluntarily. It is likely that the answers were given with the aim of being useful and not with the aim of being the best or the most thorough answers the medical doctors providing them could give. The results of studies comparing answers of doctors clearly aiming to provide their best answers with AI-generated content might not be identical.

The paper, “People Overtrust AI-Generated Medical Advice despite Low Accuracy,” was authored by Shruthi Shekar, Pat Pataranutaporn, Chethan Sarabu, Guillermo A. Cecchi, and Pattie Maes.

Previous Post

Scientists use AI to detect ADHD through unique visual rhythms in groundbreaking study

Next Post

RFK Jr. just linked circumcision to autism. What is he talking about? Here’s the research

RELATED

Why most people fail to spot AI-generated faces, while super-recognizers have a subtle advantage
Artificial Intelligence

Why most people fail to spot AI-generated faces, while super-recognizers have a subtle advantage

February 28, 2026
People with social anxiety more likely to become overdependent on conversational artificial intelligence agents
Artificial Intelligence

AI therapy is rated higher for empathy until people learn a machine wrote the text

February 26, 2026
New research: AI models tend to reflect the political ideologies of their creators
Artificial Intelligence

New research: AI models tend to reflect the political ideologies of their creators

February 26, 2026
Stress disrupts gut and brain barriers by reducing key microbial metabolites, study finds
Artificial Intelligence

AI and mental health: New research links use of ChatGPT to worsened psychiatric symptoms

February 24, 2026
Stanford scientist discovers that AI has developed an uncanny human-like ability
Artificial Intelligence

How personality and culture relate to our perceptions of artificial intelligence

February 23, 2026
Young children are more likely to trust information from robots over humans
Artificial Intelligence

The presence of robot eyes affects perception of mind

February 21, 2026
Psychology study reveals a fascinating fact about artwork
Artificial Intelligence

AI art fails to trigger the same empathy as human works

February 20, 2026
ChatGPT’s social trait judgments align with human impressions, study finds
Artificial Intelligence

AI chatbots generate weight loss coaching messages perceived as helpful as human-written advice

February 16, 2026

STAY CONNECTED

LATEST

Long-term ADHD medication use does not appear to permanently alter the developing brain

Using cannabis to cut back on alcohol? Your working memory might dictate if it works

Conservatives underestimate the environmental impact of sustainable behaviors compared to liberals

American issue polarization surged after 2008 as the left moved further left

Psychological network analysis reveals how inner self-compassion connects to outward social attitudes

New neuroscience study links visual brain network hyperactivity to social anxiety

Trump voters who believed conspiracy theories were the most likely to justify the Jan. 6 riots

Simple blood tests can detect dementia in underrepresented Latin American populations

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc