Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Psychopharmacology
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT gives better answers to health-related questions than human physicians, study finds

by Vladimir Hedrih
June 28, 2024
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook
Stay on top of the latest psychology findings: Subscribe now!

A recent study explored the performance of ChatGPT in generating responses to health-related questions posted on Reddit’s r/AskDocs forum. Licensed healthcare professionals compared these responses to those provided by human physicians. In 79% of cases, they found ChatGPT’s answers to be superior. The study was published in JAMA Internal Medicine.

Recent years have seen a lightning fast advance in artificial intelligence (AI) technology. The use of AI is spreading very fast in all segments of the society and the economy, transforming the way things are done rapidly. The development and public availability of large language models, AI systems capable of communicating in natural human language, has opened many new areas to AI systems.

One of the most important developments in this regard is ChatGPT, a large language model launched in 2022 that reached 100 million users in just 64 days. ChatGPT is able to produce meaningful human language responses to questions addressed to it, a feature incredibly useful in many fields of work.

Study author John W. Ayers and his colleagues note that one field where good AI systems could be really helpful is healthcare, where AI could help in responding to questions of patients. The volume of messages to healthcare workers sent by patients has been drastically increasing in recent years contributing to burnout in physicians.

With this in mind, and noting that ChatGPT was not created to provide healthcare advice, they conducted a study that aimed to explore the quality of answers ChatGPT gives to typical questions patients ask healthcare workers. Study authors note that although there were no previous published studies examining the quality of answers ChatGPT gives, some physicians were still integrating ChatGPT into their systems for responding to patients’ questions.

The researchers drew a random sample of 195 exchanges between physicians and patients from the Reddit’s online forum r/AskDocs, a subreddit with approximately 474,000 members, where users can post medical questions and verified health care professional volunteers submit answers. They took care not to repeat questions and answers from the same physician and the same patient.

Although different types of healthcare professionals respond in this forum, this study solely focused on answers given by physicians. This was because the study authors expected the responses of physicians to be of better quality than answers given by other types of healthcare professionals. If a physician provided multiple responses, only the first was included in the analysis.

ChatGPT was tasked with generating answers to the same questions. The original question, the physician’s response, and ChatGPT’s response were presented to a team of three evaluators, who were blinded to the source of each response. The evaluators assessed which response was better, rated the quality of the information, and evaluated the empathy or bedside manner displayed.

The two responses were presented to evaluators in a random order (i.e., they could not tell whether the response was AI or human generated based on whether it was the first or the second response shown to them) and the study authors removed any obviously revealing information from the answer (e.g. ChatGPT stating that it is an artificial intelligence in an answer). The evaluators were licensed health care professionals working in pediatrics, geriatrics, internal medicine, oncology, infectious disease, and preventive medicine.

In 79% of cases, evaluators stated that responses given by ChatGPT are better. They also rated them to be of higher quality than answers given by physicians. The mean rating they gave to ChatGPT’s responses was good, while the average rating given to physicians’ answers was only acceptable (“acceptable” is worse than “good”).

Moreover, evaluators rated 27% of physicians’ answers as being of less than acceptable quality. This was the case with only 3% of answers given by ChatGPT. ChatGPT’s responses were also rated as more empathetic. On average, ChatGPT’s responses were rated as empathetic, while those of physicians were rated as slightly empathetic.

“While this cross-sectional study has demonstrated promising results in the use of AI assistants for patient questions, it is crucial to note that further research is necessary before any definitive conclusions can be made regarding their potential effect in clinical settings. Despite the limitations of this study and the frequent overhyping of new technologies, studying the addition of AI assistants to patient messaging workflows holds promise with the potential to improve both clinician and patient outcomes,” the study authors concluded.

The study provides valuable insight into the quality of health-related advice given by ChatGPT. However, it should be noted that the study compared ChatGPT’s responses, to volunteered responses of physicians on a free online forum. This raises the question of the amount of effort physicians invested in their responses. The results would likely be different in situations when physicians invested full effort in providing good responses.

The paper “Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum” was authored by John W. Ayers, Adam Poliak, Mark Dredze, Eric C. Leas, Zechariah Zhu, Jessica B. Kelley, Dennis J. Faix, Aaron M. Goodman, Christopher A. Longhurst, Michael Hogarth, and Davey M. Smith.

RELATED

People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Top AI models fail spectacularly when faced with slightly altered medical questions

August 24, 2025

Artificial intelligence has dazzled with its test scores on medical exams, but a new study suggests this success may be superficial. When answer choices were modified, AI performance dropped sharply—raising questions about whether these systems truly understand what they're doing.

Read moreDetails
Smash or pass? AI could soon predict your date’s interest via physiological cues
Artificial Intelligence

Researchers fed 7.9 million speeches into AI—and what they found upends our understanding of language

August 23, 2025

A massive linguistic study challenges the belief that language change is driven by young people alone. Researchers found that older adults often adopt new word meanings within a few years—and sometimes even lead the change themselves.

Read moreDetails
His psychosis was a mystery—until doctors learned about ChatGPT’s health advice
Artificial Intelligence

His psychosis was a mystery—until doctors learned about ChatGPT’s health advice

August 13, 2025

Doctors were baffled when a healthy man developed hallucinations and paranoia. The cause? Bromide toxicity—triggered by an AI-guided experiment to eliminate chloride from his diet. The case raises new concerns about how people use chatbots like ChatGPT for health advice.

Read moreDetails
Brain imaging study reveals blunted empathic response to others’ pain when following orders
Artificial Intelligence

Machine learning helps tailor deep brain stimulation to improve gait in Parkinson’s disease

August 12, 2025

A new study shows that adjusting deep brain stimulation settings based on wearable sensor data and brain recordings can enhance walking in Parkinson’s disease. The personalized approach improved gait performance and revealed neural signatures linked to mobility gains.

Read moreDetails
Assimilation-induced dehumanization: Psychology research uncovers a dark side effect of AI
Artificial Intelligence

Assimilation-induced dehumanization: Psychology research uncovers a dark side effect of AI

August 11, 2025

As AI becomes more empathetic, a surprising psychological shift occurs. New research finds that interacting with emotionally intelligent machines can make us see real people as more machine-like, subtly eroding our respect for humanity.

Read moreDetails
Pet dogs fail to favor generous people over selfish ones in tests
Artificial Intelligence

AI’s personality-reading powers aren’t always what they seem, study finds

August 9, 2025

A closer look at AI language models shows that while they can detect meaningful personality signals in text, much of their success with certain datasets comes from exploiting superficial cues, raising questions about the validity of some assessments.

Read moreDetails
High sensitivity may protect against anomalous psychological phenomena
Artificial Intelligence

ChatGPT psychosis? This scientist predicted AI-induced delusions — two years later it appears he was right

August 7, 2025

A psychiatrist’s 2023 warning that AI chatbots could trigger psychosis now appears eerily accurate. Real-world cases show vulnerable users falling into delusional spirals after intense chatbot interactions—raising urgent questions about the mental health risks of generative artificial intelligence.

Read moreDetails
Generative AI simplifies science communication, boosts public trust in scientists
Artificial Intelligence

Conservatives are more receptive to AI-generated recommendations than liberals, study finds

August 4, 2025

Contrary to popular belief, conservatives may be more receptive to AI in everyday life. A series of studies finds that conservatives are more likely than liberals to accept AI-generated recommendations.

Read moreDetails

STAY CONNECTED

LATEST

Neuroscientists find evidence of an internal brain rhythm that orchestrates memory

High-fat fructose diet linked to anxiety-like behavior via disrupted liver-brain communication

Study finds Trump and Harris used distinct rhetoric in 2024—but shared more similarities than expected

Evolution may have capped human brain size to balance energy costs and survival

Cannabidiol shows potential to reverse some neuropsychological effects of social stress

Top AI models fail spectacularly when faced with slightly altered medical questions

A new frontier in autism research: predicting risk in babies as young as two months

Cerebellar-prefrontal brain connectivity may shape negative symptoms in psychosis

         
       
  • Contact us
  • Privacy policy
  • Terms and Conditions
[Do not sell my information]

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy