Artificial intelligence tools answer addiction questions accurately but lack medical nuance

Artificial intelligence chatbots regularly answer public queries about sensitive health topics such as addiction, providing mostly accurate but highly generalized information. A recent evaluation found that while chatbot responses align broadly with national guidelines, they often lack the situational details necessary for individualized health decisions. These descriptive findings were recently published in the journal Drug and Alcohol Dependence.

Substance use disorder is a chronic medical condition defined by the compulsive use of drugs or alcohol despite adverse physical, social, or emotional consequences. The official medical diagnostic framework views the condition on a spectrum of severity rather than applying a binary label of addiction. This diagnosis reflects changes in brain function that lead to cravings, physical tolerance, and withdrawal symptoms. In the United States alone, nearly fifty million people over the age of twelve met the diagnostic criteria for this condition in recent health surveys.

Despite the availability of medical treatments, care for addiction remains heavily underutilized. Medical providers face institutional limitations, time constraints, and a lack of specific training regarding the condition. At the same time, the social stigma surrounding addiction causes many individuals to avoid seeking formal medical advice out of fear of judgment or legal repercussions.

People often turn to digital platforms as an initial, private step to gather health information. Chatbots offer immediate, anonymous responses without the perceived judgment of a clinical environment. However, the quality of this digitally generated medical guidance is not always reliable, especially for deeply stigmatized behavioral health conditions.

To better understand how these systems perform, researchers designed a study to evaluate the medical accuracy of artificial intelligence responses regarding addiction. Lead author Morgan Decker, a medical student, and senior author Lea Sacca, a public health researcher, conducted the work alongside a team at Florida Atlantic University. They collaborated with addiction medicine physicians and data scientists to assess the digital guidance.

The research team focused on fourteen frequently asked questions about substance use disorders. To build this list, they first asked the chatbot to generate a list of common questions that adults have about diagnosis, treatment, and recovery. The team then cross-referenced these outputs with actual frequently asked questions from major health organizations.

The benchmark organizations included the Centers for Disease Control and Prevention and the Substance Abuse and Mental Health Services Administration. The researchers also incorporated guidelines from the National Institute on Drug Abuse and the American Society of Addiction Medicine. This ensured the artificial intelligence answers would be measured against established best practices in the medical field.

Researchers entered the fourteen finalized questions into the software to gather its responses. They specifically utilized the updated fifth version of the application. To standardize the outputs, they applied settings that limit the model’s randomness, ensuring the answers remained consistent and factual rather than conversational.

Google News Preferences Add PsyPost to your preferred sources

Pairs of evaluators independently reviewed each generated answer in a blinded fashion. The rating pairs intentionally mixed training levels, pairing students with board-certified addiction specialists. They scored the responses on a four-point scale based on accuracy, precision, and appropriateness for a general audience. Any disagreements between the rater pairs were resolved through discussions with an additional senior expert.

The highest score on the scale indicated an excellent response requiring no further explanation. The next two tiers represented satisfactory answers that needed either minimal or moderate clinical explanation. The lowest score was reserved for unsatisfactory answers that contained incorrect or dangerously misleading information based on contemporary medical practices.

The evaluators found that none of the answers provided by the software were unsatisfactory. Three of the fourteen responses received an excellent rating. Nine answers were deemed satisfactory but required minimal elaboration. Two answers were satisfactory but needed moderate clinical elaboration.

The artificial intelligence performed best on straightforward definitional prompts. When asked about the signs and symptoms of a substance use disorder, it gave a highly accurate list that matched expert guidelines. It correctly noted cravings, withdrawal, and the inability to control use as primary indicators.

Another highly rated response addressed whether a relapse represents a failure. The software accurately emphasized that an eventual return to use does not mean a medical treatment has failed. Instead, it framed relapse as a normal part of the recovery process that might require an adjustment in medical strategy, matching the empathetic tone recommended by public health officials.

Many answers provided a broad summary but missed nuanced clinical examples. When asked about the risks of untreated addiction, the software correctly listed overdose, liver damage, and social isolation. However, it failed to mention the increased risks of various cancers and infectious diseases, which are major complications recognized by public health authorities.

In evaluating treatment options, the software accurately mentioned behavioral therapies and support groups. Yet it failed to identify specific medical therapies approved by the federal government for alcohol use disorder. It also provided vague advice about how to help a loved one, advising against enabling behaviors without explaining what enabling actually looks like in practice.

The software also fell short of providing actionable resources when asked where to seek treatment. It accurately identified primary care doctors, mental health professionals, and anonymous support groups as avenues for help. Unfortunately, it completely omitted centralized, government-supported tools like national helplines or specific website directories that provide immediate, confidential assistance based on geographic location.

More complex medical scenarios revealed greater gaps in the knowledge base of the software. When asked about managing withdrawal, the application correctly noted that physical symptoms occur when a dependent person stops using a substance. Yet it did not warn users that withdrawing from certain substances like alcohol or benzodiazepines can be fatal and requires immediate medical supervision.

The software also required moderate elaboration regarding treatment duration. It accurately stated that recovery timelines vary widely based on individual needs and the severity of the condition. While true, health organizations typically recommend a minimum of three months in a treatment program to achieve better recovery outcomes, a benchmark the software failed to mention.

The researchers point out several limitations in their methodology. The study relied on a subjective evaluation process by a specific group of medical professionals. Other clinical experts might grade the nuanced responses differently. Additionally, the researchers only tested a small sample of fourteen questions, which limits how broadly the results can summarize the capabilities of the software.

Using an artificial intelligence program to generate the initial list of questions may have introduced circular bias into the experiment. The software likely performs better on prompts that match its own structured, rational logic. Real patients often write prompts that are highly emotional, ambiguous, or poorly worded, which could generate very different guidance.

The researchers did not test how actual patients interpret or apply the digital advice in real life. Health literacy varies widely among the public. A scientifically accurate but highly generalized paragraph could still lead to confusion for someone unfamiliar with medical terminology, especially if they try to manage an addiction without a doctor.

Ethical concerns also surround the use of private medical data by technology companies. Substance use disorders often carry legal risks, and poorly protected digital searches could compromise patient privacy. The phrasing used by chatbots could also accidentally reinforce social prejudices if the software relies on biased training data.

Future studies should explore a wider variety of real-world patient queries drawn from online forums or clinic data. Researchers also recommend evaluating competing digital platforms to see if different corporate models offer better medical accuracy. Until these systems improve, human medical professionals remain necessary to contextualize digital health information safely.

The study, “Descriptive content analysis assessment of ChatGPT responses to substance use disorder treatment questions compared to National health guidelines,” was authored by Morgan Decker, Christine Kamm, Sara Burgoa, Meera Rao, Maria Mejia, Christine Ramdin, Adrienne Dean, Melodie Nasr, Lewis S. Nelson, and Lea Sacca.

Artificial intelligence tools answer addiction questions accurately but lack medical nuance

RELATED

Brain cells store competing memories that drive or suppress alcohol relapse

Real-world evidence shows generative AI is making human creative output more uniform

AI-designed drug reduces fentanyl consumption in animal models by targeting serotonin receptors

People with a natural tendency toward greed face a higher risk of gambling problems

ChatGPT’s free version is 26 times more likely to respond inappropriately to psychotic delusions

Scientists tested AI’s moral compass, and the results reveal a key blind spot

Violent pornography use linked to sexual aggression risk among university students

Perpetrators of AI sexual abuse often view their actions as a joke, new research shows

Trending

Science of Money

Welcome Back!

Retrieve your password

Add New Playlist