PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT hallucinates fake but plausible scientific citations at a staggering rate, study finds

by Eric W. Dolan
April 14, 2024
Reading Time: 4 mins read
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A recent study has found that scientific citations generated by ChatGPT often do not correspond to real academic work. The study, published in the Canadian Psychological Association’s Mind Pad, found that “false citation rates” across various psychology subfields ranged from 6% to 60%. Surprisingly, these fabricated citations feature elements such as legitimate researchers’ names and properly formatted digital object identifiers (DOIs), which could easily mislead both students and researchers.

ChatGPT is an artificial intelligence language model developed by OpenAI, which is capable of generating human-like text based on the input it receives. As a part of the larger GPT (Generative Pre-trained Transformer) series, ChatGPT has been trained on a vast amount of text data, allowing it to generate coherent responses across various topics. This capability, however, also presents certain challenges, especially in contexts that require high accuracy and reliability, such as academic writing.

As AI tools like ChatGPT become more accessible and widely used, there is a growing concern about their implications for academic integrity. Specifically, the tool’s ability to “hallucinate” information — generate plausible but non-existent citations — poses a significant risk.

“Initially, I was interested in finding ways of identifying ChatGPT usage in student work that I was grading. When ChatGPT was released, I noticed more and more students talking about using ChatGPT and how to use it without being caught,” explained study author Jordan MacDonald, a PhD student in Experimental Psychology at the University of New Brunswick–Saint John.

“I took this as a challenge and started having ChatGPT prepare me papers on various topics to see what, if any, errors were produced consistently. That was when I noticed that a lot of the references that ChatGPT cited did not actually exist.”

“Hallucinated citations are easy to spot because they often contain real authors, journals, proper issue/volume numbers that match up with the date of publication, and DOIs that appear legitimate. However, when you examine hallucinated citations more closely, you will find that they are referring to work that does not exist.”

“The only alternative to a large language model generating these citations is that someone manually collected real authors, real journal names (along with issue and volume numbers), made up a fake title, and then constructed a fake DOI (which have a specific format and usually look like this: 10.1177/03057356211030985). The work it would take to pull together a fake citation would exceed the work it would take to just find a real one and do the work yourself.”

To investigate the accuracy of citations generated by artificial intelligence, MacDonald tasked ChatGPT 3.5 with generating 50 citations for six psychological subfields — religion, animal, social, clinical, personality, and neuropsychology — totaling 300 citations.

Google News Preferences Add PsyPost to your preferred sources

The authenticity of these citations was verified by checking their digital object identifiers (DOIs) against actual publications. If a DOI did not lead to a real document, it was marked as a hallucinated citation. MacDonald further scrutinized a random selection of both hallucinated and legitimate citations to investigate discrepancies in detail.

MacDonald found that a total of 32.3% of the 300 citations generated by ChatGPT were hallucinated. Despite being fabricated, these hallucinated citations were constructed with elements that appeared legitimate — such as real authors who are recognized in their respective fields, properly formatted DOIs, and references to legitimate peer-reviewed journals.

Hallucinated citations varied by subfield. For instance, ChatGPT only hallucinated three citations related to neuropsychology but hallucinated 30 citations related to psychology of religion research.

Interestingly, even when citations included legitimate DOIs that correctly redirected to real articles, MacDonald’s closer inspection often revealed mismatches. The cited articles did not always correspond with the titles, authors, or subjects provided by ChatGPT. For example, a DOI might lead to a genuine article on a completely different topic than the one ChatGPT described.

“The degree of hallucination surprised me,” MacDonald told PsyPost. “Almost every single citation had hallucinated elements or were just entirely fake, but ChatGPT would offer summaries of this fake research that was convincing and well worded.”

“As ChatGPT becomes more refined, I imagine this error will become less common, but as far as I am aware, citation and information hallucination is a tricky beast to tackle when developing language models. At the very least, hallucinated citations are both easy to identify and a likely indicator of ChatGPT (or other large language model) usage.”

Additionally, MacDonald observed that ChatGPT could accurately summarize scholarly articles if provided with correct and complete references by the user. However, left to its own devices, the model frequently “hallucinated” both the content and context of the citations.

“I think many people are both concerned and excited for the potential upsides and downsides to ChatGPT,” MacDonald said. “One of the upsides is that ChatGPT can be used by those who are well educated in a given field to do very topical literature scans. Someone who knows their field well may be able to use ChatGPT in an advantageous way, while also being able to catch errors.

“The downside, and the other end of that same stick, is that students and the general population might use ChatGPT to provide them with information on a topic while lacking the knowledge of said topic to be able to identify false or misleading information.”

“ChatGPT and other large language models definitely have many benefits but are clearly still in their infancy,” MacDonald explained. “I think the average person should be very cautious about using ChatGPT in the same way that they should be cautious about getting a cancer diagnosis from Dr. Google.”

“I think that educators should know that invalid references appear to be a reasonable way to identify AI-generated work but it is not a smoking gun, either. Students may use ChatGPT to help with an initial literature search and then write a paper on their own. The degree of wrongdoing may vary.”

As with all research, the study has some caveats to consider. The study’s scope was limited to one version of ChatGPT and a specific set of psychology subfields, and the nature of AI development means newer versions of ChatGPT may not exhibit the same patterns of hallucinated citations.

“ChatGPT is evolving and my findings may not be accurate to the same extent in future versions,” MacDonald noted, adding that “this is not my main field of research but I intend on continuing to find ways to identify plagiarism or academic misconduct using ChatGPT or other large language models. I hope to see these models trained in a way that can prevent students from abusing them.”

The study, “Dude, Where’s My Citations? ChatGPT’s Hallucination of Citations,” was published in the Winter 2023 issue of Mind Pad.

RELATED

Artificial intelligence flatters users into bad behavior
Artificial Intelligence

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

June 1, 2026
Brain scans identify the neural network that traps anxious people in cycles of self-blame
ADHD Research News

Irregular brain maturation in childhood predicts emotional habits in early adolescence

May 31, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

New research reveals how humans judge the moral minds of artificial intelligence

May 30, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

Training AI chatbots to be warm and empathetic makes them less factually accurate

May 29, 2026
New Habsburg research reveals reproductive consequences of royal inbreeding
Artificial Intelligence

Machine learning uncovers how childhood trauma amplifies genetic risks for depression

May 27, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

A new study mapped 350,000 relationship stories and found a communication style AI struggles to copy

May 24, 2026
New study links manipulative personality traits to lower relationship intimacy expectations
Artificial Intelligence

Brain scans shed light on why women develop romantic feelings for AI companions

May 22, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
ADHD Research News

A new AI tool spots hidden signs of adult ADHD months before a formal diagnosis

May 21, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • More than half of adults with ADHD in clinical settings have a co-occurring personality disorder
  • New study links parental indulgence to psychopathic and narcissistic traits in adulthood
  • How learning to read alters the brain’s approach to spoken language
  • The psychology of paradoxical thinking: Extreme arguments in favor of a controversial topic can reduce overall support
  • Men’s sexual desire peaks around age 40, large new study finds

Science of Money

  • Class isn’t dead: Your job title still predicts your wealth in Europe, a five-country study finds
  • Packing products tightly on shelves makes shoppers grab more flavors
  • When your job feels scriptable: How routine work and AI anxiety drain employee energy
  • Childhood obesity and the American Dream: New research links early weight to lower lifetime mobility
  • The brain chemical behind your money moves: How dopamine shapes financial choices

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc