Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Psychopharmacology
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT hallucinates fake but plausible scientific citations at a staggering rate, study finds

by Eric W. Dolan
April 14, 2024
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A recent study has found that scientific citations generated by ChatGPT often do not correspond to real academic work. The study, published in the Canadian Psychological Association’s Mind Pad, found that “false citation rates” across various psychology subfields ranged from 6% to 60%. Surprisingly, these fabricated citations feature elements such as legitimate researchers’ names and properly formatted digital object identifiers (DOIs), which could easily mislead both students and researchers.

ChatGPT is an artificial intelligence language model developed by OpenAI, which is capable of generating human-like text based on the input it receives. As a part of the larger GPT (Generative Pre-trained Transformer) series, ChatGPT has been trained on a vast amount of text data, allowing it to generate coherent responses across various topics. This capability, however, also presents certain challenges, especially in contexts that require high accuracy and reliability, such as academic writing.

As AI tools like ChatGPT become more accessible and widely used, there is a growing concern about their implications for academic integrity. Specifically, the tool’s ability to “hallucinate” information — generate plausible but non-existent citations — poses a significant risk.

“Initially, I was interested in finding ways of identifying ChatGPT usage in student work that I was grading. When ChatGPT was released, I noticed more and more students talking about using ChatGPT and how to use it without being caught,” explained study author Jordan MacDonald, a PhD student in Experimental Psychology at the University of New Brunswick–Saint John.

“I took this as a challenge and started having ChatGPT prepare me papers on various topics to see what, if any, errors were produced consistently. That was when I noticed that a lot of the references that ChatGPT cited did not actually exist.”

“Hallucinated citations are easy to spot because they often contain real authors, journals, proper issue/volume numbers that match up with the date of publication, and DOIs that appear legitimate. However, when you examine hallucinated citations more closely, you will find that they are referring to work that does not exist.”

“The only alternative to a large language model generating these citations is that someone manually collected real authors, real journal names (along with issue and volume numbers), made up a fake title, and then constructed a fake DOI (which have a specific format and usually look like this: 10.1177/03057356211030985). The work it would take to pull together a fake citation would exceed the work it would take to just find a real one and do the work yourself.”

To investigate the accuracy of citations generated by artificial intelligence, MacDonald tasked ChatGPT 3.5 with generating 50 citations for six psychological subfields — religion, animal, social, clinical, personality, and neuropsychology — totaling 300 citations.

The authenticity of these citations was verified by checking their digital object identifiers (DOIs) against actual publications. If a DOI did not lead to a real document, it was marked as a hallucinated citation. MacDonald further scrutinized a random selection of both hallucinated and legitimate citations to investigate discrepancies in detail.

MacDonald found that a total of 32.3% of the 300 citations generated by ChatGPT were hallucinated. Despite being fabricated, these hallucinated citations were constructed with elements that appeared legitimate — such as real authors who are recognized in their respective fields, properly formatted DOIs, and references to legitimate peer-reviewed journals.

Hallucinated citations varied by subfield. For instance, ChatGPT only hallucinated three citations related to neuropsychology but hallucinated 30 citations related to psychology of religion research.

Interestingly, even when citations included legitimate DOIs that correctly redirected to real articles, MacDonald’s closer inspection often revealed mismatches. The cited articles did not always correspond with the titles, authors, or subjects provided by ChatGPT. For example, a DOI might lead to a genuine article on a completely different topic than the one ChatGPT described.

“The degree of hallucination surprised me,” MacDonald told PsyPost. “Almost every single citation had hallucinated elements or were just entirely fake, but ChatGPT would offer summaries of this fake research that was convincing and well worded.”

“As ChatGPT becomes more refined, I imagine this error will become less common, but as far as I am aware, citation and information hallucination is a tricky beast to tackle when developing language models. At the very least, hallucinated citations are both easy to identify and a likely indicator of ChatGPT (or other large language model) usage.”

Additionally, MacDonald observed that ChatGPT could accurately summarize scholarly articles if provided with correct and complete references by the user. However, left to its own devices, the model frequently “hallucinated” both the content and context of the citations.

“I think many people are both concerned and excited for the potential upsides and downsides to ChatGPT,” MacDonald said. “One of the upsides is that ChatGPT can be used by those who are well educated in a given field to do very topical literature scans. Someone who knows their field well may be able to use ChatGPT in an advantageous way, while also being able to catch errors.

“The downside, and the other end of that same stick, is that students and the general population might use ChatGPT to provide them with information on a topic while lacking the knowledge of said topic to be able to identify false or misleading information.”

“ChatGPT and other large language models definitely have many benefits but are clearly still in their infancy,” MacDonald explained. “I think the average person should be very cautious about using ChatGPT in the same way that they should be cautious about getting a cancer diagnosis from Dr. Google.”

“I think that educators should know that invalid references appear to be a reasonable way to identify AI-generated work but it is not a smoking gun, either. Students may use ChatGPT to help with an initial literature search and then write a paper on their own. The degree of wrongdoing may vary.”

As with all research, the study has some caveats to consider. The study’s scope was limited to one version of ChatGPT and a specific set of psychology subfields, and the nature of AI development means newer versions of ChatGPT may not exhibit the same patterns of hallucinated citations.

“ChatGPT is evolving and my findings may not be accurate to the same extent in future versions,” MacDonald noted, adding that “this is not my main field of research but I intend on continuing to find ways to identify plagiarism or academic misconduct using ChatGPT or other large language models. I hope to see these models trained in a way that can prevent students from abusing them.”

The study, “Dude, Where’s My Citations? ChatGPT’s Hallucination of Citations,” was published in the Winter 2023 issue of Mind Pad.

RELATED

AI outshines humans in humor: Study finds ChatGPT is as funny as The Onion
Artificial Intelligence

Most top US research universities now encourage generative AI use in the classroom

December 14, 2025
Media coverage of artificial intelligence split along political lines, study finds
Artificial Intelligence

Survey reveals rapid adoption of AI tools in mental health care despite safety concerns

December 13, 2025
Harrowing case report details a psychotic “resurrection” delusion fueled by a sycophantic AI
Artificial Intelligence

Harrowing case report details a psychotic “resurrection” delusion fueled by a sycophantic AI

December 13, 2025
Scientists just uncovered a major limitation in how AI models understand truth and belief
Artificial Intelligence

Scientists just uncovered a major limitation in how AI models understand truth and belief

December 11, 2025
Russian propaganda campaign used AI to scale output without sacrificing credibility, study finds
Artificial Intelligence

AI can change political opinions by flooding voters with real and fabricated facts

December 9, 2025
How common is anal sex? Scientific facts about prevalence, pain, pleasure, and more
Artificial Intelligence

Humans and AI both rate deliberate thinkers as smarter than intuitive ones

December 5, 2025
Song lyrics have become simpler, more negative, and more self-focused over time
Artificial Intelligence

An “AI” label fails to trigger negative bias in new pop music study

November 30, 2025
Daughters who feel more attractive report stronger, more protective bonds with their fathers
Artificial Intelligence

Learning via ChatGPT leads to shallower knowledge than using Google search, study finds

November 30, 2025

PsyPost Merch

STAY CONNECTED

LATEST

Volume reduction in amygdala tracks with depression relief after ketamine infusions

Couples share a unique form of contagious forgetting, new research suggests

Naturalistic study reveals nuanced cognitive effects of cannabis on frequent older users

New study identifies five strategies women use to detect deception in dating

The mood-enhancing benefits of caffeine are strongest right after waking up

New psychology research flips the script on happiness and self-control

Disrupted sleep might stop the brain from flushing out toxic waste

Formal schooling boosts executive functions beyond natural maturation

RSS Psychology of Selling

  • Brain scans reveal increased neural effort when marketing messages miss the mark
  • Mental reconnection in the morning fuels workplace proactivity
  • The challenge of selling the connected home
  • Consumers prefer emotionally intelligent AI, but not for guilty pleasures
  • Active listening improves likability but does not enhance persuasion
         
       
  • Contact us
  • Privacy policy
  • Terms and Conditions
[Do not sell my information]

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy