Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT hallucinates fake but plausible scientific citations at a staggering rate, study finds

by Eric W. Dolan
April 14, 2024
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A recent study has found that scientific citations generated by ChatGPT often do not correspond to real academic work. The study, published in the Canadian Psychological Association’s Mind Pad, found that “false citation rates” across various psychology subfields ranged from 6% to 60%. Surprisingly, these fabricated citations feature elements such as legitimate researchers’ names and properly formatted digital object identifiers (DOIs), which could easily mislead both students and researchers.

ChatGPT is an artificial intelligence language model developed by OpenAI, which is capable of generating human-like text based on the input it receives. As a part of the larger GPT (Generative Pre-trained Transformer) series, ChatGPT has been trained on a vast amount of text data, allowing it to generate coherent responses across various topics. This capability, however, also presents certain challenges, especially in contexts that require high accuracy and reliability, such as academic writing.

As AI tools like ChatGPT become more accessible and widely used, there is a growing concern about their implications for academic integrity. Specifically, the tool’s ability to “hallucinate” information — generate plausible but non-existent citations — poses a significant risk.

“Initially, I was interested in finding ways of identifying ChatGPT usage in student work that I was grading. When ChatGPT was released, I noticed more and more students talking about using ChatGPT and how to use it without being caught,” explained study author Jordan MacDonald, a PhD student in Experimental Psychology at the University of New Brunswick–Saint John.

“I took this as a challenge and started having ChatGPT prepare me papers on various topics to see what, if any, errors were produced consistently. That was when I noticed that a lot of the references that ChatGPT cited did not actually exist.”

“Hallucinated citations are easy to spot because they often contain real authors, journals, proper issue/volume numbers that match up with the date of publication, and DOIs that appear legitimate. However, when you examine hallucinated citations more closely, you will find that they are referring to work that does not exist.”

“The only alternative to a large language model generating these citations is that someone manually collected real authors, real journal names (along with issue and volume numbers), made up a fake title, and then constructed a fake DOI (which have a specific format and usually look like this: 10.1177/03057356211030985). The work it would take to pull together a fake citation would exceed the work it would take to just find a real one and do the work yourself.”

To investigate the accuracy of citations generated by artificial intelligence, MacDonald tasked ChatGPT 3.5 with generating 50 citations for six psychological subfields — religion, animal, social, clinical, personality, and neuropsychology — totaling 300 citations.

Google News Preferences Add PsyPost to your preferred sources

The authenticity of these citations was verified by checking their digital object identifiers (DOIs) against actual publications. If a DOI did not lead to a real document, it was marked as a hallucinated citation. MacDonald further scrutinized a random selection of both hallucinated and legitimate citations to investigate discrepancies in detail.

MacDonald found that a total of 32.3% of the 300 citations generated by ChatGPT were hallucinated. Despite being fabricated, these hallucinated citations were constructed with elements that appeared legitimate — such as real authors who are recognized in their respective fields, properly formatted DOIs, and references to legitimate peer-reviewed journals.

Hallucinated citations varied by subfield. For instance, ChatGPT only hallucinated three citations related to neuropsychology but hallucinated 30 citations related to psychology of religion research.

Interestingly, even when citations included legitimate DOIs that correctly redirected to real articles, MacDonald’s closer inspection often revealed mismatches. The cited articles did not always correspond with the titles, authors, or subjects provided by ChatGPT. For example, a DOI might lead to a genuine article on a completely different topic than the one ChatGPT described.

“The degree of hallucination surprised me,” MacDonald told PsyPost. “Almost every single citation had hallucinated elements or were just entirely fake, but ChatGPT would offer summaries of this fake research that was convincing and well worded.”

“As ChatGPT becomes more refined, I imagine this error will become less common, but as far as I am aware, citation and information hallucination is a tricky beast to tackle when developing language models. At the very least, hallucinated citations are both easy to identify and a likely indicator of ChatGPT (or other large language model) usage.”

Additionally, MacDonald observed that ChatGPT could accurately summarize scholarly articles if provided with correct and complete references by the user. However, left to its own devices, the model frequently “hallucinated” both the content and context of the citations.

“I think many people are both concerned and excited for the potential upsides and downsides to ChatGPT,” MacDonald said. “One of the upsides is that ChatGPT can be used by those who are well educated in a given field to do very topical literature scans. Someone who knows their field well may be able to use ChatGPT in an advantageous way, while also being able to catch errors.

“The downside, and the other end of that same stick, is that students and the general population might use ChatGPT to provide them with information on a topic while lacking the knowledge of said topic to be able to identify false or misleading information.”

“ChatGPT and other large language models definitely have many benefits but are clearly still in their infancy,” MacDonald explained. “I think the average person should be very cautious about using ChatGPT in the same way that they should be cautious about getting a cancer diagnosis from Dr. Google.”

“I think that educators should know that invalid references appear to be a reasonable way to identify AI-generated work but it is not a smoking gun, either. Students may use ChatGPT to help with an initial literature search and then write a paper on their own. The degree of wrongdoing may vary.”

As with all research, the study has some caveats to consider. The study’s scope was limited to one version of ChatGPT and a specific set of psychology subfields, and the nature of AI development means newer versions of ChatGPT may not exhibit the same patterns of hallucinated citations.

“ChatGPT is evolving and my findings may not be accurate to the same extent in future versions,” MacDonald noted, adding that “this is not my main field of research but I intend on continuing to find ways to identify plagiarism or academic misconduct using ChatGPT or other large language models. I hope to see these models trained in a way that can prevent students from abusing them.”

The study, “Dude, Where’s My Citations? ChatGPT’s Hallucination of Citations,” was published in the Winter 2023 issue of Mind Pad.

Previous Post

Researchers uncover link between maternal sensitivity and infant brain responses to happy faces

Next Post

Dark traits predict social appearance anxiety, study finds

RELATED

Why most people fail to spot AI-generated faces, while super-recognizers have a subtle advantage
Artificial Intelligence

Why most people fail to spot AI-generated faces, while super-recognizers have a subtle advantage

February 28, 2026
People with social anxiety more likely to become overdependent on conversational artificial intelligence agents
Artificial Intelligence

AI therapy is rated higher for empathy until people learn a machine wrote the text

February 26, 2026
New research: AI models tend to reflect the political ideologies of their creators
Artificial Intelligence

New research: AI models tend to reflect the political ideologies of their creators

February 26, 2026
Stress disrupts gut and brain barriers by reducing key microbial metabolites, study finds
Artificial Intelligence

AI and mental health: New research links use of ChatGPT to worsened psychiatric symptoms

February 24, 2026
Stanford scientist discovers that AI has developed an uncanny human-like ability
Artificial Intelligence

How personality and culture relate to our perceptions of artificial intelligence

February 23, 2026
Young children are more likely to trust information from robots over humans
Artificial Intelligence

The presence of robot eyes affects perception of mind

February 21, 2026
Psychology study reveals a fascinating fact about artwork
Artificial Intelligence

AI art fails to trigger the same empathy as human works

February 20, 2026
ChatGPT’s social trait judgments align with human impressions, study finds
Artificial Intelligence

AI chatbots generate weight loss coaching messages perceived as helpful as human-written advice

February 16, 2026

STAY CONNECTED

LATEST

Long-term ADHD medication use does not appear to permanently alter the developing brain

Using cannabis to cut back on alcohol? Your working memory might dictate if it works

Conservatives underestimate the environmental impact of sustainable behaviors compared to liberals

American issue polarization surged after 2008 as the left moved further left

Psychological network analysis reveals how inner self-compassion connects to outward social attitudes

New neuroscience study links visual brain network hyperactivity to social anxiety

Trump voters who believed conspiracy theories were the most likely to justify the Jan. 6 riots

Simple blood tests can detect dementia in underrepresented Latin American populations

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc