Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT hallucinates fake but plausible scientific citations at a staggering rate, study finds

by Eric W. Dolan
April 14, 2024
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A recent study has found that scientific citations generated by ChatGPT often do not correspond to real academic work. The study, published in the Canadian Psychological Association’s Mind Pad, found that “false citation rates” across various psychology subfields ranged from 6% to 60%. Surprisingly, these fabricated citations feature elements such as legitimate researchers’ names and properly formatted digital object identifiers (DOIs), which could easily mislead both students and researchers.

ChatGPT is an artificial intelligence language model developed by OpenAI, which is capable of generating human-like text based on the input it receives. As a part of the larger GPT (Generative Pre-trained Transformer) series, ChatGPT has been trained on a vast amount of text data, allowing it to generate coherent responses across various topics. This capability, however, also presents certain challenges, especially in contexts that require high accuracy and reliability, such as academic writing.

As AI tools like ChatGPT become more accessible and widely used, there is a growing concern about their implications for academic integrity. Specifically, the tool’s ability to “hallucinate” information — generate plausible but non-existent citations — poses a significant risk.

“Initially, I was interested in finding ways of identifying ChatGPT usage in student work that I was grading. When ChatGPT was released, I noticed more and more students talking about using ChatGPT and how to use it without being caught,” explained study author Jordan MacDonald, a PhD student in Experimental Psychology at the University of New Brunswick–Saint John.

“I took this as a challenge and started having ChatGPT prepare me papers on various topics to see what, if any, errors were produced consistently. That was when I noticed that a lot of the references that ChatGPT cited did not actually exist.”

“Hallucinated citations are easy to spot because they often contain real authors, journals, proper issue/volume numbers that match up with the date of publication, and DOIs that appear legitimate. However, when you examine hallucinated citations more closely, you will find that they are referring to work that does not exist.”

“The only alternative to a large language model generating these citations is that someone manually collected real authors, real journal names (along with issue and volume numbers), made up a fake title, and then constructed a fake DOI (which have a specific format and usually look like this: 10.1177/03057356211030985). The work it would take to pull together a fake citation would exceed the work it would take to just find a real one and do the work yourself.”

To investigate the accuracy of citations generated by artificial intelligence, MacDonald tasked ChatGPT 3.5 with generating 50 citations for six psychological subfields — religion, animal, social, clinical, personality, and neuropsychology — totaling 300 citations.

Google News Preferences Add PsyPost to your preferred sources

The authenticity of these citations was verified by checking their digital object identifiers (DOIs) against actual publications. If a DOI did not lead to a real document, it was marked as a hallucinated citation. MacDonald further scrutinized a random selection of both hallucinated and legitimate citations to investigate discrepancies in detail.

MacDonald found that a total of 32.3% of the 300 citations generated by ChatGPT were hallucinated. Despite being fabricated, these hallucinated citations were constructed with elements that appeared legitimate — such as real authors who are recognized in their respective fields, properly formatted DOIs, and references to legitimate peer-reviewed journals.

Hallucinated citations varied by subfield. For instance, ChatGPT only hallucinated three citations related to neuropsychology but hallucinated 30 citations related to psychology of religion research.

Interestingly, even when citations included legitimate DOIs that correctly redirected to real articles, MacDonald’s closer inspection often revealed mismatches. The cited articles did not always correspond with the titles, authors, or subjects provided by ChatGPT. For example, a DOI might lead to a genuine article on a completely different topic than the one ChatGPT described.

“The degree of hallucination surprised me,” MacDonald told PsyPost. “Almost every single citation had hallucinated elements or were just entirely fake, but ChatGPT would offer summaries of this fake research that was convincing and well worded.”

“As ChatGPT becomes more refined, I imagine this error will become less common, but as far as I am aware, citation and information hallucination is a tricky beast to tackle when developing language models. At the very least, hallucinated citations are both easy to identify and a likely indicator of ChatGPT (or other large language model) usage.”

Additionally, MacDonald observed that ChatGPT could accurately summarize scholarly articles if provided with correct and complete references by the user. However, left to its own devices, the model frequently “hallucinated” both the content and context of the citations.

“I think many people are both concerned and excited for the potential upsides and downsides to ChatGPT,” MacDonald said. “One of the upsides is that ChatGPT can be used by those who are well educated in a given field to do very topical literature scans. Someone who knows their field well may be able to use ChatGPT in an advantageous way, while also being able to catch errors.

“The downside, and the other end of that same stick, is that students and the general population might use ChatGPT to provide them with information on a topic while lacking the knowledge of said topic to be able to identify false or misleading information.”

“ChatGPT and other large language models definitely have many benefits but are clearly still in their infancy,” MacDonald explained. “I think the average person should be very cautious about using ChatGPT in the same way that they should be cautious about getting a cancer diagnosis from Dr. Google.”

“I think that educators should know that invalid references appear to be a reasonable way to identify AI-generated work but it is not a smoking gun, either. Students may use ChatGPT to help with an initial literature search and then write a paper on their own. The degree of wrongdoing may vary.”

As with all research, the study has some caveats to consider. The study’s scope was limited to one version of ChatGPT and a specific set of psychology subfields, and the nature of AI development means newer versions of ChatGPT may not exhibit the same patterns of hallucinated citations.

“ChatGPT is evolving and my findings may not be accurate to the same extent in future versions,” MacDonald noted, adding that “this is not my main field of research but I intend on continuing to find ways to identify plagiarism or academic misconduct using ChatGPT or other large language models. I hope to see these models trained in a way that can prevent students from abusing them.”

The study, “Dude, Where’s My Citations? ChatGPT’s Hallucination of Citations,” was published in the Winter 2023 issue of Mind Pad.

Previous Post

Researchers uncover link between maternal sensitivity and infant brain responses to happy faces

Next Post

Dark traits predict social appearance anxiety, study finds

RELATED

Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

Disclosing autism to AI chatbots prompts overly cautious, stereotypical advice

April 18, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

Scientists tested the creativity of AI models, and the results were surprisingly homogeneous

April 18, 2026
People ascribe intentions and emotions to both human- and AI-made art, but still report stronger emotions for artworks made by humans
Artificial Intelligence

New research links personality traits to confidence in recognizing artificial intelligence deception

April 13, 2026
Scientists just found a novel way to uncover AI biases — and the results are unexpected
Artificial Intelligence

Artificial intelligence makes consumers more impatient

April 11, 2026
Scientists identify a fat-derived hormone that drives the mood benefits of exercise
Artificial Intelligence

People consistently devalue creative writing generated by artificial intelligence

April 5, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Job seekers mask their emotions and act more analytical when evaluated by artificial intelligence

April 3, 2026
AI autocomplete suggestions covertly change how users think about important topics
Artificial Intelligence

AI autocomplete suggestions covertly change how users think about important topics

April 2, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

How generative artificial intelligence is upending theories of political persuasion

April 1, 2026

STAY CONNECTED

RSS Psychology of Selling

  • Why personalized ads sometimes backfire: A research review explains when tailoring messages works and when it doesn’t
  • The common advice to avoid high customer expectations may not be backed by evidence
  • Personality-matched persuasion works better, but mismatched messages can backfire
  • When happy customers and happy employees don’t add up: How investor signals have shifted in the social media age
  • Correcting fake news about brands does not backfire, five-study experiment finds

LATEST

Cognition might emerge from embodied “grip” with the world rather than abstract mental processes

Men and women show different relative cognitive strengths across their lifespans

Early exposure to forever chemicals linked to altered brain genes and impulsive behavior in rats

Soft brain implants outperform rigid silicon in long-term safety study

Disclosing autism to AI chatbots prompts overly cautious, stereotypical advice

Can choking during sex cause brain damage? Emerging evidence points to hidden neurological risks

The decline of hypergamy: How a surge in university degrees changed marriage in the US and France

New research finds a persistent and growing leftward tilt in the social sciences

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc