Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

ChatGPT-created letters of recommendation are nearly indistinguishable from human-authored letters, study finds

by Eric W. Dolan
December 10, 2023
in Artificial Intelligence
(Photo credit: Adobe Stock)

(Photo credit: Adobe Stock)

Share on TwitterShare on Facebook

In a new study published in the journal AEM Education and Training, researchers discovered that academic physicians could only slightly better than guesswork differentiate between recommendation letters written by humans and those generated by artificial intelligence (AI). The study raises critical questions about the future role of AI in academic assessments, the need for ethical considerations in its use, and the potential reevaluation of the current practices in recommendation letters.

Letters of recommendation are a staple in the academic world, particularly in medicine. They play a critical role in various decisions, from student admissions to faculty promotions. However, writing these letters is often a burdensome task for busy academics. With the rise of AI technologies like ChatGPT, a tool adept at generating human-like text, the possibility emerged: Could AI assist in this labor-intensive process?

“This topic interested us as we recognized the essential yet time-consuming role of letters of recommendation (LORs) in academic medicine,” explained study author Carl Preiksaitis, a clinical instructor at the Department of Emergency Medicine at Stanford University School of Medicine. “These letters are written for a variety of different scenarios, from application to medical school and residency to faculty promotion. We had heard anecdotal evidence that generative AI models, such as ChatGPT, were being used to aid in authoring LORs and we wanted to explore this possibility in a more rigorous way.”

To conduct the study, the researchers selected four hypothetical candidates for academic promotion. They prepared detailed profiles for these candidates, covering their educational background, employment history, and accolades, but without any gender identification to avoid bias.

Next, the team crafted letters of recommendation. Two experienced members wrote letters as they usually would, serving as the ‘human’ authors. Meanwhile, two junior team members, with no prior experience in such letter writing, used ChatGPT to create AI-authored letters. The AI-generated letters were based on prompts derived from the candidates’ achievements. To maintain consistency, all letters were formatted similarly, focusing solely on content differences.

The researchers then designed a survey, which was administered to 32 participants, primarily full professors in the fields of emergency medicine, internal medicine, and family medicine. These participants were randomly given eight out of 16 letters (half AI-authored, half human-authored) to review. They were asked to guess the authorship of each letter, rate its quality, and assess its persuasiveness regarding the candidate’s promotion.

On average, participants correctly identified the authorship only 59.4% of the time, barely above a random guess. Interestingly, even those with extensive experience in reviewing letters did not fare much better. When it came to the perceived quality and persuasiveness of the letters, there was a bias: reviewers rated letters they believed were human-written higher than those they thought were AI-generated. However, when the actual source of the letters was considered, this difference in perception disappeared.

“One surprising element was the overall difficulty participants had in distinguishing between human- and AI-authored LORs, with accuracy only slightly better than chance,” Preiksaitis said. “Additionally, the study revealed a discrepancy in the perceived quality and persuasiveness of LORs based on the suspected authorship, with human-suspected LORs rated more favorably, despite the actual authorship.”

Google News Preferences Add PsyPost to your preferred sources

The study also examined gender bias in the letters. Results showed human-written letters contained more female-associated words, while AI-generated letters tended to have more male-associated words. Additionally, AI detection tools like GPTZero and OpenAI’s Text Classifier showed mixed effectiveness, each correctly identifying the authorship of the letters only half of the time.

The findings are in line with a previous study published in Research Methods in Applied Linguistics. In that study, 72 linguistics experts were tested to see if they could differentiate between research abstracts written by AI and humans. Despite the experts’ efforts to use linguistic and stylistic analyses, their success rate was only 38.9%, indicating a significant challenge in distinguishing AI writing from human writing.

“The average person should understand that AI technologies like ChatGPT have reached a level of sophistication where they can generate text, such as LORs, that is nearly indistinguishable from human-authored content,” Preiksaitis told PsyPost. “This suggests that AI might be a viable tool to reduce the administrative workload in academic settings. However, it also raises questions about the integrity and personalization of such important documents. The study highlights the potential for AI to assist in academic writing while also signaling the need for careful consideration of its implications.”

Despite these intriguing results, the study is not without its limitations. The standardized format of the data used in letter creation might not reflect the more personalized and nuanced letters in real-world scenarios. Also, the recruitment strategy could lead to biased results, with an overrepresentation of male participants and those in emergency medicine. Moreover, the study did not delve deeply into why and how reviewers made their distinctions between human- and AI-authored letters.

Future research could explore these areas further, perhaps focusing on how to enhance AI’s ability to write more personalized and unbiased letters. Additionally, as AI continues to advance, it’s essential to consider the ethical implications and the need for transparency in its usage, especially in critical areas like academic evaluations.

“A key caveat is the standardized approach used to generate the LORs, which might not reflect the personalized and nuanced understanding a human writer has of the candidate,” Preiksaitis noted. “The overrepresentation of certain demographics in the participant pool and the potential bias in their responses also could limit the generalizability of our findings. Future research should explore how AI-generated LORs might be optimized for authenticity and how biases, both human and AI, can be mitigated. Additionally, the ethical implications of AI assistance in such tasks need thorough exploration.”

“Perhaps most provocatively, this research and the increasing ability of generative AI causes us to question the utility of practices from a pre-AI era, like LOR<‘ the researcher added. “Perhaps we can use this crossroads as an opportunity to develop a different way of recommending candidates that is more equitable and transparent.”

The study, “Brain versus bot: Distinguishing letters of recommendation authored by humans compared with artificial intelligence“, was authored by Carl Preiksaitis, Christopher Nash, Michael Gottlieb, Teresa M. Chan, Al’ai Alvarez, and Adaira Landry.

Previous Post

Pole dancing classes boost women’s mental wellbeing, study finds

Next Post

Yoga-based interventions might improve anxiety and depression symptoms

RELATED

Scientists identify a fat-derived hormone that drives the mood benefits of exercise
Artificial Intelligence

People consistently devalue creative writing generated by artificial intelligence

April 5, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Job seekers mask their emotions and act more analytical when evaluated by artificial intelligence

April 3, 2026
AI autocomplete suggestions covertly change how users think about important topics
Artificial Intelligence

AI autocomplete suggestions covertly change how users think about important topics

April 2, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

How generative artificial intelligence is upending theories of political persuasion

April 1, 2026
People with attachment anxiety are more vulnerable to problematic AI use
Artificial Intelligence

Relying on AI chatbots for historical facts can influence your political beliefs, new study shows

March 30, 2026
ChatGPT acts as a “cognitive crutch” that weakens memory, new research suggests
Artificial Intelligence

ChatGPT acts as a “cognitive crutch” that weakens memory, new research suggests

March 30, 2026
Russian propaganda campaign used AI to scale output without sacrificing credibility, study finds
Artificial Intelligence

Knowing an AI is involved ruins human trust in social games

March 28, 2026
Scientists just uncovered a major limitation in how AI models understand truth and belief
Artificial Intelligence

Most Americans don’t fear an AI apocalypse, according to new research

March 26, 2026

STAY CONNECTED

RSS Psychology of Selling

  • Political conservatives are more drawn to baby-faced product designs, and purity values explain why
  • Free gifts with no strings attached can boost customer spending by over 30%, study finds
  • New research reveals the “Goldilocks” age for social media influencers
  • What today’s shoppers really want from salespeople, and what drives them away
  • The salesperson who competes against themselves may outperform the one trying to beat everyone else

LATEST

Early life stress fundamentally alters alcohol processing in the brain

Autism associated with age of maternal grandparents in new study

A common antidepressant shows promise in treating methamphetamine dependence

A smaller social network increases loneliness more drastically for those with depression

Social media analysis links polarized political language to distorted thought patterns

Genetic study unravels the link between caffeine intake and sleep timing

Hikikomori: Can psychological resilience prevent extreme social withdrawal?

Can a sweet potato help your baby sleep through the night?

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc