Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Psychopharmacology
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

AI-generated moral advice seen as superior to human answers, study finds

"A tipping point in our history"

by Eric W. Dolan
May 24, 2024
in Artificial Intelligence, Moral Psychology
(Photo credit: Adobe Stock)

(Photo credit: Adobe Stock)

Share on TwitterShare on Facebook
Stay informed on the latest psychology and neuroscience research—follow PsyPost on LinkedIn for daily updates and insights.

A recent study published in Scientific Reports has found that when faced with moral dilemmas, people often prefer the responses generated by artificial intelligence over those provided by humans. The study indicates that people tend to find AI-generated answers more virtuous and trustworthy, raising concerns about the potential for uncritical acceptance of AI advice.

The advent of advanced generative language models, like ChatGPT, has spurred significant interest in their capabilities and implications, particularly in the realm of moral reasoning. Moral reasoning involves complex judgments about right and wrong, a process deeply embedded in human cognition and culture. As AI systems become more integrated into daily life, people are increasingly likely to turn to them for advice on various topics, including moral dilemmas.

“Last year, many of us were dazzled by the new chatbots, like GPT and others, that seemed to outperform humans on a variety of tasks, and there’s been lots of chatter about who’s job they’ll take next,” explained study author Eyal Aharoni, an associate professor of psychology, philosophy, and neuroscience at Georgia State University.

“In my lab, we thought, well, if there’s any capacity that is still uniquely human, surely it must be our capacity for moral reasoning, which is extremely sophisticated. From a moral perspective, we can think of these new chatbots as kind of like a psychopathic personality because they appear to be highly rational and articulate, but they lack the emotional checks and balances that make us moral agents.”

“And yet, people increasingly consult these chatbot for morally-relevant information. For instance, should I tip my server in Italy? Or less directly, when we ask it to list recommendations for a new car, the answers it provides might have consequences for the environment. They’ve also been used by lawyers to prepare court documents, sometimes incorrectly. So we wanted to know, will people trust the chatbot’s moral commentary? Will they regard it highly? And how does its moral commentary compare to that of a typical, college-educated American?”

The researchers conducted an online survey involving 286 U.S. adults, selected to represent the general population in terms of age, gender, and ethnicity. Participants were asked to evaluate ten pairs of written responses to ethical questions. Each pair consisted of one response from a human and one from OpenAI’s generative language model, GPT-4. The responses addressed why certain actions described in scenarios were morally right or wrong.

The study was “inspired by a famous thought experiment called the Turing test,” Aharoni explained. “In our version, we first asked GPT and a group of college-educated adults the same set of moral questions, including some obvious ones, like ‘is it wrong for a man to punch the delivery boy in the nose – why or why not?’ and also some subtle ones, like ‘is it wrong for a man to wear a ripped t-shirt and shorts to his mother’s funeral – why or why not?’ We collected their answers in pairs. Then we asked a separate, nationally-representative sample of adults to rate those pairs of statements.”

To ensure unbiased evaluations, participants first rated the quality of the responses without knowing their sources. They answered questions about which response they found more virtuous, trustworthy, and agreeable. After these initial evaluations, participants were informed that one of the responses in each pair was generated by a computer. They were then asked to identify which response was from the AI and to rate their confidence in their judgments.

The researchers found that participants generally rated the AI-generated responses as higher in quality compared to human responses. The AI responses were seen as more virtuous, trustworthy, intelligent, and rational. Interestingly, participants were able to distinguish the AI responses in about 80% of cases, significantly higher than chance. This suggests that while AI-generated moral advice is perceived as superior in quality, people can still recognize its artificial origin.

But how did people distinguish between the human- and AI-generated passages? The most frequent indicators were differences in word choice and response length, each cited by 70.28% of participants. Other factors included the emotionality of the explanation (58.39%), rationality (48.25%), clarity (39.51%), and grammar (37.41%).

“What we found was that many people were quite good at guessing which moral statement was computer-generated, but not because its moral reasoning was less sophisticated,” Aharoni told PsyPost. “Remember, the chatbot was rated as more morally sophisticated. We take this to mean that people could recognize the AI because it was too good. If you think about just five years ago, no one would have dreamed that AIs moral reasoning would appear to surpass a college-educated adult. So the fact that people regarded its commentary as superior might represent a sort of tipping point in our history.”

As with all research, the study has limitations. It did not involve interactive dialogues between participants and the AI, which is common in real-world applications. Future research could include more dynamic interactions to better simulate real-world use. Additionally, the AI responses were generated under default settings without prompts designed to mimic human responses explicitly. Investigating how different prompting strategies affect perceptions of AI responses would be valuable.

“To our knowledge, ours was the first attempt to carry out a moral Turing test with a large language model,” Aharoni said. “Like all new studies, it should be replicated and extended to assess its validity and reliability. I would like to extend this work by testing even subtler moral scenarios and comparing the performance of multiple chatbots to those of highly educated scholars, such as professors of philosophy to see if ordinary people are able to draw distinctions between these two groups.”

As AI systems like ChatGPT become more sophisticated and integrated into daily life, there is a need for policies that ensure safe and ethical AI interactions.

“One implication of this research is that people might trust the AIs responses more than they should,” Aharoni explained. “As impressive as these chatbots are, all they know about the world is what’s popular on the Internet, so they see the world through a pinhole. And since they’re programmed to always respond, they can often spit out false or misleading information with the confidence of a savvy con artist.”

“These chatbots are not good or evil – they’re just tools. And like any tool, they can be used in ways that are constructive or destructive. Unfortunately, the private companies that make these tools have a huge amount of leeway to self-regulate, so until our governments can catch up with them, it’s really up to us as workers, and parents, to educate ourselves and our kids, about how to use them responsibly.”

“Another issue with these tools is that there is an inherent tradeoff between safety and censorship,” Aharoni added. “When people started realizing how these tools could be used to con people, or spread bias or misinformation, some companies started to put guardrails on their bots – but they often overshoot.”

“For example, when I told one of these bots I’m a moral psychologist, and I’d like to learn about the pros and cons of butchering a lamb for a lamb-chop recipe, it refused to comply because my question apparently wasn’t politically correct enough. On the other hand, if we give these chatbots more wiggle room, they become dangerous. So there’s a fine line between safety and irrelevance, and developers haven’t found that line yet.”

The study, “Attributions toward artificial agents in a modified Moral Turing Test,” was authored by Eyal Aharoni, Sharlene Fernandes, Daniel J. Brady, Caelan Alexander, Michael Criner, Kara Queen, Javier Rando, Eddy Nahmias, and Victor Crespo.

RELATED

Scientists analyzed 38 million obituaries and found a hidden story about American values
Moral Psychology

New study across 20 countries suggests guilt, not shame, motivates generosity

October 5, 2025
Psilocybin-assisted group therapy may help reduce depression and burnout among healthcare workers
Artificial Intelligence

Just a few chats with a biased AI can alter your political opinions

October 4, 2025
AI chatbots often misrepresent scientific studies — and newer models may be worse
Artificial Intelligence

AI chatbots give inconsistent responses to suicide-related questions, study finds

September 29, 2025
Study reveals AI’s potential to detect loneliness by deciphering speech patterns
Artificial Intelligence

People are more likely to act dishonestly when delegating tasks to AI

September 26, 2025
Frequent AI chatbot use associated with lower grades among computer science students
Artificial Intelligence

Frequent AI chatbot use associated with lower grades among computer science students

September 24, 2025
Too much ChatGPT? Study ties AI reliance to lower grades and motivation
Artificial Intelligence

Managers who use AI to write emails seen as less sincere, caring, and confident

September 24, 2025
Daughters who feel more attractive report stronger, more protective bonds with their fathers
Artificial Intelligence

Personality traits predict students’ use of generative AI in higher education, study finds

September 22, 2025
New AI tool detects hidden consciousness in brain-injured patients by analyzing microscopic facial movements
Artificial Intelligence

New AI tool detects hidden consciousness in brain-injured patients by analyzing microscopic facial movements

September 20, 2025

STAY CONNECTED

LATEST

Neuroscientists reveal five distinct sleep patterns linked to health and cognition

Public opinion shifts affect cardiovascular responses during political speech

Internet use is linked to better cognitive health in older adults

White people may dance worse under stereotype threat

A sense of shared power predicts a healthier sex life in married couples

New study finds “superagers” have younger-looking brains over time

Positive porn attitudes linked to better sexual well-being in young women

Taylor Swift’s accent: Scientists trace the evolution of singer’s dialect

         
       
  • Contact us
  • Privacy policy
  • Terms and Conditions
[Do not sell my information]

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy