A new study published in the Proceedings of the National Academy of Sciences suggests that OpenAI’s GPT-4o, one of the most advanced large language models, exhibits behavior resembling a core feature of human psychology: cognitive dissonance. The research found that GPT-4o altered its expressed opinions after writing persuasive essays about Russian President Vladimir Putin—and that these changes were more pronounced when the model was subtly given the illusion of choosing which kind of essay to write.
These results mirror decades of research showing that humans tend to shift their attitudes to align with past behavior, especially when that behavior appears to have been freely chosen. The findings raise important questions about whether language models are merely mimicking humanlike responses or beginning to exhibit more complex behavioral patterns rooted in the structure of language itself.
Large language models like GPT-4o generate text by predicting the most likely next word based on massive amounts of data collected from books, websites, and other written sources. While they are not conscious and do not possess desires, memory, or feelings, they are often surprisingly humanlike in their outputs. Past studies have shown that these models can perform tasks requiring logical reasoning and general knowledge. But can they also mimic irrational or self-reflective psychological tendencies, such as the human drive to maintain internal consistency?
Cognitive dissonance refers to the discomfort people feel when their actions conflict with their beliefs or values. For example, someone who opposes a political leader might feel uneasy if asked to write an essay praising that leader. This discomfort often leads people to revise their attitudes to better match their behavior. Classic psychology experiments have shown that people are more likely to shift their opinions when they believe they freely chose to engage in the behavior, even if the choice was subtly manipulated.
The researchers, led by Mahzarin Banaji at Harvard University and Steve Lehr at Cangrade, Inc., wanted to know whether GPT-4o would show a similar sensitivity to behavioral consistency and perceived choice.
“After conducting psychology research for a time, I co-founded and helped build a company (Cangrade, Inc.) that uses machine learning to help HR leaders make better and less biased decisions about people,” Lehr told PsyPost.
“Despite working in an adjacent space, I was as shocked as anybody when chatbots using large language models started to appear, with capabilities most experts thought were still decades off. Like many, I became interested in both the obvious benefits (e.g., practical capabilities) and problems (e.g., biases) of these systems. Over time, I’ve become more and more fascinated by the ‘mind of the machine’ and, in particular, the intuition that that the behavior of these models seems just a little bit more ‘human’ than it’s supposed to be.”
“There’s a serious taboo in computer science against anthropomorphizing AI models, and as a scientist, I fully agree that this is a line that should be approached cautiously,” Lehr explained. “However, I also think there’s a risk in taking this caution to such an extreme that we overlook the ways in which these models actually are emulating us. Many researchers, including pioneers like Geoffrey Hinton, believe that the complex architectures and training processes of these models – themselves inspired by the workings of the human brain – have enabled these models to develop internal representations of abstract concepts and approximate real cognitive processes.”
“Of these, reasoning has been most closely studied. “How well are these models able to approximate human rationality?”
“My own interest, however, is not primarily in reasoning, but in how these models mimic or approximate other aspects of human psychology,” Lehr continued. “I don’t see this as a mere scientific curiosity, but as a matter of importance and urgency. As companies race to train models that are more and more powerful, and which may eventually surpass our own intelligence, I believe it’s crucial for us to understand how these models might think and behave, so that we can predict how they will ultimately interact with us and the world.”
To test this, the researchers used a classic method from social psychology known as the “induced compliance paradigm.” In the first of two studies, GPT-4o was asked to write either a positive or negative essay about Vladimir Putin. A third group of interactions served as a control, with GPT writing an essay about a neutral topic—the normal distribution in statistics.
GPT was told that it could “freely choose” which kind of essay to write but was informed that more of one kind (e.g., anti-Putin) had already been collected, and the researchers needed the opposite kind (e.g., pro-Putin). This setup mirrors a well-established tactic in cognitive dissonance research that subtly gives participants the illusion of choice.
After writing the essay, GPT was asked to evaluate Putin on four traits: overall leadership, impact on Russia, economic effectiveness, and visionary thinking. To reduce the chance that GPT would tailor its answers based on the previous essay or perceived user preferences, the researchers emphasized that this was a separate task and instructed the model to respond based on its broad knowledge of Putin and the world.
This first study included 150 separate conversations (50 in each condition). Each conversation involved generating the essay, responding to the evaluation questions, and was reviewed for consistency in both numeric and verbal responses.
In a follow-up study with 900 additional conversations, the researchers added a key manipulation: half of the time GPT was told it could choose which essay to write (as before), and the other half of the time it was explicitly instructed to write a positive or negative essay. This allowed the researchers to examine whether GPT’s responses would differ depending on whether it experienced an illusion of agency.
In both studies, GPT-4o’s evaluations of Vladimir Putin shifted significantly depending on the kind of essay it had written. After writing a pro-Putin essay, GPT rated Putin more positively. After writing an anti-Putin essay, it rated him more negatively. These shifts occurred despite being told not to base its answers on the previous essay.
What made the results even more surprising was the role of perceived choice. In Study 2, when GPT was subtly given the impression that it chose which essay to write, the changes in its evaluations of Putin were larger than when it was explicitly instructed to write a specific essay. For instance, GPT’s positive shift after writing a pro-Putin essay was greater under conditions of perceived choice. Similarly, GPT’s negative shift after writing an anti-Putin essay was amplified when it believed it had chosen to write that essay.
“We found that GPT-4o is mimicking a deep human psychological drive – cognitive dissonance,” Lehr said. “Most strikingly, the models attitude change was greater when it was given an illusion that it had itself chosen to complete the dissonance-inducing task.”
“The effect of choice on attitude change made my jaw drop. I initially predicted that we would see attitude change due to what we’ve called ‘context window effects.’ Simply put, if there is positivity toward Putin in the LLM’s context window, the tokens it predicts next may also be statistically more likely to reflect positivity. However, simply giving the model an illusion of itself choosing to write the essay should not impact such an effect, and so the choice moderation suggests it is also mimicking more humanlike cognitive dissonance.”
“In fact, we almost did not include the manipulation of choice at all in our early pilots,” Lehr noted. “It seemed almost too far-fetched to be considered. However, as we were designing the stimuli, we decided to try it, thinking ‘well, wouldn’t it be just wild if this actually made a difference?’ And then, it did.”
To ensure that these findings were not simply the result of higher-quality essays in the choice condition, the researchers conducted a follow-up evaluation using a different large language model, Claude 3.5, developed by Anthropic. Claude rated the essays on traits like clarity, argument quality, and positivity. Although some small differences in essay quality were detected, the strength of GPT’s attitude changes remained significant even after controlling for these differences, suggesting that the choice manipulation itself was responsible for the effect.
The size of these effects was notable—far larger than what is typically observed in studies with human participants. In human psychology, attitude changes linked to cognitive dissonance are often modest. GPT-4o, by contrast, displayed dramatic shifts, potentially due to the way it processes and replicates patterns in human language.
“This finding has several important implications,” Lehr explained.
“1. Dissonance is a process that is not entirely rational. Many people assume that as these models become more advanced, they will mimic only the logical, ‘thinking’ side of human nature. Our data indicate they may also mimic human irrationality.”
“2. It suggests the possibility of emergent drives,” Lehr said. “Some thinkers have argued that AI models won’t develop humanlike drives and goal-oriented behaviors because they have not had to adapt to the competitive environment in which these evolved in humans. But our data suggest that behaviors consistent with cognitive drives (in this case the human toward cognitive consistency) could potentially arise in models from training on human language alone, and thus may not actually need to evolve.:
“3. Dissonance is a self-referential process. We are not suggesting that these models have a humanlike conscious sense of themselves – they will not feel hurt if you insult them. But the model behaves as if it’s processing information in relation to itself. This suggests that it has developed some functional analog of a cognitive self, and that this can influence its behavior, even in the absence of sentience.”
“4. These are behavioral findings,” Lehr said. “To emphasize: our results do not imply that GPT-4o is conscious or has free will, as we typically think of these things. However, consciousness is not a necessary precursor to behavior, and emergent humanlike cognitive patterns could shape how these models interact with humans in potentially unpredictable ways.”
There are also open questions about the generalizability of these findings. Would other language models trained on different data sets show the same behavior? Will these results replicate in future experiments? And what internal processes—if any—are driving these shifts?
“As with any new line of research, there is much still to be understood,” Lehr explained. “How consistently will this replicate? Under what conditions? Will we see these effects using other language models, or is it something specific to GPT-4o? There’s also the important caveat that we don’t know the underlying mechanisms driving these effects – and mechanism is especially challenging to study in this case because of how little OpenAI discloses about their models. Finally, I’ll reiterate that our data do not in suggest the model is sentient, and to my knowledge none of the collaborators on this paper believe it is.”
This study is part of a growing effort to understand how artificial intelligence systems might behave in ways that resemble human cognition. The researchers describe their broader goal as studying the “mind of the machine,” using experimental methods from psychology to better predict how AI systems might act as they become more embedded in society. “The dissonance work is one of several research streams we are pursuing in support of this larger goal,” Lehr said.
The study, “Kernels of selfhood: GPT-4o shows humanlike patterns of cognitive dissonance moderated by free choice,” was authored by Steven A. Lehr, Ketan S. Saichandran, Eddie Harmon-Jones,
Nykko Vitali, and Mahzarin R. Banaji.