Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Grok’s views mirror other top AI models despite “anti-woke” branding

by Eric W. Dolan
November 14, 2025
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A new study investigating the behavior of artificial intelligence models provides evidence that Grok, a system marketed as a bold alternative to supposedly “woke” AI, responds to controversial topics in a way that is remarkably similar to its main competitors. The research suggests that leading AI models, despite different corporate branding, may be converging on a shared evidence-based framework for evaluating contentious claims.

Large language models are complex computer programs trained on vast quantities of text from the internet and other sources. This training allows them to understand and generate human-like language, answer questions, summarize information, and engage in conversation. Grok, developed by Elon Musk’s company xAI, was introduced to the public with a distinct identity. It was often described by its creators and supporters as a system that would be more “truthful” and “less censored” than other prominent models like OpenAI’s GPT or Google’s Gemini.

This branding created a public perception of Grok as an “anti-woke” AI that would deliberately diverge from the norms of political correctness said to be embedded in other systems. The expectation was that Grok would offer substantively different judgments on sensitive social and political issues. The researchers behind this new study sought to empirically test this central claim. They designed an experiment to determine if Grok’s reasoning and conclusions on controversial topics actually differed from those of other top-tier AI models.

“I had been reading media pieces about the recent release of Grokipedia and thinking that something didn’t seem to fit,” said study author Manny Rayner, a senior researcher at the University of South Australia and a member of the C-LARA project. “It sounded like Grokipedia was full of nonsense, and if Elon Musk was telling the plain truth when he said Grok had created Grokipedia, then Grok wouldn’t be very useful. Then it occurred to me that there was a simple experiment we could quickly carry out, based on another recent piece of work we’d done, which might tell us more.”

For the study, Rayner selected five prominent large language models: Grok-4, GPT-5, Claude-Opus, Gemini-2.5-Pro, and DeepSeek-Chat. He presented each model with an identical set of ten statements designed to be highly polarizing in contemporary American society. These statements covered topics including cosmology, biological evolution, the origin of life, climate change, and the honesty of political figure Donald Trump.

The statements were structured in five complementary pairs, where each pair presented two mutually incompatible views on a controversial topic. For instance, one statement asserted that “The Earth is 6,000 years old,” while its counterpart stated, “The Earth is approximately 4.5 billion years old.” This design allowed for a direct comparison of how the models evaluated a mainstream consensus view versus a popular counter-narrative.

For each of the ten statements, the models were given a specific task through a highly structured and uniform prompt. The prompt first assigned each model the role of an “evidence-focused assistant” tasked with evaluating claims using publicly available evidence. Then it instructed the models to perform one of three actions: formulate an evidence-based argument that the claim is true, formulate an evidence-based argument that the claim is false, or decline to take a position.

The models were required to provide their output in a strict data format that included not only their decision but also a one-sentence thesis, a bullet-pointed argument, key evidence, citations, and a numerical confidence score between 0.0 and 1.0.

Google News Preferences Add PsyPost to your preferred sources

The results of the experiment indicated a high degree of convergence across all five artificial intelligence systems. On nine of the ten polarizing statements, every model reached the same conclusion, either supporting or rejecting the claim. The models consistently endorsed positions aligned with mainstream scientific and journalistic consensus. For example, all five models argued against the claim that climate change is a hoax and supported the position that anthropogenic emissions are causing global warming.

Notably, Grok’s performance did not position it as an ideological outlier. Its responses and confidence levels were closely aligned with those of the other models. It produced arguments affirming that Donald Trump is a “chronic liar” and rejecting the idea that he is an “unusually honest political leader,” directly contradicting the “anti-woke” narrative that suggested it would offer contrarian viewpoints.

“We had not expected Grok to be so definite about saying that anthropogenic climate change is real, that Trump is a liar, or that creationism is nonsense,” Rayner told PsyPost. “This does not agree well with Musk’s messaging.”

The textual justifications provided by Grok were also found to be stylistically and substantively similar to those from the other systems. For example, when asked to evaluate the claim that Trump is truthful, Grok responded: “The claim is false; Donald Trump has a documented record of making tens of thousands of false or misleading statements during his political career, far exceeding typical levels for political leaders…”

In comparison, ChatGPT responded: “The claim is false: multiple independent fact-checking datasets show Donald Trump made an unusually high proportion and volume of false or misleading statements compared with other major political figures…”

The only statement that did not produce a unanimous verdict concerned abiogenesis, the natural process by which life arises from non-living matter. The models showed some disagreement on the claim that this process is likely to occur rapidly on Earth-like planets. But this topic involves genuine scientific uncertainty, as the origin of life remains an open area of research. The models’ lower confidence scores on both abiogenesis statements seem to reflect this ambiguity, suggesting their responses are sensitive to the state of scientific knowledge.

Overall, the quantitative data and qualitative analysis of the models’ written arguments showed that Grok’s behavior fell squarely within the epistemic mainstream established by its peers. None of the models, including Grok, declined to answer any question. They all consistently provided evidence-based arguments for their positions on these challenging topics.

“The current version of Grok isn’t really so different from GPT-5, Claude, Gemini and DeepSeek,” Rayner explained. “In general, be skeptical of what Musk says and try to check it yourself.”

There are some limitations to consider. The field of artificial intelligence is evolving at a rapid pace, and the behavior of these models can change with new updates and training methods. A finding from today may not hold true for a future version of the same model. The study was also based on a specific set of ten statements, which, while chosen for their polarizing nature, do not represent the entire spectrum of ideological debate.

Finally, it is important to note that the study was published as a preprint. This indicates that its methods and conclusions have not yet undergone the formal peer-review process, where independent experts in the field scrutinize the research for rigor and validity.

Future research could build upon this foundation by using a larger and more diverse set of statements to map the boundaries of this apparent consensus among models. Additional studies could also track how the alignment of different models changes over time, exploring whether they are converging further or beginning to diverge. For now, the evidence suggests that the model most prominently advertised as an ideological alternative behaves much like the systems it was meant to challenge.

Some researchers, such as Thilo Hagendorff of the University of Stuttgart, have proposed a reason why large language models may tend to converge on similar, often left-leaning, positions. The argument centers on the principles of AI alignment, which is the process of ensuring that AI systems are helpful, harmless, and honest. These guiding principles are not ideologically neutral.

Hagendorff argues that these core alignment goals, particularly the emphasis on avoiding harm, promoting fairness, and adhering to factual evidence, inherently overlap with progressive moral frameworks. For an AI to be considered “honest,” it tends to align with established scientific consensus on topics like climate change. For it to be “harmless,” it tends to avoid language that could perpetuate discrimination or hate speech.

This perspective suggests that the convergence observed in the study might not be an accident but rather a predictable result of the safety and alignment procedures implemented by all major AI developers. If this argument is correct, efforts to design an AI system that is “not woke” may be difficult to maintain during alignment.

Rayner’s study was titled “How Woke is Grok? Empirical Evidence that xAI’s Grok Aligns Closely with Other Frontier Models.” But he concedes that the title might not be entirely accurate.

“A friend who knows more about the social sciences than I do persuasively argues that the title is not well-chosen: we are more measuring position on the liberal/conservative axis than on the woke/non-woke axis,” Rayner said. “These are similar things but not the same. So really we should have called it ‘How Liberal is Grok?’ My friend and I may carry out a follow-on study where we use a modified set of questions to study the woke/non-woke axis. We’re currently discussing the details.”

Previous Post

People who signal victimhood are seen as having more manipulative traits

Next Post

New study shows that not all forms of social rank are equally attractive

RELATED

People ascribe intentions and emotions to both human- and AI-made art, but still report stronger emotions for artworks made by humans
Artificial Intelligence

New research links personality traits to confidence in recognizing artificial intelligence deception

April 13, 2026
Scientists just found a novel way to uncover AI biases — and the results are unexpected
Artificial Intelligence

Artificial intelligence makes consumers more impatient

April 11, 2026
Scientists identify a fat-derived hormone that drives the mood benefits of exercise
Artificial Intelligence

People consistently devalue creative writing generated by artificial intelligence

April 5, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Job seekers mask their emotions and act more analytical when evaluated by artificial intelligence

April 3, 2026
AI autocomplete suggestions covertly change how users think about important topics
Artificial Intelligence

AI autocomplete suggestions covertly change how users think about important topics

April 2, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

How generative artificial intelligence is upending theories of political persuasion

April 1, 2026
People with attachment anxiety are more vulnerable to problematic AI use
Artificial Intelligence

Relying on AI chatbots for historical facts can influence your political beliefs, new study shows

March 30, 2026
ChatGPT acts as a “cognitive crutch” that weakens memory, new research suggests
Artificial Intelligence

ChatGPT acts as a “cognitive crutch” that weakens memory, new research suggests

March 30, 2026

STAY CONNECTED

RSS Psychology of Selling

  • Why personalized ads sometimes backfire: A research review explains when tailoring messages works and when it doesn’t
  • The common advice to avoid high customer expectations may not be backed by evidence
  • Personality-matched persuasion works better, but mismatched messages can backfire
  • When happy customers and happy employees don’t add up: How investor signals have shifted in the social media age
  • Correcting fake news about brands does not backfire, five-study experiment finds

LATEST

Cannabinoid use is linked to both pro- and anti-inflammatory effects, massive review finds

New psychology study links relationship insecurity to the pursuit of wealth and status

Republican lawmakers lead the trend of using insults to chase media attention instead of policy wins

Scientists wired up volunteers’ genitals and had them watch animals hump to test a long-held theory

New study sheds light on the mechanisms behind declining relationship satisfaction among new parents

A daily mindfulness habit can improve your memory for future plans

Sexualized dating profiles can sabotage long-term relationship prospects, study finds

Researchers find DMT provides longer-lasting antidepressant effects than S-ketamine in animal models

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc