PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Scientists tested the creativity of AI models, and the results were surprisingly homogeneous

by Eric W. Dolan
April 18, 2026
Reading Time: 4 mins read
Share on TwitterShare on Facebook

A recent study published in PNAS Nexus suggests that while artificial intelligence chatbots can match or exceed human creativity on individual tasks, they produce highly similar responses when compared to one another. This provides evidence that widespread reliance on artificial intelligence for creative tasks could lead to a loss of unique ideas.

Scientists Emily Wenger and Yoed N. Kenett designed this study to understand how large language models affect the diversity of human thought. Large language models are the technology behind popular AI chatbots that predict and generate text based on user prompts.

Large language models are complex computer programs designed to process and produce human language. Developers build these systems by training them on billions of sentences from books, articles, and websites. By analyzing this vast amount of text, the models learn the mathematical patterns and relationships between words.

When a user gives a chatbot a prompt, the model works by calculating the most probable next word in a sequence. It builds responses one word at a time based on the rules and associations it learned during its training phase. Wenger suspected this shared training method across different systems might cause a broader issue.

“Most of today’s LLMs are trained on massive datasets of scraped internet data — which functionally means they’re all trained on roughly the same data,” said Wenger, Cue Family Assistant Professor at Duke University. Traditional machine learning research has shown that training models on the same dataset lead to models with similar properties. I was wondering if this phenomenon would occur in commercial LLMs and what the implications might be.”

To investigate this, the researchers recruited 102 human participants through Prolific, an online platform for survey research. They screened the human participants to remove computer bots and ensure everyone passed basic attention checks. They also selected 22 different language models from various companies, including well-known chatbots produced by Google, Meta, and OpenAI.

Both humans and language models completed three standard verbal creativity tasks. The first was the Alternative Uses Task, which asks participants to list as many creative uses as possible for everyday objects like a fork, a book, or a pair of pants. This assessment tests divergent thinking, which is the ability to generate multiple unique solutions to a single problem.

The second assessment was the Forward Flow task, which measures associative thinking. Participants receive a starting word, like “snow” or “candle,” and must provide a chain of up to 20 subsequent words that naturally follow in their minds. Associative thinking helps individuals search through their memories and combine different concepts into new ideas.

Google News Preferences Add PsyPost to your preferred sources

The final assessment was the Divergent Association Task. This exercise required participants to generate 10 nouns that are as unrelated to one another as possible. Generating unrelated words demonstrates a cognitive flexibility that is strongly linked to creative abilities in humans.

The scientists then used computational text-analysis tools to evaluate the responses. These tools embed words into a mathematical space to measure the semantic distance between them, which calculates how different words and concepts are from one another. The researchers measured both the individual originality of a single answer and the overall variability among all answers in a group.

The researchers found that individual language models performed at or slightly above the level of the average human on most of the tasks. When looking at a single response in isolation, the chatbots provided highly original answers. However, a pattern of similarity emerged when the scientists compared all the responses from the different models to one another.

Across all tasks, the models produced answers that were significantly more alike than the answers provided by humans. The chatbots frequently relied on the same overlapping vocabulary, causing their creative outputs to group together in a highly uniform way. This similarity was even more pronounced when the researchers compared models built by the same company.

“My hypothesis was that there would be some degree of homogeneity among LLM responses relative to humans, but I was surprised by the degree.”

Wenger and Kenett also tested whether they could force the models to be more diverse. They adjusted the “temperature” setting on the models, which is a mechanism that controls the level of randomness in the text generation process. Low temperatures produce highly predictable text, while high temperatures introduce more random word choices.

While increasing the randomness did make the responses more varied, it quickly caused the models to produce nonsensical gibberish. These random responses no longer fulfilled the basic requirements of the creative prompts. True creativity requires an idea to be both novel and appropriate for the situation, so generating gibberish does not count as a successful creative output.

The researchers also tried changing the initial instructions given to the models. They explicitly commanded the chatbots to act as creative assistants and provide bold, outside-the-box answers. This minorly improved individual originality but completely failed to fix the broader issue of uniformity, as the models still produced responses similar to one another.

These findings suggest that relying on generative AI for brainstorming or problem-solving could limit the scope of human creativity. If everyone uses these tools to help write drafts or generate ideas, society might see a massive narrowing of concepts.

“If you are using AI chatbots (which are built on LLMs) for creative tasks, know that the results you get from these models will likely look very similar to the results someone else would get from an AI chatbot, even if it’s different from the one you used. If you want your content to be truly unique, probably shy away from using an AI chatbot to generate it.”

The researchers note some potential misinterpretations and limitations of their work. The study only measured performance on specific verbal creativity tasks, which means the results might not apply to all forms of creative behavior. For example, language models might not show the same homogenization when asked to perform generic non-verbal tasks like drawing or composing music.

Additionally, the scientists only tested commercially available models that have been programmed to follow strict safety and conversational guidelines. This safety training is known to affect how models behave in experimental settings. It is possible that raw, unaligned models might display different creative properties, though most everyday users do not have access to these raw versions.

Future research will need to explore other dimensions of creativity, such as fluency and flexibility, rather than just originality. Fluency refers to the sheer number of ideas generated, while flexibility refers to the variety of categories those ideas cover. The scientists also hope to investigate the extent of this homogenization across other types of artificial intelligence and explore potential engineering solutions to mitigate the problem.

The study, “Large language models are homogeneously creative,” was authored by Emily Wenger and Yoed N. Kenett.

RELATED

Artificial intelligence flatters users into bad behavior
Artificial Intelligence

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

June 1, 2026
Brain scans identify the neural network that traps anxious people in cycles of self-blame
ADHD Research News

Irregular brain maturation in childhood predicts emotional habits in early adolescence

May 31, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

New research reveals how humans judge the moral minds of artificial intelligence

May 30, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

Training AI chatbots to be warm and empathetic makes them less factually accurate

May 29, 2026
New Habsburg research reveals reproductive consequences of royal inbreeding
Artificial Intelligence

Machine learning uncovers how childhood trauma amplifies genetic risks for depression

May 27, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

A new study mapped 350,000 relationship stories and found a communication style AI struggles to copy

May 24, 2026
New study links manipulative personality traits to lower relationship intimacy expectations
Artificial Intelligence

Brain scans shed light on why women develop romantic feelings for AI companions

May 22, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
ADHD Research News

A new AI tool spots hidden signs of adult ADHD months before a formal diagnosis

May 21, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • More than half of adults with ADHD in clinical settings have a co-occurring personality disorder
  • New study links parental indulgence to psychopathic and narcissistic traits in adulthood
  • How learning to read alters the brain’s approach to spoken language
  • The psychology of paradoxical thinking: Extreme arguments in favor of a controversial topic can reduce overall support
  • Men’s sexual desire peaks around age 40, large new study finds

Science of Money

  • Class isn’t dead: Your job title still predicts your wealth in Europe, a five-country study finds
  • Packing products tightly on shelves makes shoppers grab more flavors
  • When your job feels scriptable: How routine work and AI anxiety drain employee energy
  • Childhood obesity and the American Dream: New research links early weight to lower lifetime mobility
  • The brain chemical behind your money moves: How dopamine shapes financial choices

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc