PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Large language models outperform experts in predicting neuroscience discoveries

by Mane Kara-Yakoubian
January 16, 2025
Reading Time: 2 mins read
(Photo credit: Adobe Stock)

(Photo credit: Adobe Stock)

Share on TwitterShare on Facebook

Large language models surpass human experts in predicting neuroscience results, according to a study published in Nature Human Behaviour.

Scientific research is increasingly challenging due to the immense growth in published literature. Integrating noisy and voluminous findings to predict outcomes often exceeds human capacity. This investigation was motivated by the growing role of artificial intelligence in tasks such as protein folding and drug discovery, raising the question of whether LLMs could similarly enhance fields like neuroscience.

Xiaoliang Luo and colleagues developed BrainBench, a benchmark designed to test whether LLMs could predict the results of neuroscience studies more accurately than human experts. BrainBench included 200 test cases based on neuroscience research abstracts. Each test case consisted of two versions of the same abstract: one was the original, and the other had a modified result that changed the study’s conclusion but kept the rest of the abstract coherent. Participants—both LLMs and human experts—were tasked with identifying which version was correct.

The study involved 171 human participants, all neuroscience experts with an average of 10 years of experience, including doctoral students, postdoctoral researchers, and academic staff. On the computational side, general-purpose LLMs were tested alongside BrainGPT, a specialized model fine-tuned with over 1.3 billion tokens from neuroscience literature. BrainBench covered five major subfields of neuroscience (behavioral/cognitive, cellular/molecular, systems/circuits, neurobiology of disease, and development/plasticity/repair), to ensure a comprehensive assessment.

To evaluate LLMs, the researchers used a metric called “perplexity,” which measures how well the models predict text sequences, while human accuracy was measured based on correct answers. The researchers also ensured the test items were not present in the LLMs’ training data, eliminating concerns about memorization.

LLMs significantly outperformed human experts in predicting neuroscience study outcomes. On average, LLMs achieved 81.4% accuracy, compared to 63.4% for human participants. BrainGPT, the model fine-tuned with neuroscience knowledge, performed even better, improving accuracy by 3% over general-purpose LLMs. This specialized training allowed BrainGPT to excel across all five neuroscience subfields included in the benchmark.

One key advantage of the LLMs was their ability to integrate information from the entire abstract, including the background and methods, rather than relying on isolated details. When tested with only the results section, their accuracy dropped, demonstrating the importance of contextual understanding. Human experts, by contrast, struggled to achieve the same level of integration. Additionally, both humans and LLMs showed higher accuracy when they were confident in their predictions, but LLMs displayed better alignment between confidence and correctness.

Importantly, the study confirmed that LLMs’ success was not due to memorization but rather their ability to recognize patterns in neuroscience research, highlighting their potential to assist in scientific discovery.

Google News Preferences Add PsyPost to your preferred sources

The authors acknowledge that BrainBench, while innovative, is labor-intensive to create. Moreover, there is a risk that reliance on LLM predictions could discourage researchers from pursuing studies that contradict AI predictions, potentially stifling innovation.

The study, “Large language models surpass human experts in predicting neuroscience results,” was authored by Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Yáñez, and colleagues.

RELATED

Brain scans identify the neural network that traps anxious people in cycles of self-blame
ADHD Research News

Irregular brain maturation in childhood predicts emotional habits in early adolescence

May 31, 2026
Data from 560,000 students reveals a disturbing mental health shift after 2016
Anxiety

Undigested fructose linked to anxiety and brain inflammation

May 31, 2026
New psychology research flips the script on happiness and self-control
Cannabis

How a dose of medicinal cannabis alters brain waves during sleep

May 30, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

New research reveals how humans judge the moral minds of artificial intelligence

May 30, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Cognitive Science

How learning to read alters the brain’s approach to spoken language

May 29, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

Training AI chatbots to be warm and empathetic makes them less factually accurate

May 29, 2026
Hippocampal neurons shift their activity backward in time to anticipate rewards
Neuroimaging

Nanoplastics cause abnormal branch growth in neurons

May 28, 2026
New Habsburg research reveals reproductive consequences of royal inbreeding
Artificial Intelligence

Machine learning uncovers how childhood trauma amplifies genetic risks for depression

May 27, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • More than half of adults with ADHD in clinical settings have a co-occurring personality disorder
  • New study links parental indulgence to psychopathic and narcissistic traits in adulthood
  • How learning to read alters the brain’s approach to spoken language
  • The psychology of paradoxical thinking: Extreme arguments in favor of a controversial topic can reduce overall support
  • Men’s sexual desire peaks around age 40, large new study finds

Science of Money

  • Packing products tightly on shelves makes shoppers grab more flavors
  • When your job feels scriptable: How routine work and AI anxiety drain employee energy
  • Childhood obesity and the American Dream: New research links early weight to lower lifetime mobility
  • The brain chemical behind your money moves: How dopamine shapes financial choices
  • Can AI read the room? How news sentiment signals which stocks will bounce back after a crash

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc