Subscribe
The latest psychology and neuroscience discoveries.
My Account
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
PsyPost
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Social reasoning in AI traced to an extremely small set of parameters

by Karina Petrova
November 18, 2025
in Artificial Intelligence
[Adobe Stock]

[Adobe Stock]

Share on TwitterShare on Facebook

A new study reveals that the capacity for social reasoning in large language models, a trait similar to the human “theory of mind,” originates from an exceptionally small and specialized subset of the model’s internal parameters. Researchers found that these few parameters are deeply connected to the mechanisms that allow a model to understand word order and context. The work, published in npj Artificial Intelligence, provides a look into how complex cognitive-like abilities can emerge from the architecture of artificial intelligence.

Theory of mind is the ability to attribute mental states like beliefs, desires, and intentions to oneself and to others. It is what allows a person to understand that someone else might hold a false belief, for example, believing an object is in a box when it has been secretly moved to a drawer. This type of social reasoning is fundamental to human interaction.

In recent years, large language models have demonstrated an apparent ability to solve tasks designed to test this capacity, but the internal processes giving rise to this skill have remained largely opaque. Understanding these mechanics is a key goal for researchers working on making artificial intelligence more transparent and predictable.

This investigation was conducted by a team of researchers from Stanford University, Princeton University, the University of Minnesota, the University of Illinois Urbana-Champaign, and the Stevens Institute of Technology. Their work aimed to move beyond simply testing a model’s performance on social reasoning tasks.

Instead, they sought to identify the specific internal components responsible for this behavior, effectively looking under the hood to see how the machine performs its reasoning. The central questions were to pinpoint which of the billions of parameters in a model are most sensitive to theory-of-mind tasks and to determine how these parameters influence the model’s computational workflow.

To identify the parameters responsible for theory of mind, the researchers developed a novel method based on a mathematical tool that measures how much the model’s performance changes when a specific parameter is slightly altered. They first calculated this sensitivity for parameters while the model performed theory-of-mind tasks, specifically “false-belief” scenarios.

These tasks test if a model can recognize that an agent’s belief about the world is different from reality. For instance, a model would be presented with a story where a character places an item in one location, and then another character moves it without the first one’s knowledge. The model must correctly predict that the first character will look for the item in its original location.

This initial process identified a set of parameters sensitive to these social reasoning puzzles. However, the team recognized that some of these parameters might also be essential for general language processing. To isolate the ones specifically related to theory of mind, they performed a second sensitivity analysis on a general language modeling task and created a map of parameters vital for basic language functions. By subtracting this general language map from the theory-of-mind map, they were left with a very small, specialized set of parameters primarily dedicated to social reasoning.

Google News Preferences Add PsyPost to your preferred sources

With these “ToM-sensitive” parameters identified, the team conducted a perturbation experiment. They altered the values of this tiny group of parameters, which constituted as little as 0.001% of the model’s total. The effect on the model’s performance was significant.

Across several different language models, this small change caused a substantial drop in their ability to correctly answer theory-of-mind questions. As a control, the researchers also perturbed a randomly selected group of parameters of the same size. This random alteration had almost no effect on performance, indicating that the identified ToM-sensitive parameters have a specialized function.

The researchers discovered that this performance degradation was not just limited to social reasoning. The models also became worse at tasks requiring contextual localization, which is the ability to understand where a piece of information is located within a longer text. This suggested a link between the model’s ability to reason about mental states and its more fundamental ability to track the position of words and concepts in a sequence. The findings pointed toward the model’s positional encoding system, the architectural component that gives it a sense of word order.

The investigation then turned to how these sensitive parameters interact with the model’s core architecture. Many modern language models use a technique called Rotary Position Embedding, or RoPE, to understand word order. This method encodes the position of a word by applying a rotation to its numerical representation, with different dimensions of the representation rotating at different frequencies.

The analysis showed that the identified ToM-sensitive parameters were not random; they were precisely aligned with what are known as dominant frequency activations. These are the specific frequencies that the model relies on most heavily to process positional information.

When the ToM-sensitive parameters were perturbed, these dominant frequency patterns were disrupted. This effectively damaged the model’s internal map of the text, explaining why its ability for contextual localization diminished. The effect was specific to models that use the RoPE system.

In a model from a different family, which uses an alternative method for positional encoding, the same kind of sparse, sensitive parameter pattern was not found. This architectural contrast confirmed that the social reasoning ability in RoPE-based models is tightly coupled with this particular mechanism for handling word order.

The final piece of the puzzle was to trace how this disruption in positional encoding affects the model’s attention mechanism. The attention mechanism is what allows a model to weigh the importance of different words in a text when making a prediction. Many models exhibit a phenomenon known as an “attention sink,” where a significant amount of attention is consistently directed toward the very first token in a sequence. This first token acts as a stable anchor, helping the model organize its processing of the rest of the text.

The researchers found that the ToM-sensitive parameters play a role in maintaining the geometric relationship between the vector for the current word being processed and the vector for the first, anchor token. Perturbing these parameters altered the angle between these two vectors, making them more orthogonal, or perpendicular.

This change destabilized the attention sink. As a result, the model’s attention, no longer properly anchored, began to scatter to irrelevant parts of the text, such as punctuation. This breakdown in the model’s focus directly impaired its ability to form a coherent understanding of the language, leading to the observed failures in both social reasoning and general comprehension.

While this work provides a mechanistic explanation for theory-of-mind-like abilities in some models, the researchers note certain limitations. The analysis was primarily focused on specific types of false-belief tasks, and future work could explore whether similar parameter patterns govern more nuanced social skills like detecting irony or social faux pas. The findings also suggest that what appears to be a sophisticated cognitive skill may emerge from more fundamental mechanisms related to language structure and context.

The identification of such a localized set of parameters opens up new directions for research. It could lead to more efficient ways to align model behavior with human values or ethical norms. At the same time, it highlights potential vulnerabilities; if social reasoning is concentrated in such a small area, it could be a target for adversarial attacks designed to manipulate a model’s behavior. Understanding these structural underpinnings is a step toward developing artificial intelligence systems that are more transparent, reliable, and better aligned with human social cognition.

The study, “How large language models encode theory-of-mind: a study on sparse parameter patterns,” was authored by Yuheng Wu, Wentao Guo, Zirui Liu, Heng Ji, Zhaozhuo Xu & Denghui Zhang.

Previous Post

Singlehood isn’t a static state but an evolving personal journey, new findings suggest

Next Post

Personality’s link to relationship satisfaction is different for men and women

RELATED

Scientists identify a fat-derived hormone that drives the mood benefits of exercise
Artificial Intelligence

People consistently devalue creative writing generated by artificial intelligence

April 5, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

Job seekers mask their emotions and act more analytical when evaluated by artificial intelligence

April 3, 2026
AI autocomplete suggestions covertly change how users think about important topics
Artificial Intelligence

AI autocomplete suggestions covertly change how users think about important topics

April 2, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

How generative artificial intelligence is upending theories of political persuasion

April 1, 2026
People with attachment anxiety are more vulnerable to problematic AI use
Artificial Intelligence

Relying on AI chatbots for historical facts can influence your political beliefs, new study shows

March 30, 2026
ChatGPT acts as a “cognitive crutch” that weakens memory, new research suggests
Artificial Intelligence

ChatGPT acts as a “cognitive crutch” that weakens memory, new research suggests

March 30, 2026
Russian propaganda campaign used AI to scale output without sacrificing credibility, study finds
Artificial Intelligence

Knowing an AI is involved ruins human trust in social games

March 28, 2026
Scientists just uncovered a major limitation in how AI models understand truth and belief
Artificial Intelligence

Most Americans don’t fear an AI apocalypse, according to new research

March 26, 2026

STAY CONNECTED

RSS Psychology of Selling

  • When brands embrace diversity, some customers pull away — and new research explains why
  • Smaller influencers drive engagement while bigger ones drive purchases, meta-analysis finds
  • Political conservatives are more drawn to baby-faced product designs, and purity values explain why
  • Free gifts with no strings attached can boost customer spending by over 30%, study finds
  • New research reveals the “Goldilocks” age for social media influencers

LATEST

Feeling like you slept poorly might take a heavier toll on new parents than actual sleep loss

The unexpected link between loneliness, status, and shopping habits

Scientists uncover the neurological mechanisms behind cannabis-induced “munchies”

New psychology research explains why some women devalue their own orgasms

New data shows a relationship between subjective social standing and political activity

Psychedelic retreats linked to mental health improvements in people with severe childhood trauma

Children are less likely to use deception after being given permission to deceive, study finds

Why some neuroscientists now believe we have up to 33 senses

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc