New research published in Nature Communications offers a detailed look at how the human brain supports natural conversation. By combining intracranial brain recordings with advanced language models, researchers found that speaking and listening engage widespread brain areas, especially in the frontal and temporal lobes. These brain signals not only corresponded to the words being spoken or heard, but also tracked the shifts between speaking and listening. The findings suggest that everyday conversation involves a tightly coordinated network that handles both the content of speech and the act of turn-taking.
Conversation is a dynamic, real-time activity, requiring the brain to continuously switch between understanding and producing language. Most past research has studied these abilities in isolation using artificial tasks—such as reading lists of words or repeating scripted sentences. These simplified experiments offer valuable insights but fall short of capturing the back-and-forth, free-flowing nature of real conversation. To move beyond this limitation, the authors of the new study used a unique approach. They recorded brain activity from people who were having spontaneous conversations, and then analyzed those signals using powerful natural language processing models.
“It’s fascinating to delve into the neural basis of natural conversation, especially now,” said study author Jing Cai, an instructor in the Neurosurgery Department at Massachusetts General Hospital.
“Studying the neural support for the potentially unlimited ways we produce and comprehend speech in natural conversation has long been a challenge. However, the recent advancements in natural language processing models have made it possible to directly investigate this neural activity. This feels like the right moment to leverage these powerful computational tools to unlock the neural secrets of how we communicate so fluidly.”
The researchers studied 14 people undergoing clinical treatment for epilepsy. As part of their medical care, these individuals had electrodes implanted in their brains to monitor seizures. With consent, researchers took advantage of this rare opportunity to record brain activity during natural conversation. Participants engaged in unscripted dialogues with an experimenter, talking about everyday topics like movies or personal experiences. These conversations lasted up to 90 minutes and included more than 86,000 words across all participants.
To analyze how the brain encoded these conversations, the researchers used a pre-trained artificial intelligence language model known as GPT-2, a type of natural language processing (NLP) model. NLP is a branch of artificial intelligence that focuses on enabling computers to understand and process human language. GPT-2 transforms each word into a high-dimensional vector based on its context within a sentence. These word embeddings capture complex features of language structure and meaning without relying on explicit linguistic rules. By comparing these embeddings to the brain activity recorded during speech production and comprehension, the team could assess which areas of the brain were tracking language in real time.
The results showed that both speaking and listening activated a widespread network of brain regions. Activity was especially prominent in the frontal and temporal lobes, including areas classically associated with language processing. The neural signals were not just a general response to speech but aligned closely with the specific sequence and context of the words being used. This was true regardless of whether a person was speaking or listening.
“One particularly striking aspect of our results was the alignment we observed between the patterns of activity in the human brain and the representations learned by the deep learning NLP models,” Cai told PsyPost. “The extent to which these artificial systems captured nuances of language processing that were reflected in neural activity during live conversation was quite surprising. This opens up exciting possibilities for future research to leverage these artificial systems as tools to further decode the brain’s intrinsic dynamics during communication.”
To confirm that the brain signals reflected meaningful language processing—and not just sound or motor activity—the researchers ran two control conditions. In one, participants listened to and repeated scripted sentences. In another, they spoke and heard pseudowords that mimicked English in rhythm and sound but had no real meaning. In both cases, the correspondence between brain activity and language model embeddings dropped sharply. This indicated that the observed neural patterns were specific to real, meaningful communication.
The study also explored how the brain handles the transitions between speaking and listening—an essential part of any conversation. Using precise timing data, the researchers identified when participants switched roles. They found distinct patterns of brain activity during these transitions. Some areas increased in activity before the person started speaking, while others changed when they began listening. Interestingly, many of these same areas also tracked the specific language content of the conversation, suggesting that the brain integrates information about both what is said and who is saying it.
Across all participants, 13% of electrode sites showed significant changes in brain activity during transitions from listening to speaking, and 12% during the opposite shift. These patterns varied across frequency bands and brain regions, and the differences were more pronounced at lower frequencies during the shift into listening. These signals overlapped with those involved in processing word meaning, suggesting that the brain uses shared circuits to manage both content and conversational flow.
The researchers also looked at how different types of brain activity correlated with different layers of the language model. Lower layers of the model represent individual words, while higher layers capture more complex, sentence-level meaning. The researchers found that brain activity during conversation aligned most strongly with higher layers of the model. This suggests that the brain is not simply reacting to individual words, but is also tracking the broader structure and meaning of what’s being said.
These findings held up across various models and participants. Whether the researchers used GPT-2, BERT, or other models with different sizes and training methods, they consistently found that brain activity reflected linguistic information. The percentage of neural sites showing correlations also rose with model complexity, strengthening the case that these models capture meaningful features of human language processing.
“Our findings showed the incredible complexity of something we do effortlessly every day: having a conversation,” Cai explained. “It reveals that when we speak and listen, vast and interconnected areas of our brain are actively involved in processing not just the words themselves, but also their specific meaning within the flow of the conversation and the role of who is speaking and who is listening. This research shows that even seemingly simple back-and-forth exchanges engage a dynamic and sophisticated neural orchestration, demonstrating the remarkable power of the human brain in enabling us to connect and communicate through language.”
But the study did have some limitations. The participants were patients with epilepsy, and electrode placement varied based on their clinical needs. This could affect how generalizable the findings are to the broader population. In addition, the models used were based on written text, not spoken language, meaning that prosody and tone were not captured. The researchers argue that this is just the beginning. Future work could explore how acoustic features influence neural responses, or even attempt to decode the meanings of thoughts from brain activity alone.
“Our work primarily serves as a demonstration of these differences rather than a deep dive into their fundamental mechanisms,” Cai said. “We need future investigations to identify the specific linguistic and cognitive elements. Further, we are relying on text-based NLP models, which means that we haven’t fully captured the richness of spoken language, as acoustic cues were not integrated into our analysis.”
“The next step involves semantic decoding. This means moving beyond simply identifying which brain regions are active during conversation and decoding the meaning of the words and concepts being processed. Ultimately, the combination of studies to reveal neural mechanism and decoding results could provide profound insights into the neural representation of language.”
“This is truly an exciting moment for neuroscience research in language,” Cai added. “The combination of intracranial recording techniques and the rapid advancements in artificial intelligence modeling offers remarkable opportunities to unravel the brain’s mechanisms for communication, and to develop useful tools to restore communicative abilities for those with impaired speech.”
The study, “Natural language processing models reveal neural dynamics of human conversation,” was authored by Jing Cai, Alex E. Hadjinicolaou, Angelique C. Paulk, Daniel J. Soper, Tian Xia, Alexander F. Wang, John D. Rolston, R. Mark Richardson, Ziv M. Williams, and Sydney S. Cash.