A team of scientists was recently able to reconstruct the Pink Floyd song from direct human neural recordings using predictive modeling. This groundbreaking achievement demonstrates the potential of predictive modeling techniques in decoding complex brain activity and translating it into meaningful information. Their findings, published in PLOS One, also provide valuable insights into how the human brain processes music.
For individuals with conditions like ALS, stroke, or paralysis, communication can be challenging. Current brain-machine interfaces for communication often produce robotic-sounding speech. The researchers wanted to explore whether music-related brain activity could be used to enhance the naturalness and expressiveness of speech generated by brain-computer interfaces, helping those with communication disabilities.
“Music is a core part of human experience, therefore understanding how our brains support music perception is of fundamental interest,” explained study author Ludovic Bellier, a senior computational research scientist at Inscopix. “On the applied side, this data was collected part of a broad effort to develop speech decoding brain-computer interfaces (BCI), with the idea that by understanding the neural code for music processing we could then include prosodic elements (the melody and rhythm of speech) in the speech output of these BCIs and have them sound less robotic and more natural, with intonation and emotions.”
The study involved 29 patients with pharmacoresistant epilepsy who had intracranial electroencephalography (iEEG) electrodes implanted in their brains for medical purposes. These electrodes were used to directly record the electrical activity of the patients’ brain cells.
These patients listened to Pink Floyd’s song “Another Brick in the Wall, Part 1.” While they listened to the music, the iEEG electrodes recorded the neural activity in their brains in real time. The researchers used predictive models to estimate what the Pink Floyd song would sound like based on the patterns of neural activity recorded by the electrodes. This approach involves building mathematical models that can predict or estimate one thing based on another.
The predictive models were trained to associate specific patterns of neural activity with corresponding parts of the song. Essentially, the models learned the relationship between brain activity and the music. Once the models were trained, they were applied to the recorded neural data to generate predictions of what the music should sound like.
By using these predictive models, the researchers were able to reconstruct the Pink Floyd song based on the neural recordings. In other words, they translated the electrical signals from the patients’ brains back into audible music that closely resembled the original song.
“We could reconstruct a recognizable song from the neural activity elicited by listening to the song,” Bellier told PsyPost. “It shows the feasibility of music decoding with relatively few data (3 minutes, dozens of electrodes), which paves the way towards including prosody into speech decoding devices. Also, we delineated further the neural dynamics of music perception, showing a bilateral processing although with a right hemisphere dominance, and evidencing a new cortical subregion tuned to musical rhythm.”
Music perception was found to rely on both hemispheres of the brain, but there was a preference for the right hemisphere. The right hemisphere had a higher proportion of electrodes with significant effects, higher prediction accuracy, and a greater impact on decoding models when electrodes were ablated. This confirmed prior research suggesting a relative right lateralization in music perception.
The researchers identified the superior temporal gyrus (STG) as a primary brain region involved in music perception. They were able to successfully reconstruct a recognizable song using data from a single patient who had 61 electrodes specifically located in this region.
In addition, the researchers classified the electrodes into different categories based on their functional properties. Sustained electrodes likely record neural signals related to continuous or prolonged aspects of music, while right rhythmic electrodes are associated with rhythmic aspects of the music, possibly capturing timing and beat-related information.
Selectively removing 167 sustained electrodes from the decoding model did not significantly affect the accuracy of song reconstruction. On the other hand, removing 43 right rhythmic electrodes had a notable negative effect on the accuracy of song reconstruction. This indicates that the neural signals associated with rhythmic aspects of the music, as captured by these electrodes, played a crucial role in the accurate reconstruction of the song.
“We were surprised by the fact that removing all 167 ‘sustained’ electrodes from the decoding models did not impact decoding accuracy (while removing ‘just’ 43 right rhythmic electrodes did),” Bellier explained. “It really shows that it’s not how many electrodes per se that drives model performance, but rather the auditory information represented in the recorded neural activity, as well as where these electrodes are located; quality over quantity, if you will.”
The research could contribute to the development of advanced brain-machine interfaces that allow individuals to communicate more effectively by incorporating musical elements into speech output.
“It’s a wonderful result,” said Robert Knight, a UC Berkeley professor of psychology in the Helen Wills Neuroscience Institute and co-author of the study. “One of the things for me about music is it has prosody and emotional content. As this whole field of brain machine interfaces progresses, this gives you a way to add musicality to future brain implants for people who need it, someone who’s got ALS or some other disabling neurological or developmental disorder compromising speech output.”
“It gives you an ability to decode not only the linguistic content, but some of the prosodic content of speech, some of the affect. I think that’s what we’ve really begun to crack the code on.”
Regarding the study’s caveats, Bellier said that “with hindsight, we would have loved to collect information on how familiar patients were with this Pink Floyd song, and more importantly, what was their musical background. With that information, we could have tackled exciting questions such as whether musicians’ neural activity leads to higher decoding accuracy when reconstruction the song.”
“We would like to express gratitude to the patients would volunteered in our study, especially as they were going through a trying neurosurgical procedure,” Bellier added. “Without them this research could not have been done, and it’s not everyday that they can hear back about the research they participated in, so this would make for a nice gesture.”
The study, “Music can be reconstructed from human auditory cortex activity using nonlinear decoding models“, was authored by Ludovic Bellier, Anaïs Llorens, Déborah Marciano, Aysegul Gunduz, Gerwin Schalk, Peter Brunner, and Robert T. Knight.