In a new study published in Frontiers in Robotics and AI, researchers have demonstrated that robots equipped with the ability to express emotions in real-time during interactions with humans are perceived as more likable, trustworthy, and human-like. Utilizing advanced artificial intelligence, the study found that when robots displayed emotions that matched the context of their interaction with humans, participants rated their experience more positively and performed better in a collaborative task.
The motivation behind this innovative research stems from the growing integration of social robots into everyday human environments. As robots become more prevalent in settings ranging from homes to healthcare facilities, the need for them to understand and express human emotions has become increasingly important. Recognizing facial expressions and responding with appropriate emotional cues is crucial for building rapport, trust, and ease of communication between humans and robots.
Prior studies have shown that robots capable of exhibiting emotions are more likely to be accepted and liked by users. However, developing robots that can accurately model and express emotions in real-time interactions remains a complex challenge, prompting researchers to explore the potential of Large Language Models (LLMs) like GPT-3.5 for emotion generation in human-robot interactions.
“With the recent advances in LLMs, there is a significant focus on building the next generation of general-purpose robots. Many companies have already come forward with their prototypes and envision a large demand for such robots in the society,” explained study author Chinmaya Mishra, a postdoctoral researcher in the Multimodal Language Department at the Max Planck Institute for Psycholinguistics.
“With robots poised to have a greater presence in our society, it becomes increasingly necessary for them to display affective behavior. A robot exhibiting appropriate emotions is not only easier to understand, but it also affects the overall interaction experience by facilitating effective communication and a stronger rapport with humans.”
“Modelling affective behavior on robots is a difficult problem as it entails the robot being able to perceive human behavior, understand the message being conveyed, formulate an appropriate response, and express the emotion associated with it. Additionally, it is challenging to do so in real-time, which is crucial for a seamless human-robot interaction (HRI).”
“My interest for this topic was twofold: 1.) I wanted to leverage the power of LLMs and verify if it is feasible to be used for this type of problem and 2.) move away from platform dependent and computationally heavy models to a cloud-based architecture which can be used on any social robot platform out there,” Mishra said.
The study involved 47 participants who engaged in a unique affective image sorting game with a robot, designed to test the robot’s emotional expressiveness. The robot used for this study was a Furhat robot, known for its human-like head and facial expressions, capable of displaying a wide range of emotions through back-projected facial animations.
In the affective image sorting game, participants were presented with a series of affective images on a touchscreen, which they were tasked with sorting based on the emotions these images evoked, from least to most positive. The images, selected from established psychological datasets and the internet, were designed to elicit a wide range of emotional responses.
The robot, powered by GPT-3.5, interacted with participants, providing feedback and expressing emotions through facial expressions tailored to the ongoing dialogue. Each participant played the game under the three conditions: in the congruent condition, the robot’s facial expressions matched the predicted emotions based on the ongoing dialogue; in the incongruent condition, the expressions were deliberately opposite to the expected emotions; and in the neutral condition, the robot did not display any emotional expressions.
To assess the effectiveness of the robot’s emotional expressions, participants completed a questionnaire after interacting with the robot in each condition. Additionally, the sorting task scores provided objective data on the participants’ performance.

Mishra and his colleagues found that participants rated their experience with the robot more positively when it exhibited emotions congruent with the ongoing dialogue, as opposed to when the robot’s expressions were incongruent or when it displayed no emotional expressions at all.
Specifically, in the congruent condition, participants found these interactions to be more positive, emotionally appropriate, and indicative of a robot that was more human-like in its behavior. This suggests that the alignment of a robot’s non-verbal cues with the emotional context of an interaction plays a crucial role in how humans perceive and engage with robots.
Interestingly, the researchers also found that this emotional congruency not only improved participants’ perceptions of the robot but also positively impacted their performance on the task at hand. Participants achieved higher scores in the sorting game when interacting with the robot under the congruent condition, highlighting the practical benefits of emotionally expressive robots in collaborative tasks.
“It is possible to leverage LLMs to reliably appraise the context of a conversation and thereby decide on an appropriate emotion that robots should express during an interaction,” Mishra told PsyPost. “Emotional expressions by robots are perceived as intentional, and appropriate emotions have a positive influence on the experience and outcome of the interactions we have with robots. The real-time generation of these behaviors on robots makes it easier for us to understand and talk to them as they use these emotions to signal their internal state and intentions.”
“However, it is important to keep in mind that the robot’s understanding of a situation and the decision process in expressing the appropriate emotions are dependent on how a developer/ researcher builds the architecture. To emulate realistic behaviors on robots, we break down complex human behaviors into simplified bits. These simplified bits (one or a few of them) are then used to model a robot’s behavior. While they may look and feel appropriate, we are still ways away from being actually able to model robots with capabilities that are similar to humans.”
The study also explored the ways in which participants interpreted the robot’s emotional expressions, particularly in the incongruent condition. Some participants attributed complex emotional states to the robot, indicating a tendency to anthropomorphize robotic behavior and read deeper into the robot’s expressions. This finding suggests that humans are adept at seeking emotional coherence in interactions, even attributing human-like emotional complexity to robots based on their expressions.
“It was surprising to see participants attribute complex emotions to the robot’s behavior and relate to it,” Mishra said.
“For example, in one case in which the robot was instructed to display contradictory behavior, the robot smiled when describing a sad situation. The participant informed me that they thought the robot was perhaps feeling so sad that it was masking it by putting on a smile. They said that this is what they would do as well. In another case, the participant interpreted a robot’s smile as sarcasm.”
“This goes on to show, how powerful emotion expression on a robot can be,” Mishra told PsyPost. “Even though the people know that they are talking to a robot, they still relate to it as if it were real. Moreover, it also shows us how wired our brains are to interpret emotions during interactions.”
Despite the promising results, the study encountered several limitations. Technical issues such as delays in the robot’s response times due to API call lags and the inability of GPT-3.5 to consider longer conversational history for emotion prediction were noted. Furthermore, the study’s design limited the range of emotions to basic categories, potentially overlooking the nuances of human emotional expression.
“A key limitation would be the usage of text-only modality in the current study,” Mishra explained. “Human emotions are multi-modal, involving the display and interpretation of many behaviors such as facial expressions, speech, gestures, posture, and context. I believe that this would be overcome in the coming days with the introduction and advances in Multi-modal LLMs.”
“Another caveat would be the dependency on LLM API providers such as OpenAI. There is a serious lack of publicly accessible LLM APIs that are comparable to what is commercially available. This restricts the usage and research on this topic to only groups/individuals who can afford the price.”
Future research could explore more sophisticated models capable of incorporating a wider range of emotions and multimodal inputs, including facial expressions and body language, to create even more nuanced and effective emotional interactions between humans and robots.
“In the long-term, I want to improve the models of affective behavior for robots by making them more multi-modal,” Mishra said. “This would make them more human-like and appropriate during HRI.”
The study, “Real-time emotion generation in human-robot dialogue using large language models“, was authored by Chinmaya Mishra, Rinus Verdonschot, Peter Hagoort, and Gabriel Skantze.