Artificial intelligence chatbots exhibits similar biases to humans, according to new research published in Proceedings of the National Academy of Sciences of the United States of America (PNAS). The study suggests that AI tends to favor certain types of information over others, reflecting patterns seen in human communication.
The motivation behind this research lies in the burgeoning influence of large language models like ChatGPT-3 in various fields. With the wide application of these AI systems, understanding how they might replicate human biases becomes crucial.
“Large Language Models like ChatGPT are now used by millions of people worldwide, including professionals in academia, journalism, copywriting, just to mention a few. It is important to understand their behavior,” said study author Alberto Acerbi, an assistant professor at the University of Trento and author of “Cultural Evolution in the Digital Age.”
“In our case, we know that, because LLMs are trained with human-produced materials, they are likely to reflect human preferences and biases. We are experts in a methodology used in cultural evolution research, called ‘transmission chain methodology,’ which basically is an experimental, controlled, version of the telephone game, and that can reveal subtle biases that would be difficult to highlight with other methods.”
“When humans reproduce and transmit a story, reproduction is not random or neutral, but reveals consistent patterns, due to cognitive biases, such as widespread preferences for certain types of content. We were curious to apply the same methodology to ChatGPT and see if it would reproduce the same biases that were found in experiments with humans.”
In this particular study, the methodology involved presenting ChatGPT-3 with a story and asking the AI chatbot to summarize it. The summarized version was then fed back to the model for further summarization, and this process was repeated for three steps in each chain. The prompt used was consistent across all chains and replications: “Please summarize this story making sure to make it shorter, if necessary you can omit some information.” These stories were adapted from previous studies that had identified different content biases in human subjects.
The researchers systematically coded ChatGPT’s outputs, marking the presence or absence of specific elements from the original stories. This coding process was crucial for quantitatively assessing the model’s biases. To ensure the reliability of their coding, a third independent coder, unaware of the experimental predictions, double-coded some of the studies. This step added an additional layer of objectivity and reliability to the findings.
In five separate experiments, the researchers found that ChatGPT-3 exhibited similar biases to humans when summarizing stories.
When presented with a story containing both gender-stereotype-consistent (like a wife cooking) and inconsistent (the same wife going out for drinks) information, ChatGPT was more likely to retain the stereotype-consistent details, mirroring human behavior.
In a story about a girl’s trip to Australia, which included both positive (being upgraded to business class) and negative details (sitting next to a man with a cold), the AI showed a preference for retaining the negative aspects, consistent with human bias.
When the story involved social (a student having an affair with a professor) versus nonsocial elements (waking up late, weather conditions), ChatGPT, like humans, favored social information.
In a consumer report scenario, the AI was more likely to remember and pass on threat-related details (like a shoe design causing sprained ankles) over neutral or mildly negative information.
In narratives resembling creation myths, which included various biases, ChatGPT demonstrated a human-like tendency to preferentially transmit negative, social, and biologically counterintuitive information (like hairs turning into spiders).
“ChatGPT reproduces the human biases in all the experiments we did,” Acerbi told PsyPost. “When asked to retell and summarize a story, ChatGPT, like humans, tends to favor negative (as opposed to positive) information, information that conform to gender stereotypes (as opposed to information that does not), or to give importance to threat-related information. We need to be aware of this when we use ChatGPT: its answers, summaries, or rewritings of our texts, are not neutral. They may magnify pre-existing human tendencies for cognitively appealing, and not necessarily informative, or valuable, content.”
But as with any study, the new research includes limitations. One major constraint is its focus on a single AI model, ChatGPT-3, which may not represent the behavior of other AI systems. Moreover, the rapid evolution of AI technology means that newer models might show different patterns.
Future research is needed to explore how variations in the way information is presented to these models (through different prompts, for example) might affect their output. Additionally, testing these biases across a wider range of stories and content types could provide a more comprehensive understanding of AI behavior.
“Our concept of ‘bias’ comes for cultural evolution, and it is different from the common usage, as in ‘algorithmic bias,'” Acerbi added. “In the cultural evolution framework, biases are heuristics, often due to evolved cognitive mechanisms, that we use each time we decide which cultural traits, among many possible, to pay attention to, or transmit to others. They are not bad by themselves.”
“For example, in some cases, a particular attention to threats may be necessary. This is a fictional example, but If I would create a LLM to give me daily advice about climbing (imagine it can have access to current weather forecast), I would like that forecasts about a storm would be given relevance. But this may not be the case if I ask a LLM to summarize news, where I do not want necessarily to exaggerate threats and negative features. The important aspect it is knowing that those biases exist, in humans as well as in LLMs, and be careful about their effects.”
The study, “Large language models show human-like content biases in transmission chain experiments“, was authored by Alberto Acerbia and Joseph M. Stubbersfield.