PsyPost
  • Mental Health
  • Social Psychology
  • Cognitive Science
  • Neuroscience
  • About
No Result
View All Result
Join
My Account
PsyPost
No Result
View All Result
Home Exclusive Artificial Intelligence

Trump’s speeches stump AI: Study reveals ChatGPT’s struggle with metaphors

by Eric W. Dolan
July 15, 2025
Reading Time: 4 mins read
[Photo by Gage Skidmore]

[Photo by Gage Skidmore]

Share on TwitterShare on Facebook

President Donald Trump’s political speeches recently served as a testing ground for the capabilities and limitations of large language models. By analyzing the metaphors embedded in four major speeches, researchers not only gained insight into Trump’s rhetorical strategies but also exposed key weaknesses in artificial intelligence systems like ChatGPT when it comes to understanding figurative language in political contexts. Their findings are published in Frontiers in Psychology.

Large language models, or LLMs, are computer programs trained to understand and generate human language. They work by analyzing vast amounts of text—such as books, websites, and conversations—and learning statistical patterns in how words and sentences are used. LLMs like ChatGPT can write essays, summarize documents, answer questions, and even hold conversations that feel natural.

However, they do not truly understand language the way humans do. Instead, they rely on pattern recognition to predict what words are likely to come next in a sentence. This can lead to convincing results in many situations, but it also means the models can misinterpret meaning, especially when language is abstract or emotionally charged.

To test how well a large language model can detect metaphors in political speech, the researchers selected four of Donald Trump’s speeches from mid-2024 to early 2025. These included his Republican nomination acceptance speech after surviving an assassination attempt, his post-election victory remarks, his inaugural address, and his speech to Congress. These texts, totaling over 28,000 words, were chosen because they are filled with emotionally charged and ideologically driven language, often using metaphor to frame political issues in ways that resonate with supporters.

The researchers used a method called critical metaphor analysis to examine the text. This method focuses on how metaphors influence political thinking and shape public attitudes. They then adapted this method for use with ChatGPT-4, prompting the model to go through a step-by-step process: understand the context of the speech, identify potential metaphors, categorize them by theme, and explain their likely emotional or ideological impact.

The large language model was able to detect metaphors with moderate success. Out of 138 sampled sentences, it correctly identified 119 metaphorical expressions, giving it an accuracy rate of around 86 percent. But a closer look revealed several recurring problems in the model’s reasoning. These issues provide insight into the limitations of artificial intelligence when it tries to interpret complex human communication.

One of the most common mistakes was confusing metaphors with other forms of expression, such as similes. For example, the model misinterpreted the phrase “Washington D.C., which is a horrible killing field” as metaphorical when it is more accurately described as a literal, emotionally charged comparison. The model also tended to overanalyze simple expressions.

In one case, it flagged the phrase “a series of bold promises” as metaphorical, interpreting it as a spatial metaphor when no such figurative meaning was intended. The model also struggled to correctly classify names and technical terms. For instance, it treated “Iron Dome,” the name of Israel’s missile defense system, as a metaphor instead of a proper noun.

Google News Preferences Add PsyPost to your preferred sources

These missteps show that while LLMs can detect surface-level patterns, they often lack the ability to understand meaning in context. Unlike humans, they do not draw on lived experience, cultural knowledge, or emotional nuance to make sense of language. This becomes especially apparent when analyzing political rhetoric, where metaphor is often used to tap into shared feelings, histories, and identities.

The study also tested the model’s ability to categorize metaphors based on shared themes or “source domains.” These categories include concepts like Force, Movement and Direction, Health and Illness, and the Human Body. For example, Trump frequently used phrases like “We rise together,” “Unlock America’s glorious destiny,” and “Bring law and order back,” which were successfully classified as Movement or Force metaphors. These metaphors help convey ideas of progress, strength, and control—key themes in campaign messaging.

However, the model performed poorly in less common or more abstract categories, such as Cooking and Food or Plants. In the Plants category, it failed to detect any relevant metaphors at all. In Cooking and Food, it produced several false positives, identifying metaphors that human reviewers judged to be literal. These results suggest that LLMs are more reliable when working with familiar, frequently used metaphor types and less reliable in areas that require nuanced understanding or cultural context.

To verify their findings, the researchers compared the AI-generated results with those produced by traditional metaphor analysis tools, such as Wmatrix and MIPVU. The results were strongly correlated overall, but some differences stood out. ChatGPT was faster and easier to use, but its accuracy varied widely across metaphor categories. In contrast, the traditional methods were slower but more consistent in identifying metaphors across all categories.

Another issue the study uncovered is that LLM performance depends heavily on how prompts are written. Even small changes in how a question is asked can affect what the model produces. This lack of stability makes it harder to reproduce results and undermines confidence in the model’s reliability when dealing with sensitive material like political speech.

The researchers also noted broader structural problems in how LLMs are trained. These models rely on enormous datasets scraped from the internet, much of which is uncurated and not annotated for meaning. As a result, LLMs may lack exposure to metaphorical language in specific cultural, historical, or political contexts. They may also pick up and reproduce existing biases related to gender, race, or ideology—especially when processing emotionally or politically loaded texts.

The researchers conclude that while large language models show promise in analyzing metaphor, they are far from replacing human expertise. Their tendency to misinterpret, overreach, or miss subtleties makes them best suited for assisting researchers rather than conducting fully automated analysis. In particular, political metaphors—which often rely on shared cultural symbols, deep emotional resonance, and implicit ideological framing—remain difficult for these systems to understand.

The study, “Large language models prompt engineering as a method for embodied cognitive linguistic representation: a case study of political metaphors in Trump’s discourse,” was authored by Haohan Meng, Xiaoyu Li, and Jinhua Sun.

RELATED

Artificial intelligence flatters users into bad behavior
Artificial Intelligence

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

June 1, 2026
Brain scans identify the neural network that traps anxious people in cycles of self-blame
ADHD Research News

Irregular brain maturation in childhood predicts emotional habits in early adolescence

May 31, 2026
Live music causes brain waves to synchronize more strongly with rhythm than recorded music
Artificial Intelligence

New research reveals how humans judge the moral minds of artificial intelligence

May 30, 2026
Study links phubbing sensitivity to attachment patterns in romantic couples
Artificial Intelligence

Training AI chatbots to be warm and empathetic makes them less factually accurate

May 29, 2026
New Habsburg research reveals reproductive consequences of royal inbreeding
Artificial Intelligence

Machine learning uncovers how childhood trauma amplifies genetic risks for depression

May 27, 2026
People cannot tell AI-generated from human-written poetry and they like AI poetry more
Artificial Intelligence

A new study mapped 350,000 relationship stories and found a communication style AI struggles to copy

May 24, 2026
New study links manipulative personality traits to lower relationship intimacy expectations
Artificial Intelligence

Brain scans shed light on why women develop romantic feelings for AI companions

May 22, 2026
Listening to Joe Rogan predicts belief in extraterrestrial UFOs, study finds
Donald Trump

Listening to Joe Rogan was a stronger predictor of a Trump vote than watching Fox News

May 21, 2026

Follow PsyPost

The latest research, however you prefer to read it.

Daily newsletter

One email a day. The newest research, nothing else.

Google News

Get PsyPost stories in your Google News feed.

Add PsyPost to Google News
RSS feed

Use your favorite reader. We also syndicate to Apple News.

Copy RSS URL
Social media
Support independent science journalism

Ad-free reading, full archives, and weekly deep dives for members.

Become a member

Trending

  • More than half of adults with ADHD in clinical settings have a co-occurring personality disorder
  • New study links parental indulgence to psychopathic and narcissistic traits in adulthood
  • How learning to read alters the brain’s approach to spoken language
  • The psychology of paradoxical thinking: Extreme arguments in favor of a controversial topic can reduce overall support
  • Men’s sexual desire peaks around age 40, large new study finds

Science of Money

  • Class isn’t dead: Your job title still predicts your wealth in Europe, a five-country study finds
  • Packing products tightly on shelves makes shoppers grab more flavors
  • When your job feels scriptable: How routine work and AI anxiety drain employee energy
  • Childhood obesity and the American Dream: New research links early weight to lower lifetime mobility
  • The brain chemical behind your money moves: How dopamine shapes financial choices

PsyPost is a psychology and neuroscience news website dedicated to reporting the latest research on human behavior, cognition, and society. (READ MORE...)

  • Mental Health
  • Neuroimaging
  • Personality Psychology
  • Social Psychology
  • Artificial Intelligence
  • Cognitive Science
  • Psychopharmacology
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and conditions
  • Do not sell my personal information

(c) PsyPost Media Inc

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

Subscribe
  • My Account
  • Cognitive Science Research
  • Mental Health Research
  • Social Psychology Research
  • Drug Research
  • Relationship Research
  • About PsyPost
  • Contact
  • Privacy Policy

(c) PsyPost Media Inc