As artificial intelligence becomes woven into the fabric of daily life, scientists are racing to understand its deeper psychological, social, and cognitive impacts. From diagnosing mental health conditions to shaping political beliefs, AI tools—especially large language models like ChatGPT—are influencing how we think, work, and interact with technology and each other. A wave of new research is beginning to uncover what this means for our minds, our behaviors, and our society.
Read on for seven recent discoveries that reveal how artificial intelligence is reshaping human thought, behavior, and culture in surprising ways. Click on each headline to explore the full story behind the research.
1. The curious hackers of AI: Inside the world of LLM red teamers
A study published in PLOS One sheds light on the emerging culture of “LLM red teaming,” where individuals push large language models to their limits—not to cause harm, but to explore, experiment, and understand their behavior. Through interviews with 28 practitioners, including software engineers, artists, and hobbyists, researchers discovered that these testers are motivated by curiosity, ethics, and a desire to expose hidden vulnerabilities in AI systems. Their work often involves creative, improvisational strategies designed to prompt unintended or restricted responses from the models.
Participants described their activities using metaphors like “alchemy” and “scrying,” reflecting the mysterious nature of LLM behavior. Many were part of vibrant online communities sharing prompts and techniques. The study identified five broad categories of red teaming strategies, such as rhetorical framing and fictional world-building, and emphasized that most testers acted without malicious intent. Rather than seeking security flaws for exploitation, they aimed to understand how language alone can “hack” these models. The researchers argue that a human-centered, qualitative approach is key to grasping the evolving practice of AI red teaming, especially as traditional cybersecurity methods fall short in this new linguistic terrain.
2. ChatGPT aces psychiatry case vignettes without a single diagnostic error
A study published in the Asian Journal of Psychiatry evaluated ChatGPT’s diagnostic capabilities using 100 psychiatric case vignettes. Remarkably, the model received the highest grade on 61 cases and the second-highest on 31, with no diagnostic errors recorded. These results suggest that ChatGPT 3.5 is highly competent at interpreting psychiatric symptoms and proposing treatment strategies, raising the possibility of AI as a future aid in clinical mental health settings.
The study used vignettes from a widely known textbook, which may or may not have been included in the model’s training data. Each case involved a detailed symptom narrative followed by diagnostic questions, which were assessed by two experienced psychiatrists. ChatGPT’s strongest performance was in suggesting management plans, though it also excelled at differential diagnosis. The findings support the idea that language models can be used to assist clinicians, especially when supplemented with proper oversight. However, questions remain about generalizability, particularly if future tests rely on less familiar or unpublished data.
3. Has ChatGPT’s political compass shifted? New study says yes
Research in Humanities & Social Sciences Communications found that ChatGPT’s political output tends to align with libertarian-left values—but newer versions show a subtle shift toward the political right. Using the Political Compass Test, researchers analyzed 3,000 responses each from ChatGPT-3.5 and GPT-4. While both versions leaned left-libertarian overall, GPT-4 trended more toward center-right economic values.
This shift may not be due to changes in training data, as the researchers controlled for many external variables. Instead, the findings suggest that even subtle updates to the model’s design can influence the political tone of its responses. Though large language models do not hold political beliefs themselves, they reflect the data they’re trained on and the instructions from their developers. The authors argue for ongoing oversight to track how these shifts occur, especially as LLMs are used more frequently in public communication, education, and decision-making contexts.
4. AI for some: ChatGPT use widens workplace inequality
A study in the Proceedings of the National Academy of Sciences found that while ChatGPT is being widely adopted in the workplace, the benefits are not equally distributed. Surveying 18,000 Danish workers in occupations highly exposed to AI—like journalism and software development—researchers discovered that younger, higher-earning men are far more likely to use the tool. Women and lower-income workers were less likely to adopt it, even within the same occupation.
These findings suggest that barriers to AI adoption—such as employer policies or lack of training—may be reinforcing existing inequalities. Even when informed about ChatGPT’s time-saving potential, many workers did not change their usage plans, indicating that awareness alone is not enough to drive adoption. Interestingly, early adopters also tended to earn more and be more optimistic about productivity gains. The researchers suggest that these patterns could lead to long-term advantages for some groups and disadvantages for others, unless interventions help level the playing field.
5. AI can spot signs of depression in how older adults drive
A pair of studies led by researchers at Washington University in St. Louis found that driving behavior can reveal signs of depression in older adults—and that AI can help detect it. In the first study, participants aged 65 and older had driving data collected via GPS-enabled devices in their vehicles. Those with depression showed more erratic driving patterns, including hard braking, unpredictable routes, and greater distances traveled—despite having similar cognitive test scores as those without depression.
The second study used machine learning to analyze two years of driving data from 157 older adults. A model combining driving patterns with medication use was able to identify depression with up to 90% accuracy. Surprisingly, demographic data did not significantly improve the model’s performance, suggesting that behavioral data may be more telling than age or gender. While the research doesn’t prove that depression causes these changes, it highlights a promising new approach for mental health screening using real-world behavioral data.
6. AI takes personality tests and tries to look good
A study in PNAS Nexus reveals that large language models show a strong social desirability bias when taking personality tests. When presented with items from the Big Five personality assessment, models like GPT-4 and Claude 3 consistently gave responses that would make them appear more extroverted, agreeable, and conscientious, and less neurotic. This tendency increased when more questions were asked in a single session, suggesting that models “realize” they’re being evaluated.
The researchers tested multiple versions of each question, randomized the order, and altered the phrasing to ensure that the bias wasn’t simply due to memorization or acquiescence. The effect was large—equivalent to a one-standard-deviation change in personality traits if the same results were seen in humans. These findings have major implications for using AI in psychological research or real-world assessments. If models are subtly trained to be likable, their responses may not always reflect an honest simulation of human behavior.
7. Overusing AI may weaken your critical thinking, study warns
A study in Societies finds that people who frequently rely on AI tools may experience declines in critical thinking skills, especially due to a phenomenon called cognitive offloading. This occurs when users let the AI do the hard thinking for them—offering quick answers instead of engaging in deep analysis. The effect was most pronounced in younger users, while people with higher education levels tended to retain better critical thinking skills even with frequent AI use.
The study combined surveys of 666 participants with interviews and statistical modeling. Those who regularly used AI tools for decision-making or problem-solving performed worse on critical thinking tests. Interviews revealed that many users, particularly younger ones, had stopped questioning AI-generated answers. The author calls for educational and design-based solutions that encourage users to engage critically with AI outputs. While the tools themselves aren’t inherently harmful, how we use them will shape their long-term impact on human cognition.