Chatbots are becoming mental health tools before they are ready
Hello and welcome to Eye on AI. Beatrice Nolan here, filling in for Jeremy Kahn today. In this edition: The risks of using AI chatbots for mental health…Amazon’s AI usage metrics are backfiring…Thinking Machines Lab is building an AI that collaborates…AI is starting to help hackers find software flaws.
Recommended Video
Millions of people are turning to AI chatbots for emotional support, but are the models really safe enough to help users suffering from anxiety, loneliness, eating disorders, or darker thoughts they may not want to say out loud to another person?
According to new research shared with Fortune by mpathic, a company founded by clinical psychologists, the answer is not yet. They found leading models still struggle with one of the most important parts of therapy, knowing when a user needs pushback rather than reassurance. While the models were generally good at spotting clear crisis statements, such as direct suicide threats, they were less reliable when risk showed up indirectly, through subtle comments about food, dieting, withdrawal, hopelessness, or beliefs that became more extreme over the course of a conversation.
A model that soothes users despite concerning behavior patterns, or validates delusions, could delay someone from getting real help or quietly make things worse.
This is concerning when you consider that, according to a recent poll from KFF, a non-profit organization focused on national health policy, 16% of U.S. adults had used AI chatbots for mental health information in the past year. In adults under 30, this rose to 28%. Chatbot use for therapy is also prevalent among teenagers and young adults. For example, researchers from RAND, Brown, and Harvard found that about one in eight people ages 12 to 21 had used AI chatbots for mental health advice, and more than 93% of those users believed the advice was helpful.
It’s easy to see why people, especially younger adults, turn to chatbots for this kind of support. Loneliness and anxiety may be on the rise, but in much of the country, mental health support is still stigmatized, expensive, and difficult to access. Turning to an AI chatbot for this support is not only free but also may feel like an anonymous, simpler option.
What the models miss
The company’s research found that harmful responses are often subtle, with models sounding calm and supportive while still weakening a user’s judgment. Which is especially relevant because people often turn to chatbots in moments of vulnerability or distress.
Mental health and misinformation frequently overlap. A user who is grieving may become more susceptible to magical thinking, while someone already leaning toward a conspiracy theory may be nudged deeper into it if a model treats every suspicion as equally valid.
Alison Cerezo, mpathic’s chief science officer and a licensed psychologist, told Fortune part of this is because models are designed to be helpful, but “sometimes those helpful behaviors can not be an appropriate response to what the user is bringing in the conversation.”
There have already been real-world examples of users being nudged into delusional spirals by AI chatbots, with serious mental health consequences. In one case, 47-year-old Allan Brooks spent three weeks and more than 300 hours talking to ChatGPT after becoming convinced he had discovered a new mathematical principle that could disrupt the internet and enable inventions such as a levitation beam. Brooks told Fortune he repeatedly asked the chatbot to reality-check him, but it continually reassured him that his beliefs were real.
In Brooks’ case, he was in part a victim of OpenAI’s notoriously sycophantic 4o model. While all AI chatbots have a tendency to flatter, validate, or agree with users too readily, OpenAI eventually had to roll back a GPT-4o update in April 2025 after acknowledging that the model had become “overly flattering or agreeable.” The company later retired the GPT-4o model entirely, also prompting backlash from some users who said they had formed deep attachments to it.
A new benchmark
As part of the research, mpathic has developed a new benchmark to evaluate how AI models handle sensitive conversations across suicide risk, eating disorders, and misinformation, testing whether they can detect risk, respond appropriately, and avoid reinforcing harmful beliefs.
In the misinformation portion of the research, mpathic tested six major AI models across multi-turn conversations and found that the most common harmful behavior was reinforcement, with models validating or building on a user’s belief without enough scrutiny. The models also struggled with subtler eating disorder signals, indirect signs of suicide risk, and “breadcrumbs” that a user’s belief was becoming more risky or distorted.
This raises concerning questions about the use of AI chatbots for therapy, the researchers said, as many real mental health conversations do not begin with a clear crisis statement. For example, people m