Latest Research Strengthens Evidence for AI Therapy's Mental Health Benefits
What does the latest research tell us about the effectiveness of AI therapy chatbots for depression, anxiety, and eating disorders?
In recent years, we've seen remarkable advancements in how AI can be leveraged for various mental health applications, from early detection to designing personalized treatment plans. More recently, a number of AI therapy apps have hit the market, spurred by the generative AI revolution.
While this is a new domain, evidence is building that AI therapy tools can be clinically effective for the treatment of mental health issues. A new study has been touted as the first Randomized Controlled Trial "demonstrating the effectiveness of a fully Gen-AI therapy chatbot for treating clinical-level mental health symptoms". News headlines surrounding the article make exaggerated claims that it is the “First AI Therapy Study”.
Let’s dive into this latest article to understand its strengths, limitations, its relationship to previous studies of AI therapy, and what it bodes for the future of AI therapy in mental health.
From Text Responses to Real Clinical Outcomes
A couple months ago, I wrote about a recent study comparing ChatGPT responses with those from human therapists. The findings were eye-opening:
People could barely distinguish between AI and human therapist responses
In many cases, they rated AI responses higher on therapeutic qualities
But this only examined single responses, not actual clinical improvement
The new study published in the NEJM AI by Heinz et al. (2025) adds to a growing body of evidence about AI therapy's effectiveness.
What This New Research Studied
This latest study tested "Therabot," a generative AI chatbot app designed for mental health treatment, with 210 adults experiencing symptoms of depression, anxiety, or eating disorder risk. A quick overview:
Participants used the AI therapist for four weeks or were assigned to a waitlist control group
The system used third-wave cognitive behavioral therapy (which incorporates mindfulness and acceptance)
Trained clinicians supervised the AI's responses for safety
Important context: While the authors describe this as "the first RCT of a fully Gen-AI therapy chatbot," previous research has already shown clinical benefits of other generative AI and non-generative AI approaches, and a meta-analysis conducted in 2023 counted 35 total studies of which 15 were RCT’s. The real novelty here seems to be primarily technical: this study is the first to combine a fully generative AI model with a rigorous randomized controlled trial design to assess clinical efficacy. I say this not to downplay the value of this study, but simply to point out that it’s part of a growing body of research in support of AI therapy.
The study found that the AI therapy app significantly reduced depression and anxiety symptoms, along with eating disorder risk.
The Results: Clinically Meaningful Mental Health Improvements in Users
The findings show significant benefits for participants using the AI therapy system:
Strong symptom reduction: The AI therapy group showed substantial improvements in depression, anxiety, and eating disorder risk compared to the control group
Strong therapeutic relationship: Users rated their connection with the AI similar to typical ratings for human therapists (wow!)
Good engagement: Participants sent hundreds of messages over several weeks, with average usage exceeding 6 hours
“Participants, on average, reported a therapeutic alliance comparable to
norms reported in an outpatient psychotherapy sample.”
Why AI Therapy May Be Effective
AI therapy can be accessed 24/7, unlike traditional therapy
The authors note several factors likely contributed to these positive outcomes:
24/7 accessibility: Users could engage whenever needed, even at 3 AM
Evidence-based foundation: Built on established therapeutic approaches
Personalized conversations: Generative AI allows for natural, tailored dialogue
Therapeutic alliance: The AI successfully built rapport with users
Study Limitations to Consider
Despite promising results, several important limitations remain:
The comparisons in the study were done relative to a control group receiving no treatment, not compared to active treatment with human therapists. So it does not tell us the relative effectiveness of AI therapy compared to traditional therapy
Required human supervision (15 interventions for safety concerns, 13 for inappropriate responses)
Short follow-up period (8 weeks)
Possible selection bias toward tech-savvy participants
Primarily expands on earlier research rather than showing entirely new capabilities
The Future of AI in Mental Health Care
AI and technology in general is likely to play a central role in providing personalized mental health treatment plans in the future, with treatments including traditional human therapy, AI therapy, medication, and other methodologies.
This growing body of research points to promising possibilities:
Bridging the access gap: With most people with mental health disorders unable to access care, AI could provide evidence-based support that's accessible to everyone, 24/7
Complementary approaches: AI might work alongside traditional therapy, not replace it
Multimodal support: Some platforms integrate multiple approaches (like Wellness AI, which combines AI therapy conversations with guided AI meditations)
What's Next?
Future research directions
I’m interested to see in the future some of the following research questions addressed:
Comparisons between Traditional Therapy, AI Therapy, and a combination of both
Long-term effectiveness of AI therapy and related technologically-powered mental health treatment
Best ways to combine AI with other treatment approaches, and the effectiveness of combining AI therapy with other techniques such as mindfulness and personalized AI meditations and changes to daily habits
A popular theory in psychology is that all therapeutic approaches designed around certain common factors have similar efficacy
My best guess for future research findings
Given that research into traditional therapy has often shown that most valid forms of therapy demonstrate similar effectiveness—known as the Dodo Bird Effect—I expect that to see similar effects in the AI therapy space. Specifically, I think that many AI therapy approaches will be shown to be effective as long as they are designed carefully and based around techniques and common factors that have been validated in traditional therapy settings.
A couple caveats:
I am not making the claim that the majority of AI therapy apps currently on the market are effective, but I do think that apps in which the creators have taken care in building and testing their app (as I did extensively with Wellness AI) are likely to provide mental health benefits for their users.
I would be shocked if AI therapy was as effective as traditional therapy at this point given both (a) the known limitations of current AI and (b) the importance of human interaction in general for people’s wellbeing, which AI will not fulfill.
I also expect that apps which combine treatment modalities (such as Mindfulness-Based Cognitive Therapy or what we’ve done in Wellness AI) will be more effective, since they offer more ways for users to engage with evidence-backed treatments. But ultimately, I think that it will be most important for individuals to find an app that is well suited to their preferences, similar to the importance of matching a therapist’s personality to their client’s preferences.
AI therapy has progressed dramatically from the simplistic ELIZA chatbot of the 1960s. Today's systems can engage in complex, personalized therapeutic dialogues that appear to produce real clinical benefits for conditions like anxiety and depression.
As technology continues to evolve, the goal should be to create a future where quality mental health support is available to everyone who needs it. And I think that technology’s greatest promise is in offering this support whenever, however, and wherever it is most needed.
-Tim
Creator of Wellness AI