Related papers: Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy

Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy

URL: http://arxiv.org/abs/2603.03862v1
Date: Wed, 04 Mar 2026 09:15:14 GMT
Title: Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy
Authors: Navdeep Singh Bedi, Ana-Maria Bucur, Noriko Kando, Fabio Crestani,
Abstract summary: We evaluate Large Language Models' ability to emulate professional therapists practicing Cognitive Behavioral Therapy (CBT)<n>Our results indicate that while LLMs can generate CBT-like dialogues, they are limited in their ability to convey empathy and maintain consistency.
Score: 4.551587749019292
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: As mental health issues continue to rise globally, there is an increasing demand for accessible and scalable therapeutic solutions. Many individuals currently seek support from Large Language Models (LLMs), even though these models have not been validated for use in counseling services. In this paper, we evaluate LLMs' ability to emulate professional therapists practicing Cognitive Behavioral Therapy (CBT). Using anonymized, transcribed role-play sessions between licensed therapists and clients, we compare two approaches: (1) a generation-only method and (2) a Retrieval-Augmented Generation (RAG) approach using CBT guidelines. We evaluate both proprietary and open-source models for linguistic quality, semantic coherence, and therapeutic fidelity using standard natural language generation (NLG) metrics, natural language inference (NLI), and automated scoring for skills assessment. Our results indicate that while LLMs can generate CBT-like dialogues, they are limited in their ability to convey empathy and maintain consistency.

Related papers

Roleplaying with Structure: Synthetic Therapist-Client Conversation Generation from Questionnaires [5.163738939075784]
We present an LLM-driven pipeline that generates synthetic counseling dialogues based on structured client profiles and psychological questionnaires.<n>Our framework, SQPsych, converts structured psychological input into natural language dialogues through therapist-client simulations.<n>Our findings highlight the potential of synthetic data to enable scalable, data-secure, and clinically informed AI for mental health support.
arXiv Detail & Related papers (2025-10-29T10:55:52Z)
Reframe Your Life Story: Interactive Narrative Therapist and Innovative Moment Assessment with Large Language Models [72.36715571932696]
Narrative therapy helps individuals transform problematic life stories into empowering alternatives.<n>Current approaches lack realism in specialized psychotherapy and fail to capture therapeutic progression over time.<n>Int (Interactive Narrative Therapist) simulates expert narrative therapists by planning therapeutic stages, guiding reflection levels, and generating contextually appropriate expert-like responses.
arXiv Detail & Related papers (2025-07-27T11:52:09Z)
Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling [50.83055329849865]
PsyLLM is a large language model designed to integrate diagnostic and therapeutic reasoning for mental health counseling.<n>It processes real-world mental health posts from Reddit and generates multi-turn dialogue structures.<n>Our experiments demonstrate that PsyLLM significantly outperforms state-of-the-art baseline models.
arXiv Detail & Related papers (2025-05-21T16:24:49Z)
Self-Adaptive Cognitive Debiasing for Large Language Models in Decision-Making [71.71796367760112]
Large language models (LLMs) have shown potential in supporting decision-making applications.<n>We propose a cognitive debiasing approach, self-adaptive cognitive debiasing (SACD)<n>We evaluate SACD on finance, healthcare, and legal decision-making tasks using both open-weight and closed-weight LLMs.
arXiv Detail & Related papers (2025-04-05T11:23:05Z)
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy [67.23830698947637]
We propose a new benchmark, CBT-BENCH, for the systematic evaluation of cognitive behavioral therapy (CBT) assistance.<n>We include three levels of tasks in CBT-BENCH: I: Basic CBT knowledge acquisition, with the task of multiple-choice questions; II: Cognitive model understanding, with the tasks of cognitive distortion classification, primary core belief classification, and fine-grained core belief classification; III: Therapeutic response generation, with the task of generating responses to patient speech in CBT therapy sessions.<n> Experimental results indicate that while LLMs perform well in reciting CBT knowledge, they fall short in complex real-world scenarios
arXiv Detail & Related papers (2024-10-17T04:52:57Z)
Therapy as an NLP Task: Psychologists' Comparison of LLMs and Human Peers in CBT [6.932239020477335]
Large language models (LLMs) are being used as adhoc therapists.<n>We compare the session-level behaviors of human counselors with those of an LLM prompted by a team of peer counselors to deliver single-session Cognitive Behavioral Therapy.
arXiv Detail & Related papers (2024-09-03T19:19:13Z)
Are Large Language Models Possible to Conduct Cognitive Behavioral Therapy? [13.0263170692984]
Large language models (LLMs) have been validated, providing new possibilities for psychological assistance therapy. Many concerns have been raised by mental health experts regarding the use of LLMs for therapy. Four LLM variants with excellent performance on natural language processing are evaluated.
arXiv Detail & Related papers (2024-07-25T03:01:47Z)
HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy [25.908522131646258]
We unveil the Helping and Empowering through Adaptive Language in Mental Enhancement (HealMe) model. This novel cognitive reframing therapy method effectively addresses deep-rooted negative thoughts and fosters rational, balanced perspectives. We adopt the first comprehensive and expertly crafted psychological evaluation metrics, specifically designed to rigorously assess the performance of cognitive reframing.
arXiv Detail & Related papers (2024-02-26T09:10:34Z)
Evaluating the Efficacy of Interactive Language Therapy Based on LLM for High-Functioning Autistic Adolescent Psychological Counseling [1.1780706927049207]
This study investigates the efficacy of Large Language Models (LLMs) in interactive language therapy for high-functioning autistic adolescents. LLMs present a novel opportunity to augment traditional psychological counseling methods.
arXiv Detail & Related papers (2023-11-12T07:55:39Z)
Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing [53.096237570992294]
Strategy training is a rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Standardized fidelity assessment is used to measure adherence to treatment principles. We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task.
arXiv Detail & Related papers (2022-09-14T15:33:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.