Related papers: CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models

CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models

URL: http://arxiv.org/abs/2602.20648v1
Date: Tue, 24 Feb 2026 07:52:56 GMT
Title: CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models
Authors: Anqi Li, Chenxiao Wang, Yu Lu, Renjun Xu, Lizhi Ma, Zhenzhong Lan,
Abstract summary: We present CARE, an LLM-based framework to automatically predict multi-dimensional alliance scores and generate interpretable rationales from counseling transcripts.<n> CARE is built on the CounselingWAI dataset and enriched with 9,516 expert-curated rationales.<n>Experiments show that CARE outperforms leading LLMs and substantially reduces the gap between counselor evaluations and client-perceived alliance.
Score: 19.027335814014528
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Client perceptions of the therapeutic alliance are critical for counseling effectiveness. Accurately capturing these perceptions remains challenging, as traditional post-session questionnaires are burdensome and often delayed, while existing computational approaches produce coarse scores, lack interpretable rationales, and fail to model holistic session context. We present CARE, an LLM-based framework to automatically predict multi-dimensional alliance scores and generate interpretable rationales from counseling transcripts. Built on the CounselingWAI dataset and enriched with 9,516 expert-curated rationales, CARE is fine-tuned using rationale-augmented supervision with the LLaMA-3.1-8B-Instruct backbone. Experiments show that CARE outperforms leading LLMs and substantially reduces the gap between counselor evaluations and client-perceived alliance, achieving over 70% higher Pearson correlation with client ratings. Rationale-augmented supervision further improves predictive accuracy. CARE also produces high-quality, contextually grounded rationales, validated by both automatic and human evaluations. Applied to real-world Chinese online counseling sessions, CARE uncovers common alliance-building challenges, illustrates how interaction patterns shape alliance development, and provides actionable insights, demonstrating its potential as an AI-assisted tool for supporting mental health care.

Related papers

Multi-dimensional Assessment and Explainable Feedback for Counselor Responses to Client Resistance in Text-based Counseling with LLMs [28.919083157390464]
We present a comprehensive pipeline for the multi-dimensional evaluation of human counselors' interventions targeting client resistance in text-based therapy.<n>We introduce a theory-driven framework that decomposes counselor responses into four distinct communication mechanisms.<n>We show that our approach can effectively distinguish the quality of different communication mechanisms.
arXiv Detail & Related papers (2026-02-25T07:05:05Z)
Responsible Evaluation of AI for Mental Health [72.85175110624736]
Current approaches to evaluating AI tools in mental health care are fragmented and poorly aligned with clinical practice, social context, and first-hand user experience.<n>This paper argues for a rethinking of responsible evaluation by introducing an interdisciplinary framework that integrates clinical soundness, social context, and equity.
arXiv Detail & Related papers (2026-01-20T12:55:10Z)
PAIR-SAFE: A Paired-Agent Approach for Runtime Auditing and Refining AI-Mediated Mental Health Support [18.251267901872886]
Large language models (LLMs) are increasingly used for mental health support.<n>LLMs can produce responses that are overly directive, inconsistent, or clinically misaligned.<n>We introduce PAIR-SAFE, a paired-agent framework for auditing and refining AI-generated mental health support.
arXiv Detail & Related papers (2026-01-19T06:20:57Z)
MindChat: A Privacy-preserving Large Language Model for Mental Health Support [10.332226758787277]
We present MindChat, a privacy-preserving large language model for mental health support.<n>We also present MindCorpus, a synthetic multi-turn counseling dataset constructed via a multi-agent role-playing framework.
arXiv Detail & Related papers (2026-01-05T10:54:18Z)
CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions [49.02422075498554]
Large Language Model (LLM) agents have evolved from basic text generation to autonomously completing complex tasks through interaction with external tools.<n>In this work, we emphasize the importance of learning ability, including both self-improvement and peer-learning, as a core driver for agent evolution toward human-level intelligence.<n>We propose an iterative, competitive peer-learning framework, which allows agents to refine and optimize their strategies through repeated interactions and feedback.
arXiv Detail & Related papers (2025-10-30T15:22:53Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback [51.26493826461026]
We propose Psi-Arena, an interactive framework for comprehensive assessment and optimization of large language models (LLMs)<n>Arena features realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients.<n>Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives.
arXiv Detail & Related papers (2025-05-06T08:22:51Z)
Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases [48.87360916431396]
We introduce MedR-Bench, a benchmarking dataset of 1,453 structured patient cases, annotated with reasoning references.<n>We propose a framework encompassing three critical examination recommendation, diagnostic decision-making, and treatment planning, simulating the entire patient care journey.<n>Using this benchmark, we evaluate five state-of-the-art reasoning LLMs, including DeepSeek-R1, OpenAI-o3-mini, and Gemini-2.0-Flash Thinking, etc.
arXiv Detail & Related papers (2025-03-06T18:35:39Z)
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
Understanding the Therapeutic Relationship between Counselors and Clients in Online Text-based Counseling using LLMs [18.605352662843575]
We present an automatic approach using large language models (LLMs) to understand the development of therapeutic alliance in text-based counseling. We collect a comprehensive counseling dataset and conduct multiple expert evaluations on a subset based on this framework. Our findings underscore the challenges counselors face in cultivating strong online relationships with clients.
arXiv Detail & Related papers (2024-02-19T09:00:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.