MCTSr-Zero: Self-Reflective Psychological Counseling Dialogues Generation via Principles and Adaptive Exploration
- URL: http://arxiv.org/abs/2505.23229v1
- Date: Thu, 29 May 2025 08:30:15 GMT
- Title: MCTSr-Zero: Self-Reflective Psychological Counseling Dialogues Generation via Principles and Adaptive Exploration
- Authors: Hao Lu, Yanchi Gu, Haoyuan Huang, Yulin Zhou, Ningxin Zhu, Chen Li,
- Abstract summary: We introduce MCTSr-Zero, a framework for open-ended, human-centric dialogues.<n>Core innovation is "domain alignment", which shifts the MCTS search objective.<n>We also introduce PsyEval, a benchmark for assessing multi-turn psychological counseling dialogues.
- Score: 6.448134875469534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The integration of Monte Carlo Tree Search (MCTS) with Large Language Models (LLMs) has demonstrated significant success in structured, problem-oriented tasks. However, applying these methods to open-ended dialogues, such as those in psychological counseling, presents unique challenges. Unlike tasks with objective correctness, success in therapeutic conversations depends on subjective factors like empathetic engagement, ethical adherence, and alignment with human preferences, for which strict "correctness" criteria are ill-defined. Existing result-oriented MCTS approaches can therefore produce misaligned responses. To address this, we introduce MCTSr-Zero, an MCTS framework designed for open-ended, human-centric dialogues. Its core innovation is "domain alignment", which shifts the MCTS search objective from predefined end-states towards conversational trajectories that conform to target domain principles (e.g., empathy in counseling). Furthermore, MCTSr-Zero incorporates "Regeneration" and "Meta-Prompt Adaptation" mechanisms to substantially broaden exploration by allowing the MCTS to consider fundamentally different initial dialogue strategies. We evaluate MCTSr-Zero in psychological counseling by generating multi-turn dialogue data, which is used to fine-tune an LLM, PsyLLM. We also introduce PsyEval, a benchmark for assessing multi-turn psychological counseling dialogues. Experiments demonstrate that PsyLLM achieves state-of-the-art performance on PsyEval and other relevant metrics, validating MCTSr-Zero's effectiveness in generating high-quality, principle-aligned conversational data for human-centric domains and addressing the LLM challenge of consistently adhering to complex psychological standards.
Related papers
- Reframe Your Life Story: Interactive Narrative Therapist and Innovative Moment Assessment with Large Language Models [92.93521294357058]
Narrative therapy helps individuals transform problematic life stories into empowering alternatives.<n>Current approaches lack realism in specialized psychotherapy and fail to capture therapeutic progression over time.<n>Int (Interactive Narrative Therapist) simulates expert narrative therapists by planning therapeutic stages, guiding reflection levels, and generating contextually appropriate expert-like responses.
arXiv Detail & Related papers (2025-07-27T11:52:09Z) - Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling [17.809187205107232]
PsyLLM is a large language model designed to integrate diagnostic and therapeutic reasoning for mental health counseling.<n>This pipeline processes real-world mental health posts and generates multi-turn dialogue structures.<n> Rigorous multi-dimensional filtering ensures the generation of high-quality, clinically aligned dialogue data.
arXiv Detail & Related papers (2025-05-21T16:24:49Z) - Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback [51.26493826461026]
We propose Psi-Arena, an interactive framework for comprehensive assessment and optimization of large language models (LLMs)<n>Arena features realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients.<n>Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives.
arXiv Detail & Related papers (2025-05-06T08:22:51Z) - Crisp: Cognitive Restructuring of Negative Thoughts through Multi-turn Supportive Dialogues [75.16593367473259]
Cognitive Restructuring (CR) is a psychotherapeutic process aimed at identifying and restructuring an individual's negative thoughts.<n>Existing efforts implement CR via simple text rewriting, fixed-pattern dialogues, or a one-shot CR workflow.<n>We propose CRDial, a novel framework for CR, which creates multi-turn dialogues with specifically designed identification and restructuring stages.
arXiv Detail & Related papers (2025-04-24T04:22:00Z) - Unlocking LLMs: Addressing Scarce Data and Bias Challenges in Mental Health [8.703482957316107]
Large language models (LLMs) have shown promising capabilities in healthcare analysis but face several challenges like hallucinations, parroting, and bias manifestation.<n>In this work we introduce IC-AnnoMI, an expert-annotated motivational interviewing (MI) dataset built upon AnnoMI.<n> IC-AnnoMI employs targeted prompts accurately engineered through cues and tailored information, taking into account therapy style (empathy, reflection), contextual relevance, and false semantic change.
arXiv Detail & Related papers (2024-12-17T15:01:07Z) - PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation [32.40846713004979]
PsycoLLM is trained on a proposed high-quality psychological dataset.<n>We augment this process with real-world psychological case backgrounds extracted from online platforms.<n>We develop a comprehensive psychological benchmark based on authoritative psychological counseling examinations in China.
arXiv Detail & Related papers (2024-07-08T08:25:56Z) - HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy [25.908522131646258]
We unveil the Helping and Empowering through Adaptive Language in Mental Enhancement (HealMe) model.
This novel cognitive reframing therapy method effectively addresses deep-rooted negative thoughts and fosters rational, balanced perspectives.
We adopt the first comprehensive and expertly crafted psychological evaluation metrics, specifically designed to rigorously assess the performance of cognitive reframing.
arXiv Detail & Related papers (2024-02-26T09:10:34Z) - Response Generation for Cognitive Behavioral Therapy with Large Language
Models: Comparative Study with Socratic Questioning [6.400704401007114]
This study investigates the impact of generated responses on subjective evaluations such as mood change, cognitive change, and dialogue quality.
When using GPT-4, the amount of mood change, empathy, and other dialogue qualities improve significantly.
arXiv Detail & Related papers (2024-01-29T08:53:41Z) - MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation [60.65820977963331]
We introduce a novel evaluation paradigm for Large Language Models (LLMs)
This paradigm shifts the emphasis from result-oriented assessments, which often neglect the reasoning process, to a more comprehensive evaluation.
By applying this paradigm in the GSM8K dataset, we have developed the MR-GSM8K benchmark.
arXiv Detail & Related papers (2023-12-28T15:49:43Z) - Building Emotional Support Chatbots in the Era of LLMs [64.06811786616471]
We introduce an innovative methodology that synthesizes human insights with the computational prowess of Large Language Models (LLMs)
By utilizing the in-context learning potential of ChatGPT, we generate an ExTensible Emotional Support dialogue dataset, named ExTES.
Following this, we deploy advanced tuning techniques on the LLaMA model, examining the impact of diverse training strategies, ultimately yielding an LLM meticulously optimized for emotional support interactions.
arXiv Detail & Related papers (2023-08-17T10:49:18Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - DynaEval: Unifying Turn and Dialogue Level Evaluation [60.66883575106898]
We propose DynaEval, a unified automatic evaluation framework.
It is capable of performing turn-level evaluation, but also holistically considers the quality of the entire dialogue.
Experiments show that DynaEval significantly outperforms the state-of-the-art dialogue coherence model.
arXiv Detail & Related papers (2021-06-02T12:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.