Response Generation for Cognitive Behavioral Therapy with Large Language
Models: Comparative Study with Socratic Questioning
- URL: http://arxiv.org/abs/2401.15966v1
- Date: Mon, 29 Jan 2024 08:53:41 GMT
- Title: Response Generation for Cognitive Behavioral Therapy with Large Language
Models: Comparative Study with Socratic Questioning
- Authors: Kenta Izumi, Hiroki Tanaka, Kazuhiro Shidara, Hiroyoshi Adachi,
Daisuke Kanayama, Takashi Kudo, and Satoshi Nakamura
- Abstract summary: This study investigates the impact of generated responses on subjective evaluations such as mood change, cognitive change, and dialogue quality.
When using GPT-4, the amount of mood change, empathy, and other dialogue qualities improve significantly.
- Score: 6.400704401007114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dialogue systems controlled by predefined or rule-based scenarios derived
from counseling techniques, such as cognitive behavioral therapy (CBT), play an
important role in mental health apps. Despite the need for responsible
responses, it is conceivable that using the newly emerging LLMs to generate
contextually relevant utterances will enhance these apps. In this study, we
construct dialogue modules based on a CBT scenario focused on conventional
Socratic questioning using two kinds of LLMs: a Transformer-based dialogue
model further trained with a social media empathetic counseling dataset,
provided by Osaka Prefecture (OsakaED), and GPT-4, a state-of-the art LLM
created by OpenAI. By comparing systems that use LLM-generated responses with
those that do not, we investigate the impact of generated responses on
subjective evaluations such as mood change, cognitive change, and dialogue
quality (e.g., empathy). As a result, no notable improvements are observed when
using the OsakaED model. When using GPT-4, the amount of mood change, empathy,
and other dialogue qualities improve significantly. Results suggest that GPT-4
possesses a high counseling ability. However, they also indicate that even when
using a dialogue model trained with a human counseling dataset, it does not
necessarily yield better outcomes compared to scenario-based dialogues. While
presenting LLM-generated responses, including GPT-4, and having them interact
directly with users in real-life mental health care services may raise ethical
issues, it is still possible for human professionals to produce example
responses or response templates using LLMs in advance in systems that use
rules, scenarios, or example responses.
Related papers
- Towards Empathetic Conversational Recommender Systems [77.53167131692]
We propose an empathetic conversational recommender (ECR) framework.
ECR contains two main modules: emotion-aware item recommendation and emotion-aligned response generation.
Our experiments on the ReDial dataset validate the efficacy of our framework in enhancing recommendation accuracy and improving user satisfaction.
arXiv Detail & Related papers (2024-08-30T15:43:07Z) - Are Large Language Models Possible to Conduct Cognitive Behavioral Therapy? [13.0263170692984]
Large language models (LLMs) have been validated, providing new possibilities for psychological assistance therapy.
Many concerns have been raised by mental health experts regarding the use of LLMs for therapy.
Four LLM variants with excellent performance on natural language processing are evaluated.
arXiv Detail & Related papers (2024-07-25T03:01:47Z) - Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? [64.72966061510375]
Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue.
This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis.
We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
arXiv Detail & Related papers (2024-06-16T20:41:44Z) - LLM Questionnaire Completion for Automatic Psychiatric Assessment [49.1574468325115]
We employ a Large Language Model (LLM) to convert unstructured psychological interviews into structured questionnaires spanning various psychiatric and personality domains.
The obtained answers are coded as features, which are used to predict standardized psychiatric measures of depression (PHQ-8) and PTSD (PCL-C)
arXiv Detail & Related papers (2024-06-09T09:03:11Z) - Can Large Language Models be Used to Provide Psychological Counselling?
An Analysis of GPT-4-Generated Responses Using Role-play Dialogues [0.0]
Mental health care poses an increasingly serious challenge to modern societies.
This study collected counseling dialogue data via role-playing scenarios involving expert counselors.
Third-party counselors evaluated the appropriateness of responses from human counselors and those generated by GPT-4 in identical contexts.
arXiv Detail & Related papers (2024-02-20T06:05:36Z) - "You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations [1.104960878651584]
We share a dataset containing text-based user interactions related to behaviour change with two GPT-4-based conversational agents.
This dataset includes conversation data, user language analysis, perception measures, and user feedback for LLM-generated turns.
arXiv Detail & Related papers (2024-01-29T13:54:48Z) - Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Harnessing Large Language Models' Empathetic Response Generation
Capabilities for Online Mental Health Counselling Support [1.9336815376402723]
Large Language Models (LLMs) have demonstrated remarkable performance across various information-seeking and reasoning tasks.
This study sought to examine LLMs' capability to generate empathetic responses in conversations that emulate those in a mental health counselling setting.
We selected five LLMs: version 3.5 and version 4 of the Generative Pre-training (GPT), Vicuna FastChat-T5, Pathways Language Model (PaLM) version 2, and Falcon-7B-Instruct.
arXiv Detail & Related papers (2023-10-12T03:33:06Z) - Prompting and Evaluating Large Language Models for Proactive Dialogues:
Clarification, Target-guided, and Non-collaboration [72.04629217161656]
This work focuses on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues.
To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme.
arXiv Detail & Related papers (2023-05-23T02:49:35Z) - Rethinking the Evaluation for Conversational Recommendation in the Era
of Large Language Models [115.7508325840751]
The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs)
In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol.
We propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators.
arXiv Detail & Related papers (2023-05-22T15:12:43Z) - Response-act Guided Reinforced Dialogue Generation for Mental Health
Counseling [25.524804770124145]
We present READER, a dialogue-act guided response generator for mental health counseling conversations.
READER is built on transformer to jointly predict a potential dialogue-act d(t+1) for the next utterance (aka response-act) and to generate an appropriate response u(t+1)
We evaluate READER on HOPE, a benchmark counseling conversation dataset.
arXiv Detail & Related papers (2023-01-30T08:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.