Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback
- URL: http://arxiv.org/abs/2401.05928v3
- Date: Mon, 17 Jun 2024 21:08:20 GMT
- Title: Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback
- Authors: Jiashuo Wang, Chunpu Xu, Chak Tou Leong, Wenjie Li, Jing Li,
- Abstract summary: We introduce a novel model-agnostic framework named mitigating unhelpfulness with multifaceted AI feedback for emotional support (Muffin)
Muffin employs a multifaceted AI feedback module to assess the helpfulness of responses generated by a specific model with consideration of multiple factors.
Results demonstrate that Muffin effectively mitigates the generation of unhelpful responses while slightly increasing response fluency and relevance.
- Score: 9.57004333812654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An emotional support conversation system aims to alleviate users' emotional distress and assist them in addressing their challenges. To generate supportive responses, it is critical to consider multiple factors such as empathy, support strategies, and response coherence, as established in prior methods. Nonetheless, previous models occasionally generate unhelpful responses, which intend to provide support but display counterproductive effects. According to psychology and communication theories, poor performance in just one contributing factor might cause a response to be unhelpful. From the model training perspective, since these models have not been exposed to unhelpful responses during their training phase, they are unable to distinguish if the tokens they generate might result in unhelpful responses during inference. To address this issue, we introduce a novel model-agnostic framework named mitigating unhelpfulness with multifaceted AI feedback for emotional support (Muffin). Specifically, Muffin employs a multifaceted AI feedback module to assess the helpfulness of responses generated by a specific model with consideration of multiple factors. Using contrastive learning, it then reduces the likelihood of the model generating unhelpful responses compared to the helpful ones. Experimental results demonstrate that Muffin effectively mitigates the generation of unhelpful responses while slightly increasing response fluency and relevance.
Related papers
- Distilling Reasoning Ability from Large Language Models with Adaptive Thinking [54.047761094420174]
Chain of thought finetuning (cot-finetuning) aims to endow small language models (SLM) with reasoning ability to improve their performance towards specific tasks.
Most existing cot-finetuning methods adopt a pre-thinking mechanism, allowing the SLM to generate a rationale before providing an answer.
This mechanism enables SLM to analyze and think about complex questions, but it also makes answer correctness highly sensitive to minor errors in rationale.
We propose a robust post-thinking mechanism to generate answers before rationale.
arXiv Detail & Related papers (2024-04-14T07:19:27Z) - When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour [0.8133739801185272]
We show that Large Language Models (LLMs) show sycophantic tendencies when responding to queries involving subjective opinions and statements.
LLMs at various scales seem not to follow the users' hints by demonstrating confidence in delivering the correct answers.
arXiv Detail & Related papers (2023-11-15T22:18:33Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - Boosting Distress Support Dialogue Responses with Motivational
Interviewing Strategy [4.264192013842096]
We show how some response types could be rephrased into a more MI adherent form.
We build several rephrasers by fine-tuning Blender and GPT3 to rephrase MI non-adherent "Advise without permission" responses into "Advise with permission"
arXiv Detail & Related papers (2023-05-17T13:18:28Z) - Pneg: Prompt-based Negative Response Generation for Dialogue Response
Selection Task [27.513992470527427]
In retrieval-based dialogue systems, a response selection model acts as a ranker to select the most appropriate response among several candidates.
Recent studies have shown that leveraging adversarial responses as negative training samples is useful for improving the discriminating power of the selection model.
This paper proposes a simple but efficient method for generating adversarial negative responses leveraging a large-scale language model.
arXiv Detail & Related papers (2022-10-31T11:49:49Z) - When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad
Responses into Good Labels [34.6235464256814]
Juicer is a framework to make use of both binary and free-form textual human feedback.
We find that augmenting training with model-corrected replies improves the final dialogue model.
arXiv Detail & Related papers (2022-10-28T04:57:21Z) - MISC: A MIxed Strategy-Aware Model Integrating COMET for Emotional
Support Conversation [64.37111498077866]
We propose a novel model for emotional support conversation.
It infers the user's fine-grained emotional status, and then responds skillfully using a mixture of strategy.
Experimental results on the benchmark dataset demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2022-03-25T10:32:04Z) - Exemplars-guided Empathetic Response Generation Controlled by the
Elements of Human Communication [88.52901763928045]
We propose an approach that relies on exemplars to cue the generative model on fine stylistic properties that signal empathy to the interlocutor.
We empirically show that these approaches yield significant improvements in empathetic response quality in terms of both automated and human-evaluated metrics.
arXiv Detail & Related papers (2021-06-22T14:02:33Z) - Generating Dialogue Responses from a Semantic Latent Space [75.18449428414736]
We propose an alternative to the end-to-end classification on vocabulary.
We learn the pair relationship between the prompts and responses as a regression task on a latent space.
Human evaluation showed that learning the task on a continuous space can generate responses that are both relevant and informative.
arXiv Detail & Related papers (2020-10-04T19:06:16Z) - Counterfactual Off-Policy Training for Neural Response Generation [94.76649147381232]
We propose to explore potential responses by counterfactual reasoning.
Training on the counterfactual responses under the adversarial learning framework helps to explore the high-reward area of the potential response space.
An empirical study on the DailyDialog dataset shows that our approach significantly outperforms the HRED model.
arXiv Detail & Related papers (2020-04-29T22:46:28Z) - Review-guided Helpful Answer Identification in E-commerce [38.276241153439955]
Product-specific community question answering platforms can greatly help address the concerns of potential customers.
The user-provided answers on such platforms often vary a lot in their qualities.
Helpfulness votes from the community can indicate the overall quality of the answer, but they are often missing.
arXiv Detail & Related papers (2020-03-13T11:34:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.