From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment
- URL: http://arxiv.org/abs/2505.16610v1
- Date: Thu, 22 May 2025 12:45:12 GMT
- Title: From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment
- Authors: Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong,
- Abstract summary: Large Language Models (LLMs) deliver generic and one-size-fits-all responses that fail to address users' specific needs.<n>We propose a self-evolution framework designed to help LLMs improve their responses to better align with users' implicit preferences.<n>Our method significantly enhances the model's performance in emotional support, reducing unhelpful responses and minimizing discrepancies between user preferences and model outputs.
- Score: 27.301608019492043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective emotional support hinges on understanding users' emotions and needs to provide meaningful comfort during multi-turn interactions. Large Language Models (LLMs) show great potential for expressing empathy; however, they often deliver generic and one-size-fits-all responses that fail to address users' specific needs. To tackle this issue, we propose a self-evolution framework designed to help LLMs improve their responses to better align with users' implicit preferences concerning user profiles (personalities), emotional states, and specific situations. Our framework consists of two distinct phases: \textit{(1)} \textit{Emotional Support Experience Acquisition}, where LLMs are fine-tuned on limited emotional support conversation data to provide basic support, and \textit{(2)} \textit{Self-Improvement for Personalized Emotional Support}, where LLMs leverage self-reflection and self-refinement to generate personalized responses. Through iterative direct preference optimization between the pre- and post-refined responses, our model generates responses that reflect a better understanding of the user's implicit preferences. Extensive experiments and evaluations demonstrate that our method significantly enhances the model's performance in emotional support, reducing unhelpful responses and minimizing discrepancies between user preferences and model outputs.
Related papers
- IntentionESC: An Intention-Centered Framework for Enhancing Emotional Support in Dialogue Systems [74.0855067343594]
In emotional support conversations, unclear intentions can lead supporters to employ inappropriate strategies.<n>We propose the Intention-centered Emotional Support Conversation framework.<n>It defines the possible intentions of supporters, identifies key emotional state aspects for inferring these intentions, and maps them to appropriate support strategies.
arXiv Detail & Related papers (2025-06-06T10:14:49Z) - FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem.<n>Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them.<n>We generate over 1M synthetic personalized preferences using publicly available LLMs.<n>We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z) - UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering [39.79275025010785]
name is a benchmark designed to evaluate the effectiveness of user embeddings in prompting large language models for personalization.<n>We conduct extensive experiments on various state-of-the-art methods for modeling user embeddings.
arXiv Detail & Related papers (2025-02-26T14:34:00Z) - Aligning LLMs with Individual Preferences via Interaction [51.72200436159636]
We train large language models (LLMs) that can ''interact to align''<n>We develop a multi-turn preference dataset containing 3K+ multi-turn conversations in tree structures.<n>For evaluation, we establish the ALOE benchmark, consisting of 100 carefully selected examples and well-designed metrics to measure the customized alignment performance during conversations.
arXiv Detail & Related papers (2024-10-04T17:48:29Z) - Towards Empathetic Conversational Recommender Systems [77.53167131692]
We propose an empathetic conversational recommender (ECR) framework.
ECR contains two main modules: emotion-aware item recommendation and emotion-aligned response generation.
Our experiments on the ReDial dataset validate the efficacy of our framework in enhancing recommendation accuracy and improving user satisfaction.
arXiv Detail & Related papers (2024-08-30T15:43:07Z) - WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback [36.06000681394939]
We introduce WildFeedback, a novel framework that leverages in-situ user feedback during conversations with large language models (LLMs) to create preference datasets automatically.<n>Our experiments demonstrate that LLMs fine-tuned on WildFeedback dataset exhibit significantly improved alignment with user preferences.
arXiv Detail & Related papers (2024-08-28T05:53:46Z) - EmPO: Emotion Grounding for Empathetic Response Generation through Preference Optimization [9.934277461349696]
Empathetic response generation is a desirable aspect of conversational agents.
We propose a novel approach where we construct theory-driven preference datasets based on emotion grounding.
We show that LLMs can be aligned for empathetic response generation by preference optimization while retaining their general performance.
arXiv Detail & Related papers (2024-06-27T10:41:22Z) - Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts.
RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z) - MISC: A MIxed Strategy-Aware Model Integrating COMET for Emotional
Support Conversation [64.37111498077866]
We propose a novel model for emotional support conversation.
It infers the user's fine-grained emotional status, and then responds skillfully using a mixture of strategy.
Experimental results on the benchmark dataset demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2022-03-25T10:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.