EmoHarbor: Evaluating Personalized Emotional Support by Simulating the User's Internal World
- URL: http://arxiv.org/abs/2601.01530v1
- Date: Sun, 04 Jan 2026 13:46:51 GMT
- Title: EmoHarbor: Evaluating Personalized Emotional Support by Simulating the User's Internal World
- Authors: Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong,
- Abstract summary: EmoHarbor is an automated evaluation framework that simulates the user's inner world.<n>It employs a Chain-of-Agent architecture that decomposes users' internal processes into three specialized roles.<n>EmoHarbor provides a reproducible and scalable framework to guide the development and evaluation of more nuanced and user-aware emotional support systems.
- Score: 43.2336028953103
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current evaluation paradigms for emotional support conversations tend to reward generic empathetic responses, yet they fail to assess whether the support is genuinely personalized to users' unique psychological profiles and contextual needs. We introduce EmoHarbor, an automated evaluation framework that adopts a User-as-a-Judge paradigm by simulating the user's inner world. EmoHarbor employs a Chain-of-Agent architecture that decomposes users' internal processes into three specialized roles, enabling agents to interact with supporters and complete assessments in a manner similar to human users. We instantiate this benchmark using 100 real-world user profiles that cover a diverse range of personality traits and situations, and define 10 evaluation dimensions of personalized support quality. Comprehensive evaluation of 20 advanced LLMs on EmoHarbor reveals a critical insight: while these models excel at generating empathetic responses, they consistently fail to tailor support to individual user contexts. This finding reframes the central challenge, shifting research focus from merely enhancing generic empathy to developing truly user-aware emotional support. EmoHarbor provides a reproducible and scalable framework to guide the development and evaluation of more nuanced and user-aware emotional support systems.
Related papers
- Detecting Emotional Dynamic Trajectories: An Evaluation Framework for Emotional Support in Language Models [6.810484095299127]
Emotional support is a core capability in human-AI interaction, with applications including psychological counseling, role play, and companionship.<n>Existing evaluations of large language models (LLMs) often rely on short, static dialogues and fail to capture the dynamic and long-term nature of emotional support.<n>Our framework constructs a large-scale benchmark consisting of 328 emotional contexts and 1,152 disturbance events, simulating realistic emotional shifts under evolving dialogue scenarios.
arXiv Detail & Related papers (2025-11-12T05:47:28Z) - UserBench: An Interactive Gym Environment for User-Centric Agents [110.77212949007958]
Large Language Models (LLMs)-based agents have made impressive progress in reasoning and tool use, but their ability to proactively collaborate with users remains underexplored.<n>We introduce UserBench, a user-centric benchmark designed to evaluate agents in multi-turn, preference-driven interactions.
arXiv Detail & Related papers (2025-07-29T17:34:12Z) - IntentionESC: An Intention-Centered Framework for Enhancing Emotional Support in Dialogue Systems [74.0855067343594]
In emotional support conversations, unclear intentions can lead supporters to employ inappropriate strategies.<n>We propose the Intention-centered Emotional Support Conversation framework.<n>It defines the possible intentions of supporters, identifies key emotional state aspects for inferring these intentions, and maps them to appropriate support strategies.
arXiv Detail & Related papers (2025-06-06T10:14:49Z) - From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment [27.301608019492043]
Large Language Models (LLMs) deliver generic and one-size-fits-all responses that fail to address users' specific needs.<n>We propose a self-evolution framework designed to help LLMs improve their responses to better align with users' implicit preferences.<n>Our method significantly enhances the model's performance in emotional support, reducing unhelpful responses and minimizing discrepancies between user preferences and model outputs.
arXiv Detail & Related papers (2025-05-22T12:45:12Z) - Towards Empathetic Conversational Recommender Systems [77.53167131692]
We propose an empathetic conversational recommender (ECR) framework.
ECR contains two main modules: emotion-aware item recommendation and emotion-aligned response generation.
Our experiments on the ReDial dataset validate the efficacy of our framework in enhancing recommendation accuracy and improving user satisfaction.
arXiv Detail & Related papers (2024-08-30T15:43:07Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - MISC: A MIxed Strategy-Aware Model Integrating COMET for Emotional
Support Conversation [64.37111498077866]
We propose a novel model for emotional support conversation.
It infers the user's fine-grained emotional status, and then responds skillfully using a mixture of strategy.
Experimental results on the benchmark dataset demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2022-03-25T10:32:04Z) - CEM: Commonsense-aware Empathetic Response Generation [31.956147246779423]
We propose a novel approach for empathetic response generation, which leverages commonsense to draw more information about the user's situation.
We evaluate our approach on EmpatheticDialogues, which is a widely-used benchmark dataset for empathetic response generation.
arXiv Detail & Related papers (2021-09-13T06:55:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.