Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue
- URL: http://arxiv.org/abs/2601.18281v1
- Date: Mon, 26 Jan 2026 09:04:50 GMT
- Title: Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue
- Authors: Yuhang Jia, Pei Liu, Haoqin Sun, Jiaming Zhou, Xuxin Cheng, Cao Liu, Ke Zeng, Xunliang Cai, Yong Qin,
- Abstract summary: We introduce EmpathyEval, a descriptive natural-language-based evaluation model for assessing empathetic quality in spoken dialogues.<n>We propose ReEmpathy, an end-to-end Spoken Language Models that enhances empathetic dialogue through a novel Empathetic Self-Reflective Alternating Inference mechanism.
- Score: 53.95386201009769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end Spoken Language Models (SLMs) hold great potential for paralinguistic perception, and numerous studies have aimed to enhance their capabilities, particularly for empathetic dialogue. However, current approaches largely depend on rigid supervised signals, such as ground-truth response in supervised fine-tuning or preference scores in reinforcement learning. Such reliance is fundamentally limited for modeling complex empathy, as there is no single "correct" response and a simple numerical score cannot fully capture the nuances of emotional expression or the appropriateness of empathetic behavior. To address these limitations, we sequentially introduce EmpathyEval, a descriptive natural-language-based evaluation model for assessing empathetic quality in spoken dialogues. Building upon EmpathyEval, we propose ReEmpathy, an end-to-end SLM that enhances empathetic dialogue through a novel Empathetic Self-Reflective Alternating Inference mechanism, which interleaves spoken response generation with free-form, empathy-related reflective reasoning. Extensive experiments demonstrate that ReEmpathy substantially improves empathy-sensitive spoken dialogue by enabling reflective reasoning, offering a promising approach toward more emotionally intelligent and empathy-aware human-computer interactions.
Related papers
- A Unified Spoken Language Model with Injected Emotional-Attribution Thinking for Human-like Interaction [50.05919688888947]
This paper presents a unified spoken language model for emotional intelligence, enhanced by a novel data construction strategy termed Injected Emotional-Attribution Thinking (IEAT)<n>IEAT incorporates user emotional states and their underlying causes into the model's internal reasoning process, enabling emotion-aware reasoning to be internalized rather than treated as explicit supervision.<n> Experiments on the Human-like Spoken Dialogue Systems Challenge (HumDial) Emotional Intelligence benchmark demonstrate that the proposed approach achieves top-ranked performance across emotional trajectory modeling, emotional reasoning, and empathetic response generation.
arXiv Detail & Related papers (2026-01-08T14:07:30Z) - COMPEER: Controllable Empathetic Reinforcement Reasoning for Emotional Support Conversation [47.0476311232988]
We propose controllable empathetic reasoning, which combines natural language reasoning with structured psychological steps.<n>We employ reinforcement learning with a unified process-outcome reward model that delivers precise feedback.<n>Our approach significantly improves model's emotional support ability, advancing the development of empathetic, human-like support systems.
arXiv Detail & Related papers (2025-08-13T06:09:32Z) - APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation [71.26755736617478]
Empathetic response generation is designed to comprehend the emotions of others.
We develop a framework that combines retrieval augmentation and emotional support strategy integration.
Our framework can enhance the empathy ability of LLMs from both cognitive and affective empathy perspectives.
arXiv Detail & Related papers (2024-07-23T02:23:37Z) - Use of a Taxonomy of Empathetic Response Intents to Control and
Interpret Empathy in Neural Chatbots [4.264192013842096]
A recent trend in the domain of open-domain conversational agents is enabling them to converse empathetically to emotional prompts.
Current approaches either follow an end-to-end approach or condition the responses on similar emotion labels to generate empathetic responses.
We propose several rule-based and neural approaches to predict the next response's emotion/intent and generate responses conditioned on these predicted emotions/intents.
arXiv Detail & Related papers (2023-05-17T10:03:03Z) - CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic
Response Generation [59.8935454665427]
Empathetic dialogue models usually consider only the affective aspect or treat cognition and affection in isolation.
We propose the CASE model for empathetic dialogue generation.
arXiv Detail & Related papers (2022-08-18T14:28:38Z) - Perspective-taking and Pragmatics for Generating Empathetic Responses
Focused on Emotion Causes [50.569762345799354]
We argue that two issues must be tackled at the same time: (i) identifying which word is the cause for the other's emotion from his or her utterance and (ii) reflecting those specific words in the response generation.
Taking inspiration from social cognition, we leverage a generative estimator to infer emotion cause words from utterances with no word-level label.
arXiv Detail & Related papers (2021-09-18T04:22:49Z) - CEM: Commonsense-aware Empathetic Response Generation [31.956147246779423]
We propose a novel approach for empathetic response generation, which leverages commonsense to draw more information about the user's situation.
We evaluate our approach on EmpatheticDialogues, which is a widely-used benchmark dataset for empathetic response generation.
arXiv Detail & Related papers (2021-09-13T06:55:14Z) - Exemplars-guided Empathetic Response Generation Controlled by the
Elements of Human Communication [88.52901763928045]
We propose an approach that relies on exemplars to cue the generative model on fine stylistic properties that signal empathy to the interlocutor.
We empirically show that these approaches yield significant improvements in empathetic response quality in terms of both automated and human-evaluated metrics.
arXiv Detail & Related papers (2021-06-22T14:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.