Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning
- URL: http://arxiv.org/abs/2512.01282v2
- Date: Tue, 02 Dec 2025 08:43:56 GMT
- Title: Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning
- Authors: Jiahao Yuan, Zhiqing Cui, Hanqing Wang, Yuansheng Gao, Yucheng Zhou, Usman Naseem,
- Abstract summary: KardiaBench is a large-scale user-grounded benchmark comprising 178,080 QA pairs across 22,080 conversations anchored to 671 real-world profiles.<n>Kardia-R1 is a framework that trains models for interpretable, stepwise empathetic cognition.<n>Our dataset and model will be released at https://github.com/JhCircle/Kardia-R1.
- Score: 20.717092979679553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As web platforms evolve towards greater personalization and emotional complexity, conversational agents must transcend superficial empathy to demonstrate identity-aware emotional reasoning. However, existing systems face two limitations: (1) reliance on situation-centric datasets lacking persistent user identity, which hampers the capture of personalized affective nuances; and (2) dependence on opaque, coarse reward signals that hinder development of verifiable empathetic reasoning. To address these gaps, we introduce KardiaBench, a large-scale user-grounded benchmark comprising 178,080 QA pairs across 22,080 multi-turn conversations anchored to 671 real-world profiles. The dataset is constructed via a model-in-the-loop pipeline with iterative rubric-guided refinement to ensure psychological plausibility and persona consistency. This progressive empathy pipeline that integrates user comprehension, contextual reasoning, and emotion perception into conversations, followed by iterative critique and rubric-based refinement to ensure psychological plausibility, emotional fidelity, and persona consistency. Building on this, we propose Kardia-R1, a framework that trains models for interpretable, stepwise empathetic cognition. Kardia-R1 leverages Rubric-as-Judge Empathetic Reinforcement Learning (Rubric-ERL), a GRPO-based method that uses explainable, human-aligned rubric rewards to tightly couple user understanding, emotional inference, and supportive response generation. Extensive experiments across four LLM backbones demonstrate that Kardia-R1 consistently outperforms othet methods in emotion accuracy, empathy, relevance, persona consistency, and safety. Our dataset and model will be released at https://github.com/JhCircle/Kardia-R1.
Related papers
- PERM: Psychology-grounded Empathetic Reward Modeling for Large Language Models [45.377102925731826]
Large Language Models (LLMs) are increasingly deployed in human-centric applications, yet they often fail to provide substantive emotional support.<n>We propose Psychology-grounded Empathetic Reward Modeling (PERM) to address this limitation.
arXiv Detail & Related papers (2026-01-15T15:56:55Z) - Empathy-R1: A Chain-of-Empathy and Reinforcement Learning Framework for Long-Form Mental Health Support [17.95060134327437]
We introduce Empathy-R1, a novel framework that integrates a Chain-of-Empathy (CoE) reasoning process with Reinforcement Learning (RL)<n>Inspired by cognitive-behavioral therapy, our CoE paradigm guides the model to sequentially reason about a help-seeker's emotions, causes, and intentions.<n>Our framework is empowered by a new large-scale Chinese dataset, Empathy-QA, and a two-stage training process.
arXiv Detail & Related papers (2025-09-18T11:16:09Z) - MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems [17.381122321801556]
We introduce MetaMind, a multi-agent framework inspired by psychological theories of metacognition.<n>Our framework achieves state-of-the-art performance across three challenging benchmarks, with 35.7% improvement in real-world social scenarios.<n>This work advances AI systems toward human-like social intelligence, with applications in empathetic dialogue and culturally sensitive interactions.
arXiv Detail & Related papers (2025-05-25T02:32:57Z) - Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models [75.85319609088354]
Sentient Agent as a Judge (SAGE) is an evaluation framework for large language models.<n>SAGE instantiates a Sentient Agent that simulates human-like emotional changes and inner thoughts during interaction.<n>SAGE provides a principled, scalable and interpretable tool for tracking progress toward genuinely empathetic and socially adept language agents.
arXiv Detail & Related papers (2025-05-01T19:06:10Z) - APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation [71.26755736617478]
Empathetic response generation is designed to comprehend the emotions of others.
We develop a framework that combines retrieval augmentation and emotional support strategy integration.
Our framework can enhance the empathy ability of LLMs from both cognitive and affective empathy perspectives.
arXiv Detail & Related papers (2024-07-23T02:23:37Z) - CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic
Response Generation [59.8935454665427]
Empathetic dialogue models usually consider only the affective aspect or treat cognition and affection in isolation.
We propose the CASE model for empathetic dialogue generation.
arXiv Detail & Related papers (2022-08-18T14:28:38Z) - MISC: A MIxed Strategy-Aware Model Integrating COMET for Emotional
Support Conversation [64.37111498077866]
We propose a novel model for emotional support conversation.
It infers the user's fine-grained emotional status, and then responds skillfully using a mixture of strategy.
Experimental results on the benchmark dataset demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2022-03-25T10:32:04Z) - CEM: Commonsense-aware Empathetic Response Generation [31.956147246779423]
We propose a novel approach for empathetic response generation, which leverages commonsense to draw more information about the user's situation.
We evaluate our approach on EmpatheticDialogues, which is a widely-used benchmark dataset for empathetic response generation.
arXiv Detail & Related papers (2021-09-13T06:55:14Z) - Exemplars-guided Empathetic Response Generation Controlled by the
Elements of Human Communication [88.52901763928045]
We propose an approach that relies on exemplars to cue the generative model on fine stylistic properties that signal empathy to the interlocutor.
We empirically show that these approaches yield significant improvements in empathetic response quality in terms of both automated and human-evaluated metrics.
arXiv Detail & Related papers (2021-06-22T14:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.