Rank-O-ToM: Unlocking Emotional Nuance Ranking to Enhance Affective Theory-of-Mind
- URL: http://arxiv.org/abs/2503.16461v1
- Date: Mon, 24 Feb 2025 05:04:40 GMT
- Title: Rank-O-ToM: Unlocking Emotional Nuance Ranking to Enhance Affective Theory-of-Mind
- Authors: JiHyun Kim, JuneHyoung Kwon, MiHyeon Kim, Eunju Lee, YoungBin Kim,
- Abstract summary: Ranking the Emotional Nuance for Theory of Mind (Rank-O-ToM) is a framework that leverages ordinal ranking to align confidence levels with the emotional spectrum.<n>Rank-O-ToM enhances the nuanced understanding of emotions, advancing AI's ability to reason about affective states.
- Score: 7.098576908286076
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial Expression Recognition (FER) plays a foundational role in enabling AI systems to interpret emotional nuances, a critical aspect of affective Theory of Mind (ToM). However, existing models often struggle with poor calibration and a limited capacity to capture emotional intensity and complexity. To address this, we propose Ranking the Emotional Nuance for Theory of Mind (Rank-O-ToM), a framework that leverages ordinal ranking to align confidence levels with the emotional spectrum. By incorporating synthetic samples reflecting diverse affective complexities, Rank-O-ToM enhances the nuanced understanding of emotions, advancing AI's ability to reason about affective states.
Related papers
- EMO-R3: Reflective Reinforcement Learning for Emotional Reasoning in Multimodal Large Language Models [62.3977734456669]
We propose Reflective Reinforcement Learning for Emotional Reasoning (EMO-R3), a framework designed to enhance the emotional reasoning ability of Multimodal Large Language Models (MLLMs)<n>We introduce Structured Emotional Thinking to guide the model to perform step-by-step emotional reasoning in a structured and interpretable manner, and design a Reflective Emotional Reward that enables the model to re-evaluate its reasoning based on visual-text consistency and emotional coherence.<n>EMO-R3 significantly improves both the interpretability and emotional intelligence of MLLMs, achieving superior performance across multiple visual emotional understanding benchmarks.
arXiv Detail & Related papers (2026-02-27T08:42:52Z) - Technosocial risks of ideal emotion recognition technologies: A defense of the (social) value of emotional expressions [51.56484100374058]
I argue that the appeal of such systems rests on a misunderstanding of the social functions emotional expression.<n>ERTs threaten this expressive space by collapsing epistemic friction, displacing meaning with technology-mediated affective profiles, and narrowing the space for aspirational and role-sensitive expressions.<n>I argue that, although it is intuitive to think that increasing accuracy would legitimise such systems, in the case of ERTs accuracy does not straightforwardly justify their deployment, and may, in some contexts, provide a reason for regulatory restraint.
arXiv Detail & Related papers (2026-02-09T14:20:42Z) - Unveiling the Cognitive Compass: Theory-of-Mind-Guided Multimodal Emotion Reasoning [31.790359663851305]
genuine affective intelligence requires explicit modeling of Theory of Mind (ToM), the cognitive substrate from which emotions arise.<n>We introduce HitEmotion, a ToM-grounded hierarchical benchmark that diagnoses capability breakpoints across increasing levels of cognitive depth.<n>Second, we propose a ToM-guided reasoning chain that tracks mental states and calibrates cross-modal evidence to achieve faithful emotional reasoning.
arXiv Detail & Related papers (2026-02-01T02:26:12Z) - Emotion-Coherent Reasoning for Multimodal LLMs via Emotional Rationale Verifier [53.55996102181836]
We propose the Emotional Rationale Verifier (ERV) and an Explanation Reward.<n>Our method guides the model to produce reasoning that is explicitly consistent with the target emotion.<n>We show that our approach not only enhances alignment between explanation and prediction but also empowers MLLMs to deliver emotionally coherent, trustworthy interactions.
arXiv Detail & Related papers (2025-10-27T16:40:17Z) - MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models [108.61337743051483]
We present MME-Emotion, a systematic benchmark that assesses both emotional understanding and reasoning capabilities of MLLMs.<n>MME-Emotion contains over 6,000 curated video clips with task-specific questioning-answering (QA) pairs, spanning broad scenarios to formulate eight emotional tasks.<n>It incorporates a holistic evaluation suite with hybrid metrics for emotion recognition and reasoning, analyzed through a multi-agent system framework.
arXiv Detail & Related papers (2025-08-11T03:14:55Z) - Emergence of Hierarchical Emotion Organization in Large Language Models [25.806354070542678]
We find that large language models (LLMs) naturally form hierarchical emotion trees that align with human psychological models.<n>We also uncover systematic biases in emotion recognition across socioeconomic personas, with compounding misclassifications for intersectional, underrepresented groups.<n>Our results hint at the potential of using cognitively-grounded theories for developing better model evaluations.
arXiv Detail & Related papers (2025-07-12T15:12:46Z) - Beyond Context to Cognitive Appraisal: Emotion Reasoning as a Theory of Mind Benchmark for Large Language Models [11.255011967393838]
This study advances beyond surface-level perceptual features to investigate how large language models (LLMs) reason about others' emotional states using contextual information.<n>Grounded in Cognitive Appraisal Theory, we curate a specialized ToM evaluation dataset1 to assess both forward reasoning - from context to emotion- and backward reasoning - from emotion to inferred context.
arXiv Detail & Related papers (2025-05-31T01:18:04Z) - Narrative-Centered Emotional Reflection: Scaffolding Autonomous Emotional Literacy with AI [0.0]
Reflexion is an AI-powered platform designed to enable structured emotional self-reflection at scale.
System scaffolds a progressive journey from surface-level emotional recognition toward value-aligned action planning.
arXiv Detail & Related papers (2025-04-29T01:24:46Z) - Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation [63.94836524433559]
DICE-Talk is a framework for disentangling identity with emotion and cooperating emotions with similar characteristics.
We develop a disentangled emotion embedder that jointly models audio-visual emotional cues through cross-modal attention.
Second, we introduce a correlation-enhanced emotion conditioning module with learnable Emotion Banks.
Third, we design an emotion discrimination objective that enforces affective consistency during the diffusion process.
arXiv Detail & Related papers (2025-04-25T05:28:21Z) - GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations [35.63053777817013]
GatedxLSTM is a novel multimodal Emotion Recognition in Conversation (ERC) model.<n>It considers voice and transcripts of both the speaker and their conversational partner to identify the most influential sentences driving emotional shifts.<n>It achieves state-of-the-art (SOTA) performance among open-source methods in four-class emotion classification.
arXiv Detail & Related papers (2025-03-26T18:46:18Z) - From Rational Answers to Emotional Resonance: The Role of Controllable Emotion Generation in Language Models [16.350658746140788]
Large language models (LLMs) struggle to express emotions in a consistent, controllable, and contextually appropriate manner.<n>We propose a controllable emotion generation framework based on Emotion Vectors (EVs)<n>Our method enables fine-grained, continuous modulation of emotional tone without any additional training or architectural modification.
arXiv Detail & Related papers (2025-02-06T13:38:57Z) - ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework [5.135349405469574]
We introduce ReflectDiffu, a lightweight framework for empathetic response generation.
It incorporates emotion contagion to augment emotional expressiveness and employs an emotion-reasoning mask to pinpoint critical emotional elements.
It adeptly translates emotional decision-making into precise intent actions, thereby addressing empathetic response misalignments stemming from emotional misrecognition.
arXiv Detail & Related papers (2024-09-16T13:56:17Z) - CTSM: Combining Trait and State Emotions for Empathetic Response Model [2.865464162057812]
Empathetic response generation endeavors to empower dialogue systems to perceive speakers' emotions and generate empathetic responses accordingly.
We propose Combining Trait and State emotions for Empathetic Response Model (CTSM)
To sufficiently perceive emotions in dialogue, we first construct and encode trait and state emotion embeddings.
We further enhance emotional perception capability through an emotion guidance module that guides emotion representation.
arXiv Detail & Related papers (2024-03-22T10:45:13Z) - Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought [50.13429055093534]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks.
We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z) - Rational Sensibility: LLM Enhanced Empathetic Response Generation Guided by Self-presentation Theory [8.439724621886779]
The development of Large Language Models (LLMs) provides human-centered Artificial General Intelligence (AGI) with a glimmer of hope.
Empathy serves as a key emotional attribute of humanity, playing an irreplaceable role in human-centered AGI.
In this paper, we design an innovative encoder module inspired by self-presentation theory in sociology, which specifically processes sensibility and rationality sentences in dialogues.
arXiv Detail & Related papers (2023-12-14T07:38:12Z) - Emotion Intensity and its Control for Emotional Voice Conversion [77.05097999561298]
Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while preserving the linguistic content and speaker identity.
In this paper, we aim to explicitly characterize and control the intensity of emotion.
We propose to disentangle the speaker style from linguistic content and encode the speaker style into a style embedding in a continuous space that forms the prototype of emotion embedding.
arXiv Detail & Related papers (2022-01-10T02:11:25Z) - Emotions as abstract evaluation criteria in biological and artificial
intelligences [0.0]
We propose a framework which mimics emotions on a functional level.
Based on time allocation via emotional stationarity (TAES), emotions are implemented as abstract criteria.
The long-term goal of the agent, to align experience with character, is achieved by optimizing the frequency for selecting individual tasks.
arXiv Detail & Related papers (2021-11-30T10:49:04Z) - Emotion Recognition from Multiple Modalities: Fundamentals and
Methodologies [106.62835060095532]
We discuss several key aspects of multi-modal emotion recognition (MER)
We begin with a brief introduction on widely used emotion representation models and affective modalities.
We then summarize existing emotion annotation strategies and corresponding computational tasks.
Finally, we outline several real-world applications and discuss some future directions.
arXiv Detail & Related papers (2021-08-18T21:55:20Z) - A Circular-Structured Representation for Visual Emotion Distribution
Learning [82.89776298753661]
We propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning.
To be specific, we first construct an Emotion Circle to unify any emotional state within it.
On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes.
arXiv Detail & Related papers (2021-06-23T14:53:27Z) - Emotion-aware Chat Machine: Automatic Emotional Response Generation for
Human-like Emotional Interaction [55.47134146639492]
This article proposes a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post.
Experiments on real-world data demonstrate that the proposed method outperforms the state-of-the-art methods in terms of both content coherence and emotion appropriateness.
arXiv Detail & Related papers (2021-06-06T06:26:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.