Related papers: Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects

Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects

URL: http://arxiv.org/abs/2505.20511v1
Date: Mon, 26 May 2025 20:23:24 GMT
Title: Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects
Authors: Chengyan Wu, Yiqiang Cai, Yang Liu, Pengxu Zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, Bolei Ma,
Abstract summary: Multimodal Emotion Recognition in Conversations is a direction for enhancing the naturalness and emotional understanding of human-computer interaction.<n>Its goal is to accurately recognize emotions by integrating information from various modalities such as text, speech, and visual signals.
Score: 7.505690224453812
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While text-based emotion recognition methods have achieved notable success, real-world dialogue systems often demand a more nuanced emotional understanding than any single modality can offer. Multimodal Emotion Recognition in Conversations (MERC) has thus emerged as a crucial direction for enhancing the naturalness and emotional understanding of human-computer interaction. Its goal is to accurately recognize emotions by integrating information from various modalities such as text, speech, and visual signals. This survey offers a systematic overview of MERC, including its motivations, core tasks, representative methods, and evaluation strategies. We further examine recent trends, highlight key challenges, and outline future directions. As interest in emotionally intelligent systems grows, this survey provides timely guidance for advancing MERC research.

Related papers

Bridging Cognition and Emotion: Empathy-Driven Multimodal Misinformation Detection [56.644686934050576]
Social media has become a major conduit for information dissemination, yet it also facilitates the rapid spread of misinformation.<n>Traditional misinformation detection methods primarily focus on surface-level features, overlooking the crucial roles of human empathy in the propagation process.<n>We propose the Dual-Aspect Empathy Framework (DAE), which integrates cognitive and emotional empathy to analyze misinformation from both the creator and reader perspectives.
arXiv Detail & Related papers (2025-04-24T07:48:26Z)
In-Depth Analysis of Emotion Recognition through Knowledge-Based Large Language Models [3.8153944233011385]
This paper contributes to the emerging field of context-based emotion recognition. We propose an approach that combines emotion recognition methods with Bayesian Cue Integration. We test this approach in the context of interpreting facial expressions during a social task, the prisoner's dilemma.
arXiv Detail & Related papers (2024-07-17T06:39:51Z)
Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought [50.13429055093534]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks. We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z)
From Multilingual Complexity to Emotional Clarity: Leveraging Commonsense to Unveil Emotions in Code-Mixed Dialogues [38.87497808740538]
Understanding emotions during conversation is a fundamental aspect of human communication, driving NLP research for Emotion Recognition in Conversation (ERC) We propose an innovative approach that integrates commonsense information with dialogue context to facilitate a deeper understanding of emotions. Our comprehensive experimentation showcases the substantial performance improvement obtained through the systematic incorporation of commonsense in ERC.
arXiv Detail & Related papers (2023-10-19T18:17:00Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities. We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z)
Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies [106.62835060095532]
We discuss several key aspects of multi-modal emotion recognition (MER) We begin with a brief introduction on widely used emotion representation models and affective modalities. We then summarize existing emotion annotation strategies and corresponding computational tasks. Finally, we outline several real-world applications and discuss some future directions.
arXiv Detail & Related papers (2021-08-18T21:55:20Z)
Target Guided Emotion Aware Chat Machine [58.8346820846765]
The consistency of a response to a given post at semantic-level and emotional-level is essential for a dialogue system to deliver human-like interactions. This article proposes a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post.
arXiv Detail & Related papers (2020-11-15T01:55:37Z)
Knowledge Bridging for Empathetic Dialogue Generation [52.39868458154947]
Lack of external knowledge makes empathetic dialogue systems difficult to perceive implicit emotions and learn emotional interactions from limited dialogue history. We propose to leverage external knowledge, including commonsense knowledge and emotional lexical knowledge, to explicitly understand and express emotions in empathetic dialogue generation.
arXiv Detail & Related papers (2020-09-21T09:21:52Z)
Temporal aggregation of audio-visual modalities for emotion recognition [0.5352699766206808]
We propose a multimodal fusion technique for emotion recognition based on combining audio-visual modalities from a temporal window with different temporal offsets for each modality. Our proposed method outperforms other methods from the literature and human accuracy rating.
arXiv Detail & Related papers (2020-07-08T18:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.