Related papers: UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause

UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause

URL: http://arxiv.org/abs/2404.00403v1
Date: Sat, 30 Mar 2024 15:59:17 GMT
Title: UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause
Authors: Guimin Hu, Zhihong Zhu, Daniel Hershcovich, Hasti Seifi, Jiayuan Xie,
Abstract summary: Multimodal emotion recognition in conversation (MERC) and multimodal emotion-cause pair extraction (MECPE) has recently garnered significant attention. We propose a Unified Multimodal Emotion recognition and Emotion-Cause analysis framework (UniMEEC) to explore the causality and complementarity between emotion and emotion cause. UniMEEC reformulates the MERC and MECPE tasks as two mask prediction problems, enhancing the interaction between emotion and cause.
Score: 18.84743213557238
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal emotion recognition in conversation (MERC) and multimodal emotion-cause pair extraction (MECPE) has recently garnered significant attention. Emotions are the expression of affect or feelings; responses to specific events, thoughts, or situations are known as emotion causes. Both are like two sides of a coin, collectively describing human behaviors and intents. However, most existing works treat MERC and MECPE as separate tasks, which may result in potential challenges in integrating emotion and cause in real-world applications. In this paper, we propose a Unified Multimodal Emotion recognition and Emotion-Cause analysis framework (UniMEEC) to explore the causality and complementarity between emotion and emotion cause. Concretely, UniMEEC reformulates the MERC and MECPE tasks as two mask prediction problems, enhancing the interaction between emotion and cause. Meanwhile, UniMEEC shares the prompt learning among modalities for probing modality-specific knowledge from the Pre-trained model. Furthermore, we propose a task-specific hierarchical context aggregation to control the information flow to the task. Experiment results on four public benchmark datasets verify the model performance on MERC and MECPE tasks and achieve consistent improvements compared with state-of-the-art methods.

Related papers

EmoVerse: Exploring Multimodal Large Language Models for Sentiment and Emotion Understanding [5.3848462080869215]
We introduce Emotion Universe (EmoVerse), an MLLM designed to handle a broad spectrum of sentiment and emotion-related tasks. EmoVerse is capable of deeply analyzing the underlying causes of emotional states. We also introduce the Affective Multitask (AMT) dataset.
arXiv Detail & Related papers (2024-12-11T02:55:00Z)
Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset. For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset. We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z)
Generative Emotion Cause Explanation in Multimodal Conversations [22.476961519338474]
We introduce a task-textbfMultimodal Emotion Cause Explanation in Conversation (MECEC).<n>This task aims to generate a summary based on the multimodal context of conversations, clearly and intuitively describing the reasons that trigger a given emotion.<n>A novel approach, FAME-Net, is proposed, that harnesses the power of Large Language Models (LLMs) to analyze visual data and accurately interpret the emotions conveyed through facial expressions in videos.
arXiv Detail & Related papers (2024-11-01T09:16:30Z)
EmoLLM: Multimodal Emotional Understanding Meets Large Language Models [61.179731667080326]
Multi-modal large language models (MLLMs) have achieved remarkable performance on objective multimodal perception tasks. But their ability to interpret subjective, emotionally nuanced multimodal content remains largely unexplored. EmoLLM is a novel model for multimodal emotional understanding, incorporating with two core techniques.
arXiv Detail & Related papers (2024-06-24T08:33:02Z)
Think out Loud: Emotion Deducing Explanation in Dialogues [57.90554323226896]
We propose a new task "Emotion Deducing Explanation in Dialogues" (EDEN) EDEN recognizes emotion and causes in an explicitly thinking way. It can help Large Language Models (LLMs) achieve better recognition of emotions and causes.
arXiv Detail & Related papers (2024-06-07T08:58:29Z)
SemEval-2024 Task 3: Multimodal Emotion Cause Analysis in Conversations [53.60993109543582]
SemEval-2024 Task 3, named Multimodal Emotion Cause Analysis in Conversations, aims at extracting all pairs of emotions and their corresponding causes from conversations. Under different modality settings, it consists of two subtasks: Textual Emotion-Cause Pair Extraction in Conversations (TECPE) and Multimodal Emotion-Cause Pair Extraction in Conversations (MECPE) In this paper, we introduce the task, dataset and evaluation settings, summarize the systems of the top teams, and discuss the findings of the participants.
arXiv Detail & Related papers (2024-05-19T09:59:00Z)
UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts [64.02363948840333]
UMETTS is a novel framework that leverages emotional cues from multiple modalities to generate highly expressive and emotionally resonant speech. EP-Align employs contrastive learning to align emotional features across text, audio, and visual modalities, ensuring a coherent fusion of multimodal information. EMI-TTS integrates the aligned emotional embeddings with state-of-the-art TTS models to synthesize speech that accurately reflects the intended emotions.
arXiv Detail & Related papers (2024-04-29T03:19:39Z)
Samsung Research China-Beijing at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations [12.095837596104552]
In human-computer interaction, it is crucial for agents to respond to human by understanding their emotions. New task named Multimodal Emotion-Cause Pair Extraction in Conversations is responsible for recognizing emotion and identifying causal expressions. We propose a multi-stage framework to generate emotion and extract the emotion causal pairs given the target emotion.
arXiv Detail & Related papers (2024-04-25T11:52:21Z)
Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation. This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions. Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z)
UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition [32.34485263348587]
Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors. We propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. We perform modality fusion at the syntactic and semantic levels and introduce contrastive learning between modalities and samples to better capture the difference and consistency between sentiments and emotions.
arXiv Detail & Related papers (2022-11-21T08:46:01Z)
A Multi-turn Machine Reading Comprehension Framework with Rethink Mechanism for Emotion-Cause Pair Extraction [6.6564045064972825]
Emotion-cause pair extraction (ECPE) is an emerging task in emotion cause analysis. We propose a Multi-turn MRC framework with Rethink mechanism (MM-R) to tackle the ECPE task. Our framework can model complicated relations between emotions and causes while avoiding generating the pairing matrix.
arXiv Detail & Related papers (2022-09-16T14:38:58Z)
Shapes of Emotions: Multimodal Emotion Recognition in Conversations via Emotion Shifts [2.443125107575822]
Emotion Recognition in Conversations (ERC) is an important and active research problem. Recent work has shown the benefits of using multiple modalities for the ERC task. We propose a multimodal ERC model and augment it with an emotion-shift component.
arXiv Detail & Related papers (2021-12-03T14:39:04Z)
Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies [106.62835060095532]
We discuss several key aspects of multi-modal emotion recognition (MER) We begin with a brief introduction on widely used emotion representation models and affective modalities. We then summarize existing emotion annotation strategies and corresponding computational tasks. Finally, we outline several real-world applications and discuss some future directions.
arXiv Detail & Related papers (2021-08-18T21:55:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.