EmoLLM: Multimodal Emotional Understanding Meets Large Language Models
- URL: http://arxiv.org/abs/2406.16442v2
- Date: Sat, 29 Jun 2024 14:32:46 GMT
- Title: EmoLLM: Multimodal Emotional Understanding Meets Large Language Models
- Authors: Qu Yang, Mang Ye, Bo Du,
- Abstract summary: Multi-modal large language models (MLLMs) have achieved remarkable performance on objective multimodal perception tasks.
But their ability to interpret subjective, emotionally nuanced multimodal content remains largely unexplored.
EmoLLM is a novel model for multimodal emotional understanding, incorporating with two core techniques.
- Score: 61.179731667080326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal large language models (MLLMs) have achieved remarkable performance on objective multimodal perception tasks, but their ability to interpret subjective, emotionally nuanced multimodal content remains largely unexplored. Thus, it impedes their ability to effectively understand and react to the intricate emotions expressed by humans through multimodal media. To bridge this gap, we introduce EmoBench, the first comprehensive benchmark designed specifically to evaluate the emotional capabilities of MLLMs across five popular emotional tasks, using a diverse dataset of 287k images and videos paired with corresponding textual instructions. Meanwhile, we propose EmoLLM, a novel model for multimodal emotional understanding, incorporating with two core techniques. 1) Multi-perspective Visual Projection, it captures diverse emotional cues from visual data from multiple perspectives. 2) EmoPrompt, it guides MLLMs to reason about emotions in the correct direction. Experimental results demonstrate that EmoLLM significantly elevates multimodal emotional understanding performance, with an average improvement of 12.1% across multiple foundation models on EmoBench. Our work contributes to the advancement of MLLMs by facilitating a deeper and more nuanced comprehension of intricate human emotions, paving the way for the development of artificial emotional intelligence capabilities with wide-ranging applications in areas such as human-computer interaction, mental health support, and empathetic AI systems. Code, data, and model will be released.
Related papers
- Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning [55.127202990679976]
We introduce the MERR dataset, containing 28,618 coarse-grained and 4,487 fine-grained annotated samples across diverse emotional categories.
This dataset enables models to learn from varied scenarios and generalize to real-world applications.
We propose Emotion-LLaMA, a model that seamlessly integrates audio, visual, and textual inputs through emotion-specific encoders.
arXiv Detail & Related papers (2024-06-17T03:01:22Z) - EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model [22.292581935835678]
We construct a dataset for Emotion Analysis in Long-sequential and De-identity videos called EALD.
We also provide the Non-Facial Body Language (NFBL) annotations for each player.
NFBL is an inner-driven emotional expression and can serve as an identity-free clue to understanding the emotional state.
arXiv Detail & Related papers (2024-05-01T15:25:54Z) - UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause [18.84743213557238]
Multimodal emotion recognition in conversation (MERC) and multimodal emotion-cause pair extraction (MECPE) has recently garnered significant attention.
We propose a Unified Multimodal Emotion recognition and Emotion-Cause analysis framework (UniMEEC) to explore the causality and complementarity between emotion and emotion cause.
UniMEEC reformulates the MERC and MECPE tasks as two mask prediction problems, enhancing the interaction between emotion and cause.
arXiv Detail & Related papers (2024-03-30T15:59:17Z) - Enhancing Emotional Generation Capability of Large Language Models via
Emotional Chain-of-Thought [53.1230874584344]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks.
We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z) - The Good, The Bad, and Why: Unveiling Emotions in Generative AI [73.94035652867618]
We show that EmotionPrompt can boost the performance of AI models while EmotionAttack can hinder it.
EmotionDecode reveals that AI models can comprehend emotional stimuli akin to the mechanism of dopamine in the human brain.
arXiv Detail & Related papers (2023-12-18T11:19:45Z) - Large Language Models Understand and Can be Enhanced by Emotional
Stimuli [53.53886609012119]
We take the first step towards exploring the ability of Large Language Models to understand emotional stimuli.
Our experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts.
Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks.
arXiv Detail & Related papers (2023-07-14T00:57:12Z) - Emotion Recognition from Multiple Modalities: Fundamentals and
Methodologies [106.62835060095532]
We discuss several key aspects of multi-modal emotion recognition (MER)
We begin with a brief introduction on widely used emotion representation models and affective modalities.
We then summarize existing emotion annotation strategies and corresponding computational tasks.
Finally, we outline several real-world applications and discuss some future directions.
arXiv Detail & Related papers (2021-08-18T21:55:20Z) - HEU Emotion: A Large-scale Database for Multi-modal Emotion Recognition
in the Wild [0.0]
We release a new natural state video database (called HEU Emotion)
HEU Emotion contains a total of 19,004 video clips, which is divided into two parts according to the data source.
The recognition accuracies for the two parts increased by 2.19% and 4.01% respectively over those of single-modal facial expression recognition.
arXiv Detail & Related papers (2020-07-24T13:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.