M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database
- URL: http://arxiv.org/abs/2205.10237v1
- Date: Mon, 9 May 2022 06:52:51 GMT
- Title: M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database
- Authors: Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao
Wang, Haizhou Li
- Abstract summary: We propose a Multi-modal Multi-scene Multi-label Emotional Dialogue dataset, M3ED.
M3ED contains 990 dyadic emotional dialogues from 56 different TV series, a total of 9,082 turns and 24,449 utterances.
To the best of our knowledge, M3ED is the first multimodal emotional dialogue dataset in Chinese.
- Score: 139.08528216461502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The emotional state of a speaker can be influenced by many different factors
in dialogues, such as dialogue scene, dialogue topic, and interlocutor
stimulus. The currently available data resources to support such multimodal
affective analysis in dialogues are however limited in scale and diversity. In
this work, we propose a Multi-modal Multi-scene Multi-label Emotional Dialogue
dataset, M3ED, which contains 990 dyadic emotional dialogues from 56 different
TV series, a total of 9,082 turns and 24,449 utterances. M3 ED is annotated
with 7 emotion categories (happy, surprise, sad, disgust, anger, fear, and
neutral) at utterance level, and encompasses acoustic, visual, and textual
modalities. To the best of our knowledge, M3ED is the first multimodal
emotional dialogue dataset in Chinese. It is valuable for cross-culture emotion
analysis and recognition. We apply several state-of-the-art methods on the M3ED
dataset to verify the validity and quality of the dataset. We also propose a
general Multimodal Dialogue-aware Interaction framework, MDI, to model the
dialogue context for emotion recognition, which achieves comparable performance
to the state-of-the-art methods on the M3ED. The full dataset and codes are
available.
Related papers
- Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset [74.74686464187474]
Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history.
MC-EIU is enabling technology for many human-computer interfaces.
We propose an MC-EIU dataset, which features 7 emotion categories, 9 intent categories, 3 modalities, i.e., textual, acoustic, and visual content, and two languages, English and Mandarin.
arXiv Detail & Related papers (2024-07-03T01:56:00Z) - Which One Are You Referring To? Multimodal Object Identification in
Situated Dialogue [50.279206765971125]
We explore three methods to tackle the problem of interpreting multimodal inputs from conversational and situational contexts.
Our best method, scene-dialogue alignment, improves the performance by 20% F1-score compared to the SIMMC 2.1 baselines.
arXiv Detail & Related papers (2023-02-28T15:45:20Z) - InterMulti:Multi-view Multimodal Interactions with Text-dominated
Hierarchical High-order Fusion for Emotion Analysis [10.048903012988882]
We propose a multimodal emotion analysis framework, InterMulti, to capture complex multimodal interactions from different views.
Our proposed framework decomposes signals of different modalities into three kinds of multimodal interaction representations.
THHF module reasonably integrates the above three kinds of representations into a comprehensive multimodal interaction representation.
arXiv Detail & Related papers (2022-12-20T07:02:32Z) - CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset
for Conversational AI [48.67259855309959]
Most existing datasets for conversational AI ignore human personalities and emotions.
We propose CPED, a large-scale Chinese personalized and emotional dialogue dataset.
CPED contains more than 12K dialogues of 392 speakers from 40 TV shows.
arXiv Detail & Related papers (2022-05-29T17:45:12Z) - EmoInHindi: A Multi-label Emotion and Intensity Annotated Dataset in
Hindi for Emotion Recognition in Dialogues [44.79509115642278]
We create a large conversational dataset in Hindi named EmoInHindi for multi-label emotion and intensity recognition in conversations.
We prepare our dataset in a Wizard-of-Oz manner for mental health and legal counselling of crime victims.
arXiv Detail & Related papers (2022-05-27T11:23:50Z) - MSCTD: A Multimodal Sentiment Chat Translation Dataset [66.81525961469494]
We introduce a new task named Multimodal Chat Translation (MCT)
MCT aims to generate more accurate translations with the help of the associated dialogue history and visual context.
Our work can facilitate research on both multimodal chat translation and multimodal dialogue sentiment analysis.
arXiv Detail & Related papers (2022-02-28T09:40:46Z) - EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion in
Task-Oriented Dialogue Systems [3.3010169113961325]
EmoWOZ is a large-scale manually emotion-annotated corpus of task-oriented dialogues.
It contains more than 11K dialogues with more than 83K emotion annotations of user utterances.
We propose a novel emotion labelling scheme, which is tailored to task-oriented dialogues.
arXiv Detail & Related papers (2021-09-10T15:00:01Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.