Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
- URL: http://arxiv.org/abs/2308.16349v3
- Date: Tue, 27 Aug 2024 07:22:50 GMT
- Title: Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
- Authors: Kilichbek Haydarov, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin Elsayed, Mohamed Elhoseiny,
- Abstract summary: We introduce Affective Visual Dialog as a testbed for research on understanding the formation of emotions in visually grounded conversations.
The task involves three skills: Dialog-based Question Answering, Dialog-based Emotion Prediction and Affective emotion explanation generation.
Our key contribution is the collection of a large-scale dataset, dubbed AffectVisDial, consisting of 50K 10-turn visually grounded dialogs.
- Score: 31.707018753687098
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Affective Visual Dialog, an emotion explanation and reasoning task as a testbed for research on understanding the formation of emotions in visually grounded conversations. The task involves three skills: (1) Dialog-based Question Answering (2) Dialog-based Emotion Prediction and (3) Affective emotion explanation generation based on the dialog. Our key contribution is the collection of a large-scale dataset, dubbed AffectVisDial, consisting of 50K 10-turn visually grounded dialogs as well as concluding emotion attributions and dialog-informed textual emotion explanations, resulting in a total of 27,180 working hours. We explain our design decisions in collecting the dataset and introduce the questioner and answerer tasks that are associated with the participants in the conversation. We train and demonstrate solid Affective Visual Dialog baselines adapted from state-of-the-art models. Remarkably, the responses generated by our models show promising emotional reasoning abilities in response to visually grounded conversations. Our project page is available at https://affective-visual-dialog.github.io.
Related papers
- SemEval-2024 Task 3: Multimodal Emotion Cause Analysis in Conversations [53.60993109543582]
SemEval-2024 Task 3, named Multimodal Emotion Cause Analysis in Conversations, aims at extracting all pairs of emotions and their corresponding causes from conversations.
Under different modality settings, it consists of two subtasks: Textual Emotion-Cause Pair Extraction in Conversations (TECPE) and Multimodal Emotion-Cause Pair Extraction in Conversations (MECPE)
In this paper, we introduce the task, dataset and evaluation settings, summarize the systems of the top teams, and discuss the findings of the participants.
arXiv Detail & Related papers (2024-05-19T09:59:00Z) - Personality-affected Emotion Generation in Dialog Systems [67.40609683389947]
We propose a new task, Personality-affected Emotion Generation, to generate emotion based on the personality given to the dialog system.
We analyze the challenges in this task, i.e., (1) heterogeneously integrating personality and emotional factors and (2) extracting multi-granularity emotional information in the dialog context.
Results suggest that by adopting our method, the emotion generation performance is improved by 13% in macro-F1 and 5% in weighted-F1 from the BERT-base model.
arXiv Detail & Related papers (2024-04-03T08:48:50Z) - Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.
To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity.
Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - deep learning of segment-level feature representation for speech emotion
recognition in conversations [9.432208348863336]
We propose a conversational speech emotion recognition method to deal with capturing attentive contextual dependency and speaker-sensitive interactions.
First, we use a pretrained VGGish model to extract segment-based audio representation in individual utterances.
Second, an attentive bi-directional recurrent unit (GRU) models contextual-sensitive information and explores intra- and inter-speaker dependencies jointly.
arXiv Detail & Related papers (2023-02-05T16:15:46Z) - Think Twice: A Human-like Two-stage Conversational Agent for Emotional Response Generation [16.659457455269127]
We propose a two-stage conversational agent for the generation of emotional dialogue.
First, a dialogue model trained without the emotion-annotated dialogue corpus generates a prototype response that meets the contextual semantics.
Secondly, the first-stage prototype is modified by a controllable emotion refiner with the empathy hypothesis.
arXiv Detail & Related papers (2023-01-12T10:03:56Z) - Chat-Capsule: A Hierarchical Capsule for Dialog-level Emotion Analysis [70.98130990040228]
We propose a Context-based Hierarchical Attention Capsule(Chat-Capsule) model, which models both utterance-level and dialog-level emotions and their interrelations.
On a dialog dataset collected from customer support of an e-commerce platform, our model is also able to predict user satisfaction and emotion curve category.
arXiv Detail & Related papers (2022-03-23T08:04:30Z) - Simulated Annealing for Emotional Dialogue Systems [22.96717845092991]
We consider the task of expressing a specific emotion for dialogue generation.
Our proposed method shows 12% improvements in emotion accuracy compared with the previous state-of-the-art method.
arXiv Detail & Related papers (2021-09-22T13:17:17Z) - Generating Empathetic Responses with a Large Scale Dialog Dataset [0.76146285961466]
Existing models either directly incorporate pre-defined emotion information to guide the response generation, or use deterministic rules to decide the response emotion.
We show how to build a multi-turn empathetic dialog model that performs well compared to its baselines over 6,000 human evaluated instances.
arXiv Detail & Related papers (2021-05-14T13:45:40Z) - Knowledge Bridging for Empathetic Dialogue Generation [52.39868458154947]
Lack of external knowledge makes empathetic dialogue systems difficult to perceive implicit emotions and learn emotional interactions from limited dialogue history.
We propose to leverage external knowledge, including commonsense knowledge and emotional lexical knowledge, to explicitly understand and express emotions in empathetic dialogue generation.
arXiv Detail & Related papers (2020-09-21T09:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.