SOLVER: Scene-Object Interrelated Visual Emotion Reasoning Network
- URL: http://arxiv.org/abs/2110.12334v1
- Date: Sun, 24 Oct 2021 02:41:41 GMT
- Title: SOLVER: Scene-Object Interrelated Visual Emotion Reasoning Network
- Authors: Jingyuan Yang, Xinbo Gao, Leida Li, Xiumei Wang, and Jinshan Ding
- Abstract summary: We propose a novel Scene-Object interreLated Visual Emotion Reasoning network (SOLVER) to predict emotions from images.
To mine the emotional relationships between distinct objects, we first build up an Emotion Graph based on semantic concepts and visual features.
We also design a Scene-Object Fusion Module to integrate scenes and objects, which exploits scene features to guide the fusion process of object features with the proposed scene-based attention mechanism.
- Score: 83.27291945217424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Emotion Analysis (VEA) aims at finding out how people feel emotionally
towards different visual stimuli, which has attracted great attention recently
with the prevalence of sharing images on social networks. Since human emotion
involves a highly complex and abstract cognitive process, it is difficult to
infer visual emotions directly from holistic or regional features in affective
images. It has been demonstrated in psychology that visual emotions are evoked
by the interactions between objects as well as the interactions between objects
and scenes within an image. Inspired by this, we propose a novel Scene-Object
interreLated Visual Emotion Reasoning network (SOLVER) to predict emotions from
images. To mine the emotional relationships between distinct objects, we first
build up an Emotion Graph based on semantic concepts and visual features. Then,
we conduct reasoning on the Emotion Graph using Graph Convolutional Network
(GCN), yielding emotion-enhanced object features. We also design a Scene-Object
Fusion Module to integrate scenes and objects, which exploits scene features to
guide the fusion process of object features with the proposed scene-based
attention mechanism. Extensive experiments and comparisons are conducted on
eight public visual emotion datasets, and the results demonstrate that the
proposed SOLVER consistently outperforms the state-of-the-art methods by a
large margin. Ablation studies verify the effectiveness of our method and
visualizations prove its interpretability, which also bring new insight to
explore the mysteries in VEA. Notably, we further discuss SOLVER on three other
potential datasets with extended experiments, where we validate the robustness
of our method and notice some limitations of it.
Related papers
- UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception [8.54013419046987]
We introduce UniEmoX, a cross-modal semantic-guided large-scale pretraining framework for visual emotion analysis.
By exploiting the similarity between paired and unpaired image-text samples, UniEmoX distills rich semantic knowledge from the CLIP model to enhance emotional embedding representations.
We develop a visual emotional dataset titled Emo8, covering nearly all common emotional scenes.
arXiv Detail & Related papers (2024-09-27T16:12:51Z) - StyleEDL: Style-Guided High-order Attention Network for Image Emotion
Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL.
StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents.
In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z) - EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes [53.95428298229396]
We introduce EmoSet, the first large-scale visual emotion dataset annotated with rich attributes.
EmoSet comprises 3.3 million images in total, with 118,102 of these images carefully labeled by human annotators.
Motivated by psychological studies, in addition to emotion category, each image is also annotated with a set of describable emotion attributes.
arXiv Detail & Related papers (2023-07-16T06:42:46Z) - Multi-Cue Adaptive Emotion Recognition Network [4.570705738465714]
We propose a new deep learning approach for emotion recognition based on adaptive multi-cues.
We compare the proposed approach with the state-of-art approaches in the CAER-S dataset.
arXiv Detail & Related papers (2021-11-03T15:08:55Z) - Stimuli-Aware Visual Emotion Analysis [75.68305830514007]
We propose a stimuli-aware visual emotion analysis (VEA) method consisting of three stages, namely stimuli selection, feature extraction and emotion prediction.
To the best of our knowledge, it is the first time to introduce stimuli selection process into VEA in an end-to-end network.
Experiments demonstrate that the proposed method consistently outperforms the state-of-the-art approaches on four public visual emotion datasets.
arXiv Detail & Related papers (2021-09-04T08:14:52Z) - A Circular-Structured Representation for Visual Emotion Distribution
Learning [82.89776298753661]
We propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning.
To be specific, we first construct an Emotion Circle to unify any emotional state within it.
On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes.
arXiv Detail & Related papers (2021-06-23T14:53:27Z) - Enhancing Cognitive Models of Emotions with Representation Learning [58.2386408470585]
We present a novel deep learning-based framework to generate embedding representations of fine-grained emotions.
Our framework integrates a contextualized embedding encoder with a multi-head probing model.
Our model is evaluated on the Empathetic Dialogue dataset and shows the state-of-the-art result for classifying 32 emotions.
arXiv Detail & Related papers (2021-04-20T16:55:15Z) - ArtEmis: Affective Language for Visual Art [46.643106054408285]
We focus on the affective experience triggered by visual artworks.
We ask the annotators to indicate the dominant emotion they feel for a given image.
This leads to a rich set of signals for both the objective content and the affective impact of an image.
arXiv Detail & Related papers (2021-01-19T01:03:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.