ArtEmis: Affective Language for Visual Art
- URL: http://arxiv.org/abs/2101.07396v1
- Date: Tue, 19 Jan 2021 01:03:40 GMT
- Title: ArtEmis: Affective Language for Visual Art
- Authors: Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed
Elhoseiny, Leonidas Guibas
- Abstract summary: We focus on the affective experience triggered by visual artworks.
We ask the annotators to indicate the dominant emotion they feel for a given image.
This leads to a rich set of signals for both the objective content and the affective impact of an image.
- Score: 46.643106054408285
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a novel large-scale dataset and accompanying machine learning
models aimed at providing a detailed understanding of the interplay between
visual content, its emotional effect, and explanations for the latter in
language. In contrast to most existing annotation datasets in computer vision,
we focus on the affective experience triggered by visual artworks and ask the
annotators to indicate the dominant emotion they feel for a given image and,
crucially, to also provide a grounded verbal explanation for their emotion
choice. As we demonstrate below, this leads to a rich set of signals for both
the objective content and the affective impact of an image, creating
associations with abstract concepts (e.g., "freedom" or "love"), or references
that go beyond what is directly visible, including visual similes and
metaphors, or subjective references to personal experiences. We focus on visual
art (e.g., paintings, artistic photographs) as it is a prime example of imagery
created to elicit emotional responses from its viewers. Our dataset, termed
ArtEmis, contains 439K emotion attributions and explanations from humans, on
81K artworks from WikiArt. Building on this data, we train and demonstrate a
series of captioning systems capable of expressing and explaining emotions from
visual stimuli. Remarkably, the captions produced by these systems often
succeed in reflecting the semantic and abstract content of the image, going
well beyond systems trained on existing datasets. The collected dataset and
developed methods are available at https://artemisdataset.org.
Related papers
- Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images.
We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images.
This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z) - StyleEDL: Style-Guided High-order Attention Network for Image Emotion
Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL.
StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents.
In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z) - Contextually-rich human affect perception using multimodal scene
information [36.042369831043686]
We leverage pretrained vision-language (VLN) models to extract descriptions of foreground context from images.
We propose a multimodal context fusion (MCF) module to combine foreground cues with the visual scene and person-based contextual information for emotion prediction.
We show the effectiveness of our proposed modular design on two datasets associated with natural scenes and TV shows.
arXiv Detail & Related papers (2023-03-13T07:46:41Z) - ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer [59.05857591535986]
We propose a model called ViNTER to generate image narratives that focus on time series representing varying emotions as "emotion arcs"
We present experimental results of both manual and automatic evaluations.
arXiv Detail & Related papers (2022-02-15T10:53:08Z) - SOLVER: Scene-Object Interrelated Visual Emotion Reasoning Network [83.27291945217424]
We propose a novel Scene-Object interreLated Visual Emotion Reasoning network (SOLVER) to predict emotions from images.
To mine the emotional relationships between distinct objects, we first build up an Emotion Graph based on semantic concepts and visual features.
We also design a Scene-Object Fusion Module to integrate scenes and objects, which exploits scene features to guide the fusion process of object features with the proposed scene-based attention mechanism.
arXiv Detail & Related papers (2021-10-24T02:41:41Z) - AffectGAN: Affect-Based Generative Art Driven by Semantics [2.323282558557423]
This paper introduces a novel method for generating artistic images that express particular affective states.
Our AffectGAN model is able to generate images based on specific or broad semantic prompts and intended affective outcomes.
A small dataset of 32 images generated by AffectGAN is annotated by 50 participants in terms of the particular emotion they elicit, as well as their quality and novelty.
arXiv Detail & Related papers (2021-09-30T04:53:25Z) - Affective Image Content Analysis: Two Decades Review and New
Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades.
We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence.
We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.