Related papers: Toward Quantifying Ambiguities in Artistic Images

Toward Quantifying Ambiguities in Artistic Images

URL: http://arxiv.org/abs/2008.09688v1
Date: Fri, 21 Aug 2020 21:40:16 GMT
Title: Toward Quantifying Ambiguities in Artistic Images
Authors: Xi Wang, Zoya Bylinskii, Aaron Hertzmann, Robert Pepperell
Abstract summary: This paper presents an approach to measuring the perceptual ambiguity of a collection of images. Crowdworkers are asked to describe image content, after different viewing durations. Experiments are performed using images created with Generative Adversarial Networks.
Score: 21.152039726639426
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It has long been hypothesized that perceptual ambiguities play an important role in aesthetic experience: a work with some ambiguity engages a viewer more than one that does not. However, current frameworks for testing this theory are limited by the availability of stimuli and data collection methods. This paper presents an approach to measuring the perceptual ambiguity of a collection of images. Crowdworkers are asked to describe image content, after different viewing durations. Experiments are performed using images created with Generative Adversarial Networks, using the Artbreeder website. We show that text processing of viewer responses can provide a fine-grained way to measure and describe image ambiguities.

Related papers

When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability. We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks. Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z)
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation [60.943159830780154]
We introduce Bounded Attention, a training-free method for bounding the information flow in the sampling process. We demonstrate that our method empowers the generation of multiple subjects that better align with given prompts and layouts.
arXiv Detail & Related papers (2024-03-25T17:52:07Z)
Leveraging Open-Vocabulary Diffusion to Camouflaged Instance Segmentation [59.78520153338878]
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions. We propose a method built upon a state-of-the-art diffusion model, empowered by open-vocabulary to learn multi-scale textual-visual features for camouflaged object representations.
arXiv Detail & Related papers (2023-12-29T07:59:07Z)
StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL. StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents. In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z)
Context-driven Visual Object Recognition based on Knowledge Graphs [0.8701566919381223]
We propose an approach that enhances deep learning methods by using external contextual knowledge encoded in a knowledge graph. We conduct a series of experiments to investigate the impact of different contextual views on the learned object representations for the same image dataset.
arXiv Detail & Related papers (2022-10-20T13:09:00Z)
NewsStories: Illustrating articles with visual summaries [49.924916589209374]
We introduce a large-scale multimodal dataset containing over 31M articles, 22M images and 1M videos. We show that state-of-the-art image-text alignment methods are not robust to longer narratives with multiple images. We introduce an intuitive baseline that outperforms these methods on zero-shot image-set retrieval by 10% on the GoodNews dataset.
arXiv Detail & Related papers (2022-07-26T17:34:11Z)
Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z)
Hierarchical Semantic Segmentation using Psychometric Learning [17.417302703539367]
We develop a novel approach to collect segmentation annotations from experts based on psychometric testing. Our method consists of the psychometric testing procedure, active query selection, query enhancement, and a deep metric learning model. We show the merits of our method with evaluation on the synthetically generated image, aerial image and histology image.
arXiv Detail & Related papers (2021-07-07T13:38:33Z)
A Decade Survey of Content Based Image Retrieval using Deep Learning [13.778851745408133]
This paper presents a comprehensive survey of deep learning based developments in the past decade for content based image retrieval. The similarity between the representative features of the query image and dataset images is used to rank the images for retrieval. Deep learning has emerged as a dominating alternative of hand-designed feature engineering from a decade.
arXiv Detail & Related papers (2020-11-23T02:12:30Z)
Visual Relationship Detection using Scene Graphs: A Survey [1.3505077405741583]
A Scene Graph is a technique to better represent a scene and the various relationships present in it. We present a detailed survey on the various techniques for scene graph generation, their efficacy to represent visual relationships and how it has been used to solve various downstream tasks.
arXiv Detail & Related papers (2020-05-16T17:06:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.