Related papers: Structuring Quantitative Image Analysis with Object Prominence

Structuring Quantitative Image Analysis with Object Prominence

URL: http://arxiv.org/abs/2409.00216v1
Date: Fri, 30 Aug 2024 19:05:28 GMT
Title: Structuring Quantitative Image Analysis with Object Prominence
Authors: Christian Arnold, Andreas Küpfer,
Abstract summary: We suggest carefully considering objects' prominence as an essential step in analyzing images as data. Our approach combines qualitative analyses with the scalability of quantitative approaches.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When photographers and other editors of image material produce an image, they make a statement about what matters by situating some objects in the foreground and others in the background. While this prominence of objects is a key analytical category to qualitative scholars, recent quantitative approaches to automated image analysis have not yet made this important distinction but treat all areas of an image similarly. We suggest carefully considering objects' prominence as an essential step in analyzing images as data. Its modeling requires defining an object and operationalizing and measuring how much attention a human eye would pay. Our approach combines qualitative analyses with the scalability of quantitative approaches. Exemplifying object prominence with different implementations -- object size and centeredness, the pixels' image depth, and salient image regions -- we showcase the usefulness of our approach with two applications. First, we scale the ideology of eight US newspapers based on images. Second, we analyze the prominence of women in the campaign videos of the U.S. presidential races in 2016 and 2020. We hope that our article helps all keen to study image data in a conceptually meaningful way at scale.

Related papers

For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives [3.418398936676879]
This work presents FRESCO, a framework designed to explore the socio-cultural implications of images on social media platforms at scale. FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques. The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer.
arXiv Detail & Related papers (2024-07-03T16:57:38Z)
Are These the Same Apple? Comparing Images Based on Object Intrinsics [27.43687450076182]
Measure image similarity purely based on intrinsic object properties that define object identity. This problem has been studied in the computer vision literature as re-identification. We propose to extend it to general object categories, exploring an image similarity metric based on object intrinsics.
arXiv Detail & Related papers (2023-11-01T18:00:03Z)
Spotlight Attention: Robust Object-Centric Learning With a Spatial Locality Prior [88.9319150230121]
Object-centric vision aims to construct an explicit representation of the objects in a scene. We incorporate a spatial-locality prior into state-of-the-art object-centric vision models. We obtain significant improvements in segmenting objects in both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-31T04:35:50Z)
ImageSubject: A Large-scale Dataset for Subject Detection [9.430492045581534]
Main subjects usually exist in the images or videos, as they are the objects that the photographer wants to highlight. Detecting the main subjects is an important technique to help machines understand the content of images and videos. We present a new dataset with the goal of training models to understand the layout of the objects then to find the main subjects among them.
arXiv Detail & Related papers (2022-01-09T22:49:59Z)
From Show to Tell: A Survey on Image Captioning [48.98681267347662]
Connecting Vision and Language plays an essential role in Generative Intelligence. Research in image captioning has not reached a conclusive answer yet. This work aims at providing a comprehensive overview and categorization of image captioning approaches.
arXiv Detail & Related papers (2021-07-14T18:00:54Z)
Automatic Main Character Recognition for Photographic Studies [78.88882860340797]
Main characters in images are the most important humans that catch the viewer's attention upon first look. Identifying the main character in images plays an important role in traditional photographic studies and media analysis. We propose a method for identifying the main characters using machine learning based human pose estimation.
arXiv Detail & Related papers (2021-06-16T18:14:45Z)
Common Limitations of Image Processing Metrics: A Picture Story [58.83274952067888]
This document focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.
arXiv Detail & Related papers (2021-04-12T17:03:42Z)
A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic Assessment [2.9005223064604078]
This paper presents a literature review of the recent techniques of automatic image aesthetics assessment. A large number of traditional hand crafted and deep learning based approaches are reviewed.
arXiv Detail & Related papers (2021-03-22T07:00:56Z)
A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images. We present a simple yet surprisingly effective framework to do so. Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z)
Intentonomy: a Dataset and Study towards Human Intent Understanding [65.49299806821791]
We study the intent behind social media images with an aim to analyze how visual information can help the recognition of human intent. We introduce an intent dataset, Intentonomy, comprising 14K images covering a wide range of everyday scenes. We then systematically study whether, and to what extent, commonly used visual information, i.e., object and context, contribute to human motive understanding.
arXiv Detail & Related papers (2020-11-11T05:39:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.