Automatic Main Character Recognition for Photographic Studies
- URL: http://arxiv.org/abs/2106.09064v1
- Date: Wed, 16 Jun 2021 18:14:45 GMT
- Title: Automatic Main Character Recognition for Photographic Studies
- Authors: Mert Seker, Anssi M\"annist\"o, Alexandros Iosifidis and Jenni
Raitoharju
- Abstract summary: Main characters in images are the most important humans that catch the viewer's attention upon first look.
Identifying the main character in images plays an important role in traditional photographic studies and media analysis.
We propose a method for identifying the main characters using machine learning based human pose estimation.
- Score: 78.88882860340797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Main characters in images are the most important humans that catch the
viewer's attention upon first look, and they are emphasized by properties such
as size, position, color saturation, and sharpness of focus. Identifying the
main character in images plays an important role in traditional photographic
studies and media analysis, but the task is performed manually and can be slow
and laborious. Furthermore, selection of main characters can be sometimes
subjective. In this paper, we analyze the feasibility of solving the main
character recognition needed for photographic studies automatically and propose
a method for identifying the main characters. The proposed method uses machine
learning based human pose estimation along with traditional computer vision
approaches for this task. We approach the task as a binary classification
problem where each detected human is classified either as a main character or
not. To evaluate both the subjectivity of the task and the performance of our
method, we collected a dataset of 300 varying images from multiple sources and
asked five people, a photographic researcher and four other persons, to
annotate the main characters. Our analysis showed a relatively high agreement
between different annotators. The proposed method achieved a promising F1 score
of 0.83 on the full image set and 0.96 on a subset evaluated as most clear and
important cases by the photographic researcher.
Related papers
- Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - Structuring Quantitative Image Analysis with Object Prominence [0.0]
We suggest carefully considering objects' prominence as an essential step in analyzing images as data.
Our approach combines qualitative analyses with the scalability of quantitative approaches.
arXiv Detail & Related papers (2024-08-30T19:05:28Z) - Understanding Subjectivity through the Lens of Motivational Context in Model-Generated Image Satisfaction [21.00784031928471]
Image generation models are poised to become ubiquitous in a range of applications.
These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard.
To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use cases.
arXiv Detail & Related papers (2024-02-27T01:16:55Z) - Stellar: Systematic Evaluation of Human-Centric Personalized
Text-to-Image Methods [52.806258774051216]
We focus on text-to-image systems that input a single image of an individual and ground the generation process along with text describing the desired visual context.
We introduce a standardized dataset (Stellar) that contains personalized prompts coupled with images of individuals that is an order of magnitude larger than existing relevant datasets and where rich semantic ground-truth annotations are readily available.
We derive a simple yet efficient, personalized text-to-image baseline that does not require test-time fine-tuning for each subject and which sets quantitatively and in human trials a new SoTA.
arXiv Detail & Related papers (2023-12-11T04:47:39Z) - Learning Transferable Pedestrian Representation from Multimodal
Information Supervision [174.5150760804929]
VAL-PAT is a novel framework that learns transferable representations to enhance various pedestrian analysis tasks with multimodal information.
We first perform pre-training on LUPerson-TA dataset, where each image contains text and attribute annotations.
We then transfer the learned representations to various downstream tasks, including person reID, person attribute recognition and text-based person search.
arXiv Detail & Related papers (2023-04-12T01:20:58Z) - Pre-training strategies and datasets for facial representation learning [58.8289362536262]
We show how to find a universal face representation that can be adapted to several facial analysis tasks and datasets.
We systematically investigate two ways of large-scale representation learning applied to faces: supervised and unsupervised pre-training.
Our main two findings are: Unsupervised pre-training on completely in-the-wild, uncurated data provides consistent and, in some cases, significant accuracy improvements.
arXiv Detail & Related papers (2021-03-30T17:57:25Z) - A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic
Assessment [2.9005223064604078]
This paper presents a literature review of the recent techniques of automatic image aesthetics assessment.
A large number of traditional hand crafted and deep learning based approaches are reviewed.
arXiv Detail & Related papers (2021-03-22T07:00:56Z) - Learning to Detect Important People in Unlabelled Images for
Semi-supervised Important People Detection [85.91577271918783]
We propose learning important people detection on partially annotated images.
Our approach iteratively learns to assign pseudo-labels to individuals in un-annotated images.
We have collected two large-scale datasets for evaluation.
arXiv Detail & Related papers (2020-04-16T10:09:37Z) - An Empirical Study of Person Re-Identification with Attributes [15.473033192858543]
In this paper, an attribute-based approach is proposed where the person of interest is described by a set of visual attributes.
We compare multiple algorithms and analyze how the quality of attributes impacts the performance.
A key conclusion is that the performance achieved by non-expert attributes, instead of expert-annotated ones, is a more faithful indicator of the status quo of attribute-based approaches for person re-identification.
arXiv Detail & Related papers (2020-01-25T22:18:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.