On the Role of Individual Differences in Current Approaches to Computational Image Aesthetics
- URL: http://arxiv.org/abs/2502.20518v2
- Date: Thu, 18 Sep 2025 11:55:54 GMT
- Title: On the Role of Individual Differences in Current Approaches to Computational Image Aesthetics
- Authors: Li-Wei Chen, Ombretta Strafforello, Anne-Sofie Maerten, Tinne Tuytelaars, Johan Wagemans,
- Abstract summary: Image assessment (IAA) evaluates image aesthetics, a task complicated by image diversity and user subjectivity.<n>Current approaches address this in two stages: Generic IAA (GIAA) models estimate mean aesthetic scores, while Personal IAA (PIAA) models adapt GIAA using transfer learning to incorporate user subjectivity.<n>This work establishes a theoretical foundation for IAA, proposing a unified model that encodes individual characteristics in a distributional format for both individual and group assessments.
- Score: 38.85583529536269
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image aesthetic assessment (IAA) evaluates image aesthetics, a task complicated by image diversity and user subjectivity. Current approaches address this in two stages: Generic IAA (GIAA) models estimate mean aesthetic scores, while Personal IAA (PIAA) models adapt GIAA using transfer learning to incorporate user subjectivity. However, a theoretical understanding of transfer learning between GIAA and PIAA, particularly concerning the impact of group composition, group size, aesthetic differences between groups and individuals, and demographic correlations, is lacking. This work establishes a theoretical foundation for IAA, proposing a unified model that encodes individual characteristics in a distributional format for both individual and group assessments. We show that transferring from GIAA to PIAA involves extrapolation, while the reverse involves interpolation, which is generally more effective for machine learning. Extensive experiments with varying group compositions, including sub-sampling by group size and disjoint demographics, reveal substantial performance variation even for GIAA, challenging the assumption that averaging scores eliminates individual subjectivity. Score-distribution analysis using Earth Mover's Distance (EMD) and the Gini index identifies education, photography experience, and art experience as key factors in aesthetic differences, with greater subjectivity in artworks than in photographs. Code is available at https://github.com/lwchen6309/aesthetics_transfer_learning.
Related papers
- Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks [26.53088863857899]
Image aesthetic assessment (IAA) has extensive applications in content creation, album management, and recommendation systems.<n>State-of-the-art IAA models are typically designed for coarse-grained evaluation.<n>We propose FGAesQ, a novel IAA framework that learns discriminative aesthetic scores from relative ranks.
arXiv Detail & Related papers (2026-03-04T10:13:27Z) - A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on.
Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions.
The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - Exploring Social Media Image Categorization Using Large Models with Different Adaptation Methods: A Case Study on Cultural Nature's Contributions to People [1.7736307382785161]
Social media images provide valuable insights for modeling, mapping, and understanding human interactions with natural and cultural heritage.<n> categorizing these images into semantically meaningful groups remains highly complex due to the vast diversity and heterogeneity of their visual content.<n>We introduce FLIPS a dataset of Flickr images that capture the interaction between human and nature.<n>We evaluate various solutions based on different types and combinations of large models using various adaptation methods.
arXiv Detail & Related papers (2024-09-30T23:04:55Z) - Deep Learning Activation Functions: Fixed-Shape, Parametric, Adaptive, Stochastic, Miscellaneous, Non-Standard, Ensemble [0.0]
Activation functions (AFs) play a pivotal role in the architecture of deep learning models.
This paper presents a comprehensive review of various types of AFs, including fixed-shape, adaptive, non-standard, and ensemble/combining types.
The paper concludes with a comparative evaluation of 12 state-of-the-art AFs.
arXiv Detail & Related papers (2024-07-14T17:53:49Z) - Towards Geographic Inclusion in the Evaluation of Text-to-Image Models [25.780536950323683]
We study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images.
For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative.
We recommend steps for improved automatic and human evaluations.
arXiv Detail & Related papers (2024-05-07T16:23:06Z) - Adaptive Contextual Perception: How to Generalize to New Backgrounds and
Ambiguous Objects [75.15563723169234]
We investigate how vision models adaptively use context for out-of-distribution generalization.
We show that models that excel in one setting tend to struggle in the other.
To replicate the generalization abilities of biological vision, computer vision models must have factorized object vs. background representations.
arXiv Detail & Related papers (2023-06-09T15:29:54Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - VILA: Learning Image Aesthetics from User Comments with Vision-Language
Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations.
Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels.
Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z) - Multi-modal Facial Affective Analysis based on Masked Autoencoder [7.17338843593134]
We introduce our submission to the CVPR 2023: ABAW5 competition: Affective Behavior Analysis in-the-wild.
Our approach involves several key components. First, we utilize the visual information from a Masked Autoencoder(MAE) model that has been pre-trained on a large-scale face image dataset in a self-supervised manner.
Our approach achieves impressive results in the ABAW5 competition, with an average F1 score of 55.49% and 41.21% in the AU and EXPR tracks, respectively.
arXiv Detail & Related papers (2023-03-20T03:58:03Z) - Distilling Knowledge from Object Classification to Aesthetics Assessment [68.317720070755]
The major dilemma of image aesthetics assessment (IAA) comes from the abstract nature of aesthetic labels.
We propose to distill knowledge on semantic patterns for a vast variety of image contents to an IAA model.
By supervising an end-to-end single-backbone IAA model with the distilled knowledge, the performance of the IAA model is significantly improved.
arXiv Detail & Related papers (2022-06-02T00:39:01Z) - Mitigating Bias in Facial Analysis Systems by Incorporating Label
Diversity [4.089080285684415]
We introduce a novel learning method that combines subjective human-based labels and objective annotations based on mathematical definitions of facial traits.
Our method successfully mitigates unintended biases, while maintaining significant accuracy on the downstream task.
arXiv Detail & Related papers (2022-04-13T13:17:27Z) - Personalized Image Aesthetics Assessment with Rich Attributes [35.61053167813472]
We conduct the most comprehensive subjective study of personalized image aesthetics and introduce a new personalized image Aesthetics database with Rich Attributes (PARA)
PARA features wealthy annotations, including 9 image-oriented objective attributes and 4 human-oriented subjective attributes.
We also propose a conditional PIAA model by utilizing subject information as conditional prior.
arXiv Detail & Related papers (2022-03-31T02:23:46Z) - Fair SA: Sensitivity Analysis for Fairness in Face Recognition [1.7149364927872013]
We propose a new fairness evaluation based on robustness in the form of a generic framework.
We analyze the performance of common face recognition models and empirically show that certain subgroups are at a disadvantage when images are perturbed.
arXiv Detail & Related papers (2022-02-08T01:16:09Z) - A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization [16.843126268445726]
Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
arXiv Detail & Related papers (2021-09-25T15:05:25Z) - Unravelling the Effect of Image Distortions for Biased Prediction of
Pre-trained Face Recognition Models [86.79402670904338]
We evaluate the performance of four state-of-the-art deep face recognition models in the presence of image distortions.
We have observed that image distortions have a relationship with the performance gap of the model across different subgroups.
arXiv Detail & Related papers (2021-08-14T16:49:05Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - Understanding Adversarial Examples from the Mutual Influence of Images
and Perturbations [83.60161052867534]
We analyze adversarial examples by disentangling the clean images and adversarial perturbations, and analyze their influence on each other.
Our results suggest a new perspective towards the relationship between images and universal perturbations.
We are the first to achieve the challenging task of a targeted universal attack without utilizing original training data.
arXiv Detail & Related papers (2020-07-13T05:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.