Related papers: The Algorithmic Gaze: An Audit and Ethnography of the LAION-Aesthetics Predictor Model

The Algorithmic Gaze: An Audit and Ethnography of the LAION-Aesthetics Predictor Model

URL: http://arxiv.org/abs/2601.09896v1
Date: Wed, 14 Jan 2026 21:59:44 GMT
Title: The Algorithmic Gaze: An Audit and Ethnography of the LAION-Aesthetics Predictor Model
Authors: Jordan Taylor, William Agnew, Maarten Sap, Sarah E. Fox, Haiyi Zhu,
Abstract summary: We study an aesthetic evaluation model--LAION Aesthetic Predictor (LAP)--that is widely used to curate datasets to train visual generative image models.<n>LAP disproportionally filters in images with captions mentioning women, while filtering out images with captions mentioning men or LGBTQ+ people.<n>In doing so, the algorithmic gaze of this aesthetic evaluation model reinforces the imperial and male gazes found within western art history.
Score: 38.47280177852031
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual generative AI models are trained using a one-size-fits-all measure of aesthetic appeal. However, what is deemed "aesthetic" is inextricably linked to personal taste and cultural values, raising the question of whose taste is represented in visual generative AI models. In this work, we study an aesthetic evaluation model--LAION Aesthetic Predictor (LAP)--that is widely used to curate datasets to train visual generative image models, like Stable Diffusion, and evaluate the quality of AI-generated images. To understand what LAP measures, we audited the model across three datasets. First, we examined the impact of aesthetic filtering on the LAION-Aesthetics Dataset (approximately 1.2B images), which was curated from LAION-5B using LAP. We find that the LAP disproportionally filters in images with captions mentioning women, while filtering out images with captions mentioning men or LGBTQ+ people. Then, we used LAP to score approximately 330k images across two art datasets, finding the model rates realistic images of landscapes, cityscapes, and portraits from western and Japanese artists most highly. In doing so, the algorithmic gaze of this aesthetic evaluation model reinforces the imperial and male gazes found within western art history. In order to understand where these biases may have originated, we performed a digital ethnography of public materials related to the creation of LAP. We find that the development of LAP reflects the biases we found in our audits, such as the aesthetic scores used to train LAP primarily coming from English-speaking photographers and western AI-enthusiasts. In response, we discuss how aesthetic evaluation can perpetuate representational harms and call on AI developers to shift away from prescriptive measures of "aesthetics" toward more pluralistic evaluation.

Related papers

Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output.<n>Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ.<n>We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z)
AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling [11.996211235559866]
Image Content Appeal Assessment (ICAA) is a novel metric that quantifies the level of positive interest an image's content generates for viewers. ICAA is different from traditional Image-Aesthetics Assessment (IAA), which judges an image's artistic quality.
arXiv Detail & Related papers (2024-07-08T01:40:32Z)
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms [91.19304518033144]
We aim to align vision models with human aesthetic standards in a retrieval system. We propose a preference-based reinforcement learning method that fine-tunes the vision models to better align the vision models with human aesthetics.
arXiv Detail & Related papers (2024-06-13T17:59:20Z)
Unveiling The Factors of Aesthetic Preferences with Explainable AI [0.0]
In this study, we pioneer a novel perspective by utilizing several different machine learning (ML) models. Our models process these attributes as inputs to predict the aesthetic scores of images. Our aim is to shed light on the complex nature of aesthetic preferences in images through ML and to provide a deeper understanding of the attributes that influence aesthetic judgements.
arXiv Detail & Related papers (2023-11-24T11:06:22Z)
Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method [64.40494830113286]
We first introduce a large-scale AIAA dataset: Boldbrush Artistic Image dataset (BAID), which consists of 60,337 artistic images covering various art forms. We then propose a new method, SAAN, which can effectively extract and utilize style-specific and generic aesthetic information to evaluate artistic images. Experiments demonstrate that our proposed approach outperforms existing IAA methods on the proposed BAID dataset.
arXiv Detail & Related papers (2023-03-27T12:59:15Z)
VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations. Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels. Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z)
Aesthetically Relevant Image Captioning [17.081262827258943]
We study image AQA and IAC together and present a new IAC method termed Aesthetically Relevant Image Captioning (ARIC) ARIC includes an ARS weighted IAC loss function and an ARS based diverse aesthetic caption selector (DACS) We show that texts with higher ARS's can predict the aesthetic ratings more accurately and that the new ARIC model can generate more accurate, aesthetically more relevant and more diverse image captions.
arXiv Detail & Related papers (2022-11-25T14:28:10Z)
Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z)
Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment [6.201485014848172]
We propose the Critique Photo Reddit dataset (RPCD), which contains 74K images and 220K comments. We exploit the polarity of the sentiment of criticism as an indicator of aesthetic judgment.
arXiv Detail & Related papers (2022-06-17T08:16:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.