Related papers: The Iconicity of the Generated Image

The Iconicity of the Generated Image

URL: http://arxiv.org/abs/2509.16473v1
Date: Fri, 19 Sep 2025 23:59:43 GMT
Title: The Iconicity of the Generated Image
Authors: Nanne van Noord, Noa Garcia,
Abstract summary: How humans interpret and produce images is influenced by the images we have been exposed to.<n>Visual generative AI models are exposed to many training images and learn to generate new images based on this.
Score: 22.154465616964256
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: How humans interpret and produce images is influenced by the images we have been exposed to. Similarly, visual generative AI models are exposed to many training images and learn to generate new images based on this. Given the importance of iconic images in human visual communication, as they are widely seen, reproduced, and used as inspiration, we may expect that they may similarly have a proportionally large influence within the generative AI process. In this work we explore this question through a three-part analysis, involving data attribution, semantic similarity analysis, and a user-study. Our findings indicate that iconic images do not have an obvious influence on the generative process, and that for many icons it is challenging to reproduce an image which resembles it closely. This highlights an important difference in how humans and visual generative AI models draw on and learn from prior visual communication.

Related papers

From Pixels to Feelings: Aligning MLLMs with Human Cognitive Perception of Images [36.44183173680125]
Multimodal Large Language Models (MLLMs) are adept at answering what is in an image-identifying objects but often lack the ability to understand how an image feels to a human observer.<n>This gap is most evident when considering subjective cognitive properties, such as what makes an image memorable, funny, aesthetically pleasing, or emotionally evocative.<n>We introduce CogIP-Bench, a comprehensive benchmark for evaluating MLLMs on such image cognitive properties.
arXiv Detail & Related papers (2025-11-27T23:30:24Z)
Simulated Cortical Magnification Supports Self-Supervised Object Learning [8.07351541700131]
Recent self-supervised learning models simulate the development of semantic object representations by training on visual experience similar to that of toddlers.<n>Here, we investigate the role of this varying resolution in the development of object representations.
arXiv Detail & Related papers (2025-09-19T08:28:06Z)
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning [72.81576836419373]
Chain-of-Thought (CoT) reasoning can be used to link visual cues across multiple images.<n>We adapt rule-based reinforcement learning for Vision-Language Models (VLMs)<n>Our method achieves significant improvements on multi-image reasoning benchmarks and shows strong performance on general vision tasks.
arXiv Detail & Related papers (2025-06-27T17:59:27Z)
Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images [2.4861619769660637]
Image memorability refers to the phenomenon where certain images are more likely to be remembered than others.<n>Despite advances in understanding human visual perception and memory, it is unclear what features contribute to an image's memorability.<n>We employ an autoencoder-based approach built on VGG16 convolutional neural networks (CNNs) to learn latent representations of images.
arXiv Detail & Related papers (2024-10-19T22:58:33Z)
When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability. We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks. Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z)
Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images [34.02058539403381]
We leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection. A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images.
arXiv Detail & Related papers (2024-03-13T19:56:30Z)
Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images [67.18010640829682]
We show that AI-generated images introduce an invisible relevance bias to text-image retrieval models. The inclusion of AI-generated images in the training data of the retrieval models exacerbates the invisible relevance bias. We propose an effective training method aimed at alleviating the invisible relevance bias.
arXiv Detail & Related papers (2023-11-23T16:22:58Z)
Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images. We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images. This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z)
A domain adaptive deep learning solution for scanpath prediction of paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings. We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans. The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z)
Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z)
AffectGAN: Affect-Based Generative Art Driven by Semantics [2.323282558557423]
This paper introduces a novel method for generating artistic images that express particular affective states. Our AffectGAN model is able to generate images based on specific or broad semantic prompts and intended affective outcomes. A small dataset of 32 images generated by AffectGAN is annotated by 50 participants in terms of the particular emotion they elicit, as well as their quality and novelty.
arXiv Detail & Related papers (2021-09-30T04:53:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.