Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models
- URL: http://arxiv.org/abs/2505.17064v1
- Date: Sun, 18 May 2025 13:35:23 GMT
- Title: Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models
- Authors: Maria-Teresa De Rosa Palmini, Eva Cetinic,
- Abstract summary: We introduce the HistVis dataset, a curated collection of 30,000 synthetic images generated by three state-of-the-art diffusion models.<n>We evaluate generated imagery across three key aspects: Implicit Stylistic Associations, Historical Consistency, and Demographic Representation.<n>Our findings reveal systematic inaccuracies in historically themed generated imagery, as TTI models frequently stereotype past eras by incorporating unstated stylistic cues.
- Score: 0.6445605125467574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As Text-to-Image (TTI) diffusion models become increasingly influential in content creation, growing attention is being directed toward their societal and cultural implications. While prior research has primarily examined demographic and cultural biases, the ability of these models to accurately represent historical contexts remains largely underexplored. In this work, we present a systematic and reproducible methodology for evaluating how TTI systems depict different historical periods. For this purpose, we introduce the HistVis dataset, a curated collection of 30,000 synthetic images generated by three state-of-the-art diffusion models using carefully designed prompts depicting universal human activities across different historical periods. We evaluate generated imagery across three key aspects: (1) Implicit Stylistic Associations: examining default visual styles associated with specific eras; (2) Historical Consistency: identifying anachronisms such as modern artifacts in pre-modern contexts; and (3) Demographic Representation: comparing generated racial and gender distributions against historically plausible baselines. Our findings reveal systematic inaccuracies in historically themed generated imagery, as TTI models frequently stereotype past eras by incorporating unstated stylistic cues, introduce anachronisms, and fail to reflect plausible demographic patterns. By offering a scalable methodology and benchmark for assessing historical representation in generated imagery, this work provides an initial step toward building more historically accurate and culturally aligned TTI models.
Related papers
- When Cars Have Stereotypes: Auditing Demographic Bias in Objects from Text-to-Image Models [4.240144901142787]
We introduce SODA (Stereotyped Object Diagnostic Audit), a novel framework for measuring such biases.<n>Our approach compares visual attributes of objects generated with demographic cues to those from neutral prompts.<n>We uncover strong associations between specific demographic groups and visual attributes, such as recurring color patterns prompted by gender or ethnicity cues.
arXiv Detail & Related papers (2025-08-05T14:15:53Z) - Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features [54.63343151319368]
This paper proposes a harmless model ownership verification method for personalized models by decoupling similar common features.<n>In the first stage, we create shadow models that retain common features of the victim model while disrupting dataset-specific features.<n>After that, a meta-classifier is trained to identify stolen models by determining whether suspicious models contain the dataset-specific features of the victim.
arXiv Detail & Related papers (2025-06-24T15:40:11Z) - Text-to-Image Models and Their Representation of People from Different Nationalities Engaging in Activities [2.7195102129095003]
In one scenario, the majority of images, and in the other, a substantial portion, depict individuals wearing traditional attire.<n>A statistically significant relationship was observed between this representation pattern and the regions associated with the specified countries.<n>This indicates that the issue disproportionately affects certain areas, particularly the Middle East & North Africa and Sub-Saharan Africa.
arXiv Detail & Related papers (2025-04-08T05:37:06Z) - Exploring Bias in over 100 Text-to-Image Generative Models [49.60774626839712]
We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face.<n>We assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate.<n>Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased.
arXiv Detail & Related papers (2025-03-11T03:40:44Z) - Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models.
We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space.
These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z) - The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention [61.80236015147771]
We quantify the trade-off between using diversity interventions and preserving demographic factuality in T2I models.
Experiments on DoFaiR reveal that diversity-oriented instructions increase the number of different gender and racial groups.
We propose Fact-Augmented Intervention (FAI) to reflect on verbalized or retrieved factual information about gender and racial compositions of generation subjects in history.
arXiv Detail & Related papers (2024-06-29T09:09:42Z) - T-HITL Effectively Addresses Problematic Associations in Image
Generation and Maintains Overall Visual Quality [52.5529784801908]
We focus on addressing the generation of problematic associations between demographic groups and semantic concepts.
We propose a new methodology with twice-human-in-the-loop (T-HITL) that promises improvements in both reducing problematic associations and also maintaining visual quality.
arXiv Detail & Related papers (2024-02-27T00:29:33Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Semi-supervised Human Pose Estimation in Art-historical Images [9.633949256082763]
We propose a novel approach to estimate human poses in art-language images.
Our approach achieves significantly better results than methods that use pre-trained models or style transfer.
arXiv Detail & Related papers (2022-07-06T21:20:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.