Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2512.03749v1
- Date: Wed, 03 Dec 2025 12:46:42 GMT
- Title: Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models
- Authors: Korada Sri Vardhana, Shrikrishna Lolla, Soma Biswas,
- Abstract summary: Text-to-image (T2I) diffusion models have achieved widespread success due to their ability to generate high-resolution, photorealistic images.<n>We introduce SelfDebias, a fully unsupervised test-time debiasing method applicable to any diffusion model that uses a UNet as its noise predictor.
- Score: 7.9240590529889205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image (T2I) diffusion models have achieved widespread success due to their ability to generate high-resolution, photorealistic images. These models are trained on large-scale datasets, like LAION-5B, often scraped from the internet. However, since this data contains numerous biases, the models inherently learn and reproduce them, resulting in stereotypical outputs. We introduce SelfDebias, a fully unsupervised test-time debiasing method applicable to any diffusion model that uses a UNet as its noise predictor. SelfDebias identifies semantic clusters in an image encoder's embedding space and uses these clusters to guide the diffusion process during inference, minimizing the KL divergence between the output distribution and the uniform distribution. Unlike supervised approaches, SelfDebias does not require human-annotated datasets or external classifiers trained for each generated concept. Instead, it is designed to automatically identify semantic modes. Extensive experiments show that SelfDebias generalizes across prompts and diffusion model architectures, including both conditional and unconditional models. It not only effectively debiases images along key demographic dimensions while maintaining the visual fidelity of the generated images, but also more abstract concepts for which identifying biases is also challenging.
Related papers
- Consistent World Models via Foresight Diffusion [56.45012929930605]
We argue that a key bottleneck in learning consistent diffusion-based world models lies in the suboptimal predictive ability.<n>We propose Foresight Diffusion (ForeDiff), a diffusion-based world modeling framework that enhances consistency by decoupling condition understanding from target denoising.
arXiv Detail & Related papers (2025-05-22T10:01:59Z) - InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models [28.51460282167433]
diffusion models are highly data-driven and prone to inheriting imbalances and biases present in real-world data.<n>We propose a framework, InvDiff, which aims to learn invariant semantic information for diffusion guidance.<n>InvDiff effectively reduces biases while maintaining the quality of image generation.
arXiv Detail & Related papers (2024-12-11T15:47:11Z) - Debiasing Classifiers by Amplifying Bias with Latent Diffusion and Large Language Models [9.801159950963306]
We introduce DiffuBias, a novel pipeline for text-to-image generation that enhances classifier robustness by generating bias-conflict samples.
DrouBias is the first approach leveraging a stable diffusion model to generate bias-conflict samples in debiasing tasks.
Our comprehensive experimental evaluations demonstrate that DiffuBias achieves state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2024-11-25T04:11:16Z) - Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models [0.0]
Text-to-Image (TTI) systems have come under increased scrutiny for social biases.
We investigate embedding spaces as a source of bias for TTI models.
We find that biased multimodal embeddings like CLIP can result in lower alignment scores for representationally balanced TTI models.
arXiv Detail & Related papers (2024-09-15T01:09:55Z) - VersusDebias: Universal Zero-Shot Debiasing for Text-to-Image Models via SLM-Based Prompt Engineering and Generative Adversary [8.24274551090375]
We introduce VersusDebias, a novel and universal debiasing framework for biases in arbitrary Text-to-Image (T2I) models.
The self-adaptive module generates specialized attribute arrays to post-process hallucinations and debias multiple attributes simultaneously.
In both zero-shot and few-shot scenarios, VersusDebias outperforms existing methods, showcasing its exceptional utility.
arXiv Detail & Related papers (2024-07-28T16:24:07Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - Unbiased Image Synthesis via Manifold Guidance in Diffusion Models [9.531220208352252]
Diffusion Models often inadvertently favor certain data attributes, undermining the diversity of generated images.
We propose a plug-and-play method named Manifold Sampling Guidance, which is also the first unsupervised method to mitigate bias issue in DDPMs.
arXiv Detail & Related papers (2023-07-17T02:03:17Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.