Related papers: BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation

BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation

URL: http://arxiv.org/abs/2509.13496v1
Date: Tue, 16 Sep 2025 19:52:12 GMT
Title: BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation
Authors: Rajatsubhra Chakraborty, Xujun Che, Depeng Xu, Cori Faklaris, Xi Niu, Shuhan Yuan,
Abstract summary: BiasMap is a model-agnostic framework for uncovering latent concept-level representational biases.<n>Our findings show that existing fairness interventions may reduce the output distributional gap but often fail to disentangle concept-level coupling.
Score: 14.110668963732273
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Bias discovery is critical for black-box generative models, especiall text-to-image (TTI) models. Existing works predominantly focus on output-level demographic distributions, which do not necessarily guarantee concept representations to be disentangled post-mitigation. We propose BiasMap, a model-agnostic framework for uncovering latent concept-level representational biases in stable diffusion models. BiasMap leverages cross-attention attribution maps to reveal structural entanglements between demographics (e.g., gender, race) and semantics (e.g., professions), going deeper into representational bias during the image generation. Using attribution maps of these concepts, we quantify the spatial demographics-semantics concept entanglement via Intersection over Union (IoU), offering a lens into bias that remains hidden in existing fairness discovery approaches. In addition, we further utilize BiasMap for bias mitigation through energy-guided diffusion sampling that directly modifies latent noise space and minimizes the expected SoftIoU during the denoising process. Our findings show that existing fairness interventions may reduce the output distributional gap but often fail to disentangle concept-level coupling, whereas our mitigation method can mitigate concept entanglement in image generation while complementing distributional bias mitigation.

Related papers

Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models [7.9240590529889205]
Text-to-image (T2I) diffusion models have achieved widespread success due to their ability to generate high-resolution, photorealistic images.<n>We introduce SelfDebias, a fully unsupervised test-time debiasing method applicable to any diffusion model that uses a UNet as its noise predictor.
arXiv Detail & Related papers (2025-12-03T12:46:42Z)
SCALEX: Scalable Concept and Latent Exploration for Diffusion Models [59.86284983662119]
Image generation models frequently encode social biases, including stereotypes tied to gender, race, and profession.<n>We introduce SCALEX, a framework for scalable and automated exploration of diffusion model latent spaces.<n>It extracts semantically meaningful directions from H-space using only natural language prompts, enabling zero-shot interpretation without retraining or labelling.
arXiv Detail & Related papers (2025-11-13T22:02:44Z)
FairImagen: Post-Processing for Bias Mitigation in Text-to-Image Models [10.857020427374506]
We introduce FairImagen, a post-hoc debiasing framework that operates on prompt embeddings to mitigate societal biases.<n>Our framework outperforms existing post-hoc methods and offers a simple, scalable, and model-agnostic solution for equitable text-to-image generation.
arXiv Detail & Related papers (2025-10-24T11:47:15Z)
G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models [38.44872934965588]
This paper considers the problem of utilizing a large-scale text-to-image model to tackle the Inexact diffusion (IS) task.<n>We exploit the pattern discrepancies between original images and mask-conditional generated images to facilitate a coarse-to-fine segmentation refinement.
arXiv Detail & Related papers (2025-06-02T11:05:28Z)
TRACE: Trajectory-Constrained Concept Erasure in Diffusion Models [0.0]
Concept erasure aims to remove or suppress specific concept information in a generative model.<n>Trajectory-Constrained Attentional Concept Erasure (TRACE) is a novel method to erase targeted concepts from diffusion models.<n>TRACE achieves state-of-the-art performance, outperforming recent methods such as ANT, EraseAnything, and MACE in terms of removal efficacy and output quality.
arXiv Detail & Related papers (2025-05-29T10:15:22Z)
A Meaningful Perturbation Metric for Evaluating Explainability Methods [55.09730499143998]
We introduce a novel approach, which harnesses image generation models to perform targeted perturbation.<n> Specifically, we focus on inpainting only the high-relevance pixels of an input image to modify the model's predictions while preserving image fidelity.<n>This is in contrast to existing approaches, which often produce out-of-distribution modifications, leading to unreliable results.
arXiv Detail & Related papers (2025-04-09T11:46:41Z)
DPBridge: Latent Diffusion Bridge for Dense Prediction [49.1574468325115]
We introduce DPBridge, the first latent diffusion bridge framework for dense prediction tasks.<n>Our method consistently achieves superior performance, demonstrating its effectiveness and capability generalization under different scenarios.
arXiv Detail & Related papers (2024-12-29T15:50:34Z)
Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models [0.0]
Text-to-Image (TTI) systems have come under increased scrutiny for social biases. We investigate embedding spaces as a source of bias for TTI models. We find that biased multimodal embeddings like CLIP can result in lower alignment scores for representationally balanced TTI models.
arXiv Detail & Related papers (2024-09-15T01:09:55Z)
GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added. From the global perspective, the difficulty of reconstructing images with different anomalies is uneven. We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z)
MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models [3.3454373538792552]
We introduce a method that addresses intersectional bias in diffusion-based text-to-image models by modifying cross-attention maps in a disentangled manner. Our approach utilizes a pre-trained Stable Diffusion model, eliminates the need for an additional set of reference images, and preserves the original quality for unaltered concepts.
arXiv Detail & Related papers (2024-03-28T17:54:38Z)
Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement [58.9768112704998]
Disentangled representation learning strives to extract the intrinsic factors within observed data. We introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias. This is the first work to reveal the potent disentanglement capability of diffusion models with cross-attention, requiring no complex designs.
arXiv Detail & Related papers (2024-02-15T05:07:54Z)
Exploring Social Bias in Downstream Applications of Text-to-Image Foundation Models [72.06006736916821]
We use synthetic images to probe two applications of text-to-image models, image editing and classification, for social bias. Using our methodology, we uncover meaningful and significant inter-sectional social biases in textitStable Diffusion, a state-of-the-art open-source text-to-image model. Our findings caution against the uninformed adoption of text-to-image foundation models for downstream tasks and services.
arXiv Detail & Related papers (2023-12-05T14:36:49Z)
Projection Regret: Reducing Background Bias for Novelty Detection via Diffusion Models [72.07462371883501]
We propose emphProjection Regret (PR), an efficient novelty detection method that mitigates the bias of non-semantic information. PR computes the perceptual distance between the test image and its diffusion-based projection to detect abnormality. Extensive experiments demonstrate that PR outperforms the prior art of generative-model-based novelty detection methods by a significant margin.
arXiv Detail & Related papers (2023-12-05T09:44:47Z)
Unmasking Bias in Diffusion Model Training [40.90066994983719]
Denoising diffusion models have emerged as a dominant approach for image generation. They still suffer from slow convergence in training and color shift issues in sampling. In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm.
arXiv Detail & Related papers (2023-10-12T16:04:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.