Related papers: Mitigating Bias in Visual Transformers via Targeted Alignment

Mitigating Bias in Visual Transformers via Targeted Alignment

URL: http://arxiv.org/abs/2302.04358v1
Date: Wed, 8 Feb 2023 22:11:14 GMT
Title: Mitigating Bias in Visual Transformers via Targeted Alignment
Authors: Sruthi Sudhakar, Viraj Prabhu, Arvindkumar Krishnakumar, Judy Hoffman
Abstract summary: We study the fairness of transformers applied to computer vision and benchmark several bias mitigation approaches from prior work. We propose TADeT, a targeted alignment strategy for debiasing transformers that aims to discover and remove bias primarily from query matrix features.
Score: 8.674650784377196
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As transformer architectures become increasingly prevalent in computer vision, it is critical to understand their fairness implications. We perform the first study of the fairness of transformers applied to computer vision and benchmark several bias mitigation approaches from prior work. We visualize the feature space of the transformer self-attention modules and discover that a significant portion of the bias is encoded in the query matrix. With this knowledge, we propose TADeT, a targeted alignment strategy for debiasing transformers that aims to discover and remove bias primarily from query matrix features. We measure performance using Balanced Accuracy and Standard Accuracy, and fairness using Equalized Odds and Balanced Accuracy Difference. TADeT consistently leads to improved fairness over prior work on multiple attribute prediction tasks on the CelebA dataset, without compromising performance.

Related papers

Simplifying Graph Transformers [64.50059165186701]
We propose three simple modifications to the plain Transformer to render it applicable to graphs without introducing major architectural distortions. Specifically, we advocate for the use of (1) simplified $L$ attention to measure the magnitude of closeness tokens; (2) adaptive root-mean-square normalization to preserve token magnitude information; and (3) a relative positional encoding bias with a shared encoder.
arXiv Detail & Related papers (2025-04-17T02:06:50Z)
Transformer Meets Twicing: Harnessing Unattended Residual Information [2.1605931466490795]
Transformer-based deep learning models have achieved state-of-the-art performance across numerous language and vision tasks. While the self-attention mechanism has proven capable of handling complex data patterns, it has been observed that the representational capacity of the attention matrix degrades significantly across transformer layers. We propose the Twicing Attention, a novel attention mechanism that uses kernel twicing procedure in nonparametric regression to alleviate the low-pass behavior of associated NLM smoothing.
arXiv Detail & Related papers (2025-03-02T01:56:35Z)
ResiDual Transformer Alignment with Spectral Decomposition [31.14332778586179]
We analyze the phenomenon in vision transformers, focusing on the spectral geometry of residuals. We show that they encode specialized roles across a wide variety of input data distributions. We introduce ResiDual, a technique for spectral alignment of the residual stream.
arXiv Detail & Related papers (2024-10-31T22:51:45Z)
Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization [88.5582111768376]
We study the optimization of a Transformer composed of a self-attention layer with softmax followed by a fully connected layer under gradient descent on a certain data distribution model. Our results establish a sharp condition that can distinguish between the small test error phase and the large test error regime, based on the signal-to-noise ratio in the data model.
arXiv Detail & Related papers (2024-09-28T13:24:11Z)
Simplicity Bias of Transformers to Learn Low Sensitivity Functions [19.898451497341714]
Transformers achieve state-of-the-art accuracy and robustness across many tasks. An understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.
arXiv Detail & Related papers (2024-03-11T17:12:09Z)
Multi-Dimensional Hyena for Spatial Inductive Bias [69.3021852589771]
We present a data-efficient vision transformer that does not rely on self-attention. Instead, it employs a novel generalization to multiple axes of the very recent Hyena layer. We show that a hybrid approach that is based on Hyena N-D for the first layers in ViT, followed by layers that incorporate conventional attention, consistently boosts the performance of various vision transformer architectures.
arXiv Detail & Related papers (2023-09-24T10:22:35Z)
Reviving Shift Equivariance in Vision Transformers [12.720600348466498]
We propose an adaptive polyphase anchoring algorithm that can be seamlessly integrated into vision transformer models. Our algorithms enable ViT, and its variants such as Twins to achieve 100% consistency with respect to input shift.
arXiv Detail & Related papers (2023-06-13T00:13:11Z)
2-D SSM: A General Spatial Layer for Visual Transformers [79.4957965474334]
A central objective in computer vision is to design models with appropriate 2-D inductive bias. We leverage an expressive variation of the multidimensional State Space Model. Our approach introduces efficient parameterization, accelerated computation, and a suitable normalization scheme.
arXiv Detail & Related papers (2023-06-11T09:41:37Z)
Remote Sensing Change Detection With Transformers Trained from Scratch [62.96911491252686]
transformer-based change detection (CD) approaches either employ a pre-trained model trained on large-scale image classification ImageNet dataset or rely on first pre-training on another CD dataset and then fine-tuning on the target benchmark. We develop an end-to-end CD approach with transformers that is trained from scratch and yet achieves state-of-the-art performance on four public benchmarks.
arXiv Detail & Related papers (2023-04-13T17:57:54Z)
XAI for Transformers: Better Explanations through Conservative Propagation [60.67748036747221]
We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. Our proposal can be seen as a proper extension of the well-established LRP method to Transformers.
arXiv Detail & Related papers (2022-02-15T10:47:11Z)
CETransformer: Casual Effect Estimation via Transformer Based Representation Learning [17.622007687796756]
Data-driven causal effect estimation faces two main challenges, i.e., selection bias and the missing of counterfactual. To address these two issues, most of the existing approaches tend to reduce the selection bias by learning a balanced representation. We propose a CETransformer model for casual effect estimation via transformer based representation learning.
arXiv Detail & Related papers (2021-07-19T09:39:57Z)
Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z)
Toward Transformer-Based Object Detection [12.704056181392415]
Vision Transformers can be used as a backbone by a common detection task head to produce competitive COCO results. ViT-FRCNN demonstrates several known properties associated with transformers, including large pretraining capacity and fast fine-tuning performance. We view ViT-FRCNN as an important stepping stone toward a pure-transformer solution of complex vision tasks such as object detection.
arXiv Detail & Related papers (2020-12-17T22:33:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.