Mitigating Bias in Visual Transformers via Targeted Alignment
- URL: http://arxiv.org/abs/2302.04358v1
- Date: Wed, 8 Feb 2023 22:11:14 GMT
- Title: Mitigating Bias in Visual Transformers via Targeted Alignment
- Authors: Sruthi Sudhakar, Viraj Prabhu, Arvindkumar Krishnakumar, Judy Hoffman
- Abstract summary: We study the fairness of transformers applied to computer vision and benchmark several bias mitigation approaches from prior work.
We propose TADeT, a targeted alignment strategy for debiasing transformers that aims to discover and remove bias primarily from query matrix features.
- Score: 8.674650784377196
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As transformer architectures become increasingly prevalent in computer
vision, it is critical to understand their fairness implications. We perform
the first study of the fairness of transformers applied to computer vision and
benchmark several bias mitigation approaches from prior work. We visualize the
feature space of the transformer self-attention modules and discover that a
significant portion of the bias is encoded in the query matrix. With this
knowledge, we propose TADeT, a targeted alignment strategy for debiasing
transformers that aims to discover and remove bias primarily from query matrix
features. We measure performance using Balanced Accuracy and Standard Accuracy,
and fairness using Equalized Odds and Balanced Accuracy Difference. TADeT
consistently leads to improved fairness over prior work on multiple attribute
prediction tasks on the CelebA dataset, without compromising performance.
Related papers
- ResiDual Transformer Alignment with Spectral Decomposition [31.14332778586179]
We analyze the phenomenon in vision transformers, focusing on the spectral geometry of residuals.
We show that they encode specialized roles across a wide variety of input data distributions.
We introduce ResiDual, a technique for spectral alignment of the residual stream.
arXiv Detail & Related papers (2024-10-31T22:51:45Z) - Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization [88.5582111768376]
We study the optimization of a Transformer composed of a self-attention layer with softmax followed by a fully connected layer under gradient descent on a certain data distribution model.
Our results establish a sharp condition that can distinguish between the small test error phase and the large test error regime, based on the signal-to-noise ratio in the data model.
arXiv Detail & Related papers (2024-09-28T13:24:11Z) - Simplicity Bias of Transformers to Learn Low Sensitivity Functions [19.898451497341714]
Transformers achieve state-of-the-art accuracy and robustness across many tasks.
An understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.
arXiv Detail & Related papers (2024-03-11T17:12:09Z) - Multi-Dimensional Hyena for Spatial Inductive Bias [69.3021852589771]
We present a data-efficient vision transformer that does not rely on self-attention.
Instead, it employs a novel generalization to multiple axes of the very recent Hyena layer.
We show that a hybrid approach that is based on Hyena N-D for the first layers in ViT, followed by layers that incorporate conventional attention, consistently boosts the performance of various vision transformer architectures.
arXiv Detail & Related papers (2023-09-24T10:22:35Z) - Reviving Shift Equivariance in Vision Transformers [12.720600348466498]
We propose an adaptive polyphase anchoring algorithm that can be seamlessly integrated into vision transformer models.
Our algorithms enable ViT, and its variants such as Twins to achieve 100% consistency with respect to input shift.
arXiv Detail & Related papers (2023-06-13T00:13:11Z) - 2-D SSM: A General Spatial Layer for Visual Transformers [79.4957965474334]
A central objective in computer vision is to design models with appropriate 2-D inductive bias.
We leverage an expressive variation of the multidimensional State Space Model.
Our approach introduces efficient parameterization, accelerated computation, and a suitable normalization scheme.
arXiv Detail & Related papers (2023-06-11T09:41:37Z) - Remote Sensing Change Detection With Transformers Trained from Scratch [62.96911491252686]
transformer-based change detection (CD) approaches either employ a pre-trained model trained on large-scale image classification ImageNet dataset or rely on first pre-training on another CD dataset and then fine-tuning on the target benchmark.
We develop an end-to-end CD approach with transformers that is trained from scratch and yet achieves state-of-the-art performance on four public benchmarks.
arXiv Detail & Related papers (2023-04-13T17:57:54Z) - XAI for Transformers: Better Explanations through Conservative
Propagation [60.67748036747221]
We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction.
Our proposal can be seen as a proper extension of the well-established LRP method to Transformers.
arXiv Detail & Related papers (2022-02-15T10:47:11Z) - CETransformer: Casual Effect Estimation via Transformer Based
Representation Learning [17.622007687796756]
Data-driven causal effect estimation faces two main challenges, i.e., selection bias and the missing of counterfactual.
To address these two issues, most of the existing approaches tend to reduce the selection bias by learning a balanced representation.
We propose a CETransformer model for casual effect estimation via transformer based representation learning.
arXiv Detail & Related papers (2021-07-19T09:39:57Z) - Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.
We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z) - Toward Transformer-Based Object Detection [12.704056181392415]
Vision Transformers can be used as a backbone by a common detection task head to produce competitive COCO results.
ViT-FRCNN demonstrates several known properties associated with transformers, including large pretraining capacity and fast fine-tuning performance.
We view ViT-FRCNN as an important stepping stone toward a pure-transformer solution of complex vision tasks such as object detection.
arXiv Detail & Related papers (2020-12-17T22:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.