Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers
- URL: http://arxiv.org/abs/2504.20902v1
- Date: Tue, 29 Apr 2025 16:19:38 GMT
- Title: Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers
- Authors: Quentin Guimard, Moreno D'IncĂ , Massimiliano Mancini, Elisa Ricci,
- Abstract summary: Existing approaches for bias identification rely on datasets containing labels for the task of interest.<n>We present-to-Bias (C2B), the first bias discovery framework that works without access to any labeled data.<n>C2B is training-free, does not require any annotations, has no constraints on the list of biases, and can be applied to any pre-trained model on any classification task.
- Score: 25.909153114646692
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A person downloading a pre-trained model from the web should be aware of its biases. Existing approaches for bias identification rely on datasets containing labels for the task of interest, something that a non-expert may not have access to, or may not have the necessary resources to collect: this greatly limits the number of tasks where model biases can be identified. In this work, we present Classifier-to-Bias (C2B), the first bias discovery framework that works without access to any labeled data: it only relies on a textual description of the classification task to identify biases in the target classification model. This description is fed to a large language model to generate bias proposals and corresponding captions depicting biases together with task-specific target labels. A retrieval model collects images for those captions, which are then used to assess the accuracy of the model w.r.t. the given biases. C2B is training-free, does not require any annotations, has no constraints on the list of biases, and can be applied to any pre-trained model on any classification task. Experiments on two publicly available datasets show that C2B discovers biases beyond those of the original datasets and outperforms a recent state-of-the-art bias detection baseline that relies on task-specific annotations, being a promising first step toward addressing task-agnostic unsupervised bias detection.
Related papers
- Unlabeled Debiasing in Downstream Tasks via Class-wise Low Variance Regularization [13.773597081543185]
We introduce a novel debiasing regularization technique based on the class-wise variance of embeddings.
Our method does not require attribute labels and targets any attribute, thus addressing the shortcomings of existing debiasing methods.
arXiv Detail & Related papers (2024-09-29T03:56:50Z) - GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting.
This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions.
We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z) - Language-guided Detection and Mitigation of Unknown Dataset Bias [23.299264313976213]
We propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions.
Our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.
arXiv Detail & Related papers (2024-06-05T03:11:33Z) - Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair [36.221761997349795]
Deep neural networks rely on bias attributes that are spuriously correlated with a target class in the presence of dataset bias.
This paper proposes a method that provides the model with explicit spatial guidance that indicates the region of intrinsic features.
Experiments demonstrate that our method achieves state-of-the-art performance on synthetic and real-world datasets with various levels of bias severity.
arXiv Detail & Related papers (2024-04-30T04:13:14Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Evaluating the Fairness of Discriminative Foundation Models in Computer
Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP)
We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy.
Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.<n>XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.<n>Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Discovering and Mitigating Visual Biases through Keyword Explanation [66.71792624377069]
We propose the Bias-to-Text (B2T) framework, which interprets visual biases as keywords.
B2T can identify known biases, such as gender bias in CelebA, background bias in Waterbirds, and distribution shifts in ImageNet-R/C.
B2T uncovers novel biases in larger datasets, such as Dollar Street and ImageNet.
arXiv Detail & Related papers (2023-01-26T13:58:46Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - On Cross-Dataset Generalization in Automatic Detection of Online Abuse [7.163723138100273]
We show that the benign examples in the Wikipedia Detox dataset are biased towards platform-specific topics.
We identify these examples using unsupervised topic modeling and manual inspection of topics' keywords.
For a robust dataset design, we suggest applying inexpensive unsupervised methods to inspect the collected data and downsize the non-generalizable content.
arXiv Detail & Related papers (2020-10-14T21:47:03Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.