Related papers: REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets

REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets

URL: http://arxiv.org/abs/2004.07999v4
Date: Fri, 23 Jul 2021 18:41:42 GMT
Title: REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets
Authors: Angelina Wang and Alexander Liu and Ryan Zhang and Anat Kleiman and Leslie Kim and Dora Zhao and Iroha Shirai and Arvind Narayanan and Olga Russakovsky
Abstract summary: REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset. It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
Score: 64.76453161039973
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning models are known to perpetuate and even amplify the biases present in the data. However, these data biases frequently do not become apparent until after the models are deployed. Our work tackles this issue and enables the preemptive analysis of large-scale datasets. REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset, surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based. Object-based biases relate to the size, context, or diversity of the depicted objects. Person-based metrics focus on analyzing the portrayal of people within the dataset. Geography-based analyses consider the representation of different geographic locations. These three dimensions are deeply intertwined in how they interact to bias a dataset, and REVISE sheds light on this; the responsibility then lies with the user to consider the cultural and historical context, and to determine which of the revealed biases may be problematic. The tool further assists the user by suggesting actionable steps that may be taken to mitigate the revealed biases. Overall, the key aim of our work is to tackle the machine learning bias problem early in the pipeline. REVISE is available at https://github.com/princetonvisualai/revise-tool

Related papers

Understanding Bias in Large-Scale Visual Datasets [5.042580324425314]
We propose a framework to identify the unique visual attributes distinguishing large-scale visual datasets. Our approach applies various transformations to extract semantic, structural, boundary, color, and frequency information. We generate detailed, open-ended descriptions of each dataset's characteristics.
arXiv Detail & Related papers (2024-12-02T18:56:52Z)
DSAP: Analyzing Bias Through Demographic Comparison of Datasets [4.8741052091630985]
We propose DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets. DSAP can be deployed in three key applications: to detect and characterize demographic blind spots and bias issues across datasets, to measure dataset demographic bias in single datasets, and to measure dataset demographic shift in deployment scenarios. An essential feature of DSAP is its ability to robustly analyze datasets without explicit demographic labels, offering simplicity and interpretability for a wide range of situations.
arXiv Detail & Related papers (2023-12-22T11:51:20Z)
Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond [93.96982273042296]
Vision-language (VL) understanding tasks evaluate models' comprehension of complex visual scenes through multiple-choice questions. We have identified two dataset biases that models can exploit as shortcuts to resolve various VL tasks correctly without proper understanding. We propose Adversarial Data Synthesis (ADS) to generate synthetic training and debiased evaluation data. We then introduce Intra-sample Counterfactual Training (ICT) to assist models in utilizing the synthesized training data, particularly the counterfactual data, via focusing on intra-sample differentiation.
arXiv Detail & Related papers (2023-10-23T08:09:42Z)
Mitigating Representation Bias in Action Recognition: Algorithms and Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects. We tackle this problem from two different angles: algorithm and dataset. We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Assessing Demographic Bias Transfer from Dataset to Model: A Case Study in Facial Expression Recognition [1.5340540198612824]
Two metrics focus on the representational and stereotypical bias of the dataset, and the third one on the residual bias of the trained model. We demonstrate the usefulness of the metrics by applying them to a FER problem based on the popular Affectnet dataset.
arXiv Detail & Related papers (2022-05-20T09:40:42Z)
Representation Bias in Data: A Survey on Identification and Resolution Techniques [26.142021257838564]
Data-driven algorithms are only as good as the data they work with, while data sets, especially social data, often fail to represent minorities adequately. Representation Bias in data can happen due to various reasons ranging from historical discrimination to selection and sampling biases in the data acquisition and preparation methods. This paper reviews the literature on identifying and resolving representation bias as a feature of a data set, independent of how consumed later.
arXiv Detail & Related papers (2022-03-22T16:30:22Z)
Certifying Robustness to Programmable Data Bias in Decision Trees [12.060443368097102]
We certify that models produced by a learning learner are pointwise-robust to potential dataset biases. Our approach allows specifying bias models across a variety of dimensions. We evaluate our approach on datasets commonly used in the fairness literature.
arXiv Detail & Related papers (2021-10-08T20:15:17Z)
A Survey on Bias in Visual Datasets [17.79365832663837]
Computer Vision (CV) has achieved remarkable results, outperforming humans in several tasks. CV systems highly depend on the data they are fed with and can learn and amplify biases within such data. Yet, to date there is no comprehensive survey on bias in visual datasets.
arXiv Detail & Related papers (2021-07-16T14:16:52Z)
Towards Measuring Bias in Image Classification [61.802949761385]
Convolutional Neural Networks (CNN) have become state-of-the-art for the main computer vision tasks. However, due to the complex structure their decisions are hard to understand which limits their use in some context of the industrial world. We present a systematic approach to uncover data bias by means of attribution maps.
arXiv Detail & Related papers (2021-07-01T10:50:39Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.