Fair Active Learning
- URL: http://arxiv.org/abs/2001.01796v5
- Date: Wed, 31 Mar 2021 14:39:26 GMT
- Title: Fair Active Learning
- Authors: Hadis Anahideh and Abolfazl Asudeh and Saravanan Thirumuruganathan
- Abstract summary: It is critical that machine learning models do not propagate discrimination.
Active learning is a promising approach to build an accurate classifier by interactively querying an oracle within a labeling budget.
We design algorithms for fair active learning that carefully selects data points to be labeled so as to balance model accuracy and fairness.
- Score: 15.313223110223941
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) is increasingly being used in high-stakes applications
impacting society. Therefore, it is of critical importance that ML models do
not propagate discrimination. Collecting accurate labeled data in societal
applications is challenging and costly. Active learning is a promising approach
to build an accurate classifier by interactively querying an oracle within a
labeling budget. We design algorithms for fair active learning that carefully
selects data points to be labeled so as to balance model accuracy and fairness.
We demonstrate the effectiveness and efficiency of our proposed algorithms over
widely used benchmark datasets using demographic parity and equalized odds
notions of fairness.
Related papers
- BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data.
Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z) - Fair Active Learning in Low-Data Regimes [22.349886628823125]
In machine learning applications, ensuring fairness is essential to avoid perpetuating social inequities.
In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments.
We introduce an innovative active learning framework that combines an exploration procedure inspired by posterior sampling with a fair classification subroutine.
We demonstrate that this framework performs effectively in very data-scarce regimes, maximizing accuracy while satisfying fairness constraints with high probability.
arXiv Detail & Related papers (2023-12-13T23:14:55Z) - Learning to Rank for Active Learning via Multi-Task Bilevel Optimization [29.207101107965563]
We propose a novel approach for active learning, which aims to select batches of unlabeled instances through a learned surrogate model for data acquisition.
A key challenge in this approach is developing an acquisition function that generalizes well, as the history of data, which forms part of the utility function's input, grows over time.
arXiv Detail & Related papers (2023-10-25T22:50:09Z) - Active Learning with Combinatorial Coverage [0.0]
Active learning is a practical field of machine learning that automates the process of selecting which data to label.
Current methods are effective in reducing the burden of data labeling but are heavily model-reliant.
This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias.
We propose active learning methods utilizing coverage to overcome these issues.
arXiv Detail & Related papers (2023-02-28T13:43:23Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Fair Active Learning [15.313223110223941]
Active learning is a promising approach to build an accurate classifier by interactively querying an oracle within a labeling budget.
We design algorithms for fair active learning that carefully selects data points to be labeled so as to balance model accuracy and fairness.
arXiv Detail & Related papers (2020-06-20T17:00:02Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Mining Implicit Entity Preference from User-Item Interaction Data for
Knowledge Graph Completion via Adversarial Learning [82.46332224556257]
We propose a novel adversarial learning approach by leveraging user interaction data for the Knowledge Graph Completion task.
Our generator is isolated from user interaction data, and serves to improve the performance of the discriminator.
To discover implicit entity preference of users, we design an elaborate collaborative learning algorithms based on graph neural networks.
arXiv Detail & Related papers (2020-03-28T05:47:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.