Fair Active Learning in Low-Data Regimes
- URL: http://arxiv.org/abs/2312.08559v1
- Date: Wed, 13 Dec 2023 23:14:55 GMT
- Title: Fair Active Learning in Low-Data Regimes
- Authors: Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain,
Kevin Jamieson
- Abstract summary: In machine learning applications, ensuring fairness is essential to avoid perpetuating social inequities.
In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments.
We introduce an innovative active learning framework that combines an exploration procedure inspired by posterior sampling with a fair classification subroutine.
We demonstrate that this framework performs effectively in very data-scarce regimes, maximizing accuracy while satisfying fairness constraints with high probability.
- Score: 22.349886628823125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In critical machine learning applications, ensuring fairness is essential to
avoid perpetuating social inequities. In this work, we address the challenges
of reducing bias and improving accuracy in data-scarce environments, where the
cost of collecting labeled data prohibits the use of large, labeled datasets.
In such settings, active learning promises to maximize marginal accuracy gains
of small amounts of labeled data. However, existing applications of active
learning for fairness fail to deliver on this, typically requiring large
labeled datasets, or failing to ensure the desired fairness tolerance is met on
the population distribution.
To address such limitations, we introduce an innovative active learning
framework that combines an exploration procedure inspired by posterior sampling
with a fair classification subroutine. We demonstrate that this framework
performs effectively in very data-scarce regimes, maximizing accuracy while
satisfying fairness constraints with high probability. We evaluate our proposed
approach using well-established real-world benchmark datasets and compare it
against state-of-the-art methods, demonstrating its effectiveness in producing
fair models, and improvement over existing methods.
Related papers
- Targeted Learning for Data Fairness [52.59573714151884]
We expand fairness inference by evaluating fairness in the data generating process itself.
We derive estimators demographic parity, equal opportunity, and conditional mutual information.
To validate our approach, we perform several simulations and apply our estimators to real data.
arXiv Detail & Related papers (2025-02-06T18:51:28Z) - Navigating Towards Fairness with Data Selection [27.731128352096555]
We introduce a data selection method designed to efficiently and flexibly mitigate label bias.
Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set.
Our modality-agnostic method has proven efficient and effective in handling label bias and improving fairness across diverse datasets in experimental evaluations.
arXiv Detail & Related papers (2024-12-15T06:11:05Z) - Learn to be Fair without Labels: a Distribution-based Learning Framework for Fair Ranking [1.8577028544235155]
We propose a distribution-based fair learning framework (DLF) that does not require labels by replacing the unavailable fairness labels with target fairness exposure distributions.
Our proposed framework achieves better fairness performance while maintaining better control over the fairness-relevance trade-off.
arXiv Detail & Related papers (2024-05-28T03:49:04Z) - Fair Few-shot Learning with Auxiliary Sets [53.30014767684218]
In many machine learning (ML) tasks, only very few labeled data samples can be collected, which can lead to inferior fairness performance.
In this paper, we define the fairness-aware learning task with limited training samples as the emphfair few-shot learning problem.
We devise a novel framework that accumulates fairness-aware knowledge across different meta-training tasks and then generalizes the learned knowledge to meta-test tasks.
arXiv Detail & Related papers (2023-08-28T06:31:37Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Group Fairness by Probabilistic Modeling with Latent Fair Decisions [36.20281545470954]
This paper studies learning fair probability distributions from biased data by explicitly modeling a latent variable that represents a hidden, unbiased label.
We aim to achieve demographic parity by enforcing certain independencies in the learned model.
We also show that group fairness guarantees are meaningful only if the distribution used to provide those guarantees indeed captures the real-world data.
arXiv Detail & Related papers (2020-09-18T19:13:23Z) - Fair Active Learning [15.313223110223941]
Active learning is a promising approach to build an accurate classifier by interactively querying an oracle within a labeling budget.
We design algorithms for fair active learning that carefully selects data points to be labeled so as to balance model accuracy and fairness.
arXiv Detail & Related papers (2020-06-20T17:00:02Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Fair Active Learning [15.313223110223941]
It is critical that machine learning models do not propagate discrimination.
Active learning is a promising approach to build an accurate classifier by interactively querying an oracle within a labeling budget.
We design algorithms for fair active learning that carefully selects data points to be labeled so as to balance model accuracy and fairness.
arXiv Detail & Related papers (2020-01-06T22:20:02Z) - Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning.
In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data.
The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.