Falcon: Fair Active Learning using Multi-armed Bandits
- URL: http://arxiv.org/abs/2401.12722v2
- Date: Wed, 24 Jan 2024 04:43:05 GMT
- Title: Falcon: Fair Active Learning using Multi-armed Bandits
- Authors: Ki Hyun Tae, Hantian Zhang, Jaeyoung Park, Kexin Rong, Steven Euijong
Whang
- Abstract summary: We propose a data-centric approach that improves machine learning model fairness via strategic sample selection.
Experiments show that Falcon significantly outperforms existing fair active learning approaches in terms of fairness and accuracy.
In particular, only Falcon supports a proper trade-off between accuracy and fairness where its maximum fairness score is 1.8-4.5x higher than the second-best results.
- Score: 9.895979687746376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biased data can lead to unfair machine learning models, highlighting the
importance of embedding fairness at the beginning of data analysis,
particularly during dataset curation and labeling. In response, we propose
Falcon, a scalable fair active learning framework. Falcon adopts a data-centric
approach that improves machine learning model fairness via strategic sample
selection. Given a user-specified group fairness measure, Falcon identifies
samples from "target groups" (e.g., (attribute=female, label=positive)) that
are the most informative for improving fairness. However, a challenge arises
since these target groups are defined using ground truth labels that are not
available during sample selection. To handle this, we propose a novel
trial-and-error method, where we postpone using a sample if the predicted label
is different from the expected one and falls outside the target group. We also
observe the trade-off that selecting more informative samples results in higher
likelihood of postponing due to undesired label prediction, and the optimal
balance varies per dataset. We capture the trade-off between informativeness
and postpone rate as policies and propose to automatically select the best
policy using adversarial multi-armed bandit methods, given their computational
efficiency and theoretical guarantees. Experiments show that Falcon
significantly outperforms existing fair active learning approaches in terms of
fairness and accuracy and is more efficient. In particular, only Falcon
supports a proper trade-off between accuracy and fairness where its maximum
fairness score is 1.8-4.5x higher than the second-best results.
Related papers
- Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Fair-CDA: Continuous and Directional Augmentation for Group Fairness [48.84385689186208]
We propose a fine-grained data augmentation strategy for imposing fairness constraints.
We show that group fairness can be achieved by regularizing the models on transition paths of sensitive features between groups.
Our proposed method does not assume any data generative model and ensures good generalization for both accuracy and fairness.
arXiv Detail & Related papers (2023-04-01T11:23:00Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - Learning Fair Classifiers with Partially Annotated Group Labels [22.838927494573436]
We consider a more practical scenario dubbed as Algorithmic Fairness with annotated Group labels (FairPG)
We propose a simple auxiliary group assignment (CGL) that is readily applicable to any fairness-aware learning strategy.
We show that our method design is better than the vanilla pseudo-labeling strategy in terms of fairness criteria.
arXiv Detail & Related papers (2021-11-29T15:11:18Z) - Bias-Tolerant Fair Classification [20.973916494320246]
label bias and selection bias are two reasons in data that will hinder the fairness of machine-learning outcomes.
We propose a Bias-TolerantFAirRegularizedLoss (B-FARL) which tries to regain the benefits using data affected by label bias and selection bias.
B-FARL takes the biased data as input, calls a model that approximates the one trained with fair but latent data, and thus prevents discrimination without constraints required.
arXiv Detail & Related papers (2021-07-07T13:31:38Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Fairness Constraints in Semi-supervised Learning [56.48626493765908]
We develop a framework for fair semi-supervised learning, which is formulated as an optimization problem.
We theoretically analyze the source of discrimination in semi-supervised learning via bias, variance and noise decomposition.
Our method is able to achieve fair semi-supervised learning, and reach a better trade-off between accuracy and fairness than fair supervised learning.
arXiv Detail & Related papers (2020-09-14T04:25:59Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.