Online Fairness-Aware Learning with Imbalanced Data Streams
- URL: http://arxiv.org/abs/2108.06231v1
- Date: Fri, 13 Aug 2021 13:31:42 GMT
- Title: Online Fairness-Aware Learning with Imbalanced Data Streams
- Authors: Vasileios Iosifidis, Wenbin Zhang, Eirini Ntoutsi
- Abstract summary: We propose ours, an online fairness-aware approach that maintains a valid and fair classifier over the stream.
oursis an online boosting approach that changes the training distribution in an online fashion by monitoring stream's class imbalance.
Experiments on 8 real-world and 1 synthetic datasets demonstrate the superiority of our method over state-of-the-art fairness-aware stream approaches.
- Score: 9.481178205985396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-driven learning algorithms are employed in many online applications, in
which data become available over time, like network monitoring, stock price
prediction, job applications, etc. The underlying data distribution might
evolve over time calling for model adaptation as new instances arrive and old
instances become obsolete. In such dynamic environments, the so-called data
streams, fairness-aware learning cannot be considered as a one-off requirement,
but rather it should comprise a continual requirement over the stream. Recent
fairness-aware stream classifiers ignore the problem of class imbalance, which
manifests in many real-life applications, and mitigate discrimination mainly
because they "reject" minority instances at large due to their inability to
effectively learn all classes.
In this work, we propose \ours, an online fairness-aware approach that
maintains a valid and fair classifier over the stream. \ours~is an online
boosting approach that changes the training distribution in an online fashion
by monitoring stream's class imbalance and tweaks its decision boundary to
mitigate discriminatory outcomes over the stream. Experiments on 8 real-world
and 1 synthetic datasets from different domains with varying class imbalance
demonstrate the superiority of our method over state-of-the-art fairness-aware
stream approaches with a range (relative) increase [11.2\%-14.2\%] in balanced
accuracy, [22.6\%-31.8\%] in gmean, [42.5\%-49.6\%] in recall, [14.3\%-25.7\%]
in kappa and [89.4\%-96.6\%] in statistical parity (fairness).
Related papers
- Gradient Reweighting: Towards Imbalanced Class-Incremental Learning [8.438092346233054]
Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data.
A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution.
We show that this dual imbalance issue causes skewed gradient updates with biased weights in FC layers, thus inducing over/under-fitting and catastrophic forgetting in CIL.
arXiv Detail & Related papers (2024-02-28T18:08:03Z) - Exploring Vacant Classes in Label-Skewed Federated Learning [113.65301899666645]
Label skews, characterized by disparities in local label distribution across clients, pose a significant challenge in federated learning.
This paper introduces FedVLS, a novel approach to label-skewed federated learning that integrates vacant-class distillation and logit suppression simultaneously.
arXiv Detail & Related papers (2024-01-04T16:06:31Z) - Preventing Discriminatory Decision-making in Evolving Data Streams [8.952662914331901]
Bias in machine learning has rightly received significant attention over the last decade.
Most fair machine learning (fair-ML) work to address bias in decision-making systems has focused solely on the offline setting.
Despite the wide prevalence of online systems in the real world, work on identifying and correcting bias in the online setting is severely lacking.
arXiv Detail & Related papers (2023-02-16T01:20:08Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - Discrimination and Class Imbalance Aware Online Naive Bayes [5.065947993017157]
Stream learning algorithms are used to replace humans at critical decision-making points.
Recent discrimination-aware learning methods are optimized based on overall accuracy.
We propose a novel adaptation of Na"ive Bayes to mitigate discrimination embedded in the streams.
arXiv Detail & Related papers (2022-11-09T11:20:19Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Fairness Constraints in Semi-supervised Learning [56.48626493765908]
We develop a framework for fair semi-supervised learning, which is formulated as an optimization problem.
We theoretically analyze the source of discrimination in semi-supervised learning via bias, variance and noise decomposition.
Our method is able to achieve fair semi-supervised learning, and reach a better trade-off between accuracy and fairness than fair supervised learning.
arXiv Detail & Related papers (2020-09-14T04:25:59Z) - Ensuring Fairness Beyond the Training Data [22.284777913437182]
We develop classifiers that are fair with respect to the training distribution and for a class of perturbations.
Based on online learning algorithm, we develop an iterative algorithm that converges to a fair and robust solution.
Our experiments show that there is an inherent trade-off between fairness and accuracy of such classifiers.
arXiv Detail & Related papers (2020-07-12T16:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.