Online Decision Trees with Fairness
- URL: http://arxiv.org/abs/2010.08146v1
- Date: Thu, 15 Oct 2020 02:50:13 GMT
- Title: Online Decision Trees with Fairness
- Authors: Wenbin Zhang and Liang Zhao
- Abstract summary: We propose a novel framework of online decision tree with fairness in the data stream with possible distribution drifting.
Specifically, first, we propose two novel fairness splitting criteria that encode the data as well as possible.
Second, we propose two fairness decision tree online growth algorithms that fulfills different online fair decision-making requirements.
- Score: 8.949941684885323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While artificial intelligence (AI)-based decision-making systems are
increasingly popular, significant concerns on the potential discrimination
during the AI decision-making process have been observed. For example, the
distribution of predictions is usually biased and dependents on the sensitive
attributes (e.g., gender and ethnicity). Numerous approaches have therefore
been proposed to develop decision-making systems that are
discrimination-conscious by-design, which are typically batch-based and require
the simultaneous availability of all the training data for model learning.
However, in the real-world, the data streams usually come on the fly which
requires the model to process each input data once "on arrival" and without the
need for storage and reprocessing. In addition, the data streams might also
evolve over time, which further requires the model to be able to simultaneously
adapt to non-stationary data distributions and time-evolving bias patterns,
with an effective and robust trade-off between accuracy and fairness. In this
paper, we propose a novel framework of online decision tree with fairness in
the data stream with possible distribution drifting. Specifically, first, we
propose two novel fairness splitting criteria that encode the data as well as
possible, while simultaneously removing dependence on the sensitive attributes,
and further adapts to non-stationary distribution with fine-grained control
when needed. Second, we propose two fairness decision tree online growth
algorithms that fulfills different online fair decision-making requirements.
Our experiments show that our algorithms are able to deal with discrimination
in massive and non-stationary streaming environments, with a better trade-off
between fairness and predictive performance.
Related papers
- Counterfactual Fairness through Transforming Data Orthogonal to Bias [7.109458605736819]
We propose a novel data pre-processing algorithm, Orthogonal to Bias (OB)
OB is designed to eliminate the influence of a group of continuous sensitive variables, thus promoting counterfactual fairness in machine learning applications.
OB is model-agnostic, making it applicable to a wide range of machine learning models and tasks.
arXiv Detail & Related papers (2024-03-26T16:40:08Z) - Demographic Parity: Mitigating Biases in Real-World Data [0.0]
We propose a robust methodology that guarantees the removal of unwanted biases while preserving classification utility.
Our approach can always achieve this in a model-independent way by deriving from real-world data.
arXiv Detail & Related papers (2023-09-27T11:47:05Z) - Preventing Discriminatory Decision-making in Evolving Data Streams [8.952662914331901]
Bias in machine learning has rightly received significant attention over the last decade.
Most fair machine learning (fair-ML) work to address bias in decision-making systems has focused solely on the offline setting.
Despite the wide prevalence of online systems in the real world, work on identifying and correcting bias in the online setting is severely lacking.
arXiv Detail & Related papers (2023-02-16T01:20:08Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Simultaneous Improvement of ML Model Fairness and Performance by
Identifying Bias in Data [1.76179873429447]
We propose a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training.
In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes, an inherent bias gets induced in the dataset.
arXiv Detail & Related papers (2022-10-24T13:04:07Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process.
We generate a representative as well as fair version of the UCI Adult census data set.
We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Bias in Multimodal AI: Testbed for Fair Automatic Recruitment [73.85525896663371]
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
We train automatic recruitment algorithms using a set of multimodal synthetic profiles consciously scored with gender and racial biases.
Our methodology and results show how to generate fairer AI-based tools in general, and in particular fairer automated recruitment systems.
arXiv Detail & Related papers (2020-04-15T15:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.