Detecting discriminatory risk through data annotation based on Bayesian
inferences
- URL: http://arxiv.org/abs/2101.11358v1
- Date: Wed, 27 Jan 2021 12:43:42 GMT
- Title: Detecting discriminatory risk through data annotation based on Bayesian
inferences
- Authors: Elena Beretta, Antonio Vetr\`o, Bruno Lepri, Juan Carlos De Martin
- Abstract summary: We propose a method of data annotation that aims to warn about the risk of discriminatory results of a given data set.
We empirically test our system on three datasets commonly accessed by the machine learning community.
- Score: 5.017973966200985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Thanks to the increasing growth of computational power and data availability,
the research in machine learning has advanced with tremendous rapidity.
Nowadays, the majority of automatic decision making systems are based on data.
However, it is well known that machine learning systems can present problematic
results if they are built on partial or incomplete data. In fact, in recent
years several studies have found a convergence of issues related to the ethics
and transparency of these systems in the process of data collection and how
they are recorded. Although the process of rigorous data collection and
analysis is fundamental in the model design, this step is still largely
overlooked by the machine learning community. For this reason, we propose a
method of data annotation based on Bayesian statistical inference that aims to
warn about the risk of discriminatory results of a given data set. In
particular, our method aims to deepen knowledge and promote awareness about the
sampling practices employed to create the training set, highlighting that the
probability of success or failure conditioned to a minority membership is given
by the structure of the data available. We empirically test our system on three
datasets commonly accessed by the machine learning community and we investigate
the risk of racial discrimination.
Related papers
- Non-IID data in Federated Learning: A Systematic Review with Taxonomy, Metrics, Methods, Frameworks and Future Directions [2.9434966603161072]
This systematic review aims to fill a gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics.
We describe popular solutions to address non-IID data and standardized frameworks employed in Federated Learning with heterogeneous data.
arXiv Detail & Related papers (2024-11-19T09:53:28Z) - Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection [0.0]
Open set recognition (OSR) aims to bring classification tasks in a situation that is more like reality.
This study provides an algorithm exploring a new representation of feature space to improve classification in OSR tasks.
arXiv Detail & Related papers (2024-05-09T15:15:34Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Resilient Neural Forecasting Systems [10.709321760368137]
Industrial machine learning systems face data challenges that are often under-explored in the academic literature.
In this paper, we discuss data challenges and solutions in the context of a Neural Forecasting application on labor planning.
We address changes in data distribution with a periodic retraining scheme and discuss the critical importance of model stability in this setting.
arXiv Detail & Related papers (2022-03-16T09:37:49Z) - Non-IID data and Continual Learning processes in Federated Learning: A
long road ahead [58.720142291102135]
Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private.
In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it.
At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
arXiv Detail & Related papers (2021-11-26T09:57:11Z) - FairCVtest Demo: Understanding Bias in Multimodal Learning with a
Testbed in Fair Automatic Recruitment [79.23531577235887]
This demo shows the capacity of the Artificial Intelligence (AI) behind a recruitment tool to extract sensitive information from unstructured data.
Aditionally, the demo includes a new algorithm for discrimination-aware learning which eliminates sensitive information in our multimodal AI framework.
arXiv Detail & Related papers (2020-09-12T17:45:09Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Bias in Multimodal AI: Testbed for Fair Automatic Recruitment [73.85525896663371]
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
We train automatic recruitment algorithms using a set of multimodal synthetic profiles consciously scored with gender and racial biases.
Our methodology and results show how to generate fairer AI-based tools in general, and in particular fairer automated recruitment systems.
arXiv Detail & Related papers (2020-04-15T15:58:05Z) - FAE: A Fairness-Aware Ensemble Framework [18.993049769711114]
FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre- and postprocessing steps of the data analysis process.
In the preprocessing step, we tackle the problems of under-representation of the protected group and of class-imbalance.
In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
arXiv Detail & Related papers (2020-02-03T13:05:18Z) - Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning.
In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data.
The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.