Related papers: A survey on datasets for fairness-aware machine learning

A survey on datasets for fairness-aware machine learning

URL: http://arxiv.org/abs/2110.00530v1
Date: Fri, 1 Oct 2021 16:54:04 GMT
Title: A survey on datasets for fairness-aware machine learning
Authors: Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Eirini Ntoutsi
Abstract summary: A large variety of fairness-aware machine learning solutions have been proposed. In this paper, we overview real-world datasets used for fairness-aware machine learning. For a deeper understanding of bias and fairness in the datasets, we investigate the interesting relationships using exploratory analysis.
Score: 6.962333053044713
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As decision-making increasingly relies on machine learning and (big) data, the issue of fairness in data-driven AI systems is receiving increasing attention from both research and industry. A large variety of fairness-aware machine learning solutions have been proposed which propose fairness-related interventions in the data, learning algorithms and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real-world datasets used for fairness-aware machine learning. We focus on tabular data as the most common data representation for fairness-aware machine learning. We start our analysis by identifying relationships among the different attributes, particularly w.r.t. protected attributes and class attributes, using a Bayesian network. For a deeper understanding of bias and fairness in the datasets, we investigate the interesting relationships using exploratory analysis.

Related papers

Targeted Learning for Data Fairness [52.59573714151884]
We expand fairness inference by evaluating fairness in the data generating process itself. We derive estimators demographic parity, equal opportunity, and conditional mutual information. To validate our approach, we perform several simulations and apply our estimators to real data.
arXiv Detail & Related papers (2025-02-06T18:51:28Z)
Analyzing Fairness of Computer Vision and Natural Language Processing Models [1.0923877073891446]
Machine learning (ML) algorithms play a crucial role in decision making across diverse fields such as healthcare, finance, education, and law enforcement. Despite their widespread adoption, these systems raise ethical and social concerns due to potential biases and fairness issues. This study focuses on evaluating and improving the fairness of Computer Vision and Natural Language Processing (NLP) models applied to unstructured datasets.
arXiv Detail & Related papers (2024-12-13T06:35:55Z)
Analyzing Fairness of Classification Machine Learning Model with Structured Dataset [1.0923877073891446]
This study investigates the fairness of machine learning models applied to structured datasets in classification tasks. Three fairness libraries; Fairlearn by Microsoft, AIF360 by IBM, and the What If Tool by Google were employed. The research aims to assess the extent of bias in the ML models, compare the effectiveness of these libraries, and derive actionable insights for practitioners.
arXiv Detail & Related papers (2024-12-13T06:31:09Z)
Fair Mixed Effects Support Vector Machine [0.0]
Fairness in machine learning aims to mitigate biases present in the training data and model imperfections. This is achieved by preventing the model from making decisions based on sensitive characteristics like ethnicity or sexual orientation. We present a fair mixed effects support vector machine algorithm that can handle both problems simultaneously.
arXiv Detail & Related papers (2024-05-10T12:25:06Z)
Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness. We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks. Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z)
Achieving Transparency in Distributed Machine Learning with Explainable Data Collaboration [5.994347858883343]
A parallel trend has been to train machine learning models in collaboration with other data holders without accessing their data. This paper presents an Explainable Data Collaboration Framework based on a model-agnostic additive feature attribution algorithm.
arXiv Detail & Related papers (2022-12-06T23:53:41Z)
FAIR-FATE: Fair Federated Learning with Momentum [0.41998444721319217]
We propose a novel FAIR FederATEd Learning algorithm that aims to achieve group fairness while maintaining high utility. To the best of our knowledge, this is the first approach in machine learning that aims to achieve fairness using a fair Momentum estimate. Experimental results on real-world datasets demonstrate that FAIR-FATE outperforms state-of-the-art fair Federated Learning algorithms.
arXiv Detail & Related papers (2022-09-27T20:33:38Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
MultiFair: Multi-Group Fairness in Machine Learning [52.24956510371455]
We study multi-group fairness in machine learning (MultiFair) We propose a generic end-to-end algorithmic framework to solve it. Our proposed framework is generalizable to many different settings.
arXiv Detail & Related papers (2021-05-24T02:30:22Z)
Through the Data Management Lens: Experimental Analysis and Evaluation of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness. We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability. Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z)
A Note on Data Biases in Generative Models [16.86600007830682]
We investigate the impact of dataset quality on the performance of generative models. We show how societal biases of datasets are replicated by generative models. We present creative applications through unpaired transfer between diverse datasets such as photographs, oil portraits, and animes.
arXiv Detail & Related papers (2020-12-04T10:46:37Z)
Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair. We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data. A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z)
Bringing the People Back In: Contesting Benchmark Machine Learning Datasets [11.00769651520502]
We outline a research program - a genealogy of machine learning data - for investigating how and why these datasets have been created. We describe the ways in which benchmark datasets in machine learning operate as infrastructure and pose four research questions for these datasets.
arXiv Detail & Related papers (2020-07-14T23:22:13Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.