Related papers: Information Theoretic Measures for Fairness-aware Feature Selection

Information Theoretic Measures for Fairness-aware Feature Selection

URL: http://arxiv.org/abs/2106.00772v1
Date: Tue, 1 Jun 2021 20:11:54 GMT
Title: Information Theoretic Measures for Fairness-aware Feature Selection
Authors: Sajad Khodadadian, Mohamed Nafea, AmirEmad Ghassami, Negar Kiyavash
Abstract summary: We develop a framework for fairness-aware feature selection, based on information theoretic measures for the accuracy and discriminatory impacts of features. Specifically, our goal is to design a fairness utility score for each feature which quantifies how this feature influences accurate as well as nondiscriminatory decisions.
Score: 27.06618125828978
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine earning algorithms are increasingly used for consequential decision making regarding individuals based on their relevant features. Features that are relevant for accurate decisions may however lead to either explicit or implicit forms of discrimination against unprivileged groups, such as those of certain race or gender. This happens due to existing biases in the training data, which are often replicated or even exacerbated by the learning algorithm. Identifying and measuring these biases at the data level is a challenging problem due to the interdependence among the features, and the decision outcome. In this work, we develop a framework for fairness-aware feature selection, based on information theoretic measures for the accuracy and discriminatory impacts of features. Specifically, our goal is to design a fairness utility score for each feature which quantifies how this feature influences accurate as well as nondiscriminatory decisions. We first propose information theoretic measures for the impact of different subsets of features on the accuracy and discrimination of the model. Subsequently, we deduce the marginal impact of each feature using Shapley value function. Our framework depends on the joint statistics of the data rather than a particular classifier design. We examine our proposed framework on real and synthetic data to evaluate its performance.

Related papers

Detecting and Identifying Selection Structure in Sequential Data [53.24493902162797]
We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. We show that selection structure is identifiable without any parametric assumptions or interventional experiments. We also propose a provably correct algorithm to detect and identify selection structures as well as other types of dependencies.
arXiv Detail & Related papers (2024-06-29T20:56:34Z)
Detection and Evaluation of bias-inducing Features in Machine learning [14.045499740240823]
In the context of machine learning (ML), one can use cause-to-effect analysis to understand the reason for the biased behavior of the system. We propose an approach for systematically identifying all bias-inducing features of a model to help support the decision-making of domain experts.
arXiv Detail & Related papers (2023-10-19T15:01:16Z)
Evaluating the Fairness of Discriminative Foundation Models in Computer Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP) We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy. Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z)
A statistical approach to detect sensitive features in a group fairness setting [10.087372021356751]
We propose a preprocessing step to address the task of automatically recognizing sensitive features that does not require a trained model to verify unfair results. Our empirical results attest our hypothesis and show that several features considered as sensitive in the literature do not necessarily entail disparate (unfair) results.
arXiv Detail & Related papers (2023-05-11T17:30:12Z)
Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z)
Measuring Fairness Under Unawareness of Sensitive Attributes: A Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes. We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z)
Through the Data Management Lens: Experimental Analysis and Evaluation of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness. We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability. Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
Nonparametric Feature Impact and Importance [0.6123324869194193]
We give mathematical definitions of feature impact and importance, derived from partial dependence curves, that operate directly on the data. To assess quality, we show that features ranked by these definitions are competitive with existing feature selection techniques.
arXiv Detail & Related papers (2020-06-08T17:07:35Z)
Fairness-Aware Learning with Prejudice Free Representations [2.398608007786179]
We propose a novel algorithm that can effectively identify and treat latent discriminating features. The approach helps to collect discrimination-free features that would improve the model performance.
arXiv Detail & Related papers (2020-02-26T10:06:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.