Related papers: Towards Fairness and Privacy: A Novel Data Pre-processing Optimization Framework for Non-binary Protected Attributes

Towards Fairness and Privacy: A Novel Data Pre-processing Optimization Framework for Non-binary Protected Attributes

URL: http://arxiv.org/abs/2410.00836v1
Date: Tue, 1 Oct 2024 16:17:43 GMT
Title: Towards Fairness and Privacy: A Novel Data Pre-processing Optimization Framework for Non-binary Protected Attributes
Authors: Manh Khoi Duong, Stefan Conrad,
Abstract summary: This work presents a framework for addressing fairness by debiasing datasets containing a (non-binary) protected attribute. The framework addresses this by finding a data subset that minimizes a certain discrimination measure. In contrast to prior work, the framework exhibits a high degree of flexibility as it is metric- and task-agnostic.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The reason behind the unfair outcomes of AI is often rooted in biased datasets. Therefore, this work presents a framework for addressing fairness by debiasing datasets containing a (non-)binary protected attribute. The framework proposes a combinatorial optimization problem where heuristics such as genetic algorithms can be used to solve for the stated fairness objectives. The framework addresses this by finding a data subset that minimizes a certain discrimination measure. Depending on a user-defined setting, the framework enables different use cases, such as data removal, the addition of synthetic data, or exclusive use of synthetic data. The exclusive use of synthetic data in particular enhances the framework's ability to preserve privacy while optimizing for fairness. In a comprehensive evaluation, we demonstrate that under our framework, genetic algorithms can effectively yield fairer datasets compared to the original data. In contrast to prior work, the framework exhibits a high degree of flexibility as it is metric- and task-agnostic, can be applied to both binary or non-binary protected attributes, and demonstrates efficient runtime.

Related papers

Targeted Learning for Data Fairness [52.59573714151884]
We expand fairness inference by evaluating fairness in the data generating process itself. We derive estimators demographic parity, equal opportunity, and conditional mutual information. To validate our approach, we perform several simulations and apply our estimators to real data.
arXiv Detail & Related papers (2025-02-06T18:51:28Z)
Can Synthetic Data be Fair and Private? A Comparative Study of Synthetic Data Generation and Fairness Algorithms [2.144088660722956]
We find that the DEbiasing CAusal Fairness (DECAF) algorithm achieves the best balance between privacy and fairness. Applying pre-processing fairness algorithms to synthetic data improves fairness even more than when applied to real data.
arXiv Detail & Related papers (2025-01-03T12:35:58Z)
Optimisation Strategies for Ensuring Fairness in Machine Learning: With and Without Demographics [4.662958544712181]
This paper introduces two formal frameworks to tackle open questions in machine learning fairness. In one framework, operator-valued optimisation and min-max objectives are employed to address unfairness in time-series problems. In the second framework, the challenge of lacking sensitive attributes, such as gender and race, in commonly used datasets is addressed.
arXiv Detail & Related papers (2024-11-13T22:29:23Z)
Data Generation via Latent Factor Simulation for Fairness-aware Re-ranking [11.133319460036082]
Synthetic data is a useful resource for algorithmic research. We propose a novel type of data for fairness-aware recommendation: synthetic recommender system outputs.
arXiv Detail & Related papers (2024-09-21T09:13:50Z)
Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning [49.417414031031264]
This paper studies learning fair encoders in a self-supervised learning setting. All data are unlabeled and only a small portion of them are annotated with sensitive attributes.
arXiv Detail & Related papers (2024-06-09T08:11:12Z)
Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient [52.2669490431145]
PropEn is inspired by'matching', which enables implicit guidance without training a discriminator. We show that training with a matched dataset approximates the gradient of the property of interest while remaining within the data distribution.
arXiv Detail & Related papers (2024-05-28T11:30:19Z)
Theoretically Principled Federated Learning for Balancing Privacy and Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters. It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z)
Fair mapping [0.0]
We propose a novel pre-processing method based on the transformation of the distribution of protected groups onto a chosen target one. We leverage on the recent works of the Wasserstein GAN and AttGAN frameworks to achieve the optimal transport of data points. Our proposed approach, preserves the interpretability of data and can be used without defining exactly the sensitive groups.
arXiv Detail & Related papers (2022-09-01T17:31:27Z)
Fair Classification with Adversarial Perturbations [35.030329189029246]
We study fair classification in the presence of an omniscient adversary that, given an $eta$, is allowed to choose an arbitrary $eta$-fraction of the training samples and arbitrarily perturb their protected attributes. Our main contribution is an optimization framework to learn fair classifiers in this adversarial setting that comes with provable guarantees on accuracy and fairness. We prove near-tightness of our framework's guarantees for natural hypothesis classes: no algorithm can have significantly better accuracy and any algorithm with better fairness must have lower accuracy.
arXiv Detail & Related papers (2021-06-10T17:56:59Z)
Beyond Individual and Group Fairness [90.4666341812857]
We present a new data-driven model of fairness that is guided by the unfairness complaints received by the system. Our model supports multiple fairness criteria and takes into account their potential incompatibilities.
arXiv Detail & Related papers (2020-08-21T14:14:44Z)
New Oracle-Efficient Algorithms for Private Synthetic Data Release [52.33506193761153]
We present three new algorithms for constructing differentially private synthetic data. The algorithms satisfy differential privacy even in the worst case. Compared to the state-of-the-art method High-Dimensional Matrix Mechanism citeMcKennaMHM18, our algorithms provide better accuracy in the large workload.
arXiv Detail & Related papers (2020-07-10T15:46:05Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
Fair Classification with Noisy Protected Attributes: A Framework with Provable Guarantees [43.326827444321935]
We present an optimization framework for learning a fair classifier in the presence of noisy perturbations in the protected attributes. Our framework can be employed with a very general class of linear and linear-fractional fairness constraints. We show that our framework can be used to attain either statistical rate or false positive rate fairness guarantees with a minimal loss in accuracy, even when the noise is large.
arXiv Detail & Related papers (2020-06-08T17:52:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.