AdapFair: Ensuring Continuous Fairness for Machine Learning Operations
- URL: http://arxiv.org/abs/2409.15088v1
- Date: Mon, 23 Sep 2024 15:01:47 GMT
- Title: AdapFair: Ensuring Continuous Fairness for Machine Learning Operations
- Authors: Yinghui Huang, Zihao Tang, Xiangyu Chang,
- Abstract summary: We present a debiasing framework designed to find an optimal fair transformation of input data.
We leverage the normalizing flows to enable efficient, information-preserving data transformation.
We introduce an efficient optimization algorithm with closed-formed gradient computations.
- Score: 7.909259406397651
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The biases and discrimination of machine learning algorithms have attracted significant attention, leading to the development of various algorithms tailored to specific contexts. However, these solutions often fall short of addressing fairness issues inherent in machine learning operations. In this paper, we present a debiasing framework designed to find an optimal fair transformation of input data that maximally preserves data predictability. A distinctive feature of our approach is its flexibility and efficiency. It can be integrated with any downstream black-box classifiers, providing continuous fairness guarantees with minimal retraining efforts, even in the face of frequent data drifts, evolving fairness requirements, and batches of similar tasks. To achieve this, we leverage the normalizing flows to enable efficient, information-preserving data transformation, ensuring that no critical information is lost during the debiasing process. Additionally, we incorporate the Wasserstein distance as the unfairness measure to guide the optimization of data transformations. Finally, we introduce an efficient optimization algorithm with closed-formed gradient computations, making our framework scalable and suitable for dynamic, real-world environments.
Related papers
- Re-evaluating Group Robustness via Adaptive Class-Specific Scaling [47.41034887474166]
Group distributionally robust optimization is a prominent algorithm used to mitigate spurious correlations and address dataset bias.
Existing approaches have reported improvements in robust accuracies but come at the cost of average accuracy due to inherent trade-offs.
We propose a class-specific scaling strategy, directly applicable to existing debiasing algorithms with no additional training.
We develop an instance-wise adaptive scaling technique to alleviate this trade-off, even leading to improvements in both robust and average accuracies.
arXiv Detail & Related papers (2024-12-19T16:01:51Z) - Navigating Towards Fairness with Data Selection [27.731128352096555]
We introduce a data selection method designed to efficiently and flexibly mitigate label bias.
Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set.
Our modality-agnostic method has proven efficient and effective in handling label bias and improving fairness across diverse datasets in experimental evaluations.
arXiv Detail & Related papers (2024-12-15T06:11:05Z) - A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation [0.562479170374811]
In many real-world applications, continuous machine learning (ML) systems are crucial but prone to data drift.
Traditional drift adaptation methods typically update models using ensemble techniques, often discarding drifted historical data.
We contend that explicitly incorporating drifted data into the model training process significantly enhances model accuracy and robustness.
arXiv Detail & Related papers (2024-11-23T17:35:23Z) - Towards Explainable Automated Data Quality Enhancement without Domain Knowledge [0.0]
We propose a comprehensive framework designed to automatically assess and rectify data quality issues in any given dataset.
Our primary objective is to address three fundamental types of defects: absence, redundancy, and incoherence.
We adopt a hybrid approach that integrates statistical methods with machine learning algorithms.
arXiv Detail & Related papers (2024-09-16T10:08:05Z) - Evolutionary Multi-Objective Optimisation for Fairness-Aware Self Adjusting Memory Classifiers in Data Streams [2.8366371519410887]
We introduce a novel approach to enhance fairness in machine learning algorithms applied to data stream classification.
The proposed approach integrates the strengths of the self-adjusting memory K-Nearest-Neighbour algorithm with evolutionary multi-objective optimisation.
We show that the proposed approach maintains competitive accuracy and significantly reduces discrimination.
arXiv Detail & Related papers (2024-04-18T10:59:04Z) - OTClean: Data Cleaning for Conditional Independence Violations using
Optimal Transport [51.6416022358349]
sys is a framework that harnesses optimal transport theory for data repair under Conditional Independence (CI) constraints.
We develop an iterative algorithm inspired by Sinkhorn's matrix scaling algorithm, which efficiently addresses high-dimensional and large-scale data.
arXiv Detail & Related papers (2024-03-04T18:23:55Z) - Data-Driven Offline Decision-Making via Invariant Representation
Learning [97.49309949598505]
offline data-driven decision-making involves synthesizing optimized decisions with no active interaction.
A key challenge is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good.
In this paper, we formulate offline data-driven decision-making as domain adaptation, where the goal is to make accurate predictions for the value of optimized decisions.
arXiv Detail & Related papers (2022-11-21T11:01:37Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Bilevel Optimization for Differentially Private Optimization in Energy
Systems [53.806512366696275]
This paper studies how to apply differential privacy to constrained optimization problems whose inputs are sensitive.
The paper shows that, under a natural assumption, a bilevel model can be solved efficiently for large-scale nonlinear optimization problems.
arXiv Detail & Related papers (2020-01-26T20:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.