Model-agnostic bias mitigation methods with regressor distribution
control for Wasserstein-based fairness metrics
- URL: http://arxiv.org/abs/2111.11259v1
- Date: Fri, 19 Nov 2021 17:31:22 GMT
- Title: Model-agnostic bias mitigation methods with regressor distribution
control for Wasserstein-based fairness metrics
- Authors: Alexey Miroshnikov, Konstandinos Kotsiopoulos, Ryan Franks, Arjun Ravi
Kannan
- Abstract summary: We propose a bias mitigation methodology based upon the construction of post-processed models with fairer regressor distributions.
Our novel methodology performs optimization in low-dimensional spaces and avoids expensive model retraining.
- Score: 0.6509758931804478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article is a companion paper to our earlier work Miroshnikov et al.
(2021) on fairness interpretability, which introduces bias explanations. In the
current work, we propose a bias mitigation methodology based upon the
construction of post-processed models with fairer regressor distributions for
Wasserstein-based fairness metrics. By identifying the list of predictors
contributing the most to the bias, we reduce the dimensionality of the problem
by mitigating the bias originating from those predictors. The post-processing
methodology involves reshaping the predictor distributions by balancing the
positive and negative bias explanations and allows for the regressor bias to
decrease. We design an algorithm that uses Bayesian optimization to construct
the bias-performance efficient frontier over the family of post-processed
models, from which an optimal model is selected. Our novel methodology performs
optimization in low-dimensional spaces and avoids expensive model retraining.
Related papers
- Inference-Time Selective Debiasing [27.578390085427156]
We propose selective debiasing -- an inference-time safety mechanism that aims to increase the overall quality of models.
We identify the potentially biased model predictions and, instead of discarding them, we debias them using LEACE -- a post-processing debiasing method.
Experiments with text classification datasets demonstrate that selective debiasing helps to close the performance gap between post-processing methods and at-training and pre-processing debiasing techniques.
arXiv Detail & Related papers (2024-07-27T21:56:23Z) - Robust Preference Optimization through Reward Model Distillation [68.65844394615702]
Language model (LM) post-training involves maximizing a reward function that is derived from preference annotations.
DPO is a popular offline alignment method that trains a policy directly on preference data without the need to train a reward model or apply reinforcement learning.
We analyze this phenomenon and propose distillation to get a better proxy for the true preference distribution over generation pairs.
arXiv Detail & Related papers (2024-05-29T17:39:48Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases [76.9127853906115]
Bridging the gap between diffusion models and human preferences is crucial for their integration into practical generative.
We propose Temporal Diffusion Policy Optimization with critic active neuron Reset (TDPO-R), a policy gradient algorithm that exploits the temporal inductive bias of diffusion models.
Empirical results demonstrate the superior efficacy of our methods in mitigating reward overoptimization.
arXiv Detail & Related papers (2024-02-13T15:55:41Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Balancing Unobserved Confounding with a Few Unbiased Ratings in Debiased
Recommendations [4.960902915238239]
We propose a theoretically guaranteed model-agnostic balancing approach that can be applied to any existing debiasing method.
The proposed approach makes full use of unbiased data by alternatively correcting model parameters learned with biased data, and adaptively learning balance coefficients of biased samples for further debiasing.
arXiv Detail & Related papers (2023-04-17T08:56:55Z) - Guide the Learner: Controlling Product of Experts Debiasing Method Based
on Token Attribution Similarities [17.082695183953486]
A popular workaround is to train a robust model by re-weighting training examples based on a secondary biased model.
Here, the underlying assumption is that the biased model resorts to shortcut features.
We introduce a fine-tuning strategy that incorporates the similarity between the main and biased model attribution scores in a Product of Experts loss function.
arXiv Detail & Related papers (2023-02-06T15:21:41Z) - Debiased Fine-Tuning for Vision-language Models by Prompt Regularization [50.41984119504716]
We present a new paradigm for fine-tuning large-scale vision pre-trained models on downstream task, dubbed Prompt Regularization (ProReg)
ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning.
We show the consistently strong performance of ProReg compared with conventional fine-tuning, zero-shot prompt, prompt tuning, and other state-of-the-art methods.
arXiv Detail & Related papers (2023-01-29T11:53:55Z) - Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks [10.748627178113418]
We propose a novel modular bias mitigation approach, consisting of stand-alone highly sparse debiasingworks.
We conduct experiments on three classification tasks with gender, race, and age as protected attributes.
arXiv Detail & Related papers (2022-05-30T15:21:25Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.