Understanding and Mitigating Accuracy Disparity in Regression
- URL: http://arxiv.org/abs/2102.12013v1
- Date: Wed, 24 Feb 2021 01:24:50 GMT
- Title: Understanding and Mitigating Accuracy Disparity in Regression
- Authors: Jianfeng Chi, Yuan Tian, Geoffrey J. Gordon, Han Zhao
- Abstract summary: We study the accuracy disparity problem in regression.
We propose an error decomposition theorem, which decomposes the accuracy disparity into the distance between marginal label distributions.
We then propose an algorithm to reduce this disparity, and analyze its game-theoretic optima of the proposed objective functions.
- Score: 34.63275666745179
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the widespread deployment of large-scale prediction systems in
high-stakes domains, e.g., face recognition, criminal justice, etc., disparity
on prediction accuracy between different demographic subgroups has called for
fundamental understanding on the source of such disparity and algorithmic
intervention to mitigate it. In this paper, we study the accuracy disparity
problem in regression. To begin with, we first propose an error decomposition
theorem, which decomposes the accuracy disparity into the distance between
marginal label distributions and the distance between conditional
representations, to help explain why such accuracy disparity appears in
practice. Motivated by this error decomposition and the general idea of
distribution alignment with statistical distances, we then propose an algorithm
to reduce this disparity, and analyze its game-theoretic optima of the proposed
objective functions. To corroborate our theoretical findings, we also conduct
experiments on five benchmark datasets. The experimental results suggest that
our proposed algorithms can effectively mitigate accuracy disparity while
maintaining the predictive power of the regression models.
Related papers
- Zero-Shot Uncertainty Quantification using Diffusion Probabilistic Models [7.136205674624813]
We conduct a study to evaluate the effectiveness of ensemble methods on solving different regression problems using diffusion models.
We demonstrate that ensemble methods consistently improve model prediction accuracy across various regression tasks.
Our study provides a comprehensive view of the utility of diffusion ensembles, serving as a useful reference for practitioners employing diffusion models in regression problem-solving.
arXiv Detail & Related papers (2024-08-08T18:34:52Z) - On the Maximal Local Disparity of Fairness-Aware Classifiers [35.98015221840018]
We propose a novel fairness metric called Maximal Cumulative ratio Disparity along varying Predictions' neighborhood (MCDP)
To accurately and efficiently calculate the MCDP, we develop a provably exact and an approximate calculation algorithm that greatly reduces the computational complexity with low estimation error.
arXiv Detail & Related papers (2024-06-05T13:35:48Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Distributed Variational Inference for Online Supervised Learning [15.038649101409804]
This paper develops a scalable distributed probabilistic inference algorithm.
It applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks.
arXiv Detail & Related papers (2023-09-05T22:33:02Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Assaying Out-Of-Distribution Generalization in Transfer Learning [103.57862972967273]
We take a unified view of previous work, highlighting message discrepancies that we address empirically.
We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting.
arXiv Detail & Related papers (2022-07-19T12:52:33Z) - Costs and Benefits of Wasserstein Fair Regression [11.134279147254361]
In this paper, we characterize the inherent tradeoff between statistical parity and accuracy in the regression setting.
Our lower bound is sharp, algorithm-independent, and admits a simple interpretation.
We develop a practical algorithm for fair regression through the lens of representation learning.
arXiv Detail & Related papers (2021-06-16T14:24:44Z) - Learning Prediction Intervals for Regression: Generalization and
Calibration [12.576284277353606]
We study the generation of prediction intervals in regression for uncertainty quantification.
We use a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes.
We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
arXiv Detail & Related papers (2021-02-26T17:55:30Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.