A Unified View of Label Shift Estimation
- URL: http://arxiv.org/abs/2003.07554v3
- Date: Fri, 16 Oct 2020 19:23:32 GMT
- Title: A Unified View of Label Shift Estimation
- Authors: Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton
- Abstract summary: There are two dominant approaches for estimating the label marginal.
We present a unified view of the two methods and the first theoretical characterization of MLLS.
Our analysis attributes BBSE's statistical inefficiency to a loss of information due to coarse calibration.
- Score: 45.472049320861856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Under label shift, the label distribution p(y) might change but the
class-conditional distributions p(x|y) do not. There are two dominant
approaches for estimating the label marginal. BBSE, a moment-matching approach
based on confusion matrices, is provably consistent and provides interpretable
error bounds. However, a maximum likelihood estimation approach, which we call
MLLS, dominates empirically. In this paper, we present a unified view of the
two methods and the first theoretical characterization of MLLS. Our
contributions include (i) consistency conditions for MLLS, which include
calibration of the classifier and a confusion matrix invertibility condition
that BBSE also requires; (ii) a unified framework, casting BBSE as roughly
equivalent to MLLS for a particular choice of calibration method; and (iii) a
decomposition of MLLS's finite-sample error into terms reflecting
miscalibration and estimation error. Our analysis attributes BBSE's statistical
inefficiency to a loss of information due to coarse calibration. Experiments on
synthetic data, MNIST, and CIFAR10 support our findings.
Related papers
- SoftCVI: Contrastive variational inference with self-generated soft labels [2.5398014196797614]
Variational inference and Markov chain Monte Carlo methods are the predominant tools for this task.
We introduce Soft Contrastive Variational Inference (SoftCVI), which allows a family of variational objectives to be derived through a contrastive estimation framework.
We find that SoftCVI can be used to form objectives which are stable to train and mass-covering, frequently outperforming inference with other variational approaches.
arXiv Detail & Related papers (2024-07-22T14:54:12Z) - Analysis of Diagnostics (Part I): Prevalence, Uncertainty Quantification, and Machine Learning [0.0]
This manuscript is the first in a two-part series that studies deeper connections between classification theory and prevalence.
We propose a numerical, homotopy algorithm that estimates the $Bstar (q)$ by minimizing a prevalence-weighted empirical error.
We validate our methods in the context of synthetic data and a research-use-only SARS-CoV-2 enzyme-linked immunosorbent (ELISA) assay.
arXiv Detail & Related papers (2023-08-30T13:26:49Z) - How Does Pseudo-Labeling Affect the Generalization Error of the
Semi-Supervised Gibbs Algorithm? [73.80001705134147]
We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm.
The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset.
arXiv Detail & Related papers (2022-10-15T04:11:56Z) - Algorithmic Fairness Verification with Graphical Models [24.8005399877574]
We propose an efficient fairness verifier, called FVGM, that encodes correlations among features as a Bayesian network.
We show that FVGM leads to an accurate and scalable assessment for more diverse families of fairness-enhancing algorithms.
arXiv Detail & Related papers (2021-09-20T12:05:14Z) - Distribution-free uncertainty quantification for classification under
label shift [105.27463615756733]
We focus on uncertainty quantification (UQ) for classification problems via two avenues.
We first argue that label shift hurts UQ, by showing degradation in coverage and calibration.
We examine these techniques theoretically in a distribution-free framework and demonstrate their excellent practical performance.
arXiv Detail & Related papers (2021-03-04T20:51:03Z) - Label-Imbalanced and Group-Sensitive Classification under
Overparameterization [32.923780772605596]
Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics.
We show that a logit-adjusted loss modification to standard empirical risk minimization might be ineffective in general.
We show that our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way.
arXiv Detail & Related papers (2021-03-02T08:09:43Z) - A Unified Joint Maximum Mean Discrepancy for Domain Adaptation [73.44809425486767]
This paper theoretically derives a unified form of JMMD that is easy to optimize.
From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence that benefits to classification.
We propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift.
arXiv Detail & Related papers (2021-01-25T09:46:14Z) - Distribution Aligning Refinery of Pseudo-label for Imbalanced
Semi-supervised Learning [126.31716228319902]
We develop Distribution Aligning Refinery of Pseudo-label (DARP) algorithm.
We show that DARP is provably and efficiently compatible with state-of-the-art SSL schemes.
arXiv Detail & Related papers (2020-07-17T09:16:05Z) - Dual T: Reducing Estimation Error for Transition Matrix in Label-noise
Learning [157.2709657207203]
Existing methods for estimating the transition matrix rely heavily on estimating the noisy class posterior.
We introduce an intermediate class to avoid directly estimating the noisy class posterior.
By this intermediate class, the original transition matrix can then be factorized into the product of two easy-to-estimate transition matrices.
arXiv Detail & Related papers (2020-06-14T05:48:20Z) - Estimation of Classification Rules from Partially Classified Data [0.9137554315375919]
We consider the situation where the observed sample contains some observations whose class of origin is known, and where the remaining observations in the sample are unclassified.
For class-conditional distributions taken to be known up to a vector of unknown parameters, the aim is to estimate the Bayes' rule of allocation for the allocation of subsequent unclassified observations.
arXiv Detail & Related papers (2020-04-13T23:35:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.