Hypothesis Testing for Class-Conditional Noise Using Local Maximum
Likelihood
- URL: http://arxiv.org/abs/2312.10238v1
- Date: Fri, 15 Dec 2023 22:14:58 GMT
- Title: Hypothesis Testing for Class-Conditional Noise Using Local Maximum
Likelihood
- Authors: Weisong Yang, Rafael Poyiadzi, Niall Twomey, Raul Santos Rodriguez
- Abstract summary: In supervised learning, automatically assessing the quality of the labels before any learning takes place remains an open research question.
In this paper we show how similar procedures can be followed when the underlying model is a product of Local Maximum Likelihood Estimation.
This different view allows for wider applicability of the tests by offering users access to a richer model class.
- Score: 1.8798171797988192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In supervised learning, automatically assessing the quality of the labels
before any learning takes place remains an open research question. In certain
particular cases, hypothesis testing procedures have been proposed to assess
whether a given instance-label dataset is contaminated with class-conditional
label noise, as opposed to uniform label noise. The existing theory builds on
the asymptotic properties of the Maximum Likelihood Estimate for parametric
logistic regression. However, the parametric assumptions on top of which these
approaches are constructed are often too strong and unrealistic in practice. To
alleviate this problem, in this paper we propose an alternative path by showing
how similar procedures can be followed when the underlying model is a product
of Local Maximum Likelihood Estimation that leads to more flexible
nonparametric logistic regression models, which in turn are less susceptible to
model misspecification. This different view allows for wider applicability of
the tests by offering users access to a richer model class. Similarly to
existing works, we assume we have access to anchor points which are provided by
the users. We introduce the necessary ingredients for the adaptation of the
hypothesis tests to the case of nonparametric logistic regression and
empirically compare against the parametric approach presenting both synthetic
and real-world case studies and discussing the advantages and limitations of
the proposed approach.
Related papers
- Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - A Tale of Sampling and Estimation in Discounted Reinforcement Learning [50.43256303670011]
We present a minimax lower bound on the discounted mean estimation problem.
We show that estimating the mean by directly sampling from the discounted kernel of the Markov process brings compelling statistical properties.
arXiv Detail & Related papers (2023-04-11T09:13:17Z) - Doubly Robust Counterfactual Classification [1.8907108368038217]
We study counterfactual classification as a new tool for decision-making under hypothetical (contrary to fact) scenarios.
We propose a doubly-robust nonparametric estimator for a general counterfactual classifier.
arXiv Detail & Related papers (2023-01-15T22:04:46Z) - fAux: Testing Individual Fairness via Gradient Alignment [2.5329739965085785]
We describe a new approach for testing individual fairness that does not have either requirement.
We show that the proposed method effectively identifies discrimination on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-10-10T21:27:20Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Calibrating Over-Parametrized Simulation Models: A Framework via
Eligibility Set [3.862247454265944]
We develop a framework to develop calibration schemes that satisfy rigorous frequentist statistical guarantees.
We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
arXiv Detail & Related papers (2021-05-27T00:59:29Z) - Statistical Hypothesis Testing for Class-Conditional Label Noise [3.6895394817068357]
This work aims to provide machine learning practitioners with tools to answer the question: is there class-conditional flipping noise in my labels?
In particular, we present hypothesis tests to check whether a given dataset of instance-label pairs has been corrupted with class-conditional label noise.
arXiv Detail & Related papers (2021-03-03T19:03:06Z) - Achieving Equalized Odds by Resampling Sensitive Attributes [13.114114427206678]
We present a flexible framework for learning predictive models that approximately satisfy the equalized odds notion of fairness.
This differentiable functional is used as a penalty driving the model parameters towards equalized odds.
We develop a formal hypothesis test to detect whether a prediction rule violates this property, the first such test in the literature.
arXiv Detail & Related papers (2020-06-08T00:18:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.