Integrative conformal p-values for powerful out-of-distribution testing
with labeled outliers
- URL: http://arxiv.org/abs/2208.11111v1
- Date: Tue, 23 Aug 2022 17:52:20 GMT
- Title: Integrative conformal p-values for powerful out-of-distribution testing
with labeled outliers
- Authors: Ziyi Liang, Matteo Sesia, Wenguang Sun
- Abstract summary: This paper develops novel conformal methods to test whether a new observation was sampled from the same distribution as a reference set.
The described methods can re-weight standard conformal p-values based on dependent side information from known out-of-distribution data.
The solution can be implemented either through sample splitting or via a novel transductive cross-validation+ scheme.
- Score: 1.6371837018687636
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper develops novel conformal methods to test whether a new observation
was sampled from the same distribution as a reference set. Blending inductive
and transductive conformal inference in an innovative way, the described
methods can re-weight standard conformal p-values based on dependent side
information from known out-of-distribution data in a principled way, and can
automatically take advantage of the most powerful model from any collection of
one-class and binary classifiers. The solution can be implemented either
through sample splitting or via a novel transductive cross-validation+ scheme
which may also be useful in other applications of conformal inference, due to
tighter guarantees compared to existing cross-validation approaches. After
studying false discovery rate control and power within a multiple testing
framework with several possible outliers, the proposed solution is shown to
outperform standard conformal p-values through simulations as well as
applications to image recognition and tabular data.
Related papers
- Conditional Testing based on Localized Conformal p-values [5.6779147365057305]
We define the localized conformal p-values by inverting prediction intervals and prove their theoretical properties.
These defined p-values are then applied to several conditional testing problems to illustrate their practicality.
arXiv Detail & Related papers (2024-09-25T11:30:14Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Transductive conformal inference with adaptive scores [3.591224588041813]
We consider the transductive setting, where decisions are made on a test sample of $m$ new points.
We show that their joint distribution follows a P'olya urn model, and establish a concentration inequality for their empirical distribution function.
We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks.
arXiv Detail & Related papers (2023-10-27T12:48:30Z) - Derandomized Novelty Detection with FDR Control via Conformal E-values [20.864605211132663]
We propose to make conformal inferences more stable by leveraging suitable conformal e-values instead of p-values.
We show that the proposed method can reduce randomness without much loss of power compared to standard conformal inference.
arXiv Detail & Related papers (2023-02-14T19:21:44Z) - Adaptive novelty detection with false discovery rate guarantee [1.8249324194382757]
We propose a flexible method to control the false discovery rate (FDR) on detected novelties in finite samples.
Inspired by the multiple testing literature, we propose variants of AdaDetect that are adaptive to the proportion of nulls.
The methods are illustrated on synthetic datasets and real-world datasets, including an application in astrophysics.
arXiv Detail & Related papers (2022-08-13T17:14:55Z) - Predicting Out-of-Domain Generalization with Neighborhood Invariance [59.05399533508682]
We propose a measure of a classifier's output invariance in a local transformation neighborhood.
Our measure is simple to calculate, does not depend on the test point's true label, and can be applied even in out-of-domain (OOD) settings.
In experiments on benchmarks in image classification, sentiment analysis, and natural language inference, we demonstrate a strong and robust correlation between our measure and actual OOD generalization.
arXiv Detail & Related papers (2022-07-05T14:55:16Z) - Toward Learning Robust and Invariant Representations with Alignment
Regularization and Data Augmentation [76.85274970052762]
This paper is motivated by a proliferation of options of alignment regularizations.
We evaluate the performances of several popular design choices along the dimensions of robustness and invariance.
We also formally analyze the behavior of alignment regularization to complement our empirical study under assumptions we consider realistic.
arXiv Detail & Related papers (2022-06-04T04:29:19Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - Testing for Outliers with Conformal p-values [14.158078752410182]
The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers.
We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually dependent for different test points.
We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense.
arXiv Detail & Related papers (2021-04-16T17:59:21Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.