Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
- URL: http://arxiv.org/abs/2410.01660v1
- Date: Wed, 2 Oct 2024 15:26:52 GMT
- Title: Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
- Authors: Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach,
- Abstract summary: Generative models lack rigorous statistical guarantees for their outputs.
We propose a sequential conformal prediction method producing prediction sets that satisfy a rigorous statistical guarantee.
This guarantee states that with high probability, the prediction sets contain at least one admissible (or valid) example.
- Score: 55.15192437680943
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models lack rigorous statistical guarantees for their outputs and are therefore unreliable in safety-critical applications. In this work, we propose Sequential Conformal Prediction for Generative Models (SCOPE-Gen), a sequential conformal prediction method producing prediction sets that satisfy a rigorous statistical guarantee called conformal admissibility control. This guarantee states that with high probability, the prediction sets contain at least one admissible (or valid) example. To this end, our method first samples an initial set of i.i.d. examples from a black box generative model. Then, this set is iteratively pruned via so-called greedy filters. As a consequence of the iterative generation procedure, admissibility of the final prediction set factorizes as a Markov chain. This factorization is crucial, because it allows to control each factor separately, using conformal prediction. In comparison to prior work, our method demonstrates a large reduction in the number of admissibility evaluations during calibration. This reduction is important in safety-critical applications, where these evaluations must be conducted manually by domain experts and are therefore costly and time consuming. We highlight the advantages of our method in terms of admissibility evaluations and cardinality of the prediction sets through experiments in natural language generation and molecular graph extension tasks.
Related papers
- Online scalable Gaussian processes with conformal prediction for guaranteed coverage [32.21093722162573]
The consistency of the resulting uncertainty values hinges on the premise that the learning function conforms to the properties specified by the GP model.
We propose to wed the GP with the prevailing conformal prediction (CP), a distribution-free post-processing framework that produces it prediction sets with a provably valid coverage.
arXiv Detail & Related papers (2024-10-07T19:22:15Z) - Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - Non-Exchangeable Conformal Language Generation with Nearest Neighbors [12.790082627386482]
Non-exchangeable conformal nucleus sampling is a novel extension of the conformal prediction framework to generation based on nearest neighbors.
Our method can be used post-hoc for an arbitrary model without extra training and supplies token-level, calibrated prediction sets equipped with statistical guarantees.
arXiv Detail & Related papers (2024-02-01T16:04:04Z) - Conformal Approach To Gaussian Process Surrogate Evaluation With
Coverage Guarantees [47.22930583160043]
We propose a method for building adaptive cross-conformal prediction intervals.
The resulting conformal prediction intervals exhibit a level of adaptivity akin to Bayesian credibility sets.
The potential applicability of the method is demonstrated in the context of surrogate modeling of an expensive-to-evaluate simulator of the clogging phenomenon in steam generators of nuclear reactors.
arXiv Detail & Related papers (2024-01-15T14:45:18Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Shortcomings of Top-Down Randomization-Based Sanity Checks for
Evaluations of Deep Neural Network Explanations [67.40641255908443]
We identify limitations of model-randomization-based sanity checks for the purpose of evaluating explanations.
Top-down model randomization preserves scales of forward pass activations with high probability.
arXiv Detail & Related papers (2022-11-22T18:52:38Z) - Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy.
We present a framework that treats these two desiderata jointly.
We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z) - On Model Identification and Out-of-Sample Prediction of Principal
Component Regression: Applications to Synthetic Controls [20.96904429337912]
We analyze principal component regression (PCR) in a high-dimensional error-in-variables setting with fixed design.
We establish non-asymptotic out-of-sample prediction guarantees that improve upon the best known rates.
arXiv Detail & Related papers (2020-10-27T17:07:36Z) - Efficient Conformal Prediction via Cascaded Inference with Expanded
Admission [43.596058175459746]
We present a novel approach for conformal prediction (CP)
We aim to identify a set of promising prediction candidates -- in place of a single prediction.
This set is guaranteed to contain a correct answer with high probability.
arXiv Detail & Related papers (2020-07-06T23:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.