Non-Exchangeable Conformal Language Generation with Nearest Neighbors
- URL: http://arxiv.org/abs/2402.00707v1
- Date: Thu, 1 Feb 2024 16:04:04 GMT
- Title: Non-Exchangeable Conformal Language Generation with Nearest Neighbors
- Authors: Dennis Ulmer, Chrysoula Zerva, Andr\'e F.T. Martins
- Abstract summary: Non-exchangeable conformal nucleus sampling is a novel extension of the conformal prediction framework to generation based on nearest neighbors.
Our method can be used post-hoc for an arbitrary model without extra training and supplies token-level, calibrated prediction sets equipped with statistical guarantees.
- Score: 12.790082627386482
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quantifying uncertainty in automatically generated text is important for
letting humans check potential hallucinations and making systems more reliable.
Conformal prediction is an attractive framework to provide predictions imbued
with statistical guarantees, however, its application to text generation is
challenging since any i.i.d. assumptions are not realistic. In this paper, we
bridge this gap by leveraging recent results on non-exchangeable conformal
prediction, which still ensures bounds on coverage. The result,
non-exchangeable conformal nucleus sampling, is a novel extension of the
conformal prediction framework to generation based on nearest neighbors. Our
method can be used post-hoc for an arbitrary model without extra training and
supplies token-level, calibrated prediction sets equipped with statistical
guarantees. Experiments in machine translation and language modeling show
encouraging results in generation quality. By also producing tighter prediction
sets with good coverage, we thus give a more theoretically principled way to
perform sampling with conformal guarantees.
Related papers
- Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.
Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.
We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering [55.15192437680943]
Generative models lack rigorous statistical guarantees for their outputs.
We propose a sequential conformal prediction method producing prediction sets that satisfy a rigorous statistical guarantee.
This guarantee states that with high probability, the prediction sets contain at least one admissible (or valid) example.
arXiv Detail & Related papers (2024-10-02T15:26:52Z) - Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - Robust Conformal Prediction Using Privileged Information [17.886554223172517]
We develop a method to generate prediction sets with a guaranteed coverage rate that is robust to corruptions in the training data.
Our approach builds on conformal prediction, a powerful framework to construct prediction sets that are valid under the i.i.d assumption.
arXiv Detail & Related papers (2024-06-08T08:56:47Z) - Conformal Prediction for Deep Classifier via Label Ranking [29.784336674173616]
Conformal prediction is a statistical framework that generates prediction sets with a desired coverage guarantee.
We propose a novel algorithm named $textitSorted Adaptive Prediction Sets$ (SAPS)
SAPS discards all the probability values except for the maximum softmax probability.
arXiv Detail & Related papers (2023-10-10T08:54:14Z) - Conformal Language Modeling [61.94417935386489]
We propose a novel approach to conformal prediction for generative language models (LMs)
Standard conformal prediction produces prediction sets with rigorous, statistical guarantees.
We demonstrate the promise of our approach on multiple tasks in open-domain question answering, text summarization, and radiology report generation.
arXiv Detail & Related papers (2023-06-16T21:55:08Z) - Conformalizing Machine Translation Evaluation [9.89901717499058]
Several uncertainty estimation methods have been recently proposed for machine translation evaluation.
We show that the majority of them tend to underestimate model uncertainty, and as a result they often produce misleading confidence intervals that do not cover the ground truth.
We propose as an alternative the use of conformal prediction, a distribution-free method to obtain confidence intervals with a theoretically established guarantee on coverage.
arXiv Detail & Related papers (2023-06-09T19:36:18Z) - Federated Conformal Predictors for Distributed Uncertainty
Quantification [83.50609351513886]
Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning.
In this paper, we extend conformal prediction to the federated learning setting.
We propose a weaker notion of partial exchangeability, better suited to the FL setting, and use it to develop the Federated Conformal Prediction framework.
arXiv Detail & Related papers (2023-05-27T19:57:27Z) - Distribution-Free Finite-Sample Guarantees and Split Conformal
Prediction [0.0]
split conformal prediction represents a promising avenue to obtain finite-sample guarantees under minimal distribution-free assumptions.
We highlight the connection between split conformal prediction and classical tolerance predictors developed in the 1940s.
arXiv Detail & Related papers (2022-10-26T14:12:24Z) - Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy.
We present a framework that treats these two desiderata jointly.
We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.