Out-of-Distribution Detection and Selective Generation for Conditional
Language Models
- URL: http://arxiv.org/abs/2209.15558v1
- Date: Fri, 30 Sep 2022 16:17:11 GMT
- Title: Out-of-Distribution Detection and Selective Generation for Conditional
Language Models
- Authors: Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji
Lakshminarayanan, Peter J. Liu
- Abstract summary: Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence.
We present a highly accurate and lightweight OOD detection method for CLMs.
We show how our method can be used under the common and realistic setting of distribution shift for selective generation of high-quality outputs.
- Score: 40.15896981028647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning algorithms typically assume independent and identically
distributed samples in training and at test time. Much work has shown that
high-performing ML classifiers can degrade significantly and provide
overly-confident, wrong classification predictions, particularly for
out-of-distribution (OOD) inputs. Conditional language models (CLMs) are
predominantly trained to classify the next token in an output sequence, and may
suffer even worse degradation on OOD inputs as the prediction is done
auto-regressively over many steps. Furthermore, the space of potential
low-quality outputs is larger as arbitrary text can be generated and it is
important to know when to trust the generated output. We present a highly
accurate and lightweight OOD detection method for CLMs, and demonstrate its
effectiveness on abstractive summarization and translation. We also show how
our method can be used under the common and realistic setting of distribution
shift for selective generation (analogous to selective prediction for
classification) of high-quality outputs, while automatically abstaining from
low-quality ones, enabling safer deployment of generative language models.
Related papers
- Learnable Linguistic Watermarks for Tracing Model Extraction Attacks on Large Language Models [20.44680783275184]
Current watermarking techniques against model extraction attacks rely on signal insertion in model logits or post-processing of generated text.
We propose a novel method for embedding learnable linguistic watermarks in Large Language Models (LLMs)
Our approach subtly modifies the LLM's output distribution by introducing controlled noise into token frequency distributions, embedding a statistically identifiable watermark.
arXiv Detail & Related papers (2024-04-28T14:45:53Z) - Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
We reformulate open-ended generation tasks into token-level prediction tasks.
We instruct an LLM to self-evaluate its answers.
We benchmark a range of scoring methods based on self-evaluation.
arXiv Detail & Related papers (2023-12-14T19:09:22Z) - Calibrating Sequence likelihood Improves Conditional Language Generation [39.35161650538767]
Conditional language models are predominantly trained with maximum likelihood estimation (MLE)
While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality.
We introduce sequence likelihood calibration (SLiC) where the likelihood of model generated sequences are calibrated to better align with reference sequences in the model's latent space.
arXiv Detail & Related papers (2022-09-30T19:16:16Z) - Evaluating Distributional Distortion in Neural Language Modeling [81.83408583979745]
A heavy-tail of rare events accounts for a significant amount of the total probability mass of distributions in language.
Standard language modeling metrics such as perplexity quantify the performance of language models (LM) in aggregate.
We develop a controlled evaluation scheme which uses generative models trained on natural data as artificial languages.
arXiv Detail & Related papers (2022-03-24T01:09:46Z) - Energy-bounded Learning for Robust Models of Code [16.592638312365164]
In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on.
We propose the use of an energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models.
arXiv Detail & Related papers (2021-12-20T06:28:56Z) - Distributionally Robust Recurrent Decoders with Random Network
Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference.
We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models.
This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.