Related papers: Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

URL: http://arxiv.org/abs/2209.15558v1
Date: Fri, 30 Sep 2022 16:17:11 GMT
Title: Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Authors: Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu
Abstract summary: Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence. We present a highly accurate and lightweight OOD detection method for CLMs. We show how our method can be used under the common and realistic setting of distribution shift for selective generation of high-quality outputs.
Score: 40.15896981028647
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning algorithms typically assume independent and identically distributed samples in training and at test time. Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on OOD inputs as the prediction is done auto-regressively over many steps. Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation (analogous to selective prediction for classification) of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.

Related papers

Random-Set Large Language Models [4.308457163593758]
Large Language Models (LLMs) are known to produce very high-quality tests and responses to our queries. But how much can we trust this generated text? We propose a novel Random-Set Large Language Model (RSLLM) approach which predicts finite random sets (belief functions) over the token space.
arXiv Detail & Related papers (2025-04-25T05:25:27Z)
Learnable Linguistic Watermarks for Tracing Model Extraction Attacks on Large Language Models [20.44680783275184]
Current watermarking techniques against model extraction attacks rely on signal insertion in model logits or post-processing of generated text. We propose a novel method for embedding learnable linguistic watermarks in Large Language Models (LLMs) Our approach subtly modifies the LLM's output distribution by introducing controlled noise into token frequency distributions, embedding a statistically identifiable watermark.
arXiv Detail & Related papers (2024-04-28T14:45:53Z)
Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
We reformulate open-ended generation tasks into token-level prediction tasks. We instruct an LLM to self-evaluate its answers. We benchmark a range of scoring methods based on self-evaluation.
arXiv Detail & Related papers (2023-12-14T19:09:22Z)
Calibrating Sequence likelihood Improves Conditional Language Generation [39.35161650538767]
Conditional language models are predominantly trained with maximum likelihood estimation (MLE) While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality. We introduce sequence likelihood calibration (SLiC) where the likelihood of model generated sequences are calibrated to better align with reference sequences in the model's latent space.
arXiv Detail & Related papers (2022-09-30T19:16:16Z)
Evaluating Distributional Distortion in Neural Language Modeling [81.83408583979745]
A heavy-tail of rare events accounts for a significant amount of the total probability mass of distributions in language. Standard language modeling metrics such as perplexity quantify the performance of language models (LM) in aggregate. We develop a controlled evaluation scheme which uses generative models trained on natural data as artificial languages.
arXiv Detail & Related papers (2022-03-24T01:09:46Z)
Energy-bounded Learning for Robust Models of Code [16.592638312365164]
In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on. We propose the use of an energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models.
arXiv Detail & Related papers (2021-12-20T06:28:56Z)
Distributionally Robust Recurrent Decoders with Random Network Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z)
Learn what you can't learn: Regularized Ensembles for Transductive Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios. This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data. We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models. This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.