Holistic Sentence Embeddings for Better Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2210.07485v1
- Date: Fri, 14 Oct 2022 03:22:58 GMT
- Title: Holistic Sentence Embeddings for Better Out-of-Distribution Detection
- Authors: Sishuo Chen, Xiaohan Bi, Rundong Gao, Xu Sun
- Abstract summary: We propose a simple embedding approach named Avg-Avg, which averages all token representations from each intermediate layer as the sentence embedding.
Our analysis demonstrates that it indeed helps preserve general linguistic knowledge in fine-tuned PLMs and substantially benefits detecting background shifts.
- Score: 12.640837452980332
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting out-of-distribution (OOD) instances is significant for the safe
deployment of NLP models. Among recent textual OOD detection works based on
pretrained language models (PLMs), distance-based methods have shown superior
performance. However, they estimate sample distance scores in the last-layer
CLS embedding space and thus do not make full use of linguistic information
underlying in PLMs. To address the issue, we propose to boost OOD detection by
deriving more holistic sentence embeddings. On the basis of the observations
that token averaging and layer combination contribute to improving OOD
detection, we propose a simple embedding approach named Avg-Avg, which averages
all token representations from each intermediate layer as the sentence
embedding and significantly surpasses the state-of-the-art on a comprehensive
suite of benchmarks by a 9.33% FAR95 margin. Furthermore, our analysis
demonstrates that it indeed helps preserve general linguistic knowledge in
fine-tuned PLMs and substantially benefits detecting background shifts. The
simple yet effective embedding method can be applied to fine-tuned PLMs with
negligible extra costs, providing a free gain in OOD detection. Our code is
available at https://github.com/lancopku/Avg-Avg.
Related papers
- WeiPer: OOD Detection using Weight Perturbations of Class Projections [11.130659240045544]
We introduce perturbations of the class projections in the final fully connected layer which creates a richer representation of the input.
We achieve state-of-the-art OOD detection results across multiple benchmarks of the OpenOOD framework.
arXiv Detail & Related papers (2024-05-27T13:38:28Z) - Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector [17.305076703258813]
We revisit the likelihood ratio between a pretrained large language model (LLM) and its finetuned variant as a criterion for out-of-distribution (OOD) detection.
We show that, for the first time, the likelihood ratio can serve as an effective OOD detector.
arXiv Detail & Related papers (2024-04-07T10:32:49Z) - Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis,
and LLMs Evaluations [111.88727295707454]
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP.
We propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts.
We conduct experiments on pre-trained language models for analysis and evaluation of OOD robustness.
arXiv Detail & Related papers (2023-06-07T17:47:03Z) - Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning [17.409939628100517]
We propose a unified framework termed OOD Semantic Pruning (OSP), which aims at pruning OOD semantics out from in-distribution (ID) features.
OSP surpasses the previous state-of-the-art by 13.7% on accuracy for ID classification and 5.9% on AUROC for OOD detection on TinyImageNet dataset.
arXiv Detail & Related papers (2023-05-29T15:37:07Z) - Unsupervised Layer-wise Score Aggregation for Textual OOD Detection [35.47177259803885]
We observe that OOD detection performance varies greatly depending on the task and layer output.
We propose a data-driven, unsupervised method to combine layer-wise anomaly scores.
We extend classical OOD benchmarks by including classification tasks with a greater number of classes.
arXiv Detail & Related papers (2023-02-20T09:26:11Z) - Out-of-distribution Detection with Deep Nearest Neighbors [33.71627349163909]
Out-of-distribution (OOD) detection is a critical task for deploying machine learning models in the open world.
In this paper, we explore the efficacy of non-parametric nearest-neighbor distance for OOD detection.
We demonstrate the effectiveness of nearest-neighbor-based OOD detection on several benchmarks and establish superior performance.
arXiv Detail & Related papers (2022-04-13T16:45:21Z) - No True State-of-the-Art? OOD Detection Methods are Inconsistent across
Datasets [69.725266027309]
Out-of-distribution detection is an important component of reliable ML systems.
In this work, we show that none of these methods are inherently better at OOD detection than others on a standardized set of 16 pairs.
We also show that a method outperforming another on a certain (ID, OOD) pair may not do so in a low-data regime.
arXiv Detail & Related papers (2021-09-12T16:35:00Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection [72.35532598131176]
We propose an unsupervised method to detect OOD samples using a $k$-NN density estimate.
We leverage a recent insight about label smoothing, which we call the emphLabel Smoothed Embedding Hypothesis
We show that our proposal outperforms many OOD baselines and also provide new finite-sample high-probability statistical results.
arXiv Detail & Related papers (2021-02-09T21:04:44Z) - ATOM: Robustifying Out-of-distribution Detection Using Outlier Mining [51.19164318924997]
Adrial Training with informative Outlier Mining improves robustness of OOD detection.
ATOM achieves state-of-the-art performance under a broad family of classic and adversarial OOD evaluation tasks.
arXiv Detail & Related papers (2020-06-26T20:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.