PnPOOD : Out-Of-Distribution Detection for Text Classification via Plug
andPlay Data Augmentation
- URL: http://arxiv.org/abs/2111.00506v1
- Date: Sun, 31 Oct 2021 14:02:26 GMT
- Title: PnPOOD : Out-Of-Distribution Detection for Text Classification via Plug
andPlay Data Augmentation
- Authors: Mrinal Rawat, Ramya Hebbalaguppe, Lovekesh Vig
- Abstract summary: We present OOD, a data augmentation technique to perform OOD detection via out-of-domain sample generation.
Our method generates high quality discriminative samples close to the class boundaries, resulting in accurate OOD detection at test time.
We highlight an important data leakage issue with datasets used in prior attempts at OOD detection and share results on a new dataset for OOD detection that does not suffer from the same problem.
- Score: 25.276900899887192
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Out-of-distribution (OOD) detection has been well explored in computer
vision, there have been relatively few prior attempts in OOD detection for NLP
classification. In this paper we argue that these prior attempts do not fully
address the OOD problem and may suffer from data leakage and poor calibration
of the resulting models. We present PnPOOD, a data augmentation technique to
perform OOD detection via out-of-domain sample generation using the recently
proposed Plug and Play Language Model (Dathathri et al., 2020). Our method
generates high quality discriminative samples close to the class boundaries,
resulting in accurate OOD detection at test time. We demonstrate that our model
outperforms prior models on OOD sample detection, and exhibits lower
calibration error on the 20 newsgroup text and Stanford Sentiment Treebank
dataset (Lang, 1995; Socheret al., 2013). We further highlight an important
data leakage issue with datasets used in prior attempts at OOD detection, and
share results on a new dataset for OOD detection that does not suffer from the
same problem.
Related papers
- EAT: Towards Long-Tailed Out-of-Distribution Detection [55.380390767978554]
This paper addresses the challenging task of long-tailed OOD detection.
The main difficulty lies in distinguishing OOD data from samples belonging to the tail classes.
We propose two simple ideas: (1) Expanding the in-distribution class space by introducing multiple abstention classes, and (2) Augmenting the context-limited tail classes by overlaying images onto the context-rich OOD data.
arXiv Detail & Related papers (2023-12-14T13:47:13Z) - Model-free Test Time Adaptation for Out-Of-Distribution Detection [62.49795078366206]
We propose a Non-Parametric Test Time textbfAdaptation framework for textbfDistribution textbfDetection (abbr)
abbr utilizes online test samples for model adaptation during testing, enhancing adaptability to changing data distributions.
We demonstrate the effectiveness of abbr through comprehensive experiments on multiple OOD detection benchmarks.
arXiv Detail & Related papers (2023-11-28T02:00:47Z) - Can Pre-trained Networks Detect Familiar Out-of-Distribution Data? [37.36999826208225]
We study the effect of PT-OOD on the OOD detection performance of pre-trained networks.
We find that the low linear separability of PT-OOD in the feature space heavily degrades the PT-OOD detection performance.
We propose a unique solution to large-scale pre-trained models: Leveraging powerful instance-by-instance discriminative representations of pre-trained models.
arXiv Detail & Related papers (2023-10-02T02:01:00Z) - In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation [43.865923770543205]
Out-of-distribution (OOD) detection is the problem of identifying inputs unrelated to the in-distribution task.
Most of the currently used test OOD datasets, including datasets from the open set recognition (OSR) literature, have severe issues.
We introduce with NINCO a novel test OOD dataset, each sample checked to be ID free, which allows for a detailed analysis of an OOD detector's strengths and failure modes.
arXiv Detail & Related papers (2023-06-01T15:48:10Z) - Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for
Out-of-Domain Detection [28.810524375810736]
Out-of-distribution (OOD) detection is a critical task for reliable predictions over text.
Fine-tuning with pre-trained language models has been a de facto procedure to derive OOD detectors.
We show that using distance-based detection methods, pre-trained language models are near-perfect OOD detectors when the distribution shift involves a domain change.
arXiv Detail & Related papers (2023-05-22T17:42:44Z) - Out-of-distribution Detection with Implicit Outlier Transformation [72.73711947366377]
Outlier exposure (OE) is powerful in out-of-distribution (OOD) detection.
We propose a novel OE-based approach that makes the model perform well for unseen OOD situations.
arXiv Detail & Related papers (2023-03-09T04:36:38Z) - Unsupervised Evaluation of Out-of-distribution Detection: A Data-centric
Perspective [55.45202687256175]
Out-of-distribution (OOD) detection methods assume that they have test ground truths, i.e., whether individual test samples are in-distribution (IND) or OOD.
In this paper, we are the first to introduce the unsupervised evaluation problem in OOD detection.
We propose three methods to compute Gscore as an unsupervised indicator of OOD detection performance.
arXiv Detail & Related papers (2023-02-16T13:34:35Z) - Augmenting Softmax Information for Selective Classification with
Out-of-Distribution Data [7.221206118679026]
We show that existing post-hoc methods perform quite differently compared to when evaluated only on OOD detection.
We propose a novel method for SCOD, Softmax Information Retaining Combination (SIRC), that augments softmax-based confidence scores with feature-agnostic information.
Experiments on a wide variety of ImageNet-scale datasets and convolutional neural network architectures show that SIRC is able to consistently match or outperform the baseline for SCOD.
arXiv Detail & Related papers (2022-07-15T14:39:57Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.