Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for
Out-of-Domain Detection
- URL: http://arxiv.org/abs/2305.13282v1
- Date: Mon, 22 May 2023 17:42:44 GMT
- Title: Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for
Out-of-Domain Detection
- Authors: Rheeya Uppaal, Junjie Hu, Yixuan Li
- Abstract summary: Out-of-distribution (OOD) detection is a critical task for reliable predictions over text.
Fine-tuning with pre-trained language models has been a de facto procedure to derive OOD detectors.
We show that using distance-based detection methods, pre-trained language models are near-perfect OOD detectors when the distribution shift involves a domain change.
- Score: 28.810524375810736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Out-of-distribution (OOD) detection is a critical task for reliable
predictions over text. Fine-tuning with pre-trained language models has been a
de facto procedure to derive OOD detectors with respect to in-distribution (ID)
data. Despite its common use, the understanding of the role of fine-tuning and
its necessity for OOD detection is largely unexplored. In this paper, we raise
the question: is fine-tuning necessary for OOD detection? We present a study
investigating the efficacy of directly leveraging pre-trained language models
for OOD detection, without any model fine-tuning on the ID data. We compare the
approach with several competitive fine-tuning objectives, and offer new
insights under various types of distributional shifts. Extensive evaluations on
8 diverse ID-OOD dataset pairs demonstrate near-perfect OOD detection
performance (with 0% FPR95 in many cases), strongly outperforming its
fine-tuned counterparts. We show that using distance-based detection methods,
pre-trained language models are near-perfect OOD detectors when the
distribution shift involves a domain change. Furthermore, we study the effect
of fine-tuning on OOD detection and identify how to balance ID accuracy with
OOD detection performance. Our code is publically available at
https://github.com/Uppaal/lm-ood.
Related papers
- Semantic or Covariate? A Study on the Intractable Case of Out-of-Distribution Detection [70.57120710151105]
We provide a more precise definition of the Semantic Space for the ID distribution.
We also define the "Tractable OOD" setting which ensures the distinguishability of OOD and ID distributions.
arXiv Detail & Related papers (2024-11-18T03:09:39Z) - Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection [24.557227100200215]
Out-of-distribution (OOD) detection is crucial for deploying reliable machine learning models in open-world applications.
Recent advances in CLIP-based OOD detection have shown promising results via regularizing prompt tuning with OOD features extracted from ID data.
We propose a novel framework, namely, Self-Calibrated Tuning (SCT), to mitigate this problem for effective OOD detection with only the given few-shot ID data.
arXiv Detail & Related papers (2024-11-05T02:29:16Z) - Model-free Test Time Adaptation for Out-Of-Distribution Detection [62.49795078366206]
We propose a Non-Parametric Test Time textbfAdaptation framework for textbfDistribution textbfDetection (abbr)
abbr utilizes online test samples for model adaptation during testing, enhancing adaptability to changing data distributions.
We demonstrate the effectiveness of abbr through comprehensive experiments on multiple OOD detection benchmarks.
arXiv Detail & Related papers (2023-11-28T02:00:47Z) - Can Pre-trained Networks Detect Familiar Out-of-Distribution Data? [37.36999826208225]
We study the effect of PT-OOD on the OOD detection performance of pre-trained networks.
We find that the low linear separability of PT-OOD in the feature space heavily degrades the PT-OOD detection performance.
We propose a unique solution to large-scale pre-trained models: Leveraging powerful instance-by-instance discriminative representations of pre-trained models.
arXiv Detail & Related papers (2023-10-02T02:01:00Z) - Using Semantic Information for Defining and Detecting OOD Inputs [3.9577682622066264]
Out-of-distribution (OOD) detection has received some attention recently.
We demonstrate that the current detectors inherit the biases in the training dataset.
This can render the current OOD detectors impermeable to inputs lying outside the training distribution but with the same semantic information.
We perform OOD detection on semantic information extracted from the training data of MNIST and COCO datasets.
arXiv Detail & Related papers (2023-02-21T21:31:20Z) - Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is
All You Need [52.88953913542445]
We find surprisingly that simply using reconstruction-based methods could boost the performance of OOD detection significantly.
We take Masked Image Modeling as a pretext task for our OOD detection framework (MOOD)
arXiv Detail & Related papers (2023-02-06T08:24:41Z) - Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application.
We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data.
We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z) - How Useful are Gradients for OOD Detection Really? [5.459639971144757]
Out of distribution (OOD) detection is a critical challenge in deploying highly performant machine learning models in real-life applications.
We provide an in-depth analysis and comparison of gradient based methods for OOD detection.
We propose a general, non-gradient based method of OOD detection which improves over previous baselines in both performance and computational efficiency.
arXiv Detail & Related papers (2022-05-20T21:10:05Z) - PnPOOD : Out-Of-Distribution Detection for Text Classification via Plug
andPlay Data Augmentation [25.276900899887192]
We present OOD, a data augmentation technique to perform OOD detection via out-of-domain sample generation.
Our method generates high quality discriminative samples close to the class boundaries, resulting in accurate OOD detection at test time.
We highlight an important data leakage issue with datasets used in prior attempts at OOD detection and share results on a new dataset for OOD detection that does not suffer from the same problem.
arXiv Detail & Related papers (2021-10-31T14:02:26Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.