Out-of-Distribution Detection by Leveraging Between-Layer Transformation
Smoothness
- URL: http://arxiv.org/abs/2310.02832v2
- Date: Mon, 11 Mar 2024 18:18:41 GMT
- Title: Out-of-Distribution Detection by Leveraging Between-Layer Transformation
Smoothness
- Authors: Fran Jeleni\'c, Josip Juki\'c, Martin Tutek, Mate Puljiz, Jan
\v{S}najder
- Abstract summary: We present a novel method for detecting OOD data in Transformers based on transformation smoothness between intermediate layers of a network.
We evaluate BLOOD on several text classification tasks with Transformer networks and demonstrate that it outperforms methods with comparable resource requirements.
Our analysis also suggests that when learning simpler tasks, OOD data transformations maintain their original sharpness, whereas sharpness increases with more complex tasks.
- Score: 4.724825031148413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective out-of-distribution (OOD) detection is crucial for reliable machine
learning models, yet most current methods are limited in practical use due to
requirements like access to training data or intervention in training. We
present a novel method for detecting OOD data in Transformers based on
transformation smoothness between intermediate layers of a network (BLOOD),
which is applicable to pre-trained models without access to training data.
BLOOD utilizes the tendency of between-layer representation transformations of
in-distribution (ID) data to be smoother than the corresponding transformations
of OOD data, a property that we also demonstrate empirically. We evaluate BLOOD
on several text classification tasks with Transformer networks and demonstrate
that it outperforms methods with comparable resource requirements. Our analysis
also suggests that when learning simpler tasks, OOD data transformations
maintain their original sharpness, whereas sharpness increases with more
complex tasks.
Related papers
- Enhancing OOD Detection Using Latent Diffusion [3.4899193297791054]
Out-of-distribution (OOD) detection is crucial for the reliable deployment of machine learning models in real-world scenarios.<n>Recent efforts have explored using generative models, such as Stable Diffusion, to synthesize outlier data in the pixel space.<n>We propose Outlier-Aware Learning (OAL), a novel framework that generates synthetic OOD training data within the latent space.
arXiv Detail & Related papers (2024-06-24T11:01:43Z) - How Out-of-Distribution Detection Learning Theory Enhances Transformer: Learnability and Reliability [10.056026416603006]
This paper introduces the OOD detection Probably Approximately Correct (PAC) Theory for transformers.<n>It shows that outliers can be accurately represented and distinguished with sufficient data under conditions.<n>This approach yields a novel algorithm that ensures learnability and refines the decision boundaries between inliers and outliers.
arXiv Detail & Related papers (2024-06-13T17:54:09Z) - Gradient-Regularized Out-of-Distribution Detection [28.542499196417214]
One of the challenges for neural networks in real-life applications is the overconfident errors these models make when the data is not from the original training distribution.
We propose the idea of leveraging the information embedded in the gradient of the loss function during training to enable the network to learn a desired OOD score for each sample.
We also develop a novel energy-based sampling method to allow the network to be exposed to more informative OOD samples during the training phase.
arXiv Detail & Related papers (2024-04-18T17:50:23Z) - EAT: Towards Long-Tailed Out-of-Distribution Detection [55.380390767978554]
This paper addresses the challenging task of long-tailed OOD detection.
The main difficulty lies in distinguishing OOD data from samples belonging to the tail classes.
We propose two simple ideas: (1) Expanding the in-distribution class space by introducing multiple abstention classes, and (2) Augmenting the context-limited tail classes by overlaying images onto the context-rich OOD data.
arXiv Detail & Related papers (2023-12-14T13:47:13Z) - Distilling the Unknown to Unveil Certainty [66.29929319664167]
Out-of-distribution (OOD) detection is essential in identifying test samples that deviate from the in-distribution (ID) data upon which a standard network is trained.
This paper introduces OOD knowledge distillation, a pioneering learning framework applicable whether or not training ID data is available.
arXiv Detail & Related papers (2023-11-14T08:05:02Z) - Out-of-distribution Detection Learning with Unreliable
Out-of-distribution Sources [73.28967478098107]
Out-of-distribution (OOD) detection discerns OOD data where the predictor cannot make valid predictions as in-distribution (ID) data.
It is typically hard to collect real out-of-distribution (OOD) data for training a predictor capable of discerning OOD patterns.
We propose a data generation-based learning method named Auxiliary Task-based OOD Learning (ATOL) that can relieve the mistaken OOD generation.
arXiv Detail & Related papers (2023-11-06T16:26:52Z) - Classifier-head Informed Feature Masking and Prototype-based Logit
Smoothing for Out-of-Distribution Detection [27.062465089674763]
Out-of-distribution (OOD) detection is essential when deploying neural networks in the real world.
One main challenge is that neural networks often make overconfident predictions on OOD data.
We propose an effective post-hoc OOD detection method based on a new feature masking strategy and a novel logit smoothing strategy.
arXiv Detail & Related papers (2023-10-27T12:42:17Z) - Out-of-distribution Detection with Implicit Outlier Transformation [72.73711947366377]
Outlier exposure (OE) is powerful in out-of-distribution (OOD) detection.
We propose a novel OE-based approach that makes the model perform well for unseen OOD situations.
arXiv Detail & Related papers (2023-03-09T04:36:38Z) - Using Semantic Information for Defining and Detecting OOD Inputs [3.9577682622066264]
Out-of-distribution (OOD) detection has received some attention recently.
We demonstrate that the current detectors inherit the biases in the training dataset.
This can render the current OOD detectors impermeable to inputs lying outside the training distribution but with the same semantic information.
We perform OOD detection on semantic information extracted from the training data of MNIST and COCO datasets.
arXiv Detail & Related papers (2023-02-21T21:31:20Z) - Igeood: An Information Geometry Approach to Out-of-Distribution
Detection [35.04325145919005]
We introduce Igeood, an effective method for detecting out-of-distribution (OOD) samples.
Igeood applies to any pre-trained neural network, works under various degrees of access to the machine learning model.
We show that Igeood outperforms competing state-of-the-art methods on a variety of network architectures and datasets.
arXiv Detail & Related papers (2022-03-15T11:26:35Z) - Distributionally Robust Recurrent Decoders with Random Network
Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference.
We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z) - OODformer: Out-Of-Distribution Detection Transformer [15.17006322500865]
In real-world safety-critical applications, it is important to be aware if a new data point is OOD.
This paper proposes a first-of-its-kind OOD detection architecture named OODformer.
arXiv Detail & Related papers (2021-07-19T15:46:38Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.