Bridging In- and Out-of-distribution Samples for Their Better
Discriminability
- URL: http://arxiv.org/abs/2101.02500v1
- Date: Thu, 7 Jan 2021 11:34:18 GMT
- Title: Bridging In- and Out-of-distribution Samples for Their Better
Discriminability
- Authors: Engkarat Techapanurak, Anh-Chuong Dang, Takayuki Okatani
- Abstract summary: We consider samples lying in the intermediate of the two and use them for training a network.
We generate such samples using multiple image transformations that corrupt inputs in various ways and with different severity levels.
We estimate where the generated samples by a single image transformation lie between ID and OOD using a network trained on clean ID samples.
- Score: 18.84265231678354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a method for OOD detection. Questioning the premise of
previous studies that ID and OOD samples are separated distinctly, we consider
samples lying in the intermediate of the two and use them for training a
network. We generate such samples using multiple image transformations that
corrupt inputs in various ways and with different severity levels. We estimate
where the generated samples by a single image transformation lie between ID and
OOD using a network trained on clean ID samples. To be specific, we make the
network classify the generated samples and calculate their mean classification
accuracy, using which we create a soft target label for them. We train the same
network from scratch using the original ID samples and the generated samples
with the soft labels created for them. We detect OOD samples by thresholding
the entropy of the predicted softmax probability. The experimental results show
that our method outperforms the previous state-of-the-art in the standard
benchmark tests. We also analyze the effect of the number and particular
combinations of image corrupting transformations on the performance.
Related papers
- Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox [70.57120710151105]
Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data.
Some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox.
We construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue.
arXiv Detail & Related papers (2024-06-14T09:27:56Z) - Pseudo Outlier Exposure for Out-of-Distribution Detection using
Pretrained Transformers [3.8839179829686126]
A rejection network can be trained with ID and diverse outlier samples to detect test OOD samples.
We propose a method called Pseudo Outlier Exposure (POE) that constructs a surrogate OOD dataset by sequentially masking tokens related to ID classes.
Our method does not require any external OOD data and can be easily implemented within off-the-shelf Transformers.
arXiv Detail & Related papers (2023-07-18T17:29:23Z) - ReSmooth: Detecting and Utilizing OOD Samples when Training with Data
Augmentation [57.38418881020046]
Recent DA techniques always meet the need for diversity in augmented training samples.
An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples.
We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
arXiv Detail & Related papers (2022-05-25T09:29:27Z) - Understanding, Detecting, and Separating Out-of-Distribution Samples and
Adversarial Samples in Text Classification [80.81532239566992]
We compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects.
We find that OOD samples expose their aberration starting from the first layer, while the abnormalities of Adv samples do not emerge until the deeper layers of the model.
We propose a simple method to separate ID, OOD, and Adv samples using the hidden representations and output probabilities of the model.
arXiv Detail & Related papers (2022-04-09T12:11:59Z) - Energy-bounded Learning for Robust Models of Code [16.592638312365164]
In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on.
We propose the use of an energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models.
arXiv Detail & Related papers (2021-12-20T06:28:56Z) - WOOD: Wasserstein-based Out-of-Distribution Detection [6.163329453024915]
Training data for deep-neural-network-based classifiers are usually assumed to be sampled from the same distribution.
When part of the test samples are drawn from a distribution that is far away from that of the training samples, the trained neural network has a tendency to make high confidence predictions for these OOD samples.
We propose a Wasserstein-based out-of-distribution detection (WOOD) method to overcome these challenges.
arXiv Detail & Related papers (2021-12-13T02:35:15Z) - iNNformant: Boundary Samples as Telltale Watermarks [68.8204255655161]
We show that it is possible to generate sets of boundary samples which can identify any of four tested microarchitectures.
These sets can be built to not contain any sample with a worse peak signal-to-noise ratio than 70dB.
arXiv Detail & Related papers (2021-06-14T11:18:32Z) - Transform consistency for learning with noisy labels [9.029861710944704]
We propose a method to identify clean samples only using one single network.
Clean samples prefer to reach consistent predictions for the original images and the transformed images.
In order to mitigate the negative influence of noisy labels, we design a classification loss by using the off-line hard labels and on-line soft labels.
arXiv Detail & Related papers (2021-03-25T14:33:13Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning [54.85397562961903]
Semi-supervised learning (SSL) has been proposed to leverage unlabeled data for training powerful models when only limited labeled data is available.
We address a more complex novel scenario named open-set SSL, where out-of-distribution (OOD) samples are contained in unlabeled data.
Our method achieves state-of-the-art results by successfully eliminating the effect of OOD samples.
arXiv Detail & Related papers (2020-07-22T10:33:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.