Towards Textual Out-of-Domain Detection without In-Domain Labels
- URL: http://arxiv.org/abs/2203.11396v1
- Date: Tue, 22 Mar 2022 00:11:46 GMT
- Title: Towards Textual Out-of-Domain Detection without In-Domain Labels
- Authors: Di Jin, Shuyang Gao, Seokhwan Kim, Yang Liu, and Dilek Hakkani-Tur
- Abstract summary: This work focuses on a challenging case of OOD detection, where no labels for in-domain data are accessible.
We first evaluate different language model based approaches that predict likelihood for a sequence of tokens.
We propose a novel representation learning based method by combining unsupervised clustering and contrastive learning.
- Score: 41.23096594140221
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In many real-world settings, machine learning models need to identify user
inputs that are out-of-domain (OOD) so as to avoid performing wrong actions.
This work focuses on a challenging case of OOD detection, where no labels for
in-domain data are accessible (e.g., no intent labels for the intent
classification task). To this end, we first evaluate different language model
based approaches that predict likelihood for a sequence of tokens. Furthermore,
we propose a novel representation learning based method by combining
unsupervised clustering and contrastive learning so that better data
representations for OOD detection can be learned. Through extensive
experiments, we demonstrate that this method can significantly outperform
likelihood-based methods and can be even competitive to the state-of-the-art
supervised approaches with label information.
Related papers
- Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios.
We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE.
EOE can be generalized to different tasks, including far, near, and fine-language OOD detection.
EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z) - Finding Dino: A plug-and-play framework for unsupervised detection of out-of-distribution objects using prototypes [12.82756672393553]
PRototype-based zero-shot OOD detection Without Labels (PROWL)
We present PRototype-based zero-shot OOD detection Without Labels (PROWL)
It is an inference-based method that does not require training on the domain dataset.
We also demonstrate its suitability for other domains such as rail and maritime scenes.
arXiv Detail & Related papers (2024-04-11T11:55:42Z) - Out-of-Distribution Detection Using Peer-Class Generated by Large Language Model [0.0]
Out-of-distribution (OOD) detection is a critical task to ensure the reliability and security of machine learning models.
In this paper, a novel method called ODPC is proposed, in which specific prompts to generate OOD peer classes of ID semantics are designed by a large language model.
Experiments on five benchmark datasets show that the method we propose can yield state-of-the-art results.
arXiv Detail & Related papers (2024-03-20T06:04:05Z) - Semi-Supervised Object Detection in the Open World [16.274397329511192]
We introduce an ensemble based OOD detector consisting of lightweight auto-encoder networks trained only on ID data.
Our method performs competitively against state-of-the-art OOD detection algorithms and also significantly boosts the semi-supervised learning performance in open-world scenarios.
arXiv Detail & Related papers (2023-07-28T17:59:03Z) - Unsupervised Domain Adaptive Salient Object Detection Through
Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations.
We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Just Label What You Need: Fine-Grained Active Selection for Perception
and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes.
Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z) - Adversarial Knowledge Transfer from Unlabeled Data [62.97253639100014]
We present a novel Adversarial Knowledge Transfer framework for transferring knowledge from internet-scale unlabeled data to improve the performance of a classifier.
An important novel aspect of our method is that the unlabeled source data can be of different classes from those of the labeled target data, and there is no need to define a separate pretext task.
arXiv Detail & Related papers (2020-08-13T08:04:27Z) - Likelihood Ratios and Generative Classifiers for Unsupervised
Out-of-Domain Detection In Task Oriented Dialog [24.653367921046442]
We focus on OOD detection for natural language sentence inputs to task-based dialog systems.
We release a dataset of 4K OOD examples for the publicly available dataset fromSchuster et al.
arXiv Detail & Related papers (2019-12-30T03:31:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.