The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
- URL: http://arxiv.org/abs/2210.07071v3
- Date: Wed, 24 Apr 2024 02:37:55 GMT
- Title: The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
- Authors: Yunhua Zhou, Pengyu Wang, Peiju Liu, Yuxin Wang, Xipeng Qiu,
- Abstract summary: We shed light on the fundamental cause of model overconfidence on OOD.
We also extend the Lottery Ticket Hypothesis to open-world scenarios.
- Score: 68.93357975024773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing methods of Out-of-Domain (OOD) intent classification rely on extensive auxiliary OOD corpora or specific training paradigms. However, they are underdeveloped in the underlying principle that the models should have differentiated confidence in In- and Out-of-domain intent. In this work, we shed light on the fundamental cause of model overconfidence on OOD and demonstrate that calibrated subnetworks can be uncovered by pruning the overparameterized model. Calibrated confidence provided by the subnetwork can better distinguish In- and Out-of-domain, which can be a benefit for almost all post hoc methods. In addition to bringing fundamental insights, we also extend the Lottery Ticket Hypothesis to open-world scenarios. We conduct extensive experiments on four real-world datasets to demonstrate our approach can establish consistent improvements compared with a suite of competitive baselines.
Related papers
- WeiPer: OOD Detection using Weight Perturbations of Class Projections [11.130659240045544]
We introduce perturbations of the class projections in the final fully connected layer which creates a richer representation of the input.
We achieve state-of-the-art OOD detection results across multiple benchmarks of the OpenOOD framework.
arXiv Detail & Related papers (2024-05-27T13:38:28Z) - Towards Context-Aware Domain Generalization: Understanding the Benefits
and Limits of Marginal Transfer Learning [1.5320861212113897]
We formalize the notion of context as a permutation-invariant representation of a set of data points.
Empirical analysis shows that our criteria are effective in discerning both favorable and unfavorable scenarios.
arXiv Detail & Related papers (2023-12-15T05:18:07Z) - Towards Evaluating Transfer-based Attacks Systematically, Practically,
and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention.
An increasing number of transfer-based methods have been developed to fool black-box DNN models.
We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z) - DATa: Domain Adaptation-Aided Deep Table Detection Using Visual-Lexical
Representations [2.542864854772221]
We present a novel Domain Adaptation-aided deep Table detection method called DATa.
It guarantees satisfactory performance in a specific target domain where few trusted labels are available.
Experiments show that DATa substantially outperforms competing methods that only utilize visual representations in the target domain.
arXiv Detail & Related papers (2022-11-12T12:14:16Z) - Generalizability of Adversarial Robustness Under Distribution Shifts [57.767152566761304]
We take a first step towards investigating the interplay between empirical and certified adversarial robustness on one hand and domain generalization on another.
We train robust models on multiple domains and evaluate their accuracy and robustness on an unseen domain.
We extend our study to cover a real-world medical application, in which adversarial augmentation significantly boosts the generalization of robustness with minimal effect on clean data accuracy.
arXiv Detail & Related papers (2022-09-29T18:25:48Z) - WeShort: Out-of-distribution Detection With Weak Shortcut structure [0.0]
We propose a simple and effective post-hoc technique, WeShort, to reduce the overconfidence of neural networks on OOD data.
Our method is compatible with different OOD detection scores and can generalize well to different architectures of networks.
arXiv Detail & Related papers (2022-06-23T07:59:10Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Confidence Estimation via Auxiliary Models [47.08749569008467]
We introduce a novel target criterion for model confidence, namely the true class probability ( TCP)
We show that TCP offers better properties for confidence estimation than standard maximum class probability (MCP)
arXiv Detail & Related papers (2020-12-11T17:21:12Z) - A Unified Taylor Framework for Revisiting Attribution Methods [49.03783992773811]
We propose a Taylor attribution framework and reformulate seven mainstream attribution methods into the framework.
We establish three principles for a good attribution in the Taylor attribution framework.
arXiv Detail & Related papers (2020-08-21T22:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.