Automatic Discovery of Novel Intents & Domains from Text Utterances
- URL: http://arxiv.org/abs/2006.01208v1
- Date: Fri, 22 May 2020 00:47:10 GMT
- Title: Automatic Discovery of Novel Intents & Domains from Text Utterances
- Authors: Nikhita Vedula, Rahul Gupta, Aman Alok, Mukund Sridhar
- Abstract summary: We propose a novel framework, ADVIN, to automatically discover novel domains and intents from large volumes of unlabeled data.
ADVIN significantly outperforms baselines on three benchmark datasets, and real user utterances from a commercial voice-powered agent.
- Score: 18.39942131996558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the primary tasks in Natural Language Understanding (NLU) is to
recognize the intents as well as domains of users' spoken and written language
utterances. Most existing research formulates this as a supervised
classification problem with a closed-world assumption, i.e. the domains or
intents to be identified are pre-defined or known beforehand. Real-world
applications however increasingly encounter dynamic, rapidly evolving
environments with newly emerging intents and domains, about which no
information is known during model training. We propose a novel framework,
ADVIN, to automatically discover novel domains and intents from large volumes
of unlabeled data. We first employ an open classification model to identify all
utterances potentially consisting of a novel intent. Next, we build a knowledge
transfer component with a pairwise margin loss function. It learns
discriminative deep features to group together utterances and discover multiple
latent intent categories within them in an unsupervised manner. We finally
hierarchically link mutually related intents into domains, forming an
intent-domain taxonomy. ADVIN significantly outperforms baselines on three
benchmark datasets, and real user utterances from a commercial voice-powered
agent.
Related papers
- Towards Open-Domain Topic Classification [69.21234350688098]
We introduce an open-domain topic classification system that accepts user-defined taxonomy in real time.
Users will be able to classify a text snippet with respect to any candidate labels they want, and get instant response from our web interface.
arXiv Detail & Related papers (2023-06-29T20:25:28Z) - Tri-level Joint Natural Language Understanding for Multi-turn
Conversational Datasets [5.3361357265365035]
We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.
We evaluate our model on two multi-turn datasets for which we are the first to conduct joint slot-filling and intent detection.
arXiv Detail & Related papers (2023-05-28T13:59:58Z) - Few-Shot Object Detection in Unseen Domains [4.36080478413575]
Few-shot object detection (FSOD) has thrived in recent years to learn novel object classes with limited data.
We propose various data augmentations techniques on the few shots of novel classes to account for all possible domain-specific information.
Our experiments on the T-LESS dataset show that the proposed approach succeeds in alleviating the domain gap considerably.
arXiv Detail & Related papers (2022-04-11T13:16:41Z) - Open Domain Question Answering over Virtual Documents: A Unified
Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA)
Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources.
We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z) - Discover, Hallucinate, and Adapt: Open Compound Domain Adaptation for
Semantic Segmentation [91.30558794056056]
Unsupervised domain adaptation (UDA) for semantic segmentation has been attracting attention recently.
We present a novel framework based on three main design principles: discover, hallucinate, and adapt.
We evaluate our solution on standard benchmark GTA to C-driving, and achieved new state-of-the-art results.
arXiv Detail & Related papers (2021-10-08T13:20:09Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Learning to Select Context in a Hierarchical and Global Perspective for
Open-domain Dialogue Generation [15.01710843286394]
We propose a novel model with hierarchical self-attention mechanism and distant supervision to detect relevant words and utterances in short and long distances.
Our model significantly outperforms other baselines in terms of fluency, coherence, and informativeness.
arXiv Detail & Related papers (2021-02-18T11:56:42Z) - Linguistically-Enriched and Context-Aware Zero-shot Slot Filling [6.06746295810681]
Slot filling is one of the most important challenges in modern task-oriented dialog systems.
New domains (i.e., unseen in training) may emerge after deployment.
It is imperative that models seamlessly adapt and fill slots from both seen and unseen domains.
arXiv Detail & Related papers (2021-01-16T20:18:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.