Related papers: Linguistically-Enriched and Context-Aware Zero-shot Slot Filling

Linguistically-Enriched and Context-Aware Zero-shot Slot Filling

URL: http://arxiv.org/abs/2101.06514v1
Date: Sat, 16 Jan 2021 20:18:16 GMT
Title: Linguistically-Enriched and Context-Aware Zero-shot Slot Filling
Authors: A.B. Siddique, Fuad Jamour, Vagelis Hristidis
Abstract summary: Slot filling is one of the most important challenges in modern task-oriented dialog systems. New domains (i.e., unseen in training) may emerge after deployment. It is imperative that models seamlessly adapt and fill slots from both seen and unseen domains.
Score: 6.06746295810681
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Slot filling is identifying contiguous spans of words in an utterance that correspond to certain parameters (i.e., slots) of a user request/query. Slot filling is one of the most important challenges in modern task-oriented dialog systems. Supervised learning approaches have proven effective at tackling this challenge, but they need a significant amount of labeled training data in a given domain. However, new domains (i.e., unseen in training) may emerge after deployment. Thus, it is imperative that these models seamlessly adapt and fill slots from both seen and unseen domains -- unseen domains contain unseen slot types with no training data, and even seen slots in unseen domains are typically presented in different contexts. This setting is commonly referred to as zero-shot slot filling. Little work has focused on this setting, with limited experimental evaluation. Existing models that mainly rely on context-independent embedding-based similarity measures fail to detect slot values in unseen domains or do so only partially. We propose a new zero-shot slot filling neural model, LEONA, which works in three steps. Step one acquires domain-oblivious, context-aware representations of the utterance word by exploiting (a) linguistic features; (b) named entity recognition cues; (c) contextual embeddings from pre-trained language models. Step two fine-tunes these rich representations and produces slot-independent tags for each word. Step three exploits generalizable context-aware utterance-slot similarity features at the word level, uses slot-independent tags, and contextualizes them to produce slot-specific predictions for each word. Our thorough evaluation on four diverse public datasets demonstrates that our approach consistently outperforms the SOTA models by 17.52%, 22.15%, 17.42%, and 17.95% on average for unseen domains on SNIPS, ATIS, MultiWOZ, and SGD datasets, respectively.

Related papers

Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
HierarchicalContrast: A Coarse-to-Fine Contrastive Learning Framework for Cross-Domain Zero-Shot Slot Filling [4.1940152307593515]
Cross-domain zero-shot slot filling plays a vital role in leveraging source domain knowledge to learn a model. Existing state-of-the-art zero-shot slot filling methods have limited generalization ability in target domain. We present a novel Hierarchical Contrastive Learning Framework (HiCL) for zero-shot slot filling.
arXiv Detail & Related papers (2023-10-13T14:23:33Z)
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding [103.34092301324425]
Large language models (LLMs) have shown impressive ability for open-domain NLP tasks. We present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding.
arXiv Detail & Related papers (2023-08-21T07:31:19Z)
Vocabulary-informed Zero-shot and Open-set Learning [128.83517181045815]
We propose vocabulary-informed learning to address problems of supervised, zero-shot, generalized zero-shot and open set recognition. Specifically, we propose a weighted maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms. We illustrate that resulting model shows improvements in supervised, zero-shot, generalized zero-shot, and large open set recognition, with up to 310K class vocabulary on Animal with Attributes and ImageNet datasets.
arXiv Detail & Related papers (2023-01-03T08:19:22Z)
Zero-Shot Learning for Joint Intent and Slot Labeling [11.82805641934772]
We show that one can profitably perform joint zero-shot intent classification and slot labeling. We describe NN architectures that translate between word and sentence embedding spaces.
arXiv Detail & Related papers (2022-11-29T01:58:25Z)
Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains. Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z)
Automatic Discovery of Novel Intents & Domains from Text Utterances [18.39942131996558]
We propose a novel framework, ADVIN, to automatically discover novel domains and intents from large volumes of unlabeled data. ADVIN significantly outperforms baselines on three benchmark datasets, and real user utterances from a commercial voice-powered agent.
arXiv Detail & Related papers (2020-05-22T00:47:10Z)
Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling [65.09621991654745]
Cross-domain slot filling is an essential task in task-oriented dialog systems. We propose a Coarse-to-fine approach (Coach) for cross-domain slot filling. Experimental results show that our model significantly outperforms state-of-the-art approaches in slot filling.
arXiv Detail & Related papers (2020-04-24T13:07:12Z)
Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision. We propose domain data selection methods based on such models. We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z)
MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition using Deep Bidirectional Transformers [1.7403133838762446]
We consider the training of a slot tagger using multiple data sets covering different slot types as a multi-task learning problem. The experimental results on the biomedical domain have shown that the proposed approach outperforms the previous state-of-the-art systems for slot tagging.
arXiv Detail & Related papers (2020-01-24T07:16:32Z)
Elastic CRFs for Open-ontology Slot Filling [32.17803768259441]
Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. We propose a new model called elastic conditional random field (eCRF) where each slot is represented by the embedding of its natural language description. New slot values can be detected by eCRF whenever a language description is available for the slot.
arXiv Detail & Related papers (2018-11-04T07:38:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.