Linguistically-Enriched and Context-Aware Zero-shot Slot Filling
- URL: http://arxiv.org/abs/2101.06514v1
- Date: Sat, 16 Jan 2021 20:18:16 GMT
- Title: Linguistically-Enriched and Context-Aware Zero-shot Slot Filling
- Authors: A.B. Siddique, Fuad Jamour, Vagelis Hristidis
- Abstract summary: Slot filling is one of the most important challenges in modern task-oriented dialog systems.
New domains (i.e., unseen in training) may emerge after deployment.
It is imperative that models seamlessly adapt and fill slots from both seen and unseen domains.
- Score: 6.06746295810681
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Slot filling is identifying contiguous spans of words in an utterance that
correspond to certain parameters (i.e., slots) of a user request/query. Slot
filling is one of the most important challenges in modern task-oriented dialog
systems. Supervised learning approaches have proven effective at tackling this
challenge, but they need a significant amount of labeled training data in a
given domain. However, new domains (i.e., unseen in training) may emerge after
deployment. Thus, it is imperative that these models seamlessly adapt and fill
slots from both seen and unseen domains -- unseen domains contain unseen slot
types with no training data, and even seen slots in unseen domains are
typically presented in different contexts. This setting is commonly referred to
as zero-shot slot filling. Little work has focused on this setting, with
limited experimental evaluation. Existing models that mainly rely on
context-independent embedding-based similarity measures fail to detect slot
values in unseen domains or do so only partially. We propose a new zero-shot
slot filling neural model, LEONA, which works in three steps. Step one acquires
domain-oblivious, context-aware representations of the utterance word by
exploiting (a) linguistic features; (b) named entity recognition cues; (c)
contextual embeddings from pre-trained language models. Step two fine-tunes
these rich representations and produces slot-independent tags for each word.
Step three exploits generalizable context-aware utterance-slot similarity
features at the word level, uses slot-independent tags, and contextualizes them
to produce slot-specific predictions for each word. Our thorough evaluation on
four diverse public datasets demonstrates that our approach consistently
outperforms the SOTA models by 17.52%, 22.15%, 17.42%, and 17.95% on average
for unseen domains on SNIPS, ATIS, MultiWOZ, and SGD datasets, respectively.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - HierarchicalContrast: A Coarse-to-Fine Contrastive Learning Framework
for Cross-Domain Zero-Shot Slot Filling [4.1940152307593515]
Cross-domain zero-shot slot filling plays a vital role in leveraging source domain knowledge to learn a model.
Existing state-of-the-art zero-shot slot filling methods have limited generalization ability in target domain.
We present a novel Hierarchical Contrastive Learning Framework (HiCL) for zero-shot slot filling.
arXiv Detail & Related papers (2023-10-13T14:23:33Z) - SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence
Understanding [103.34092301324425]
Large language models (LLMs) have shown impressive ability for open-domain NLP tasks.
We present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding.
arXiv Detail & Related papers (2023-08-21T07:31:19Z) - Vocabulary-informed Zero-shot and Open-set Learning [128.83517181045815]
We propose vocabulary-informed learning to address problems of supervised, zero-shot, generalized zero-shot and open set recognition.
Specifically, we propose a weighted maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms.
We illustrate that resulting model shows improvements in supervised, zero-shot, generalized zero-shot, and large open set recognition, with up to 310K class vocabulary on Animal with Attributes and ImageNet datasets.
arXiv Detail & Related papers (2023-01-03T08:19:22Z) - Zero-Shot Learning for Joint Intent and Slot Labeling [11.82805641934772]
We show that one can profitably perform joint zero-shot intent classification and slot labeling.
We describe NN architectures that translate between word and sentence embedding spaces.
arXiv Detail & Related papers (2022-11-29T01:58:25Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Automatic Discovery of Novel Intents & Domains from Text Utterances [18.39942131996558]
We propose a novel framework, ADVIN, to automatically discover novel domains and intents from large volumes of unlabeled data.
ADVIN significantly outperforms baselines on three benchmark datasets, and real user utterances from a commercial voice-powered agent.
arXiv Detail & Related papers (2020-05-22T00:47:10Z) - Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling [65.09621991654745]
Cross-domain slot filling is an essential task in task-oriented dialog systems.
We propose a Coarse-to-fine approach (Coach) for cross-domain slot filling.
Experimental results show that our model significantly outperforms state-of-the-art approaches in slot filling.
arXiv Detail & Related papers (2020-04-24T13:07:12Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z) - MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition
using Deep Bidirectional Transformers [1.7403133838762446]
We consider the training of a slot tagger using multiple data sets covering different slot types as a multi-task learning problem.
The experimental results on the biomedical domain have shown that the proposed approach outperforms the previous state-of-the-art systems for slot tagging.
arXiv Detail & Related papers (2020-01-24T07:16:32Z) - Elastic CRFs for Open-ontology Slot Filling [32.17803768259441]
Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots.
We propose a new model called elastic conditional random field (eCRF) where each slot is represented by the embedding of its natural language description.
New slot values can be detected by eCRF whenever a language description is available for the slot.
arXiv Detail & Related papers (2018-11-04T07:38:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.