Low-Resource Compositional Semantic Parsing with Concept Pretraining
- URL: http://arxiv.org/abs/2301.09809v1
- Date: Tue, 24 Jan 2023 04:27:27 GMT
- Title: Low-Resource Compositional Semantic Parsing with Concept Pretraining
- Authors: Subendhu Rongali, Mukund Sridhar Harakere, Haidar Khan, Konstantine
Arkoudas, Wael Hamza, and Andrew McCallum
- Abstract summary: We present an architecture to perform such domain adaptation automatically.
We use a base seq2seq (sequence-to-sequence) architecture and augment it with a concept encoder that encodes intent and slot tags from the new domain.
We report few-shot and zero-shot results for compositional semantic parsing on the TOPv2 dataset.
- Score: 35.35201295013346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic parsing plays a key role in digital voice assistants such as Alexa,
Siri, and Google Assistant by mapping natural language to structured meaning
representations. When we want to improve the capabilities of a voice assistant
by adding a new domain, the underlying semantic parsing model needs to be
retrained using thousands of annotated examples from the new domain, which is
time-consuming and expensive. In this work, we present an architecture to
perform such domain adaptation automatically, with only a small amount of
metadata about the new domain and without any new training data (zero-shot) or
with very few examples (few-shot). We use a base seq2seq (sequence-to-sequence)
architecture and augment it with a concept encoder that encodes intent and slot
tags from the new domain. We also introduce a novel decoder-focused approach to
pretrain seq2seq models to be concept aware using Wikidata and use it to help
our model learn important concepts and perform well in low-resource settings.
We report few-shot and zero-shot results for compositional semantic parsing on
the TOPv2 dataset and show that our model outperforms prior approaches in
few-shot settings for the TOPv2 and SNIPS datasets.
Related papers
- Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - Towards Zero-Shot Frame Semantic Parsing with Task Agnostic Ontologies
and Simple Labels [0.9236074230806577]
OpenFSP is a framework for easy creation of new domains from simple labels.
Our approach relies on creating a small, but expressive, set of domain agnostic slot types.
Our model outperforms strong baselines in this simple labels setting.
arXiv Detail & Related papers (2023-05-05T18:47:18Z) - Training Naturalized Semantic Parsers with Very Little Data [10.709587018625275]
State-of-the-art (SOTA) semantics are seq2seq architectures based on large language models that have been pretrained on vast amounts of text.
Recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves natural language sentences.
We show that this method delivers new SOTA few-shot performance on the Overnight dataset.
arXiv Detail & Related papers (2022-04-29T17:14:54Z) - RETRONLU: Retrieval Augmented Task-Oriented Semantic Parsing [11.157958012672202]
We are applying retrieval-based modeling ideas to the problem of multi-domain task-oriented semantic parsing.
Our approach, RetroNLU, extends a sequence-to-sequence model architecture with a retrieval component.
We analyze the nearest neighbor retrieval component's quality, model sensitivity and break down the performance for semantic parses of different utterance complexity.
arXiv Detail & Related papers (2021-09-21T19:30:30Z) - X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented
Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries.
We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP.
We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z) - Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling [65.51280121472146]
We exploit what we intrinsically know about ontology labels to build efficient semantic parsing models.
Our model is highly efficient using a low-resource benchmark derived from TOPv2.
arXiv Detail & Related papers (2021-04-15T04:01:02Z) - Generating Synthetic Data for Task-Oriented Semantic Parsing with
Hierarchical Representations [0.8203855808943658]
In this work, we explore the possibility of generating synthetic data for neural semantic parsing.
Specifically, we first extract masked templates from the existing labeled utterances, and then fine-tune BART to generate synthetic utterances conditioning.
We show the potential of our approach when evaluating on the Facebook TOP dataset for navigation domain.
arXiv Detail & Related papers (2020-11-03T22:55:40Z) - Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic
Parsing [85.35582118010608]
Task-oriented semantic parsing is a critical component of virtual assistants.
Recent advances in deep learning have enabled several approaches to successfully parse more complex queries.
We propose a novel method that outperforms a supervised neural model at a 10-fold data reduction.
arXiv Detail & Related papers (2020-10-07T17:47:53Z) - Selecting Relevant Features from a Multi-domain Representation for
Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches.
We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.