Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers
- URL: http://arxiv.org/abs/2210.11984v1
- Date: Fri, 21 Oct 2022 14:19:47 GMT
- Title: Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers
- Authors: Daniel Fern\'andez-Gonz\'alez
- Abstract summary: Task-oriented dialog systems, such as Apple Siri and Amazon Alexa, require a semantic parsing module in order to process user utterances and understand the action to be performed.
We advance the research on shift-reduce semantic parsing for task-oriented dialog.
In particular, we implement novel shift-reduces that rely on Stack-Transformers.
- Score: 0.40611352512781856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligent voice assistants, such as Apple Siri and Amazon Alexa, are widely
used nowadays. These task-oriented dialog systems require a semantic parsing
module in order to process user utterances and understand the action to be
performed. This semantic parsing component was initially implemented by
rule-based or statistical slot-filling approaches for processing simple
queries; however, the appearance of more complex utterances demanded the
application of shift-reduce parsers or sequence-to-sequence models. While
shift-reduce approaches initially demonstrated to be the best option, recent
efforts on sequence-to-sequence systems pushed them to become the
highest-performing method for that task. In this article, we advance the
research on shift-reduce semantic parsing for task-oriented dialog. In
particular, we implement novel shift-reduce parsers that rely on
Stack-Transformers. These allow to adequately model transition systems on the
cutting-edge Transformer architecture, notably boosting shift-reduce parsing
performance. Additionally, we adapt alternative transition systems from
constituency parsing to task-oriented parsing, and empirically prove that the
in-order algorithm substantially outperforms the commonly-used top-down
strategy. Finally, we extensively test our approach on multiple domains from
the Facebook TOP benchmark, improving over existing shift-reduce parsers and
state-of-the-art sequence-to-sequence models in both high-resource and
low-resource settings.
Related papers
- MASSFormer: Mobility-Aware Spectrum Sensing using Transformer-Driven
Tiered Structure [3.6194127685460553]
We develop a mobility-aware transformer-driven structure (MASSFormer) based cooperative sensing method.
Our method considers a dynamic scenario involving mobile primary users (PUs) and secondary users (SUs)
The proposed method is tested under imperfect reporting channel scenarios to show robustness.
arXiv Detail & Related papers (2024-09-26T05:25:25Z) - Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Hierarchical Decision Transformer [0.0]
This paper presents a hierarchical algorithm for learning a sequence model from demonstrations.
The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach.
We validate our method in multiple tasks of OpenAIGym, D4RL and RoboMimic benchmarks.
arXiv Detail & Related papers (2022-09-21T15:48:40Z) - Efficient Long Sequence Encoding via Synchronization [29.075962393432857]
We propose a synchronization mechanism for hierarchical encoding.
Our approach first identifies anchor tokens across segments and groups them by their roles in the original input sequence.
Our approach is able to improve the global information exchange among segments while maintaining efficiency.
arXiv Detail & Related papers (2022-03-15T04:37:02Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Dependency Parsing with Bottom-up Hierarchical Pointer Networks [0.7412445894287709]
Left-to-right and top-down transition-based algorithms are among the most accurate approaches for performing dependency parsing.
We propose two novel transition-based alternatives: an approach that parses a sentence in right-to-left order and a variant that does it from the outside in.
We empirically test the proposed neural architecture with the different algorithms on a wide variety of languages, outperforming the original approach in practically all of them.
arXiv Detail & Related papers (2021-05-20T09:10:42Z) - Few-shot Sequence Learning with Transformers [79.87875859408955]
Few-shot algorithms aim at learning new tasks provided only a handful of training examples.
In this work we investigate few-shot learning in the setting where the data points are sequences of tokens.
We propose an efficient learning algorithm based on Transformers.
arXiv Detail & Related papers (2020-12-17T12:30:38Z) - Transition-based Parsing with Stack-Transformers [32.029528327212795]
Recurrent Neural Networks considerably improved the performance of transition-based systems by modelling the global state.
We show that modifications of the cross attention mechanism of the Transformer considerably strengthen performance both on dependency and Meaning.
arXiv Detail & Related papers (2020-10-20T23:20:31Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - Multi-level Head-wise Match and Aggregation in Transformer for Textual
Sequence Matching [87.97265483696613]
We propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels.
Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks.
arXiv Detail & Related papers (2020-01-20T20:02:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.