Related papers: Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers

Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers

URL: http://arxiv.org/abs/2210.11984v1
Date: Fri, 21 Oct 2022 14:19:47 GMT
Title: Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers
Authors: Daniel Fern\'andez-Gonz\'alez
Abstract summary: Task-oriented dialog systems, such as Apple Siri and Amazon Alexa, require a semantic parsing module in order to process user utterances and understand the action to be performed. We advance the research on shift-reduce semantic parsing for task-oriented dialog. In particular, we implement novel shift-reduces that rely on Stack-Transformers.
Score: 0.40611352512781856
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intelligent voice assistants, such as Apple Siri and Amazon Alexa, are widely used nowadays. These task-oriented dialog systems require a semantic parsing module in order to process user utterances and understand the action to be performed. This semantic parsing component was initially implemented by rule-based or statistical slot-filling approaches for processing simple queries; however, the appearance of more complex utterances demanded the application of shift-reduce parsers or sequence-to-sequence models. While shift-reduce approaches initially demonstrated to be the best option, recent efforts on sequence-to-sequence systems pushed them to become the highest-performing method for that task. In this article, we advance the research on shift-reduce semantic parsing for task-oriented dialog. In particular, we implement novel shift-reduce parsers that rely on Stack-Transformers. These allow to adequately model transition systems on the cutting-edge Transformer architecture, notably boosting shift-reduce parsing performance. Additionally, we adapt alternative transition systems from constituency parsing to task-oriented parsing, and empirically prove that the in-order algorithm substantially outperforms the commonly-used top-down strategy. Finally, we extensively test our approach on multiple domains from the Facebook TOP benchmark, improving over existing shift-reduce parsers and state-of-the-art sequence-to-sequence models in both high-resource and low-resource settings.

Related papers

Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation [34.55224347308013]
Traditional supervised fine-tuning (SFT) strategies for sequence-to-sequence tasks often train models to directly generate the target output. We introduce a task-agnostic framework that enables models to generate intermediate "upwarm" sequences. We show that our approach outperforms traditional SFT methods, and offers a scalable and flexible solution for sequence-to-sequence tasks.
arXiv Detail & Related papers (2025-02-17T20:23:42Z)
MASSFormer: Mobility-Aware Spectrum Sensing using Transformer-Driven Tiered Structure [3.6194127685460553]
We develop a mobility-aware transformer-driven structure (MASSFormer) based cooperative sensing method. Our method considers a dynamic scenario involving mobile primary users (PUs) and secondary users (SUs) The proposed method is tested under imperfect reporting channel scenarios to show robustness.
arXiv Detail & Related papers (2024-09-26T05:25:25Z)
Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task. A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks. Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z)
Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z)
Hierarchical Decision Transformer [0.0]
This paper presents a hierarchical algorithm for learning a sequence model from demonstrations. The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach. We validate our method in multiple tasks of OpenAIGym, D4RL and RoboMimic benchmarks.
arXiv Detail & Related papers (2022-09-21T15:48:40Z)
Efficient Long Sequence Encoding via Synchronization [29.075962393432857]
We propose a synchronization mechanism for hierarchical encoding. Our approach first identifies anchor tokens across segments and groups them by their roles in the original input sequence. Our approach is able to improve the global information exchange among segments while maintaining efficiency.
arXiv Detail & Related papers (2022-03-15T04:37:02Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
Dependency Parsing with Bottom-up Hierarchical Pointer Networks [0.7412445894287709]
Left-to-right and top-down transition-based algorithms are among the most accurate approaches for performing dependency parsing. We propose two novel transition-based alternatives: an approach that parses a sentence in right-to-left order and a variant that does it from the outside in. We empirically test the proposed neural architecture with the different algorithms on a wide variety of languages, outperforming the original approach in practically all of them.
arXiv Detail & Related papers (2021-05-20T09:10:42Z)
Few-shot Sequence Learning with Transformers [79.87875859408955]
Few-shot algorithms aim at learning new tasks provided only a handful of training examples. In this work we investigate few-shot learning in the setting where the data points are sequences of tokens. We propose an efficient learning algorithm based on Transformers.
arXiv Detail & Related papers (2020-12-17T12:30:38Z)
Transition-based Parsing with Stack-Transformers [32.029528327212795]
Recurrent Neural Networks considerably improved the performance of transition-based systems by modelling the global state. We show that modifications of the cross attention mechanism of the Transformer considerably strengthen performance both on dependency and Meaning.
arXiv Detail & Related papers (2020-10-20T23:20:31Z)
Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z)
Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching [87.97265483696613]
We propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels. Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks.
arXiv Detail & Related papers (2020-01-20T20:02:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.