Span-Selective Linear Attention Transformers for Effective and Robust
Schema-Guided Dialogue State Tracking
- URL: http://arxiv.org/abs/2306.09340v1
- Date: Thu, 15 Jun 2023 17:59:31 GMT
- Title: Span-Selective Linear Attention Transformers for Effective and Robust
Schema-Guided Dialogue State Tracking
- Authors: Bj\"orn Bebensee, Haejun Lee
- Abstract summary: We introduce SPLAT, a novel architecture which achieves better generalization and efficiency than prior approaches.
We demonstrate the effectiveness of our model on theGuided Dialogue (SGD) and MultiWOZ datasets.
- Score: 7.176787451868171
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In schema-guided dialogue state tracking models estimate the current state of
a conversation using natural language descriptions of the service schema for
generalization to unseen services. Prior generative approaches which decode
slot values sequentially do not generalize well to variations in schema, while
discriminative approaches separately encode history and schema and fail to
account for inter-slot and intent-slot dependencies. We introduce SPLAT, a
novel architecture which achieves better generalization and efficiency than
prior approaches by constraining outputs to a limited prediction space. At the
same time, our model allows for rich attention among descriptions and history
while keeping computation costs constrained by incorporating linear-time
attention. We demonstrate the effectiveness of our model on the Schema-Guided
Dialogue (SGD) and MultiWOZ datasets. Our approach significantly improves upon
existing models achieving 85.3 JGA on the SGD dataset. Further, we show
increased robustness on the SGD-X benchmark: our model outperforms the more
than 30$\times$ larger D3ST-XXL model by 5.0 points.
Related papers
- Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment [104.18002641195442]
We introduce Self-Augmented Preference Optimization (SAPO), an effective and scalable training paradigm that does not require existing paired data.
Building on the self-play concept, which autonomously generates negative responses, we further incorporate an off-policy learning pipeline to enhance data exploration and exploitation.
arXiv Detail & Related papers (2024-05-31T14:21:04Z) - TOD-Flow: Modeling the Structure of Task-Oriented Dialogues [77.15457469745364]
We propose a novel approach focusing on inferring the TOD-Flow graph from dialogue data annotated with dialog acts.
The inferred TOD-Flow graph can be easily integrated with any dialogue model to improve its prediction performance, transparency, and controllability.
arXiv Detail & Related papers (2023-12-07T20:06:23Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - More Robust Schema-Guided Dialogue State Tracking via Tree-Based
Paraphrase Ranking [0.0]
Fine-tuned language models excel at building schema-guided dialogue state tracking (DST)
We propose a framework for generating synthetic schemas which uses tree-based ranking to jointly optimise diversity and semantic faithfulness.
arXiv Detail & Related papers (2023-03-17T11:43:08Z) - A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking [78.2700757742992]
Task-oriented dialogue systems often employ a Dialogue State Tracker (DST) to successfully complete conversations.
Recent state-of-the-art DST implementations rely on schemata of diverse services to improve model robustness.
We propose a single multi-task BERT-based model that jointly solves the three DST tasks of intent prediction, requested slot prediction and slot filling.
arXiv Detail & Related papers (2022-07-02T13:27:59Z) - Long Range Language Modeling via Gated State Spaces [67.64091993846269]
We focus on autoregressive sequence modeling over English books, Github source code and ArXiv mathematics articles.
We propose a new layer named Gated State Space (GSS) and show that it trains significantly faster than the diagonal version of S4.
arXiv Detail & Related papers (2022-06-27T01:50:18Z) - SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue
Systems [26.14268488547028]
We release SGD-X, a benchmark for measuring robustness of dialogue systems to linguistic variations in schemas.
We evaluate two dialogue state tracking models on SGD-X and observe that neither generalizes well across schema variations.
We present a simple model-agnostic data augmentation method to improve schema robustness and zero-shot generalization to unseen services.
arXiv Detail & Related papers (2021-10-13T15:38:29Z) - SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services [15.21976869687864]
We propose SGD-QA, a model for schema-guided dialogue state tracking based on a question answering approach.
The proposed multi-pass model shares a single encoder between the domain information and dialogue utterance.
The model improves performance on unseen services by at least 1.6x compared to single-pass baseline models.
arXiv Detail & Related papers (2021-05-17T17:54:32Z) - Autoregressive Dynamics Models for Offline Policy Evaluation and
Optimization [60.73540999409032]
We show that expressive autoregressive dynamics models generate different dimensions of the next state and reward sequentially conditioned on previous dimensions.
We also show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer.
arXiv Detail & Related papers (2021-04-28T16:48:44Z) - A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided
Dialogue Dataset [8.990035371365408]
We introduce FastSGT, a fast and robust BERT-based model for state tracking in goal-oriented dialogue systems.
The proposed model is designed for theGuided Dialogue dataset which contains natural language descriptions.
Our model keeps the efficiency in terms of computational and memory consumption while improving the accuracy significantly.
arXiv Detail & Related papers (2020-08-27T18:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.