Related papers: A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking

A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking

URL: http://arxiv.org/abs/2207.00828v1
Date: Sat, 2 Jul 2022 13:27:59 GMT
Title: A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking
Authors: Eleftherios Kapelonis, Efthymios Georgiou, Alexandros Potamianos
Abstract summary: Task-oriented dialogue systems often employ a Dialogue State Tracker (DST) to successfully complete conversations. Recent state-of-the-art DST implementations rely on schemata of diverse services to improve model robustness. We propose a single multi-task BERT-based model that jointly solves the three DST tasks of intent prediction, requested slot prediction and slot filling.
Score: 78.2700757742992
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Task-oriented dialogue systems often employ a Dialogue State Tracker (DST) to successfully complete conversations. Recent state-of-the-art DST implementations rely on schemata of diverse services to improve model robustness and handle zero-shot generalization to new domains [1], however such methods [2, 3] typically require multiple large scale transformer models and long input sequences to perform well. We propose a single multi-task BERT-based model that jointly solves the three DST tasks of intent prediction, requested slot prediction and slot filling. Moreover, we propose an efficient and parsimonious encoding of the dialogue history and service schemata that is shown to further improve performance. Evaluation on the SGD dataset shows that our approach outperforms the baseline SGP-DST by a large margin and performs well compared to the state-of-the-art, while being significantly more computationally efficient. Extensive ablation studies are performed to examine the contributing factors to the success of our model.

Related papers

MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
A Large-Scale Evaluation of Speech Foundation Models [110.95827399522204]
We establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the foundation model paradigm for speech. We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads.
arXiv Detail & Related papers (2024-04-15T00:03:16Z)
Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks. We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps. Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z)
Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking [7.176787451868171]
We introduce SPLAT, a novel architecture which achieves better generalization and efficiency than prior approaches. We demonstrate the effectiveness of our model on theGuided Dialogue (SGD) and MultiWOZ datasets.
arXiv Detail & Related papers (2023-06-15T17:59:31Z)
DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning [7.5700317050237365]
We propose DiSTRICT, a generalizable in-context tuning approach for Dialogue State Tracking (DST) DSTRICT retrieves highly relevant training examples for a given dialogue to fine-tune the model without any hand-crafted templates. Experiments with the MultiWOZ benchmark datasets show that DiSTRICT outperforms existing approaches in various zero-shot and few-shot settings.
arXiv Detail & Related papers (2022-12-06T09:40:15Z)
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System [26.837972034630003]
PPTOD is a unified plug-and-play model for task-oriented dialogue. We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification.
arXiv Detail & Related papers (2021-09-29T22:02:18Z)
SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services [15.21976869687864]
We propose SGD-QA, a model for schema-guided dialogue state tracking based on a question answering approach. The proposed multi-pass model shares a single encoder between the domain information and dialogue utterance. The model improves performance on unseen services by at least 1.6x compared to single-pass baseline models.
arXiv Detail & Related papers (2021-05-17T17:54:32Z)
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains. RADDLE is designed to favor and encourage models with a strong generalization ability. We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z)
A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset [8.990035371365408]
We introduce FastSGT, a fast and robust BERT-based model for state tracking in goal-oriented dialogue systems. The proposed model is designed for theGuided Dialogue dataset which contains natural language descriptions. Our model keeps the efficiency in terms of computational and memory consumption while improving the accuracy significantly.
arXiv Detail & Related papers (2020-08-27T18:51:18Z)
Non-Autoregressive Dialog State Tracking [122.2328875457225]
We propose a novel framework of Non-Autoregressive Dialog State Tracking (NADST) NADST can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots. Our results show that our model achieves the state-of-the-art joint accuracy across all domains on the MultiWOZ 2.1 corpus.
arXiv Detail & Related papers (2020-02-19T06:39:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.