A Zero-Shot Open-Vocabulary Pipeline for Dialogue Understanding
- URL: http://arxiv.org/abs/2409.15861v1
- Date: Tue, 24 Sep 2024 08:33:41 GMT
- Title: A Zero-Shot Open-Vocabulary Pipeline for Dialogue Understanding
- Authors: Abdulfattah Safa, Gözde Gül Şahin,
- Abstract summary: We propose a zero-shot, open-vocabulary system that integrates domain classification and State Tracking (DST) in a single pipeline.
Our approach includes reformulating DST as a question-answering task for less capable models and employing self-refining prompts for more adaptable ones.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Dialogue State Tracking (DST) is crucial for understanding user needs and executing appropriate system actions in task-oriented dialogues. Majority of existing DST methods are designed to work within predefined ontologies and assume the availability of gold domain labels, struggling with adapting to new slots values. While Large Language Models (LLMs)-based systems show promising zero-shot DST performance, they either require extensive computational resources or they underperform existing fully-trained systems, limiting their practicality. To address these limitations, we propose a zero-shot, open-vocabulary system that integrates domain classification and DST in a single pipeline. Our approach includes reformulating DST as a question-answering task for less capable models and employing self-refining prompts for more adaptable ones. Our system does not rely on fixed slot values defined in the ontology allowing the system to adapt dynamically. We compare our approach with existing SOTA, and show that it provides up to 20% better Joint Goal Accuracy (JGA) over previous methods on datasets like Multi-WOZ 2.1, with up to 90% fewer requests to the LLM API.
Related papers
- Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking [54.51316566989655]
Previous zero-shot dialogue state tracking (DST) methods only apply transfer learning, ignoring unlabelled data in the target domain.
We transform zero-shot DST into few-shot DST by utilising such unlabelled data via joint and self-training methods.
We demonstrate this method's effectiveness on general language models in zero-shot scenarios, improving average joint goal accuracy by 8% across all domains in MultiWOZ.
arXiv Detail & Related papers (2023-10-16T15:16:16Z) - Task-Optimized Adapters for an End-to-End Task-Oriented Dialogue System [0.0]
We propose an End-to-end TOD system with Task-d Adapters which learn independently per task, adding only small number of parameters after fixed layers of pre-trained network.
Our method is a model-agnostic approach and does not require prompt-tuning as only input data without a prompt.
arXiv Detail & Related papers (2023-05-04T00:17:49Z) - Choice Fusion as Knowledge for Zero-Shot Dialogue State Tracking [5.691339955497443]
zero-shot dialogue state tracking (DST) tracks user's requirements in task-oriented dialogues without training on desired domains.
We propose CoFunDST, which is trained on domain-agnostic QA datasets and directly uses candidate choices of slot-values as knowledge for zero-shot dialogue-state generation.
Our proposed model achieves outperformed joint goal accuracy compared to existing zero-shot DST approaches in most domains on the MultiWOZ 2.1.
arXiv Detail & Related papers (2023-02-25T07:32:04Z) - Prompt Learning for Few-Shot Dialogue State Tracking [75.50701890035154]
This paper focuses on how to learn a dialogue state tracking (DST) model efficiently with limited labeled data.
We design a prompt learning framework for few-shot DST, which consists of two main components: value-based prompt and inverse prompt mechanism.
Experiments show that our model can generate unseen slots and outperforms existing state-of-the-art few-shot methods.
arXiv Detail & Related papers (2022-01-15T07:37:33Z) - Zero-Shot Dialogue State Tracking via Cross-Task Transfer [69.70718906395182]
We propose to transfer the textitcross-task knowledge from general question answering (QA) corpora for the zero-shot dialogue state tracking task.
Specifically, we propose TransferQA, a transferable generative QA model that seamlessly combines extractive QA and multi-choice QA.
In addition, we introduce two effective ways to construct unanswerable questions, namely, negative question sampling and context truncation.
arXiv Detail & Related papers (2021-09-10T03:57:56Z) - Zero-shot Generalization in Dialog State Tracking through Generative
Question Answering [10.81203437307028]
We introduce a novel framework that supports natural language queries for unseen constraints and slots in task-oriented dialogs.
Our approach is based on generative question-answering using a conditional domain model pre-trained on substantive English sentences.
arXiv Detail & Related papers (2021-01-20T21:47:20Z) - Few Shot Dialogue State Tracking using Meta-learning [3.6292310166028403]
Dialogue State Tracking (DST) forms a core component of automated systems designed for specific goals like hotel, taxi reservation, tourist information, etc.
With the increasing need to deploy such systems in new domains, solving the problem of zero/few-shot DST has become necessary.
Our proposed meta-learner is agnostic of the underlying model and hence any existing state-of-the-art DST system can improve its performance on unknown domains using our training strategy.
arXiv Detail & Related papers (2021-01-17T20:47:06Z) - Few-shot Natural Language Generation for Task-Oriented Dialog [113.07438787659859]
We present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems.
We develop the SC-GPT model, which is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability.
Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods.
arXiv Detail & Related papers (2020-02-27T18:48:33Z) - Non-Autoregressive Dialog State Tracking [122.2328875457225]
We propose a novel framework of Non-Autoregressive Dialog State Tracking (NADST)
NADST can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots.
Our results show that our model achieves the state-of-the-art joint accuracy across all domains on the MultiWOZ 2.1 corpus.
arXiv Detail & Related papers (2020-02-19T06:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.