Semantic Parsing by Large Language Models for Intricate Updating
Strategies of Zero-Shot Dialogue State Tracking
- URL: http://arxiv.org/abs/2310.10520v3
- Date: Sat, 25 Nov 2023 02:09:35 GMT
- Title: Semantic Parsing by Large Language Models for Intricate Updating
Strategies of Zero-Shot Dialogue State Tracking
- Authors: Yuxiang Wu, Guanting Dong, Weiran Xu
- Abstract summary: Zero-shot Dialogue State Tracking (DST) addresses the challenge of acquiring and annotating task-oriented dialogues.
We propose ParsingDST, a new In-Context Learning (ICL) method, to introduce additional intricate updating strategies in zero-shot DST.
Our approach reformulates the DST task by leveraging powerful Large Language Models (LLMs) and translating the original dialogue text to semantic parsing.
- Score: 25.286077416235784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot Dialogue State Tracking (DST) addresses the challenge of acquiring
and annotating task-oriented dialogues, which can be time-consuming and costly.
However, DST extends beyond simple slot-filling and requires effective updating
strategies for tracking dialogue state as conversations progress. In this
paper, we propose ParsingDST, a new In-Context Learning (ICL) method, to
introduce additional intricate updating strategies in zero-shot DST. Our
approach reformulates the DST task by leveraging powerful Large Language Models
(LLMs) and translating the original dialogue text to JSON through semantic
parsing as an intermediate state. We also design a novel framework that
includes more modules to ensure the effectiveness of updating strategies in the
text-to-JSON process. Experimental results demonstrate that our approach
outperforms existing zero-shot DST methods on MultiWOZ, exhibiting significant
improvements in Joint Goal Accuracy (JGA) and slot accuracy compared to
existing ICL methods. Our code has been released.
Related papers
- Beyond Ontology in Dialogue State Tracking for Goal-Oriented Chatbot [3.2288892242158984]
We propose a novel approach to enhance Dialogue State Tracking (DST) performance.
Our method enables Large Language Model (LLM) to infer dialogue states through carefully designed prompts.
Our approach achieved state-of-the-art with a JGA of 42.57%, and performed well in open-domain real-world conversations.
arXiv Detail & Related papers (2024-10-30T07:36:23Z) - Injecting linguistic knowledge into BERT for Dialogue State Tracking [60.42231674887294]
This paper proposes a method that extracts linguistic knowledge via an unsupervised framework.
We then utilize this knowledge to augment BERT's performance and interpretability in Dialogue State Tracking (DST) tasks.
We benchmark this framework on various DST tasks and observe a notable improvement in accuracy.
arXiv Detail & Related papers (2023-11-27T08:38:42Z) - Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - TaDSE: Template-aware Dialogue Sentence Embeddings [27.076663644996966]
General sentence embedding methods are usually sentence-level self-supervised frameworks and cannot utilize token-level extra knowledge.
TaDSE augments each sentence with its corresponding template and then conducts pairwise contrastive learning over both sentence and template.
Experiment results show that TaDSE achieves significant improvements over previous SOTA methods, along with a consistent Intent Classification task performance improvement margin.
arXiv Detail & Related papers (2023-05-23T17:40:41Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - KILDST: Effective Knowledge-Integrated Learning for Dialogue State
Tracking using Gazetteer and Speaker Information [3.342637296393915]
Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention.
It is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users.
We introduce a new task - DST from dialogue between users about scheduling an event (DST-S)
The DST-S task is much more challenging since it requires the model to understand and track dialogue in the dialogue between users and to understand who suggested the schedule and who agreed to the proposed schedule.
arXiv Detail & Related papers (2023-01-18T07:11:56Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z) - Prompt Learning for Few-Shot Dialogue State Tracking [75.50701890035154]
This paper focuses on how to learn a dialogue state tracking (DST) model efficiently with limited labeled data.
We design a prompt learning framework for few-shot DST, which consists of two main components: value-based prompt and inverse prompt mechanism.
Experiments show that our model can generate unseen slots and outperforms existing state-of-the-art few-shot methods.
arXiv Detail & Related papers (2022-01-15T07:37:33Z) - Modeling Long Context for Task-Oriented Dialogue State Generation [51.044300192906995]
We propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model.
Our approaches attempt to solve the problem that the performance of the baseline significantly drops when the input dialogue context sequence is long.
In our experiments, our proposed model achieves a 7.03% relative improvement over the baseline, establishing a new state-of-the-art joint goal accuracy of 52.04% on the MultiWOZ 2.0 dataset.
arXiv Detail & Related papers (2020-04-29T11:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.