ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an
Opportunity?
- URL: http://arxiv.org/abs/2306.01386v1
- Date: Fri, 2 Jun 2023 09:15:01 GMT
- Title: ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an
Opportunity?
- Authors: Michael Heck, Nurul Lubis, Benjamin Ruppik, Renato Vukovic, Shutong
Feng, Christian Geishauser, Hsien-Chin Lin, Carel van Niekerk, Milica
Ga\v{s}i\'c
- Abstract summary: We present preliminary experimental results on the ChatGPT research preview, showing that ChatGPT achieves state-of-the-art performance in zero-shot DST.
We theorize that the in-context learning capabilities of such models will likely become powerful tools to support the development of dedicated and dynamic dialogue state trackers.
- Score: 2.3555053092246125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research on dialogue state tracking (DST) focuses on methods that
allow few- and zero-shot transfer to new domains or schemas. However,
performance gains heavily depend on aggressive data augmentation and
fine-tuning of ever larger language model based architectures. In contrast,
general purpose language models, trained on large amounts of diverse data, hold
the promise of solving any kind of task without task-specific training. We
present preliminary experimental results on the ChatGPT research preview,
showing that ChatGPT achieves state-of-the-art performance in zero-shot DST.
Despite our findings, we argue that properties inherent to general purpose
models limit their ability to replace specialized systems. We further theorize
that the in-context learning capabilities of such models will likely become
powerful tools to support the development of dedicated and dynamic dialogue
state trackers.
Related papers
- Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Large Language Models as Zero-shot Dialogue State Tracker through Function Calling [42.00097476584174]
We propose a novel approach for solving dialogue state tracking with large language models (LLMs) through function calling.
This method improves zero-shot DST, allowing adaptation to diverse domains without extensive data collection or model tuning.
We show that our approach achieves exceptional performance with both modestly sized open-source and also proprietary LLMs.
arXiv Detail & Related papers (2024-02-16T06:13:18Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Plex: Towards Reliability using Pretrained Large Model Extensions [69.13326436826227]
We develop ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively.
Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol.
We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples.
arXiv Detail & Related papers (2022-07-15T11:39:37Z) - Leveraging Pre-Trained Language Models to Streamline Natural Language
Interaction for Self-Tracking [25.28975864365579]
We propose a novel NLP task for self-tracking that extracts close- and open-ended information from a retrospective activity log.
The framework augments the prompt using synthetic samples to transform the task into 10-shot learning, to address a cold-start problem in bootstrapping a new tracking topic.
arXiv Detail & Related papers (2022-05-31T01:58:04Z) - A Study on Prompt-based Few-Shot Learning Methods for Belief State
Tracking in Task-oriented Dialog Systems [10.024834304960846]
We tackle the Dialogue Belief State Tracking problem of task-oriented conversational systems.
Recent approaches to this problem leveraging Transformer-based models have yielded great results.
We explore prompt-based few-shot learning for Dialogue Belief State Tracking.
arXiv Detail & Related papers (2022-04-18T05:29:54Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - Few-shot Natural Language Generation for Task-Oriented Dialog [113.07438787659859]
We present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems.
We develop the SC-GPT model, which is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability.
Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods.
arXiv Detail & Related papers (2020-02-27T18:48:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.