Related papers: ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an Opportunity?

ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an Opportunity?

URL: http://arxiv.org/abs/2306.01386v1
Date: Fri, 2 Jun 2023 09:15:01 GMT
Title: ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an Opportunity?
Authors: Michael Heck, Nurul Lubis, Benjamin Ruppik, Renato Vukovic, Shutong Feng, Christian Geishauser, Hsien-Chin Lin, Carel van Niekerk, Milica Ga\v{s}i\'c
Abstract summary: We present preliminary experimental results on the ChatGPT research preview, showing that ChatGPT achieves state-of-the-art performance in zero-shot DST. We theorize that the in-context learning capabilities of such models will likely become powerful tools to support the development of dedicated and dynamic dialogue state trackers.
Score: 2.3555053092246125
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent research on dialogue state tracking (DST) focuses on methods that allow few- and zero-shot transfer to new domains or schemas. However, performance gains heavily depend on aggressive data augmentation and fine-tuning of ever larger language model based architectures. In contrast, general purpose language models, trained on large amounts of diverse data, hold the promise of solving any kind of task without task-specific training. We present preliminary experimental results on the ChatGPT research preview, showing that ChatGPT achieves state-of-the-art performance in zero-shot DST. Despite our findings, we argue that properties inherent to general purpose models limit their ability to replace specialized systems. We further theorize that the in-context learning capabilities of such models will likely become powerful tools to support the development of dedicated and dynamic dialogue state trackers.

Related papers

Evaluating and Enhancing Out-of-Domain Generalization of Task-Oriented Dialog Systems for Task Completion without Turn-level Dialog Annotations [2.453775887722866]
This work explores whether large language models (LLMs) can be fine-tuned solely on natural language dialogs to perform ToD tasks, without requiring such annotations. We find that models fine-tuned without turn-level annotations generate coherent and contextually appropriate responses. We propose ZeroToD, a framework that incorporates a schema augmentation mechanism to enhance API call accuracy and overall task completion rates.
arXiv Detail & Related papers (2025-02-18T22:10:51Z)
Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks. Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z)
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling [42.00097476584174]
We propose a novel approach for solving dialogue state tracking with large language models (LLMs) through function calling. This method improves zero-shot DST, allowing adaptation to diverse domains without extensive data collection or model tuning. We show that our approach achieves exceptional performance with both modestly sized open-source and also proprietary LLMs.
arXiv Detail & Related papers (2024-02-16T06:13:18Z)
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models. A meta-model can learn on self-supervised prompts consisting of tailored demonstrations. Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z)
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial. We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z)
Plex: Towards Reliability using Pretrained Large Model Extensions [69.13326436826227]
We develop ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples.
arXiv Detail & Related papers (2022-07-15T11:39:37Z)
Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking [25.28975864365579]
We propose a novel NLP task for self-tracking that extracts close- and open-ended information from a retrospective activity log. The framework augments the prompt using synthetic samples to transform the task into 10-shot learning, to address a cold-start problem in bootstrapping a new tracking topic.
arXiv Detail & Related papers (2022-05-31T01:58:04Z)
A Study on Prompt-based Few-Shot Learning Methods for Belief State Tracking in Task-oriented Dialog Systems [10.024834304960846]
We tackle the Dialogue Belief State Tracking problem of task-oriented conversational systems. Recent approaches to this problem leveraging Transformer-based models have yielded great results. We explore prompt-based few-shot learning for Dialogue Belief State Tracking.
arXiv Detail & Related papers (2022-04-18T05:29:54Z)
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains. RADDLE is designed to favor and encourage models with a strong generalization ability. We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z)
Few-shot Natural Language Generation for Task-Oriented Dialog [113.07438787659859]
We present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems. We develop the SC-GPT model, which is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability. Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods.
arXiv Detail & Related papers (2020-02-27T18:48:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.