Few-shot Natural Language Generation for Task-Oriented Dialog
- URL: http://arxiv.org/abs/2002.12328v1
- Date: Thu, 27 Feb 2020 18:48:33 GMT
- Title: Few-shot Natural Language Generation for Task-Oriented Dialog
- Authors: Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li,
Michael Zeng, and Jianfeng Gao
- Abstract summary: We present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems.
We develop the SC-GPT model, which is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability.
Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods.
- Score: 113.07438787659859
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a crucial component in task-oriented dialog systems, the Natural Language
Generation (NLG) module converts a dialog act represented in a semantic form
into a response in natural language. The success of traditional template-based
or statistical models typically relies on heavily annotated data, which is
infeasible for new domains. Therefore, it is pivotal for an NLG system to
generalize well with limited labelled data in real applications. To this end,
we present FewShotWoz, the first NLG benchmark to simulate the few-shot
learning setting in task-oriented dialog systems. Further, we develop the
SC-GPT model. It is pre-trained on a large set of annotated NLG corpus to
acquire the controllable generation ability, and fine-tuned with only a few
domain-specific labels to adapt to new domains. Experiments on FewShotWoz and
the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly
outperforms existing methods, measured by various automatic metrics and human
evaluations.
Related papers
- Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests [6.33281463741573]
Indirect User Requests (IURs) are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener.
While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints.
arXiv Detail & Related papers (2024-06-12T01:18:04Z) - DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context
Tuning [7.5700317050237365]
We propose DiSTRICT, a generalizable in-context tuning approach for Dialogue State Tracking (DST)
DSTRICT retrieves highly relevant training examples for a given dialogue to fine-tune the model without any hand-crafted templates.
Experiments with the MultiWOZ benchmark datasets show that DiSTRICT outperforms existing approaches in various zero-shot and few-shot settings.
arXiv Detail & Related papers (2022-12-06T09:40:15Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Self-augmented Data Selection for Few-shot Dialogue Generation [18.794770678708637]
We adopt the self-training framework to deal with the few-shot MR-to-Text generation problem.
We propose a novel data selection strategy to select the data that our generation model is most uncertain about.
arXiv Detail & Related papers (2022-05-19T16:25:50Z) - Compression, Transduction, and Creation: A Unified Framework for
Evaluating Natural Language Generation [85.32991360774447]
Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives.
We propose a unifying perspective based on the nature of information change in NLG tasks.
We develop a family of interpretable metrics that are suitable for evaluating key aspects of different NLG tasks.
arXiv Detail & Related papers (2021-09-14T01:00:42Z) - AUGNLG: Few-shot Natural Language Generation using Self-trained Data
Augmentation [26.016540126949103]
This paper proposes AUGNLG, a novel data augmentation approach that combines a self-trained neural retrieval model with a few-shot learned NLU model.
The proposed system mostly outperforms the state-of-the-art methods on the FewShotWOZ data in both BLEU and Slot Error Rate.
arXiv Detail & Related papers (2021-06-10T08:45:28Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - Schema-Guided Natural Language Generation [13.11874946084068]
We present the novel task ofGuided Natural Language Generation (SG-NLG)
In SG-NLG, the goal is still to generate a natural language prompt, but in SG-NLG, the input MRs are paired with rich schemata providing contextual information.
We train different state-of-the-art models for neural natural language generation on this dataset and show that in many cases, including rich schema information allows our models to produce higher quality outputs.
arXiv Detail & Related papers (2020-05-11T23:01:22Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.