Task-Oriented Dialogue System as Natural Language Generation
- URL: http://arxiv.org/abs/2108.13679v2
- Date: Wed, 1 Sep 2021 07:33:04 GMT
- Title: Task-Oriented Dialogue System as Natural Language Generation
- Authors: Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen and
Weihua Luo
- Abstract summary: We propose to formulate the task-oriented dialogue system as the purely natural language generation task.
This method heavily suffers from the dialogue entity inconsistency caused by the removal of delexicalized tokens.
We design a novel GPT-Adapter-CopyNet network, which incorporates the lightweight adapter and CopyNet modules into GPT-2 to achieve better performance.
- Score: 29.870260635814436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose to formulate the task-oriented dialogue system as
the purely natural language generation task, so as to fully leverage the
large-scale pre-trained models like GPT-2 and simplify complicated
delexicalization prepossessing. However, directly applying this method heavily
suffers from the dialogue entity inconsistency caused by the removal of
delexicalized tokens, as well as the catastrophic forgetting problem of the
pre-trained model during fine-tuning, leading to unsatisfactory performance. To
alleviate these problems, we design a novel GPT-Adapter-CopyNet network, which
incorporates the lightweight adapter and CopyNet modules into GPT-2 to achieve
better performance on transfer learning and dialogue entity generation.
Experimental results conducted on the DSTC8 Track 1 benchmark and MultiWOZ
dataset demonstrate that our proposed approach significantly outperforms
baseline models with a remarkable performance on automatic and human
evaluations.
Related papers
- Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards [4.334100270812517]
Large language models (LLMs) struggle with technical standards in telecommunications.
We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 small language model (SLM)
Our experiments demonstrate substantial improvements over existing question-answering approaches in the telecom domain.
arXiv Detail & Related papers (2024-08-21T17:00:05Z) - LEEETs-Dial: Linguistic Entrainment in End-to-End Task-oriented Dialogue systems [0.0]
We introduce methods for achieving dialogue entrainment in a GPT-2-based end-to-end task-oriented dialogue system.
We experiment with training instance weighting, entrainment-specific loss, and additional conditioning to generate responses that align with the user.
arXiv Detail & Related papers (2023-11-15T21:35:25Z) - Enhancing Performance on Seen and Unseen Dialogue Scenarios using
Retrieval-Augmented End-to-End Task-Oriented System [89.40590076430297]
This work enables the TOD systems with more flexibility through a simple cache.
We train end-to-end TOD models that can refer to and ground on both dialogue history and retrieved information during TOD generation.
Experiments demonstrate the superior performance of our framework, with a notable improvement in non-empty joint goal accuracy by 6.7% compared to strong baselines.
arXiv Detail & Related papers (2023-08-16T06:52:10Z) - BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained
Transformer [77.28871523946418]
BatGPT is a large-scale language model designed and trained jointly by Wuhan University and Shanghai Jiao Tong University.
It is capable of generating highly natural and fluent text in response to various types of input, including text prompts, images, and audio.
arXiv Detail & Related papers (2023-07-01T15:10:01Z) - SimOAP: Improve Coherence and Consistency in Persona-based Dialogue
Generation via Over-sampling and Post-evaluation [54.66399120084227]
Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue.
For the persona-based dialogue generation task, consistency and coherence are great challenges for language models.
A two-stage SimOAP strategy is proposed, i.e., over-sampling and post-evaluation.
arXiv Detail & Related papers (2023-05-18T17:23:00Z) - Task-Optimized Adapters for an End-to-End Task-Oriented Dialogue System [0.0]
We propose an End-to-end TOD system with Task-d Adapters which learn independently per task, adding only small number of parameters after fixed layers of pre-trained network.
Our method is a model-agnostic approach and does not require prompt-tuning as only input data without a prompt.
arXiv Detail & Related papers (2023-05-04T00:17:49Z) - Context Matters in Semantically Controlled Language Generation for
Task-oriented Dialogue Systems [6.1478669848771546]
This work combines information about the dialogue history encoded by pre-trained model with a meaning representation of the current system utterance to realize contextual language generation in task-oriented dialogues.
We utilize the pre-trained multi-context ConveRT model for context representation in a model trained from scratch; and leverage the immediate preceding user utterance for context generation in a model adapted from the pre-trained GPT-2.
arXiv Detail & Related papers (2021-11-28T11:48:02Z) - Robust Dialogue Utterance Rewriting as Sequence Tagging [62.12912805378693]
The task of dialogue rewriting aims to reconstruct the latest dialogue utterance by copying the missing content from the dialogue context.
Until now, the existing models for this task suffer from the robustness issue, i.e., performances drop dramatically when testing on a different domain.
We propose a novel sequence-tagging-based fluency model so that the search space is significantly reduced.
arXiv Detail & Related papers (2020-12-29T00:05:35Z) - Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation [77.62366712130196]
We present the winning entry at the fast domain adaptation task of DSTC8, a hybrid generative-retrieval model based on GPT-2 fine-tuned to the multi-domain MetaLWOz dataset.
Our model uses retrieval logic as a fallback, being SoTA on MetaLWOz in human evaluation (>4% improvement over the 2nd place system) and attaining competitive generalization performance in adaptation to the unseen MultiWOZ dataset.
arXiv Detail & Related papers (2020-03-03T18:07:42Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.