Dialogue-oriented Pre-training
- URL: http://arxiv.org/abs/2106.00420v1
- Date: Tue, 1 Jun 2021 12:02:46 GMT
- Title: Dialogue-oriented Pre-training
- Authors: Yi Xu, Hai Zhao
- Abstract summary: We propose three strategies to simulate the conversation features on general plain text.
Dialog-PrLM is fine-tuned on three public multi-turn dialogue datasets.
- Score: 70.03028879331339
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (PrLM) has been shown powerful in enhancing a
broad range of downstream tasks including various dialogue related ones.
However, PrLMs are usually trained on general plain text with common language
model (LM) training objectives, which cannot sufficiently capture dialogue
exclusive features due to the limitation of such training setting, so that
there is an immediate need to fill the gap between a specific dialogue task and
the LM task. As it is unlikely to collect huge dialogue data for
dialogue-oriented pre-training, in this paper, we propose three strategies to
simulate the conversation features on general plain text. Our proposed method
differs from existing post-training methods that it may yield a general-purpose
PrLM and does not individualize to any detailed task while keeping the
capability of learning dialogue related features including speaker awareness,
continuity and consistency. The resulted Dialog-PrLM is fine-tuned on three
public multi-turn dialogue datasets and helps achieve significant and
consistent improvement over the plain PrLMs.
Related papers
- Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task.
Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts.
Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z) - Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue
Representation Learning [50.5572111079898]
Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc.
While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive.
In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks.
arXiv Detail & Related papers (2020-02-27T04:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.