Structural Pre-training for Dialogue Comprehension
- URL: http://arxiv.org/abs/2105.10956v1
- Date: Sun, 23 May 2021 15:16:54 GMT
- Title: Structural Pre-training for Dialogue Comprehension
- Authors: Zhuosheng Zhang, Hai Zhao
- Abstract summary: We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
- Score: 51.215629336320305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (PrLMs) have demonstrated superior performance
due to their strong ability to learn universal language representations from
self-supervised pre-training. However, even with the help of the powerful
PrLMs, it is still challenging to effectively capture task-related knowledge
from dialogue texts which are enriched by correlations among speaker-aware
utterances. In this work, we present SPIDER, Structural Pre-traIned DialoguE
Reader, to capture dialogue exclusive features. To simulate the dialogue-like
features, we propose two training objectives in addition to the original LM
objectives: 1) utterance order restoration, which predicts the order of the
permuted utterances in dialogue context; 2) sentence backbone regularization,
which regularizes the model to improve the factual correctness of summarized
subject-verb-object triplets. Experimental results on widely used dialogue
benchmarks verify the effectiveness of the newly introduced self-supervised
tasks.
Related papers
- LEEETs-Dial: Linguistic Entrainment in End-to-End Task-oriented Dialogue systems [0.0]
We introduce methods for achieving dialogue entrainment in a GPT-2-based end-to-end task-oriented dialogue system.
We experiment with training instance weighting, entrainment-specific loss, and additional conditioning to generate responses that align with the user.
arXiv Detail & Related papers (2023-11-15T21:35:25Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension [42.57581945778631]
Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing.
We propose a novel type of dialogue summarization task - STRUctured DiaLoguE Summarization.
We show that our STRUDEL dialogue comprehension model can significantly improve the dialogue comprehension performance of transformer encoder language models.
arXiv Detail & Related papers (2022-12-24T04:39:54Z) - Self-supervised Dialogue Learning for Spoken Conversational Question
Answering [29.545937716796082]
In spoken conversational question answering (SCQA), the answer to the corresponding question is generated by retrieving and then analyzing a fixed spoken document, including multi-part conversations.
We introduce a self-supervised learning approach, including incoherence discrimination, insertion detection, and question prediction, to explicitly capture the coreference resolution and dialogue coherence.
Our proposed method provides more coherent, meaningful, and appropriate responses, yielding superior performance gains compared to the original pre-trained language models.
arXiv Detail & Related papers (2021-06-04T00:09:38Z) - Dialogue-oriented Pre-training [70.03028879331339]
We propose three strategies to simulate the conversation features on general plain text.
Dialog-PrLM is fine-tuned on three public multi-turn dialogue datasets.
arXiv Detail & Related papers (2021-06-01T12:02:46Z) - Filling the Gap of Utterance-aware and Speaker-aware Representation for
Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.
In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely.
We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.