Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
- URL: http://arxiv.org/abs/2301.03953v2
- Date: Wed, 11 Jan 2023 02:21:04 GMT
- Title: Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
- Authors: Zhuosheng Zhang, Hai Zhao, Longxiang Liu
- Abstract summary: We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
- Score: 81.47133615169203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training machines to understand natural language and interact with humans is
one of the major goals of artificial intelligence. Recent years have witnessed
an evolution from matching networks to pre-trained language models (PrLMs). In
contrast to the plain-text modeling as the focus of the PrLMs, dialogue texts
involve multiple speakers and reflect special characteristics such as topic
transitions and structure dependencies between distant utterances. However, the
related PrLM models commonly represent dialogues sequentially by processing the
pairwise dialogue history as a whole. Thus the hierarchical information on
either utterance interrelation or speaker roles coupled in such representations
is not well addressed. In this work, we propose compositional learning for
holistic interaction across the utterances beyond the sequential
contextualization from PrLMs, in order to capture the utterance-aware and
speaker-aware representations entailed in a dialogue history. We decouple the
contextualized word representations by masking mechanisms in Transformer-based
PrLM, making each word only focus on the words in current utterance, other
utterances, and two speaker roles (i.e., utterances of sender and utterances of
the receiver), respectively. In addition, we employ domain-adaptive training
strategies to help the model adapt to the dialogue domains. Experimental
results show that our method substantially boosts the strong PrLM baselines in
four public benchmark datasets, achieving new state-of-the-art performance over
previous methods.
Related papers
- SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization [48.284512017469524]
Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations.
Traditional language models often overlook the distinct features of these dialogues by treating them as regular text.
We propose a speaker-enhanced pre-training method for long dialogue summarization.
arXiv Detail & Related papers (2024-01-31T04:50:00Z) - Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics.
We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context.
Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - Advances in Multi-turn Dialogue Comprehension: A Survey [51.215629336320305]
Training machines to understand natural language and interact with humans is an elusive and essential task of artificial intelligence.
This paper reviews the previous methods from the technical perspective of dialogue modeling for the dialogue comprehension task.
In addition, we categorize dialogue-related pre-training techniques which are employed to enhance PrLMs in dialogue scenarios.
arXiv Detail & Related papers (2021-10-11T03:52:37Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z) - Filling the Gap of Utterance-aware and Speaker-aware Representation for
Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.
In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely.
We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.