Contrastive Speaker-Aware Learning for Multi-party Dialogue Generation with LLMs
- URL: http://arxiv.org/abs/2503.08842v1
- Date: Tue, 11 Mar 2025 19:28:12 GMT
- Title: Contrastive Speaker-Aware Learning for Multi-party Dialogue Generation with LLMs
- Authors: Tianyu Sun, Kun Qian, Wenhong Wang,
- Abstract summary: Multi-party dialogue generation presents significant challenges due to the complex interplay of multiple speakers and interwoven conversational threads.<n>This paper introduces Speaker-Attentive LLM (SA-LLM), a novel generative model that leverages pre-trained Large Language Models (LLMs) and a speaker-aware contrastive learning strategy to address these challenges.<n>SA-LLM incorporates a speaker-attributed input encoding and a contrastive learning objective to implicitly learn contextual coherence and speaker roles without explicit relation annotations.
- Score: 4.691083532629246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-party dialogue generation presents significant challenges due to the complex interplay of multiple speakers and interwoven conversational threads. Traditional approaches often fall short in capturing these complexities, particularly when relying on manually annotated dialogue relations. This paper introduces Speaker-Attentive LLM (SA-LLM), a novel generative model that leverages pre-trained Large Language Models (LLMs) and a speaker-aware contrastive learning strategy to address these challenges. SA-LLM incorporates a speaker-attributed input encoding and a contrastive learning objective to implicitly learn contextual coherence and speaker roles without explicit relation annotations. Extensive experiments on the Ubuntu IRC and Movie Dialogues datasets demonstrate that SA-LLM significantly outperforms state-of-the-art baselines in automatic and human evaluations, achieving superior performance in fluency, coherence, informativeness, and response diversity. Ablation studies and detailed error analyses further validate the effectiveness of the proposed speaker-attentive training approach, highlighting its robustness across different speaker roles and context lengths. The results underscore the potential of SA-LLM as a powerful and annotation-free solution for high-quality multi-party dialogue generation.
Related papers
- Advancing Multi-Party Dialogue Systems with Speaker-ware Contrastive Learning [10.678477576849579]
We propose Contrastive learning-based Multi-party dialogue Response generation model.<n>CMR uses self-supervised contrastive learning to better distinguish "who says what"<n>CMR significantly outperforms state-of-the-art models in multi-party dialogue response tasks.
arXiv Detail & Related papers (2025-01-20T06:28:22Z) - Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization [25.213694510527436]
Most existing speaker diarization systems rely exclusively on unimodal acoustic information.
We propose a novel multimodal approach that jointly utilizes audio, visual, and semantic cues to enhance speaker diarization.
Our approach consistently outperforms state-of-the-art speaker diarization methods.
arXiv Detail & Related papers (2024-08-22T03:34:03Z) - SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization [48.284512017469524]
Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations.
Traditional language models often overlook the distinct features of these dialogues by treating them as regular text.
We propose a speaker-enhanced pre-training method for long dialogue summarization.
arXiv Detail & Related papers (2024-01-31T04:50:00Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Enhanced Speaker-aware Multi-party Multi-turn Dialogue Comprehension [43.352833140317486]
Multi-party multi-turn dialogue comprehension brings unprecedented challenges.
Most existing methods deal with dialogue contexts as plain texts.
We propose an enhanced speaker-aware model with masking attention and heterogeneous graph networks.
arXiv Detail & Related papers (2021-09-09T07:12:22Z) - Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance
for Multi-party Dialogue Reading Comprehension [46.69961067676279]
Multi-party dialogue machine reading comprehension (MRC) brings tremendous challenge since it involves multiple speakers at one dialogue.
Previous models focus on how to incorporate speaker information flows using complex graph-based modules.
In this paper, we design two labour-free self- and pseudo-self-supervised prediction tasks on speaker and key-utterance to implicitly model the speaker information flows.
arXiv Detail & Related papers (2021-09-08T16:51:41Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z) - Filling the Gap of Utterance-aware and Speaker-aware Representation for
Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.
In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely.
We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.