Related papers: Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots

Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots

URL: http://arxiv.org/abs/2004.03588v2
Date: Thu, 30 Jul 2020 01:27:11 GMT
Title: Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots
Authors: Jia-Chen Gu, Tianda Li, Quan Liu, Zhen-Hua Ling, Zhiming Su, Si Wei, Xiaodan Zhu
Abstract summary: A new model, named Speaker-Aware BERT (SA-BERT), is proposed to make the model aware of the speaker change information. A speaker-aware disentanglement strategy is proposed to tackle the entangled dialogues.
Score: 47.40380290055558
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study the problem of employing pre-trained language models for multi-turn response selection in retrieval-based chatbots. A new model, named Speaker-Aware BERT (SA-BERT), is proposed in order to make the model aware of the speaker change information, which is an important and intrinsic property of multi-turn dialogues. Furthermore, a speaker-aware disentanglement strategy is proposed to tackle the entangled dialogues. This strategy selects a small number of most important utterances as the filtered context according to the speakers' information in them. Finally, domain adaptation is performed to incorporate the in-domain knowledge into pre-trained language models. Experiments on five public datasets show that our proposed model outperforms the present models on all metrics by large margins and achieves new state-of-the-art performances for multi-turn response selection.

Related papers

Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z)
Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling [13.628984890958314]
We introduce a data-centric customization approach for efficiently enhancing multimodal understanding in conversational speech modeling. Our approach achieves state-of-the-art performance on the Spoken-SQuAD benchmark, using only 10% of the training data with open-weight models. We also introduce ASK-QA, the first dataset for multi-turn spoken dialogue with ambiguous user requests and dynamic evaluation inputs.
arXiv Detail & Related papers (2024-12-20T15:43:09Z)
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models [59.80042864360884]
Speaker-attributed automatic speech recognition (SA-ASR) aims to transcribe speech while assigning transcripts to the corresponding speakers accurately. This paper introduces a novel approach, leveraging a frozen multilingual ASR model to incorporate speaker attribution into the transcriptions.
arXiv Detail & Related papers (2024-11-27T09:01:08Z)
Speechworthy Instruction-tuned Language Models [71.8586707840169]
We show that both prompting and preference learning increase the speech-suitability of popular instruction-tuned LLMs. We share lexical, syntactical, and qualitative analyses to showcase how each method contributes to improving the speech-suitability of generated responses.
arXiv Detail & Related papers (2024-09-23T02:34:42Z)
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition [48.527630771422935]
We propose a synthetic data generation pipeline for multi-speaker conversational ASR. We conduct evaluation by fine-tuning the Whisper ASR model for telephone and distant conversational speech settings.
arXiv Detail & Related papers (2024-08-17T14:47:05Z)
SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization [48.284512017469524]
Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations. Traditional language models often overlook the distinct features of these dialogues by treating them as regular text. We propose a speaker-enhanced pre-training method for long dialogue summarization.
arXiv Detail & Related papers (2024-01-31T04:50:00Z)
Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension [46.69961067676279]
Multi-party dialogue machine reading comprehension (MRC) brings tremendous challenge since it involves multiple speakers at one dialogue. Previous models focus on how to incorporate speaker information flows using complex graph-based modules. In this paper, we design two labour-free self- and pseudo-self-supervised prediction tasks on speaker and key-utterance to implicitly model the speaker information flows.
arXiv Detail & Related papers (2021-09-08T16:51:41Z)
Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring [60.55025339250815]
We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling. We take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context from these responses and feed them as additional speaker-specific context to our network to score a particular response.
arXiv Detail & Related papers (2021-08-30T07:00:28Z)
Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles. In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely. We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection [11.465266718370536]
We study the task of selecting the optimal response given a user and system utterance history in retrieval-based dialog systems. We propose utterance manipulation strategies (UMS) to address this problem. UMS consist of several strategies (i.e., insertion, deletion, and search) which aid the response selection model towards maintaining dialog coherence.
arXiv Detail & Related papers (2020-09-10T07:39:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.