Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair
Coherence Scoring
- URL: http://arxiv.org/abs/2106.06719v1
- Date: Sat, 12 Jun 2021 08:49:20 GMT
- Title: Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair
Coherence Scoring
- Authors: Linzi Xing, Giuseppe Carenini
- Abstract summary: We present a strategy to generate a training corpus for utterance-pair coherence scoring.
Then, we train a BERT-based neural utterance-pair coherence model with the obtained training corpus.
Finally, such model is used to measure the topical relevance between utterances, acting as the basis of the segmentation inference.
- Score: 8.31009800792799
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dialogue topic segmentation is critical in several dialogue modeling
problems. However, popular unsupervised approaches only exploit surface
features in assessing topical coherence among utterances. In this work, we
address this limitation by leveraging supervisory signals from the
utterance-pair coherence scoring task. First, we present a simple yet effective
strategy to generate a training corpus for utterance-pair coherence scoring.
Then, we train a BERT-based neural utterance-pair coherence model with the
obtained training corpus. Finally, such model is used to measure the topical
relevance between utterances, acting as the basis of the segmentation
inference. Experiments on three public datasets in English and Chinese
demonstrate that our proposal outperforms the state-of-the-art baselines.
Related papers
- Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Unsupervised Dialogue Topic Segmentation with Topic-aware Utterance
Representation [51.22712675266523]
Dialogue Topic (DTS) plays an essential role in a variety of dialogue modeling tasks.
We propose a novel unsupervised DTS framework, which learns topic-aware utterance representations from unlabeled dialogue data.
arXiv Detail & Related papers (2023-05-04T11:35:23Z) - Improving Topic Segmentation by Injecting Discourse Dependencies [29.353285741379334]
We present a discourse-aware neural topic segmentation model with the injection of above-sentence discourse dependency structures.
Our empirical study on English evaluation datasets shows that injecting above-sentence discourse structures to a neural topic segmenter can substantially improve its performances.
arXiv Detail & Related papers (2022-09-18T18:22:25Z) - Opponent Modeling in Negotiation Dialogues by Related Data Adaptation [20.505272677769355]
We propose a ranker for identifying priorities from negotiation dialogues.
The model takes in a partial dialogue as input and predicts the priority order of the opponent.
We show the utility of our proposed approach through extensive experiments based on two dialogue datasets.
arXiv Detail & Related papers (2022-04-30T21:11:41Z) - FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment
Act Flows [63.116280145770006]
We propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it.
To utilize segment act flows, sequences of segment acts, for evaluation, we develop the first consensus-based dialogue evaluation framework, FlowEval.
arXiv Detail & Related papers (2022-02-14T11:37:20Z) - DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings [33.89889949577356]
We propose DialogueCSE, a dialogue-based contrastive learning approach to tackle this issue.
We evaluate our model on three multi-turn dialogue datasets: the Microsoft Dialogue Corpus, the Jing Dong Dialogue Corpus, and the E-commerce Dialogue Corpus.
arXiv Detail & Related papers (2021-09-26T13:25:41Z) - DialogBERT: Discourse-Aware Response Generation via Learning to Recover
and Rank Utterances [18.199473005335093]
This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models.
To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression.
Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
arXiv Detail & Related papers (2020-12-03T09:06:23Z) - Probing Task-Oriented Dialogue Representation from Language Models [106.02947285212132]
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way.
arXiv Detail & Related papers (2020-10-26T21:34:39Z) - Modeling Topical Relevance for Multi-Turn Dialogue Generation [61.87165077442267]
We propose a new model, named STAR-BTM, to tackle the problem of topic drift in multi-turn dialogue.
The Biterm Topic Model is pre-trained on the whole training dataset. Then, the topic level attention weights are computed based on the topic representation of each context.
Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2020-09-27T03:33:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.