Related papers: Topic Detection from Conversational Dialogue Corpus with Parallel Dirichlet Allocation Model and Elbow Method

Topic Detection from Conversational Dialogue Corpus with Parallel Dirichlet Allocation Model and Elbow Method

URL: http://arxiv.org/abs/2006.03353v1
Date: Fri, 5 Jun 2020 10:24:43 GMT
Title: Topic Detection from Conversational Dialogue Corpus with Parallel Dirichlet Allocation Model and Elbow Method
Authors: Haider Khalid, Vincent Wade
Abstract summary: We propose a topic detection approach with Parallel Latent Dirichlet Allocation (PLDA) Model. We use K-mean clustering with Elbow Method for interpretation and validation of consistency within-cluster analysis. The experimental results show that combining PLDA with Elbow method selects the optimal number of clusters and refines the topics for the conversation.
Score: 1.599072005190786
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A conversational system needs to know how to switch between topics to continue the conversation for a more extended period. For this topic detection from dialogue corpus has become an important task for a conversation and accurate prediction of conversation topics is important for creating coherent and engaging dialogue systems. In this paper, we proposed a topic detection approach with Parallel Latent Dirichlet Allocation (PLDA) Model by clustering a vocabulary of known similar words based on TF-IDF scores and Bag of Words (BOW) technique. In the experiment, we use K-mean clustering with Elbow Method for interpretation and validation of consistency within-cluster analysis to select the optimal number of clusters. We evaluate our approach by comparing it with traditional LDA and clustering technique. The experimental results show that combining PLDA with Elbow method selects the optimal number of clusters and refine the topics for the conversation.

Related papers

A Multi-view Discourse Framework for Integrating Semantic and Syntactic Features in Dialog Agents [0.0]
Multiturn dialogue models aim to generate human-like responses by leveraging conversational context. Existing methods often neglect the interactions between these utterances or treat all of them as equally significant. This paper introduces a discourse-aware framework for response selection in retrieval-based dialogue systems.
arXiv Detail & Related papers (2025-04-12T04:22:18Z)
Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective. We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way. We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z)
SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model. To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
arXiv Detail & Related papers (2023-07-02T13:36:36Z)
Multi-Granularity Prompts for Topic Shift Detection in Dialogue [13.739991183173494]
The goal of dialogue topic shift detection is to identify whether the current topic in a conversation has changed or needs to change. Previous work focused on detecting topic shifts using pre-trained models to encode the utterance. We take a prompt-based approach to fully extract topic information from dialogues at multiple-granularity, i.e., label, turn, and topic.
arXiv Detail & Related papers (2023-05-23T12:35:49Z)
Unsupervised Dialogue Topic Segmentation with Topic-aware Utterance Representation [51.22712675266523]
Dialogue Topic (DTS) plays an essential role in a variety of dialogue modeling tasks. We propose a novel unsupervised DTS framework, which learns topic-aware utterance representations from unlabeled dialogue data.
arXiv Detail & Related papers (2023-05-04T11:35:23Z)
Improve Retrieval-based Dialogue System via Syntax-Informed Attention [46.79601705850277]
We propose SIA, Syntax-Informed Attention, considering both intra- and inter-sentence syntax information. We evaluate our method on three widely used benchmarks and experimental results demonstrate the general superiority of our method on dialogue response selection.
arXiv Detail & Related papers (2023-03-12T08:14:16Z)
CluCDD:Contrastive Dialogue Disentanglement via Clustering [18.06976502939079]
A huge number of multi-participant dialogues happen online every day. Dialogue disentanglement aims at separating an entangled dialogue into detached sessions. We propose a model named CluCDD, which aggregates utterances by contrastive learning.
arXiv Detail & Related papers (2023-02-16T08:47:51Z)
Findings on Conversation Disentanglement [28.874162427052905]
We build a learning model that learns utterance-to-utterance and utterance-to-thread classification. Experiments on the Ubuntu IRC dataset show that this approach has the potential to outperform the conventional greedy approach.
arXiv Detail & Related papers (2021-12-10T05:54:48Z)
Response Selection for Multi-Party Conversations with Dynamic Topic Tracking [63.15158355071206]
We frame response selection as a dynamic topic tracking task to match the topic between the response and relevant conversation context. We propose a novel multi-task learning framework that supports efficient encoding through large pretrained models. Experimental results on the DSTC-8 Ubuntu IRC dataset show state-of-the-art results in response selection and topic disentanglement tasks.
arXiv Detail & Related papers (2020-10-15T14:21:38Z)
Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization [72.54873655114844]
Text summarization is one of the most challenging and interesting problems in NLP. This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations. Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
arXiv Detail & Related papers (2020-10-04T20:12:44Z)
Modeling Topical Relevance for Multi-Turn Dialogue Generation [61.87165077442267]
We propose a new model, named STAR-BTM, to tackle the problem of topic drift in multi-turn dialogue. The Biterm Topic Model is pre-trained on the whole training dataset. Then, the topic level attention weights are computed based on the topic representation of each context. Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2020-09-27T03:33:22Z)
Topic-Aware Multi-turn Dialogue Modeling [91.52820664879432]
This paper presents a novel solution for multi-turn dialogue modeling, which segments and extracts topic-aware utterances in an unsupervised way. Our topic-aware modeling is implemented by a newly proposed unsupervised topic-aware segmentation algorithm and Topic-Aware Dual-attention Matching (TADAM) Network.
arXiv Detail & Related papers (2020-09-26T08:43:06Z)
A Hybrid Framework for Topic Structure using Laughter Occurrences [0.3680403821470856]
In this work we combine both paralinguistic and linguistic knowledge into a hybrid framework through a multi-level hierarchy. The laughter occurrences are used as paralinguistic information from the multiparty meeting transcripts of ICSI database. This training-free topic structuring approach can be applicable to online understanding of spoken dialogs.
arXiv Detail & Related papers (2019-12-31T23:31:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.