Topic Detection from Conversational Dialogue Corpus with Parallel
  Dirichlet Allocation Model and Elbow Method
        - URL: http://arxiv.org/abs/2006.03353v1
- Date: Fri, 5 Jun 2020 10:24:43 GMT
- Title: Topic Detection from Conversational Dialogue Corpus with Parallel
  Dirichlet Allocation Model and Elbow Method
- Authors: Haider Khalid, Vincent Wade
- Abstract summary: We propose a topic detection approach with Parallel Latent Dirichlet Allocation (PLDA) Model.
We use K-mean clustering with Elbow Method for interpretation and validation of consistency within-cluster analysis.
The experimental results show that combining PLDA with Elbow method selects the optimal number of clusters and refines the topics for the conversation.
- Score: 1.599072005190786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   A conversational system needs to know how to switch between topics to
continue the conversation for a more extended period. For this topic detection
from dialogue corpus has become an important task for a conversation and
accurate prediction of conversation topics is important for creating coherent
and engaging dialogue systems. In this paper, we proposed a topic detection
approach with Parallel Latent Dirichlet Allocation (PLDA) Model by clustering a
vocabulary of known similar words based on TF-IDF scores and Bag of Words (BOW)
technique. In the experiment, we use K-mean clustering with Elbow Method for
interpretation and validation of consistency within-cluster analysis to select
the optimal number of clusters. We evaluate our approach by comparing it with
traditional LDA and clustering technique. The experimental results show that
combining PLDA with Elbow method selects the optimal number of clusters and
refine the topics for the conversation.
 
      
        Related papers
        - A Multi-view Discourse Framework for Integrating Semantic and Syntactic   Features in Dialog Agents [0.0]
 Multiturn dialogue models aim to generate human-like responses by leveraging conversational context.
Existing methods often neglect the interactions between these utterances or treat all of them as equally significant.
This paper introduces a discourse-aware framework for response selection in retrieval-based dialogue systems.
 arXiv  Detail & Related papers  (2025-04-12T04:22:18Z)
- Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
 This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
 arXiv  Detail & Related papers  (2023-09-18T11:03:55Z)
- SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
 We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model.
To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
 arXiv  Detail & Related papers  (2023-07-02T13:36:36Z)
- Multi-Granularity Prompts for Topic Shift Detection in Dialogue [13.739991183173494]
 The goal of dialogue topic shift detection is to identify whether the current topic in a conversation has changed or needs to change.
Previous work focused on detecting topic shifts using pre-trained models to encode the utterance.
We take a prompt-based approach to fully extract topic information from dialogues at multiple-granularity, i.e., label, turn, and topic.
 arXiv  Detail & Related papers  (2023-05-23T12:35:49Z)
- Unsupervised Dialogue Topic Segmentation with Topic-aware Utterance
  Representation [51.22712675266523]
 Dialogue Topic (DTS) plays an essential role in a variety of dialogue modeling tasks.
We propose a novel unsupervised DTS framework, which learns topic-aware utterance representations from unlabeled dialogue data.
 arXiv  Detail & Related papers  (2023-05-04T11:35:23Z)
- Improve Retrieval-based Dialogue System via Syntax-Informed Attention [46.79601705850277]
 We propose SIA, Syntax-Informed Attention, considering both intra- and inter-sentence syntax information.
We evaluate our method on three widely used benchmarks and experimental results demonstrate the general superiority of our method on dialogue response selection.
 arXiv  Detail & Related papers  (2023-03-12T08:14:16Z)
- CluCDD:Contrastive Dialogue Disentanglement via Clustering [18.06976502939079]
 A huge number of multi-participant dialogues happen online every day.
 Dialogue disentanglement aims at separating an entangled dialogue into detached sessions.
We propose a model named CluCDD, which aggregates utterances by contrastive learning.
 arXiv  Detail & Related papers  (2023-02-16T08:47:51Z)
- Findings on Conversation Disentanglement [28.874162427052905]
 We build a learning model that learns utterance-to-utterance and utterance-to-thread classification.
Experiments on the Ubuntu IRC dataset show that this approach has the potential to outperform the conventional greedy approach.
 arXiv  Detail & Related papers  (2021-12-10T05:54:48Z)
- Response Selection for Multi-Party Conversations with Dynamic Topic
  Tracking [63.15158355071206]
 We frame response selection as a dynamic topic tracking task to match the topic between the response and relevant conversation context.
We propose a novel multi-task learning framework that supports efficient encoding through large pretrained models.
 Experimental results on the DSTC-8 Ubuntu IRC dataset show state-of-the-art results in response selection and topic disentanglement tasks.
 arXiv  Detail & Related papers  (2020-10-15T14:21:38Z)
- Multi-View Sequence-to-Sequence Models with Conversational Structure for
  Abstractive Dialogue Summarization [72.54873655114844]
 Text summarization is one of the most challenging and interesting problems in NLP.
This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations.
 Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
 arXiv  Detail & Related papers  (2020-10-04T20:12:44Z)
- Modeling Topical Relevance for Multi-Turn Dialogue Generation [61.87165077442267]
 We propose a new model, named STAR-BTM, to tackle the problem of topic drift in multi-turn dialogue.
The Biterm Topic Model is pre-trained on the whole training dataset. Then, the topic level attention weights are computed based on the topic representation of each context.
 Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods.
 arXiv  Detail & Related papers  (2020-09-27T03:33:22Z)
- Topic-Aware Multi-turn Dialogue Modeling [91.52820664879432]
 This paper presents a novel solution for multi-turn dialogue modeling, which segments and extracts topic-aware utterances in an unsupervised way.
Our topic-aware modeling is implemented by a newly proposed unsupervised topic-aware segmentation algorithm and Topic-Aware Dual-attention Matching (TADAM) Network.
 arXiv  Detail & Related papers  (2020-09-26T08:43:06Z)
- A Hybrid Framework for Topic Structure using Laughter Occurrences [0.3680403821470856]
 In this work we combine both paralinguistic and linguistic knowledge into a hybrid framework through a multi-level hierarchy.
The laughter occurrences are used as paralinguistic information from the multiparty meeting transcripts of ICSI database.
This training-free topic structuring approach can be applicable to online understanding of spoken dialogs.
 arXiv  Detail & Related papers  (2019-12-31T23:31:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.