Coordinated Topic Modeling
- URL: http://arxiv.org/abs/2210.08559v1
- Date: Sun, 16 Oct 2022 15:10:54 GMT
- Title: Coordinated Topic Modeling
- Authors: Pritom Saha Akash and Jie Huang and Kevin Chen-Chuan Chang
- Abstract summary: We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus.
We design ECTM, an embedding-based coordinated topic model that effectively uses the reference representation to capture the target corpus-specific aspects.
In ECTM, we introduce the topic- and document-level supervision with a self-training mechanism to solve the problem.
- Score: 10.710176350043998
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new problem called coordinated topic modeling that imitates
human behavior while describing a text corpus. It considers a set of
well-defined topics like the axes of a semantic space with a reference
representation. It then uses the axes to model a corpus for easily
understandable representation. This new task helps represent a corpus more
interpretably by reusing existing knowledge and benefits the corpora comparison
task. We design ECTM, an embedding-based coordinated topic model that
effectively uses the reference representation to capture the target
corpus-specific aspects while maintaining each topic's global semantics. In
ECTM, we introduce the topic- and document-level supervision with a
self-training mechanism to solve the problem. Finally, extensive experiments on
multiple domains show the superiority of our model over other baselines.
Related papers
- Interactive Topic Models with Optimal Transport [75.26555710661908]
We present EdTM, as an approach for label name supervised topic modeling.
EdTM models topic modeling as an assignment problem while leveraging LM/LLM based document-topic affinities.
arXiv Detail & Related papers (2024-06-28T13:57:27Z) - TopicAdapt- An Inter-Corpora Topics Adaptation Approach [27.450275637652418]
This paper proposes a neural topic model, TopicAdapt, that can adapt relevant topics from a related source corpus and also discover new topics in a target corpus that are absent in the source corpus.
Experiments over multiple datasets from diverse domains show the superiority of the proposed model against the state-of-the-art topic models.
arXiv Detail & Related papers (2023-10-08T02:56:44Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Dialogue Meaning Representation for Task-Oriented Dialogue Systems [51.91615150842267]
We propose Dialogue Meaning Representation (DMR), a flexible and easily extendable representation for task-oriented dialogue.
Our representation contains a set of nodes and edges with inheritance hierarchy to represent rich semantics for compositional semantics and task-specific concepts.
We propose two evaluation tasks to evaluate different machine learning based dialogue models, and further propose a novel coreference resolution model GNNCoref for the graph-based coreference resolution task.
arXiv Detail & Related papers (2022-04-23T04:17:55Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - TopicNet: Semantic Graph-Guided Topic Discovery [51.71374479354178]
Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner.
We introduce TopicNet as a deep hierarchical topic model that can inject prior structural knowledge as an inductive bias to influence learning.
arXiv Detail & Related papers (2021-10-27T09:07:14Z) - Semiparametric Latent Topic Modeling on Consumer-Generated Corpora [0.0]
This paper proposes semiparametric topic model, a two-step approach utilizing nonnegative matrix factorization and semiparametric regression in topic modeling.
The model enables the reconstruction of sparse topic structures in the corpus and provides a generative model for predicting topics in new documents entering the corpus.
In an actual consumer feedback corpus, the model also demonstrably provides interpretable and useful topic definitions comparable with those produced by other methods.
arXiv Detail & Related papers (2021-07-13T00:22:02Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Query-Driven Topic Model [23.07260625816975]
One desirable property of topic models is to allow users to find topics describing a specific aspect of the corpus.
We propose a novel query-driven topic model that allows users to specify a simple query in words or phrases and return query-related topics.
arXiv Detail & Related papers (2021-05-28T22:49:42Z) - Multi-View Sequence-to-Sequence Models with Conversational Structure for
Abstractive Dialogue Summarization [72.54873655114844]
Text summarization is one of the most challenging and interesting problems in NLP.
This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations.
Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
arXiv Detail & Related papers (2020-10-04T20:12:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.