HTMOT : Hierarchical Topic Modelling Over Time
- URL: http://arxiv.org/abs/2112.03104v1
- Date: Mon, 22 Nov 2021 11:02:35 GMT
- Title: HTMOT : Hierarchical Topic Modelling Over Time
- Authors: Judicael Poumay, Ashwin Ittoo
- Abstract summary: We propose a novel method, HTMOT, to perform Hierarchical Topic Modelling Over Time.
We show that only applying time modelling to deep sub-topics provides a way to extract specific stories or events.
Our results show that our training procedure is fast and can extract accurate high-level topics and temporally precise sub-topics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Over the years, topic models have provided an efficient way of extracting
insights from text. However, while many models have been proposed, none are
able to model topic temporality and hierarchy jointly. Modelling time provide
more precise topics by separating lexically close but temporally distinct
topics while modelling hierarchy provides a more detailed view of the content
of a document corpus. In this study, we therefore propose a novel method,
HTMOT, to perform Hierarchical Topic Modelling Over Time. We train HTMOT using
a new implementation of Gibbs sampling, which is more efficient. Specifically,
we show that only applying time modelling to deep sub-topics provides a way to
extract specific stories or events while high level topics extract larger
themes in the corpus. Our results show that our training procedure is fast and
can extract accurate high-level topics and temporally precise sub-topics. We
measured our model's performance using the Word Intrusion task and outlined
some limitations of this evaluation method, especially for hierarchical models.
As a case study, we focused on the various developments in the space industry
in 2020.
Related papers
- Interactive Topic Models with Optimal Transport [75.26555710661908]
We present EdTM, as an approach for label name supervised topic modeling.
EdTM models topic modeling as an assignment problem while leveraging LM/LLM based document-topic affinities.
arXiv Detail & Related papers (2024-06-28T13:57:27Z) - Let the Pretrained Language Models "Imagine" for Short Texts Topic
Modeling [29.87929724277381]
In short texts, co-occurrence information is minimal, which results in feature sparsity in document representation.
Existing topic models (probabilistic or neural) mostly fail to mine patterns from them to generate coherent topics.
We extend short text into longer sequences using existing pre-trained language models (PLMs)
arXiv Detail & Related papers (2023-10-24T00:23:30Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - TopNet: Learning from Neural Topic Model to Generate Long Stories [43.5564336855688]
Long story generation (LSG) is one of the coveted goals in natural language processing.
We propose emphTopNet to obtain high-quality skeleton words to complement the short input.
Our proposed framework is highly effective in skeleton word selection and significantly outperforms state-of-the-art models in both automatic evaluation and human evaluation.
arXiv Detail & Related papers (2021-12-14T09:47:53Z) - Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers.
Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z) - Neural Topic Model via Optimal Transport [24.15046280736009]
We present a new neural topic model via the theory of optimal transport (OT)
Specifically, we propose to learn the topic distribution of a document by directly minimising its OT distance to the document's word distributions.
Our proposed model can be trained efficiently with a differentiable loss.
arXiv Detail & Related papers (2020-08-12T06:37:09Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z) - A Comprehensive Study on Temporal Modeling for Online Action Detection [50.558313106389335]
Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.
This paper aims to provide a comprehensive study on temporal modeling for OAD including four meta types of temporal modeling methods.
We present several hybrid temporal modeling methods, which outperform the recent state-of-the-art methods with sizable margins on THUMOS-14 and TVSeries.
arXiv Detail & Related papers (2020-01-21T13:12:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.