HTMOT : Hierarchical Topic Modelling Over Time
- URL: http://arxiv.org/abs/2112.03104v1
- Date: Mon, 22 Nov 2021 11:02:35 GMT
- Title: HTMOT : Hierarchical Topic Modelling Over Time
- Authors: Judicael Poumay, Ashwin Ittoo
- Abstract summary: We propose a novel method, HTMOT, to perform Hierarchical Topic Modelling Over Time.
We show that only applying time modelling to deep sub-topics provides a way to extract specific stories or events.
Our results show that our training procedure is fast and can extract accurate high-level topics and temporally precise sub-topics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Over the years, topic models have provided an efficient way of extracting
insights from text. However, while many models have been proposed, none are
able to model topic temporality and hierarchy jointly. Modelling time provide
more precise topics by separating lexically close but temporally distinct
topics while modelling hierarchy provides a more detailed view of the content
of a document corpus. In this study, we therefore propose a novel method,
HTMOT, to perform Hierarchical Topic Modelling Over Time. We train HTMOT using
a new implementation of Gibbs sampling, which is more efficient. Specifically,
we show that only applying time modelling to deep sub-topics provides a way to
extract specific stories or events while high level topics extract larger
themes in the corpus. Our results show that our training procedure is fast and
can extract accurate high-level topics and temporally precise sub-topics. We
measured our model's performance using the Word Intrusion task and outlined
some limitations of this evaluation method, especially for hierarchical models.
As a case study, we focused on the various developments in the space industry
in 2020.
Related papers
- Enhancing Short-Text Topic Modeling with LLM-Driven Context Expansion and Prefix-Tuned VAEs [25.915607750636333]
We propose a novel approach that leverages large language models (LLMs) to extend short texts into more detailed sequences before applying topic modeling.
Our method significantly improves short-text topic modeling performance, as demonstrated by extensive experiments on real-world datasets with extreme data sparsity.
arXiv Detail & Related papers (2024-10-04T01:28:56Z) - Embedded Topic Models Enhanced by Wikification [3.082729239227955]
We incorporate the Wikipedia knowledge into a neural topic model to make it aware of named entities.
Our experiments show that our method improves the performance of neural topic models in generalizability.
arXiv Detail & Related papers (2024-10-03T12:39:14Z) - Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - Iterative Improvement of an Additively Regularized Topic Model [0.0]
We present a method for iterative training of a topic model.
Experiments conducted on several collections of natural language texts show that the proposed ITAR model performs better than other popular topic models.
arXiv Detail & Related papers (2024-08-11T18:22:12Z) - Let the Pretrained Language Models "Imagine" for Short Texts Topic
Modeling [29.87929724277381]
In short texts, co-occurrence information is minimal, which results in feature sparsity in document representation.
Existing topic models (probabilistic or neural) mostly fail to mine patterns from them to generate coherent topics.
We extend short text into longer sequences using existing pre-trained language models (PLMs)
arXiv Detail & Related papers (2023-10-24T00:23:30Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers.
Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z) - A Comprehensive Study on Temporal Modeling for Online Action Detection [50.558313106389335]
Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.
This paper aims to provide a comprehensive study on temporal modeling for OAD including four meta types of temporal modeling methods.
We present several hybrid temporal modeling methods, which outperform the recent state-of-the-art methods with sizable margins on THUMOS-14 and TVSeries.
arXiv Detail & Related papers (2020-01-21T13:12:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.