ANTM: An Aligned Neural Topic Model for Exploring Evolving Topics
- URL: http://arxiv.org/abs/2302.01501v2
- Date: Sun, 4 Jun 2023 16:23:00 GMT
- Title: ANTM: An Aligned Neural Topic Model for Exploring Evolving Topics
- Authors: Hamed Rahimi, Hubert Naacke, Camelia Constantin, Bernd Amann
- Abstract summary: This paper presents an algorithmic family of dynamic topic models called Aligned Neural Topic Models (ANTM)
ANTM combines novel data mining algorithms to provide a modular framework for discovering evolving topics.
A Python package is developed for researchers and scientists who wish to study the trends and evolving patterns of topics in large-scale textual data.
- Score: 1.854328133293073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an algorithmic family of dynamic topic models called
Aligned Neural Topic Models (ANTM), which combine novel data mining algorithms
to provide a modular framework for discovering evolving topics. ANTM maintains
the temporal continuity of evolving topics by extracting time-aware features
from documents using advanced pre-trained Large Language Models (LLMs) and
employing an overlapping sliding window algorithm for sequential document
clustering. This overlapping sliding window algorithm identifies a different
number of topics within each time frame and aligns semantically similar
document clusters across time periods. This process captures emerging and
fading trends across different periods and allows for a more interpretable
representation of evolving topics. Experiments on four distinct datasets show
that ANTM outperforms probabilistic dynamic topic models in terms of topic
coherence and diversity metrics. Moreover, it improves the scalability and
flexibility of dynamic topic models by being accessible and adaptable to
different types of algorithms. Additionally, a Python package is developed for
researchers and scientists who wish to study the trends and evolving patterns
of topics in large-scale textual data.
Related papers
- Semantic-Driven Topic Modeling Using Transformer-Based Embeddings and Clustering Algorithms [6.349503549199403]
This study introduces an innovative end-to-end semantic-driven topic modeling technique for the topic extraction process.
Our model generates document embeddings using pre-trained transformer-based language models.
Compared to ChatGPT and traditional topic modeling algorithms, our model provides more coherent and meaningful topics.
arXiv Detail & Related papers (2024-09-30T18:15:31Z) - PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from
the perspective of partial differential equations [49.80959046861793]
We present PDETime, a novel LMTF model inspired by the principles of Neural PDE solvers.
Our experimentation across seven diversetemporal real-world LMTF datasets reveals that PDETime adapts effectively to the intrinsic nature of the data.
arXiv Detail & Related papers (2024-02-25T17:39:44Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Neural Dynamic Focused Topic Model [2.9005223064604078]
We leverage recent advances in neural variational inference and present an alternative neural approach to the dynamic Focused Topic Model.
We develop a neural model for topic evolution which exploits sequences of Bernoulli random variables in order to track the appearances of topics.
arXiv Detail & Related papers (2023-01-26T08:37:34Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - BERTopic: Neural topic modeling with a class-based TF-IDF procedure [0.0]
We present BERTopic, a topic model that extends the feasibility of approach topic modeling as a clustering task.
BERTopic generates coherent topics and remains competitive across a variety of benchmarks involving classical models and those that follow the more recent clustering approach of topic modeling.
arXiv Detail & Related papers (2022-03-11T08:35:15Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - Recurrent Coupled Topic Modeling over Sequential Documents [33.35324412209806]
We show that a current topic evolves from all prior topics with corresponding coupling weights, forming the multi-topic-thread evolution.
A new solution with a set of novel data augmentation techniques is proposed, which successfully discomposes the multi-couplings between evolving topics.
A novel Gibbs sampler with a backward-forward filter algorithm efficiently learns latent timeevolving parameters in a closed-form.
arXiv Detail & Related papers (2021-06-23T08:58:13Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z) - Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference [55.35176938713946]
We develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network.
We propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a downward generative model.
The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.
arXiv Detail & Related papers (2020-06-15T22:22:56Z) - Variational Hyper RNN for Sequence Modeling [69.0659591456772]
We propose a novel probabilistic sequence model that excels at capturing high variability in time series data.
Our method uses temporal latent variables to capture information about the underlying data pattern.
The efficacy of the proposed method is demonstrated on a range of synthetic and real-world sequential data.
arXiv Detail & Related papers (2020-02-24T19:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.