Dynamic Topic Analysis in Academic Journals using Convex Non-negative Matrix Factorization Method
- URL: http://arxiv.org/abs/2504.08743v1
- Date: Sun, 23 Mar 2025 14:31:47 GMT
- Title: Dynamic Topic Analysis in Academic Journals using Convex Non-negative Matrix Factorization Method
- Authors: Yang Yang, Tong Zhang, Jian Wu, Lijie Su,
- Abstract summary: This paper presents a two-stage dynamic topic analysis framework.<n>It incorporates convex optimization to improve topic consistency, sparsity, and interpretability.<n>Applying the proposed method to IEEE journal abstracts from 2004 to 2022 effectively identifies and quantifies emerging research topics.
- Score: 13.479775419940283
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid advancement of large language models, academic topic identification and topic evolution analysis are crucial for enhancing AI's understanding capabilities. Dynamic topic analysis provides a powerful approach to capturing and understanding the temporal evolution of topics in large-scale datasets. This paper presents a two-stage dynamic topic analysis framework that incorporates convex optimization to improve topic consistency, sparsity, and interpretability. In Stage 1, a two-layer non-negative matrix factorization (NMF) model is employed to extract annual topics and identify key terms. In Stage 2, a convex optimization algorithm refines the dynamic topic structure using the convex NMF (cNMF) model, further enhancing topic integration and stability. Applying the proposed method to IEEE journal abstracts from 2004 to 2022 effectively identifies and quantifies emerging research topics, such as COVID-19 and digital twins. By optimizing sparsity differences in the clustering feature space between traditional and emerging research topics, the framework provides deeper insights into topic evolution and ranking analysis. Moreover, the NMF-cNMF model demonstrates superior stability in topic consistency. At sparsity levels of 0.4, 0.6, and 0.9, the proposed approach improves topic ranking stability by 24.51%, 56.60%, and 36.93%, respectively. The source code (to be open after publication) is available at https://github.com/meetyangyang/CDNMF.
Related papers
- Exploring Topic Trends in COVID-19 Research Literature using Non-Negative Matrix Factorization [2.8777530051393314]
We apply topic modeling using Non-Negative Matrix Factorization (NMF) on the COVID-19 Open Research dataset.<n>NMF factorizes the document-term matrix into two non-negative matrices, effectively representing the topics and their distribution across the documents.<n>Our findings contribute to the understanding of the knowledge structure of the COVID-19 research landscape.
arXiv Detail & Related papers (2025-03-23T19:37:52Z) - Multivariate Gaussian Topic Modelling: A novel approach to discover topics with greater semantic coherence [3.360457684855856]
We propose a novel Multivariate Gaussian Topic modelling (MGD) approach.<n>The approach is first applied on a synthetic dataset to demonstrate the interpretability benefits vis-a-vis LDA.<n>This model achieves a higher mean topic coherence of 0.436 vis-a-vis 0.294 for LDA.
arXiv Detail & Related papers (2025-03-19T09:25:54Z) - Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives [84.03001845263]
Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management.<n>Traditional narrative analysis often focuses on local indicators in microstructure, such as word usage and syntax.<n>We propose to investigate specific cognitive and linguistic challenges by analyzing topical shifts, temporal dynamics, and the coherence of narratives over time.
arXiv Detail & Related papers (2025-01-07T12:16:26Z) - Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems [2.66854711376491]
Proposed dynamics are based on the proximal augmented Lagrangian.
We leverage various structural properties to establish global (exponential) convergence guarantees.
Our assumptions are much weaker than those required to prove (exponential) stability of various primal-dual dynamics.
arXiv Detail & Related papers (2024-08-28T17:43:18Z) - Interactive Topic Models with Optimal Transport [75.26555710661908]
We present EdTM, as an approach for label name supervised topic modeling.
EdTM models topic modeling as an assignment problem while leveraging LM/LLM based document-topic affinities.
arXiv Detail & Related papers (2024-06-28T13:57:27Z) - FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model.
We use Dual Semantic-relation Reconstruction (DSR) to model latent topics.
We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z) - Data-Centric Long-Tailed Image Recognition [49.90107582624604]
Long-tail models exhibit a strong demand for high-quality data.
Data-centric approaches aim to enhance both the quantity and quality of data to improve model performance.
There is currently a lack of research into the underlying mechanisms explaining the effectiveness of information augmentation.
arXiv Detail & Related papers (2023-11-03T06:34:37Z) - ANTM: An Aligned Neural Topic Model for Exploring Evolving Topics [1.854328133293073]
This paper presents an algorithmic family of dynamic topic models called Aligned Neural Topic Models (ANTM)
ANTM combines novel data mining algorithms to provide a modular framework for discovering evolving topics.
A Python package is developed for researchers and scientists who wish to study the trends and evolving patterns of topics in large-scale textual data.
arXiv Detail & Related papers (2023-02-03T02:31:12Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Non-negative matrix factorization algorithms greatly improve topic model
fits [7.7276871905342315]
NMF avoids the "sum-to-one" constraints on the topic model parameters.
We show that first solving the NMF problem then recovering the topic model fit can produce remarkably better fits.
arXiv Detail & Related papers (2021-05-27T20:34:46Z) - Improving Neural Topic Models using Knowledge Distillation [84.66983329587073]
We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers.
Our modular method can be straightforwardly applied with any neural topic model to improve topic quality.
arXiv Detail & Related papers (2020-10-05T22:49:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.