Topic Analysis with Side Information: A Neural-Augmented LDA Approach
- URL: http://arxiv.org/abs/2510.24918v2
- Date: Sat, 01 Nov 2025 21:06:32 GMT
- Title: Topic Analysis with Side Information: A Neural-Augmented LDA Approach
- Authors: Biyi Fang, Truong Vo, Kripa Rajshekhar, Diego Klabjan,
- Abstract summary: We propose a neural-augmented probabilistic topic model that incorporates side information through a neural prior mechanism.<n>nnLDA consistently outperforms LDA and Dirichlet-Multinomial Regression in topic coherence, perplexity, distributions and downstream classification.
- Score: 16.477230727313017
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document labels. These limitations restrict their expressiveness, personalization, and interpretability. To address this, we propose nnLDA, a neural-augmented probabilistic topic model that dynamically incorporates side information through a neural prior mechanism. nnLDA models each document as a mixture of latent topics, where the prior over topic proportions is generated by a neural network conditioned on auxiliary features. This design allows the model to capture complex nonlinear interactions between side information and topic distributions that static Dirichlet priors cannot represent. We develop a stochastic variational Expectation-Maximization algorithm to jointly optimize the neural and probabilistic components. Across multiple benchmark datasets, nnLDA consistently outperforms LDA and Dirichlet-Multinomial Regression in topic coherence, perplexity, and downstream classification. These results highlight the benefits of combining neural representation learning with probabilistic topic modeling in settings where side information is available.
Related papers
- Information-theoretic Quantification of High-order Feature Effects in Classification Problems [0.19791587637442676]
We present an information-theoretic extension of the High-order interactions for Feature importance (Hi-Fi) method.<n>Our framework decomposes feature contributions into unique, synergistic, and redundant components.<n>Results indicate that the proposed estimator accurately recovers theoretical and expected findings.
arXiv Detail & Related papers (2025-07-06T11:50:30Z) - AI-Aided Kalman Filters [65.35350122917914]
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing.<n>Recent developments illustrate the possibility of fusing deep neural networks (DNNs) with classic Kalman-type filtering.<n>This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms.
arXiv Detail & Related papers (2024-10-16T06:47:53Z) - TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation [65.65530016765615]
We propose a hierarchical predictive coding framework that captures multi-scale dependencies through three complementary learning objectives.<n> TokenUnify integrates random token prediction, next-token prediction, and next-all token prediction to create a comprehensive representational space.<n>We also introduce a large-scale EM dataset with 1.2 billion annotated voxels, offering ideal long-sequence visual data with spatial continuity.
arXiv Detail & Related papers (2024-05-27T05:45:51Z) - Improving Neural Additive Models with Bayesian Principles [54.29602161803093]
Neural additive models (NAMs) enhance the transparency of deep neural networks by handling calibrated input features in separate additive sub-networks.
We develop Laplace-approximated NAMs (LA-NAMs) which show improved empirical performance on datasets and challenging real-world medical tasks.
arXiv Detail & Related papers (2023-05-26T13:19:15Z) - Neural Dynamic Focused Topic Model [2.9005223064604078]
We leverage recent advances in neural variational inference and present an alternative neural approach to the dynamic Focused Topic Model.
We develop a neural model for topic evolution which exploits sequences of Bernoulli random variables in order to track the appearances of topics.
arXiv Detail & Related papers (2023-01-26T08:37:34Z) - Neural Topic Modeling with Deep Mutual Information Estimation [23.474848535821994]
We propose a neural topic model which incorporates deep mutual information estimation.
NTM-DMIE is a neural network method for topic learning.
We evaluate NTM-DMIE on several metrics, including accuracy of text clustering, with topic representation, topic uniqueness and topic coherence.
arXiv Detail & Related papers (2022-03-12T01:08:10Z) - Topic Analysis for Text with Side Data [18.939336393665553]
We introduce a hybrid generative probabilistic model that combines a neural network with a latent topic model.
In the model, each document is modeled as a finite mixture over an underlying set of topics.
Each topic is modeled as an infinite mixture over an underlying set of topic probabilities.
arXiv Detail & Related papers (2022-03-01T22:06:30Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Semi-Structured Deep Piecewise Exponential Models [2.7728956081909346]
We propose a versatile framework for survival analysis that combines advanced concepts from statistics with deep learning.
A proof of concept is provided by using the framework to predict Alzheimer's disease progression.
arXiv Detail & Related papers (2020-11-11T14:41:19Z) - Bayesian Sparse Factor Analysis with Kernelized Observations [67.60224656603823]
Multi-view problems can be faced with latent variable models.
High-dimensionality and non-linear issues are traditionally handled by kernel methods.
We propose merging both approaches into single model.
arXiv Detail & Related papers (2020-06-01T14:25:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.