Non-negative matrix factorization algorithms greatly improve topic model
fits
- URL: http://arxiv.org/abs/2105.13440v1
- Date: Thu, 27 May 2021 20:34:46 GMT
- Title: Non-negative matrix factorization algorithms greatly improve topic model
fits
- Authors: Peter Carbonetto, Abhishek Sarkar, Zihao Wang and Matthew Stephens
- Abstract summary: NMF avoids the "sum-to-one" constraints on the topic model parameters.
We show that first solving the NMF problem then recovering the topic model fit can produce remarkably better fits.
- Score: 7.7276871905342315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We report on the potential for using algorithms for non-negative matrix
factorization (NMF) to improve parameter estimation in topic models. While
several papers have studied connections between NMF and topic models, none have
suggested leveraging these connections to develop new algorithms for fitting
topic models. Importantly, NMF avoids the "sum-to-one" constraints on the topic
model parameters, resulting in an optimization problem with simpler structure
and more efficient computations. Building on recent advances in optimization
algorithms for NMF, we show that first solving the NMF problem then recovering
the topic model fit can produce remarkably better fits, and in less time, than
standard algorithms for topic models. While we focus primarily on maximum
likelihood estimation, we show that this approach also has the potential to
improve variational inference for topic models. Our methods are implemented in
the R package fastTopics.
Related papers
- Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball.
We propose a new family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems.
arXiv Detail & Related papers (2025-02-11T13:10:34Z) - Towards a Fairer Non-negative Matrix Factorization [6.069820038869034]
We investigate how Non-negative Matrix Factorization (NMF) can introduce bias in the representation of data groups.
We present an approach, called Fairer-NMF, that seeks to minimize the maximum reconstruction loss for different groups.
arXiv Detail & Related papers (2024-11-14T23:34:38Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Supervised Class-pairwise NMF for Data Representation and Classification [2.7320863258816512]
Non-negative Matrix factorization (NMF) based methods add new terms to the cost function to adapt the model to specific tasks.
NMF method adopts unsupervised approaches to estimate the factorizing matrices.
arXiv Detail & Related papers (2022-09-28T04:33:03Z) - Non-Negative Matrix Factorization with Scale Data Structure Preservation [23.31865419578237]
The model described in this paper belongs to the family of non-negative matrix factorization methods designed for data representation and dimension reduction.
The idea is to add, to the NMF cost function, a penalty term to impose a scale relationship between the pairwise similarity matrices of the original and transformed data points.
The proposed clustering algorithm is compared to some existing NMF-based algorithms and to some manifold learning-based algorithms when applied to some real-life datasets.
arXiv Detail & Related papers (2022-09-22T09:32:18Z) - Neural Improvement Heuristics for Graph Combinatorial Optimization
Problems [49.85111302670361]
We introduce a novel Neural Improvement (NI) model capable of handling graph-based problems where information is encoded in the nodes, edges, or both.
The presented model serves as a fundamental component for hill-climbing-based algorithms that guide the selection of neighborhood operations for each.
arXiv Detail & Related papers (2022-06-01T10:35:29Z) - Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction.
This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants.
We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions.
The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z) - Offline Model-Based Optimization via Normalized Maximum Likelihood
Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points.
This problem setting emerges in many domains where function evaluation is a complex and expensive process.
We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z) - Positive Semidefinite Matrix Factorization: A Connection with Phase
Retrieval and Affine Rank Minimization [71.57324258813674]
We show that PSDMF algorithms can be designed based on phase retrieval (PR) and affine rank minimization (ARM) algorithms.
Motivated by this idea, we introduce a new family of PSDMF algorithms based on iterative hard thresholding (IHT)
arXiv Detail & Related papers (2020-07-24T06:10:19Z) - MATE: A Model-based Algorithm Tuning Engine [2.4693304175649304]
We introduce a Model-based Algorithm Turning Engine, namely MATE, where the parameters of an algorithm are represented as expressions of the features of a target optimisation problem.
We formulate the problem of finding the relationships between the parameters and the problem features as a symbolic regression problem and we use genetic programming to extract these expressions.
For the evaluation, we apply our approach to configuration of the (1+1) EA and RLS algorithms for the OneMax, LeadingOnes, BinValue and Jump optimisation problems.
arXiv Detail & Related papers (2020-04-27T12:50:48Z) - Uncertainty Modelling in Risk-averse Supply Chain Systems Using
Multi-objective Pareto Optimization [0.0]
One of the arduous tasks in supply chain modelling is to build robust models against irregular variations.
We have introduced a novel methodology namely, Pareto Optimization to handle uncertainties and bound the entropy of such uncertainties by explicitly modelling them under some apriori assumptions.
arXiv Detail & Related papers (2020-04-24T21:04:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.