Related papers: Large-scale gradient-based training of Mixtures of Factor Analyzers

Large-scale gradient-based training of Mixtures of Factor Analyzers

URL: http://arxiv.org/abs/2308.13778v1
Date: Sat, 26 Aug 2023 06:12:33 GMT
Title: Large-scale gradient-based training of Mixtures of Factor Analyzers
Authors: Alexander Gepperth
Abstract summary: This article contributes both a theoretical analysis as well as a new method for efficient high-dimensional training by gradient descent. We prove that MFA training and inference/sampling can be performed based on precision matrices, which does not require matrix inversions after training is completed. Besides the theoretical analysis and matrices, we apply MFA to typical image datasets such as SVHN and MNIST, and demonstrate the ability to perform sample generation and outlier detection.
Score: 67.21722742907981
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gaussian Mixture Models (GMMs) are a standard tool in data analysis. However, they face problems when applied to high-dimensional data (e.g., images) due to the size of the required full covariance matrices (CMs), whereas the use of diagonal or spherical CMs often imposes restrictions that are too severe. The Mixture of Factor analyzers (MFA) model is an important extension of GMMs, which allows to smoothly interpolate between diagonal and full CMs based on the number of \textit{factor loadings} $l$. MFA has successfully been applied for modeling high-dimensional image data. This article contributes both a theoretical analysis as well as a new method for efficient high-dimensional MFA training by stochastic gradient descent, starting from random centroid initializations. This greatly simplifies the training and initialization process, and avoids problems of batch-type algorithms such Expectation-Maximization (EM) when training with huge amounts of data. In addition, by exploiting the properties of the matrix determinant lemma, we prove that MFA training and inference/sampling can be performed based on precision matrices, which does not require matrix inversions after training is completed. At training time, the methods requires the inversion of $l\times l$ matrices only. Besides the theoretical analysis and proofs, we apply MFA to typical image datasets such as SVHN and MNIST, and demonstrate the ability to perform sample generation and outlier detection.

Related papers

Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry [35.039437211695436]
We propose DiffeoCFM, an approach that enables conditional flow matching (CFM) on matrix pullback metrics induced by diffeomorphism.<n>It enables fast training and co-generative state-of-the-art performance, all while preserving manifold constraints.
arXiv Detail & Related papers (2025-05-20T07:52:55Z)
Support matrix machine: A review [0.0]
Support matrix machine (SMM) represents one of the emerging methodologies tailored for handling matrix input data. This article provides the first in-depth analysis of the development of the SMM model. We discuss numerous SMM variants, such as robust, sparse, class imbalance, and multi-class classification models.
arXiv Detail & Related papers (2023-10-30T16:46:23Z)
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z)
Sufficient dimension reduction for feature matrices [3.04585143845864]
We propose a method called principal support matrix machine (PSMM) for the matrix sufficient dimension reduction. Our numerical analysis demonstrates that the PSMM outperforms existing methods and has strong interpretability in real data applications.
arXiv Detail & Related papers (2023-03-07T23:16:46Z)
Quadratic Matrix Factorization with Applications to Manifold Learning [1.6795461001108094]
We propose a quadratic matrix factorization (QMF) framework to learn the curved manifold on which the dataset lies. Algorithmically, we propose an alternating minimization algorithm to optimize QMF and establish its theoretical convergence properties. Experiments on a synthetic manifold learning dataset and two real datasets, including the MNIST handwritten dataset and a cryogenic electron microscopy dataset, demonstrate the superiority of the proposed method over its competitors.
arXiv Detail & Related papers (2023-01-30T15:09:00Z)
Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints. The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution. We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z)
Unitary Approximate Message Passing for Matrix Factorization [90.84906091118084]
We consider matrix factorization (MF) with certain constraints, which finds wide applications in various areas. We develop a Bayesian approach to MF with an efficient message passing implementation, called UAMPMF. We show that UAMPMF significantly outperforms state-of-the-art algorithms in terms of recovery accuracy, robustness and computational complexity.
arXiv Detail & Related papers (2022-07-31T12:09:32Z)
Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression. It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise. This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z)
Weighted Low Rank Matrix Approximation and Acceleration [0.5177947445379687]
Low-rank matrix approximation is one of the central concepts in machine learning. Low-rank matrix completion (LRMC) solves the LRMA problem when some observations are missing. We propose an algorithm for solving the weighted problem, as well as two acceleration techniques.
arXiv Detail & Related papers (2021-09-22T22:03:48Z)
Self-supervised Symmetric Nonnegative Matrix Factorization [82.59905231819685]
Symmetric nonnegative factor matrix (SNMF) has demonstrated to be a powerful method for data clustering. Inspired by ensemble clustering that aims to seek better clustering results, we propose self-supervised SNMF (S$3$NMF) We take advantage of the sensitivity to code characteristic of SNMF, without relying on any additional information.
arXiv Detail & Related papers (2021-03-02T12:47:40Z)
Information-Theoretic Limits for the Matrix Tensor Product [8.206394018475708]
This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices. On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
arXiv Detail & Related papers (2020-05-22T17:03:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.