Related papers: Geometry of EM and related iterative algorithms

Geometry of EM and related iterative algorithms

URL: http://arxiv.org/abs/2209.01301v1
Date: Sat, 3 Sep 2022 00:23:23 GMT
Title: Geometry of EM and related iterative algorithms
Authors: Hideitsu Hino and Shotaro Akaho and Noboru Murata
Abstract summary: The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that has been used for many years as a methodology for statistical inference. In this paper, we introduce the $em$ algorithm, an information geometric formulation of the EM algorithm, and its extensions and applications to various problems.
Score: 8.228889210180268
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that has been used for many years as a methodology for statistical inference when there are missing measurements in the observed data or when the data is composed of observables and unobservables. Its general properties are well studied, and also, there are countless ways to apply it to individual problems. In this paper, we introduce the $em$ algorithm, an information geometric formulation of the EM algorithm, and its extensions and applications to various problems. Specifically, we will see that it is possible to formulate an outlier-robust inference algorithm, an algorithm for calculating channel capacity, parameter estimation methods on probability simplex, particular multivariate analysis methods such as principal component analysis in a space of probability models and modal regression, matrix factorization, and learning generative models, which have recently attracted attention in deep learning, from the geometric perspective.

Related papers

Survey on Algorithms for multi-index models [45.143425167349314]
We review the literature on algorithms for estimating the index space in a multi-index model. The primary focus is on computationally efficient (polynomial-time) algorithms in Gaussian space, the assumptions under which consistency is guaranteed by these methods, and their sample complexity.
arXiv Detail & Related papers (2025-04-07T18:50:11Z)
A Unifying Family of Data-Adaptive Partitioning Algorithms [0.0]
We present a family of data-adaptive partitioning algorithms that unifies several well-known methods. The algorithms are easy to use and interpret, and scale well to large, high-dimensional problems.
arXiv Detail & Related papers (2024-12-21T17:54:53Z)
Elliptical Wishart distributions: information geometry, maximum likelihood estimator, performance analysis and statistical learning [7.499241611226476]
Two algorithms to compute the likelihood estimator (MLE) are proposed. For the $t$-Wishart distribution, the performance of the MLE and statistical learning algorithms are evaluated.
arXiv Detail & Related papers (2024-11-05T01:52:27Z)
An evaluation framework for dimensionality reduction through sectional curvature [59.40521061783166]
In this work, we aim to introduce the first highly non-supervised dimensionality reduction performance metric. To test its feasibility, this metric has been used to evaluate the performance of the most commonly used dimension reduction algorithms. A new parameterized problem instance generator has been constructed in the form of a function generator.
arXiv Detail & Related papers (2023-03-17T11:59:33Z)
Learning to Bound Counterfactual Inference in Structural Causal Models from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm. The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources. It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z)
Understanding High Dimensional Spaces through Visual Means Employing Multidimensional Projections [0.0]
Two of the relevant algorithms in the data visualisation field are t-distributed neighbourhood embedding (t-SNE) and Least-Square Projection (LSP) These algorithms can be used to understand several ranges of mathematical functions including their impact on datasets. We illustrate ways of employing the visual results of multidimensional projection algorithms to understand and fine-tune the parameters of their mathematical framework.
arXiv Detail & Related papers (2022-07-12T20:30:33Z)
Group invariant machine learning by fundamental domain projections [0.5156484100374058]
We approach the well-studied problem of supervised group invariant and equivariant machine learning from the point of view of geometric topology. We propose a novel approach using a pre-processing step, which involves projecting the input data into a geometric space. This new data can then be the input for an arbitrary machine learning model.
arXiv Detail & Related papers (2022-02-04T14:45:57Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Quantum Algorithms for Data Representation and Analysis [68.754953879193]
We provide quantum procedures that speed-up the solution of eigenproblems for data representation in machine learning. The power and practical use of these subroutines is shown through new quantum algorithms, sublinear in the input matrix's size, for principal component analysis, correspondence analysis, and latent semantic analysis. Results show that the run-time parameters that do not depend on the input's size are reasonable and that the error on the computed model is small, allowing for competitive classification performances.
arXiv Detail & Related papers (2021-04-19T00:41:43Z)
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets. We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z)
Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples. We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.