Geometry of EM and related iterative algorithms
- URL: http://arxiv.org/abs/2209.01301v1
- Date: Sat, 3 Sep 2022 00:23:23 GMT
- Title: Geometry of EM and related iterative algorithms
- Authors: Hideitsu Hino and Shotaro Akaho and Noboru Murata
- Abstract summary: The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that has been used for many years as a methodology for statistical inference.
In this paper, we introduce the $em$ algorithm, an information geometric formulation of the EM algorithm, and its extensions and applications to various problems.
- Score: 8.228889210180268
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that
has been used for many years as a methodology for statistical inference when
there are missing measurements in the observed data or when the data is
composed of observables and unobservables. Its general properties are well
studied, and also, there are countless ways to apply it to individual problems.
In this paper, we introduce the $em$ algorithm, an information geometric
formulation of the EM algorithm, and its extensions and applications to various
problems. Specifically, we will see that it is possible to formulate an
outlier-robust inference algorithm, an algorithm for calculating channel
capacity, parameter estimation methods on probability simplex, particular
multivariate analysis methods such as principal component analysis in a space
of probability models and modal regression, matrix factorization, and learning
generative models, which have recently attracted attention in deep learning,
from the geometric perspective.
Related papers
- Elliptical Wishart distributions: information geometry, maximum likelihood estimator, performance analysis and statistical learning [7.499241611226476]
Two algorithms to compute the likelihood estimator (MLE) are proposed.
For the $t$-Wishart distribution, the performance of the MLE and statistical learning algorithms are evaluated.
arXiv Detail & Related papers (2024-11-05T01:52:27Z) - An evaluation framework for dimensionality reduction through sectional
curvature [59.40521061783166]
In this work, we aim to introduce the first highly non-supervised dimensionality reduction performance metric.
To test its feasibility, this metric has been used to evaluate the performance of the most commonly used dimension reduction algorithms.
A new parameterized problem instance generator has been constructed in the form of a function generator.
arXiv Detail & Related papers (2023-03-17T11:59:33Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Understanding High Dimensional Spaces through Visual Means Employing
Multidimensional Projections [0.0]
Two of the relevant algorithms in the data visualisation field are t-distributed neighbourhood embedding (t-SNE) and Least-Square Projection (LSP)
These algorithms can be used to understand several ranges of mathematical functions including their impact on datasets.
We illustrate ways of employing the visual results of multidimensional projection algorithms to understand and fine-tune the parameters of their mathematical framework.
arXiv Detail & Related papers (2022-07-12T20:30:33Z) - Group invariant machine learning by fundamental domain projections [0.5156484100374058]
We approach the well-studied problem of supervised group invariant and equivariant machine learning from the point of view of geometric topology.
We propose a novel approach using a pre-processing step, which involves projecting the input data into a geometric space.
This new data can then be the input for an arbitrary machine learning model.
arXiv Detail & Related papers (2022-02-04T14:45:57Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Quantum Algorithms for Data Representation and Analysis [68.754953879193]
We provide quantum procedures that speed-up the solution of eigenproblems for data representation in machine learning.
The power and practical use of these subroutines is shown through new quantum algorithms, sublinear in the input matrix's size, for principal component analysis, correspondence analysis, and latent semantic analysis.
Results show that the run-time parameters that do not depend on the input's size are reasonable and that the error on the computed model is small, allowing for competitive classification performances.
arXiv Detail & Related papers (2021-04-19T00:41:43Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.