EM's Convergence in Gaussian Latent Tree Models
- URL: http://arxiv.org/abs/2211.11904v1
- Date: Mon, 21 Nov 2022 23:12:58 GMT
- Title: EM's Convergence in Gaussian Latent Tree Models
- Authors: Yuval Dagan, Constantinos Daskalakis, Anthimos Vardis Kandiros
- Abstract summary: We show that the unique non-trivial point of the population log-likelihood is its global maximum.
We establish that the expectation-maximization algorithm is guaranteed to converge to it in the single latent variable case.
- Score: 22.987933817370305
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the optimization landscape of the log-likelihood function and the
convergence of the Expectation-Maximization (EM) algorithm in latent Gaussian
tree models, i.e.~tree-structured Gaussian graphical models whose leaf nodes
are observable and non-leaf nodes are unobservable. We show that the unique
non-trivial stationary point of the population log-likelihood is its global
maximum, and establish that the expectation-maximization algorithm is
guaranteed to converge to it in the single latent variable case. Our results
for the landscape of the log-likelihood function in general latent tree models
provide support for the extensive practical use of maximum likelihood
based-methods in this setting. Our results for the EM algorithm extend an
emerging line of work on obtaining global convergence guarantees for this
celebrated algorithm. We show our results for the non-trivial stationary points
of the log-likelihood by arguing that a certain system of polynomial equations
obtained from the EM updates has a unique non-trivial solution. The global
convergence of the EM algorithm follows by arguing that all trivial fixed
points are higher-order saddle points.
Related papers
- Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - CITS: Coherent Ising Tree Search Algorithm Towards Solving Combinatorial
Optimization Problems [0.0]
This paper proposes a search algorithm by expanding search space from a Markov chain to a depth limited tree based on SA.
At each iteration, the algorithm will select the best near-optimal solution within the feasible search space by exploring along the tree in the sense of look ahead'
Our results show that above the primals SA and CIM, our high-level tree search strategy is able to provide solutions within fewer epochs for Ising formulated NP-optimization problems.
arXiv Detail & Related papers (2022-03-09T10:07:26Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - Regularization of Mixture Models for Robust Principal Graph Learning [0.0]
A regularized version of Mixture Models is proposed to learn a principal graph from a distribution of $D$-dimensional data points.
Parameters of the model are iteratively estimated through an Expectation-Maximization procedure.
arXiv Detail & Related papers (2021-06-16T18:00:02Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical
Clustering [33.000371053304676]
We present the first continuous relaxation of Dasgupta's discrete optimization problem with provable quality guarantees.
We show that even approximate solutions found with gradient descent have superior quality than agglomerative clusterings.
We also highlight the flexibility of HypHC using end-to-end training in a downstream classification task.
arXiv Detail & Related papers (2020-10-01T13:43:19Z) - ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms.
We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients.
We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z) - Inference with Aggregate Data: An Optimal Transport Approach [16.274397329511196]
We consider inference (filtering) problems over probabilistic graphical models with aggregate data generated by a large population of individuals.
We propose a new efficient belief propagation algorithm over tree-structured graphs with computational complexity as well as a global convergence guarantee.
arXiv Detail & Related papers (2020-03-31T03:12:20Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.