Multivariate Gaussian Variational Inference by Natural Gradient Descent
- URL: http://arxiv.org/abs/2001.10025v2
- Date: Mon, 19 Oct 2020 13:54:16 GMT
- Title: Multivariate Gaussian Variational Inference by Natural Gradient Descent
- Authors: Timothy D. Barfoot
- Abstract summary: We show that there are some advantages to choosing a parameterization comprising the mean and inverse covariance matrix.
We provide a simple NGD update that accounts for the symmetric (and sparse) nature of the inverse covariance matrix.
- Score: 14.670851095242451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This short note reviews so-called Natural Gradient Descent (NGD) for
multivariate Gaussians. The Fisher Information Matrix (FIM) is derived for
several different parameterizations of Gaussians. Careful attention is paid to
the symmetric nature of the covariance matrix when calculating derivatives. We
show that there are some advantages to choosing a parameterization comprising
the mean and inverse covariance matrix and provide a simple NGD update that
accounts for the symmetric (and sparse) nature of the inverse covariance
matrix.
Related papers
- Adaptive posterior distributions for uncertainty analysis of covariance matrices in Bayesian inversion problems for multioutput signals [0.0]
We address the problem of performing Bayesian inference for the parameters of a nonlinear multi-output model.
The variables of interest are split in two blocks and the inference takes advantage of known analytical optimization formulas.
arXiv Detail & Related papers (2025-01-02T09:01:09Z) - Entrywise application of non-linear functions on orthogonally invariant matrices [44.99833362998488]
We investigate how the entrywise application of a non-linear function to symmetric invariant random matrix ensembles alters the spectral distribution.
We find that in all those cases a Gaussian equivalence principle holds, that is, the effect of the non-linear function is the same as taking a linear combination of the involved matrices and an additional independent GOE.
arXiv Detail & Related papers (2024-12-09T19:41:09Z) - Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry [63.694184882697435]
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
This paper provides a comprehensive and unified understanding of the matrix logarithm and power from a Riemannian geometry perspective.
arXiv Detail & Related papers (2024-07-15T07:11:44Z) - Convex Parameter Estimation of Perturbed Multivariate Generalized
Gaussian Distributions [18.95928707619676]
We propose a convex formulation with well-established properties for MGGD parameters.
The proposed framework is flexible as it combines a variety of regularizations for the precision matrix, the mean and perturbations.
Experiments show a more accurate precision and covariance matrix estimation with similar performance for the mean vector parameter.
arXiv Detail & Related papers (2023-12-12T18:08:04Z) - Intrinsic Bayesian Cramér-Rao Bound with an Application to Covariance Matrix Estimation [49.67011673289242]
This paper presents a new performance bound for estimation problems where the parameter to estimate lies in a smooth manifold.
It induces a geometry for the parameter manifold, as well as an intrinsic notion of the estimation error measure.
arXiv Detail & Related papers (2023-11-08T15:17:13Z) - Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models.
We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix.
Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Approximation Based Variance Reduction for Reparameterization Gradients [38.73307745906571]
Flexible variational distributions improve variational inference but are harder to optimize.
We present a control variate that is applicable for any reizable distribution with known mean and covariance matrix.
It leads to large improvements in gradient variance and optimization convergence for inference with non-factorized variational distributions.
arXiv Detail & Related papers (2020-07-29T06:55:11Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.