Natural Gradient Hybrid Variational Inference with Application to Deep Mixed Models
- URL: http://arxiv.org/abs/2302.13536v2
- Date: Wed, 24 Jul 2024 23:23:48 GMT
- Title: Natural Gradient Hybrid Variational Inference with Application to Deep Mixed Models
- Authors: Weiben Zhang, Michael Stanley Smith, Worapree Maneesoonthorn, Ruben Loaiza-Maya,
- Abstract summary: We propose a fast and accurate variational inference (VI) method for deep mixed models.
It employs a well-defined gradient variational optimization that targets the joint posterior of the global parameters and latent variables.
We show that the approach is faster and more accurate than two cutting-edge natural VI gradient methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic models with global parameters and latent variables are common, and for which variational inference (VI) is popular. However, existing methods are often either slow or inaccurate in high dimensions. We suggest a fast and accurate VI method for this case that employs a well-defined natural gradient variational optimization that targets the joint posterior of the global parameters and latent variables. It is a hybrid method, where at each step the global parameters are updated using the natural gradient and the latent variables are generated from their conditional posterior. A fast to compute expression for the Tikhonov damped Fisher information matrix is used, along with the re-parameterization trick, to provide a stable natural gradient. We apply the approach to deep mixed models, which are an emerging class of Bayesian neural networks with random output layer coefficients to allow for heterogeneity. A range of simulations show that using the natural gradient is substantially more efficient than using the ordinary gradient, and that the approach is faster and more accurate than two cutting-edge natural gradient VI methods. In a financial application we show that accounting for industry level heterogeneity using the deep mixed model improves the accuracy of asset pricing models. MATLAB code to implement the method can be found at: https://github.com/WeibenZhang07/NG-HVI.
Related papers
- GP-FL: Model-Based Hessian Estimation for Second-Order Over-the-Air Federated Learning [52.295563400314094]
Second-order methods are widely adopted to improve the convergence rate of learning algorithms.
This paper introduces a novel second-order FL framework tailored for wireless channels.
arXiv Detail & Related papers (2024-12-05T04:27:41Z) - Gradient-based optimization for variational empirical Bayes multiple regression [2.6763498831034043]
We propose alternative optimization approaches based on gradient-based (quasi-Newton) methods.
We show that GradVI produces similar predictive performance and converges in fewer iterations when predictors are highly correlated.
Our methods are implemented in an open-source Python software, GradVI.
arXiv Detail & Related papers (2024-11-21T20:35:44Z) - Neural Gradient Learning and Optimization for Oriented Point Normal
Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation.
We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors.
Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z) - Natural Gradient Methods: Perspectives, Efficient-Scalable
Approximations, and Analysis [0.0]
Natural Gradient Descent is a second-degree optimization method motivated by the information geometry.
It makes use of the Fisher Information Matrix instead of the Hessian which is typically used.
Being a second-order method makes it infeasible to be used directly in problems with a huge number of parameters and data.
arXiv Detail & Related papers (2023-03-06T04:03:56Z) - VI-DGP: A variational inference method with deep generative prior for
solving high-dimensional inverse problems [0.7734726150561089]
We propose a novel approximation method for estimating the high-dimensional posterior distribution.
This approach leverages a deep generative model to learn a prior model capable of generating spatially-varying parameters.
The proposed method can be fully implemented in an automatic differentiation manner.
arXiv Detail & Related papers (2023-02-22T06:48:10Z) - Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models.
We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix.
Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z) - Quasi Black-Box Variational Inference with Natural Gradients for
Bayesian Learning [84.90242084523565]
We develop an optimization algorithm suitable for Bayesian learning in complex models.
Our approach relies on natural gradient updates within a general black-box framework for efficient training with limited model-specific derivations.
arXiv Detail & Related papers (2022-05-23T18:54:27Z) - Exploiting Adam-like Optimization Algorithms to Improve the Performance
of Convolutional Neural Networks [82.61182037130405]
gradient descent (SGD) is the main approach for training deep networks.
In this work, we compare Adam based variants based on the difference between the present and the past gradients.
We have tested ensemble of networks and the fusion with ResNet50 trained with gradient descent.
arXiv Detail & Related papers (2021-03-26T18:55:08Z) - Bayesian Sparse learning with preconditioned stochastic gradient MCMC
and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions.
We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.