Projective Integral Updates for High-Dimensional Variational Inference
- URL: http://arxiv.org/abs/2301.08374v2
- Date: Fri, 8 Sep 2023 21:41:31 GMT
- Title: Projective Integral Updates for High-Dimensional Variational Inference
- Authors: Jed A. Duersch
- Abstract summary: Variational inference seeks to improve uncertainty in predictions by optimizing a simplified distribution over parameters to stand in for the full posterior.
This work introduces a fixed-point optimization for variational inference that is applicable when every feasible log density can be expressed as a linear combination of functions from a given basis.
A PyTorch implementation of QNVB allows for better control over model uncertainty during training than competing methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational inference is an approximation framework for Bayesian inference
that seeks to improve quantified uncertainty in predictions by optimizing a
simplified distribution over parameters to stand in for the full posterior.
Capturing model variations that remain consistent with training data enables
more robust predictions by reducing parameter sensitivity. This work introduces
a fixed-point optimization for variational inference that is applicable when
every feasible log density can be expressed as a linear combination of
functions from a given basis. In such cases, the optimizer becomes a
fixed-point of projective integral updates. When the basis spans univariate
quadratics in each parameter, feasible densities are Gaussian and the
projective integral updates yield quasi-Newton variational Bayes (QNVB). Other
bases and updates are also possible. As these updates require high-dimensional
integration, this work first proposes an efficient quasirandom quadrature
sequence for mean-field distributions. Each iterate of the sequence contains
two evaluation points that combine to correctly integrate all univariate
quadratics and, if the mean-field factors are symmetric, all univariate cubics.
More importantly, averaging results over short subsequences achieves periodic
exactness on a much larger space of multivariate quadratics. The corresponding
variational updates require 4 loss evaluations with standard (not second-order)
backpropagation to eliminate error terms from over half of all multivariate
quadratic basis functions. This integration technique is motivated by first
proposing stochastic blocked mean-field quadratures, which may be useful in
other contexts. A PyTorch implementation of QNVB allows for better control over
model uncertainty during training than competing methods. Experiments
demonstrate superior generalizability for multiple learning problems and
architectures.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Robust scalable initialization for Bayesian variational inference with
multi-modal Laplace approximations [0.0]
Variational mixtures with full-covariance structures suffer from a quadratic growth due to variational parameters with the number of parameters.
We propose a method for constructing an initial Gaussian model approximation that can be used to warm-start variational inference.
arXiv Detail & Related papers (2023-07-12T19:30:04Z) - Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models.
We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix.
Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z) - Distributed Estimation and Inference for Semi-parametric Binary Response Models [8.309294338998539]
This paper studies the maximum score estimator of a semi-parametric binary choice model under a distributed computing environment.
An intuitive divide-and-conquer estimator is computationally expensive and restricted by a non-regular constraint on the number of machines.
arXiv Detail & Related papers (2022-10-15T23:06:46Z) - Communication-Efficient Distributed Quantile Regression with Optimal
Statistical Guarantees [2.064612766965483]
We address the problem of how to achieve optimal inference in distributed quantile regression without stringent scaling conditions.
The difficulties are resolved through a double-smoothing approach that is applied to the local (at each data source) and global objective functions.
Despite the reliance on a delicate combination of local and global smoothing parameters, the quantile regression model is fully parametric.
arXiv Detail & Related papers (2021-10-25T17:09:59Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Last iterate convergence of SGD for Least-Squares in the Interpolation
regime [19.05750582096579]
We study the noiseless model in the fundamental least-squares setup.
We assume that an optimum predictor fits perfectly inputs and outputs $langle theta_*, phi(X) rangle = Y$, where $phi(X)$ stands for a possibly infinite dimensional non-linear feature map.
arXiv Detail & Related papers (2021-02-05T14:02:20Z) - Reducing the Amortization Gap in Variational Autoencoders: A Bayesian
Random Function Approach [38.45568741734893]
Inference in our GP model is done by a single feed forward pass through the network, significantly faster than semi-amortized methods.
We show that our approach attains higher test data likelihood than the state-of-the-arts on several benchmark datasets.
arXiv Detail & Related papers (2021-02-05T13:01:12Z) - Optimal Change-Point Detection with Training Sequences in the Large and
Moderate Deviations Regimes [72.68201611113673]
This paper investigates a novel offline change-point detection problem from an information-theoretic perspective.
We assume that the knowledge of the underlying pre- and post-change distributions are not known and can only be learned from the training sequences which are available.
arXiv Detail & Related papers (2020-03-13T23:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.