Bayesian ODE Solvers: The Maximum A Posteriori Estimate
- URL: http://arxiv.org/abs/2004.00623v2
- Date: Tue, 12 Jan 2021 16:12:21 GMT
- Title: Bayesian ODE Solvers: The Maximum A Posteriori Estimate
- Authors: Filip Tronarp, Simo Sarkka, Philipp Hennig
- Abstract summary: It has been established that the numerical solution of ordinary differential equations can be posed as a nonlinear Bayesian inference problem.
The maximum a posteriori estimate corresponds to an optimal interpolant in the Hilbert space associated with the prior.
The methodology developed provides a novel and more natural approach to study the convergence of these estimators.
- Score: 30.767328732475956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has recently been established that the numerical solution of ordinary
differential equations can be posed as a nonlinear Bayesian inference problem,
which can be approximately solved via Gaussian filtering and smoothing,
whenever a Gauss--Markov prior is used. In this paper the class of $\nu$ times
differentiable linear time invariant Gauss--Markov priors is considered. A
taxonomy of Gaussian estimators is established, with the maximum a posteriori
estimate at the top of the hierarchy, which can be computed with the iterated
extended Kalman smoother. The remaining three classes are termed explicit,
semi-implicit, and implicit, which are in similarity with the classical notions
corresponding to conditions on the vector field, under which the filter update
produces a local maximum a posteriori estimate. The maximum a posteriori
estimate corresponds to an optimal interpolant in the reproducing Hilbert space
associated with the prior, which in the present case is equivalent to a Sobolev
space of smoothness $\nu+1$. Consequently, using methods from scattered data
approximation and nonlinear analysis in Sobolev spaces, it is shown that the
maximum a posteriori estimate converges to the true solution at a polynomial
rate in the fill-distance (maximum step size) subject to mild conditions on the
vector field. The methodology developed provides a novel and more natural
approach to study the convergence of these estimators than classical methods of
convergence analysis. The methods and theoretical results are demonstrated in
numerical examples.
Related papers
- Riemannian Laplace Approximation with the Fisher Metric [5.982697037000189]
Laplace's method approximates a target density with a Gaussian distribution at its mode.
For complex targets and finite-data posteriors it is often too crude an approximation.
We develop two alternative variants that are exact at the limit of infinite data.
arXiv Detail & Related papers (2023-11-05T20:51:03Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Connecting Hamilton--Jacobi partial differential equations with maximum
a posteriori and posterior mean estimators for some non-convex priors [0.0]
In this chapter, we consider a certain class non-log-concave regularizations and show that similar representation formulas for the minimizer can also be obtained.
We also present similar results for certain Bayesian posterior mean estimators with Gaussian data fidelity and certain non-log-concave priors using an analogue of min-plus algebra techniques.
arXiv Detail & Related papers (2021-04-22T19:00:37Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Pathwise Conditioning of Gaussian Processes [72.61885354624604]
Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations.
This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector.
We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors.
arXiv Detail & Related papers (2020-11-08T17:09:37Z) - Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic
Bounds and Applications [0.6445605125467573]
gradient estimation is of crucial importance in statistics and learning theory.
We consider here the classic regression setup, where a real valued square integrable r.v. $Y$ is to be predicted.
We prove nonasymptotic bounds improving upon those obtained for alternative estimation methods.
arXiv Detail & Related papers (2020-06-26T15:19:43Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.