Variational excess risk bound for general state space models
- URL: http://arxiv.org/abs/2312.09607v1
- Date: Fri, 15 Dec 2023 08:41:07 GMT
- Title: Variational excess risk bound for general state space models
- Authors: \'Elisabeth Gassiat (LM-Orsay), Sylvain Le Corff (SU, LPSM
(UMR\_8001))
- Abstract summary: We consider variational autoencoders (VAE) for general state space models.
We consider a backward factorization of the variational distributions to analyze the excess risk associated with VAE.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider variational autoencoders (VAE) for general state
space models. We consider a backward factorization of the variational
distributions to analyze the excess risk associated with VAE. Such backward
factorizations were recently proposed to perform online variational learning
and to obtain upper bounds on the variational estimation error. When
independent trajectories of sequences are observed and under strong mixing
assumptions on the state space model and on the variational distribution, we
provide an oracle inequality explicit in the number of samples and in the
length of the observation sequences. We then derive consequences of this
theoretical result. In particular, when the data distribution is given by a
state space model, we provide an upper bound for the Kullback-Leibler
divergence between the data distribution and its estimator and between the
variational posterior and the estimated state space posterior
distributions.Under classical assumptions, we prove that our results can be
applied to Gaussian backward kernels built with dense and recurrent neural
networks.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Generalized Laplace Approximation [23.185126261153236]
We introduce a unified theoretical framework to attribute Bayesian inconsistency to model misspecification and inadequate priors.
We propose the generalized Laplace approximation, which involves a simple adjustment to the Hessian matrix of the regularized loss function.
We assess the performance and properties of the generalized Laplace approximation on state-of-the-art neural networks and real-world datasets.
arXiv Detail & Related papers (2024-05-22T11:11:42Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Asymptotics of Bayesian Uncertainty Estimation in Random Features
Regression [1.170951597793276]
We focus on the variance of the posterior predictive distribution (Bayesian model average) and compare itss to that of the risk of the MAP estimator.
They also agree with each other when the number of samples grow faster than any constant multiple of model dimensions.
arXiv Detail & Related papers (2023-06-06T15:36:15Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - Excess risk analysis for epistemic uncertainty with application to
variational inference [110.4676591819618]
We present a novel EU analysis in the frequentist setting, where data is generated from an unknown distribution.
We show a relation between the generalization ability and the widely used EU measurements, such as the variance and entropy of the predictive distribution.
We propose new variational inference that directly controls the prediction and EU evaluation performances based on the PAC-Bayesian theory.
arXiv Detail & Related papers (2022-06-02T12:12:24Z) - Amortized backward variational inference in nonlinear state-space models [0.0]
We consider the problem of state estimation in general state-space models using variational inference.
We establish for the first time that, under mixing assumptions, the variational approximation of expectations of additive state functionals induces an error which grows at most linearly in the number of observations.
arXiv Detail & Related papers (2022-06-01T08:35:54Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Discrete Variational Attention Models for Language Generation [51.88612022940496]
We propose a discrete variational attention model with categorical distribution over the attention mechanism owing to the discrete nature in languages.
Thanks to the property of discreteness, the training of our proposed approach does not suffer from posterior collapse.
arXiv Detail & Related papers (2020-04-21T05:49:04Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.