On the Impact of Sampling on Deep Sequential State Estimation
- URL: http://arxiv.org/abs/2311.17006v1
- Date: Tue, 28 Nov 2023 17:59:49 GMT
- Title: On the Impact of Sampling on Deep Sequential State Estimation
- Authors: Helena Calatrava and Ricardo Augusto Borsoi and Tales Imbiriba and Pau
Closas
- Abstract summary: State inference and parameter learning in sequential models can be successfully performed with approximation techniques.
Tighter Monte Carlo objectives have been proposed in the literature to enhance generative modeling performance.
- Score: 17.92198582435315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State inference and parameter learning in sequential models can be
successfully performed with approximation techniques that maximize the evidence
lower bound to the marginal log-likelihood of the data distribution. These
methods may be referred to as Dynamical Variational Autoencoders, and our
specific focus lies on the deep Kalman filter. It has been shown that the ELBO
objective can oversimplify data representations, potentially compromising
estimation quality. Tighter Monte Carlo objectives have been proposed in the
literature to enhance generative modeling performance. For instance, the IWAE
objective uses importance weights to reduce the variance of marginal
log-likelihood estimates. In this paper, importance sampling is applied to the
DKF framework for learning deep Markov models, resulting in the IW-DKF, which
shows an improvement in terms of log-likelihood estimates and KL divergence
between the variational distribution and the transition model. The framework
using the sampled DKF update rule is also accommodated to address sequential
state and parameter estimation when working with highly non-linear
physics-based models. An experiment with the 3-space Lorenz attractor shows an
enhanced generative modeling performance and also a decrease in RMSE when
estimating the model parameters and latent states, indicating that tighter MCOs
lead to improved state inference performance.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach [11.878350833222711]
We propose a method called em GradSamp for sampling gradient updates from a Gaussian distribution.
em GradSamp not only streamlines gradient but also enables skipping entire epochs, thereby enhancing overall efficiency.
We rigorously validate our hypothesis across a diverse set of standard and non-standard CNN and transformer-based models.
arXiv Detail & Related papers (2024-06-11T15:01:20Z) - Reliable Trajectory Prediction and Uncertainty Quantification with Conditioned Diffusion Models [11.308331231957588]
This work introduces the conditioned Vehicle Motion Diffusion (cVMD) model, a novel network architecture for highway trajectory prediction using diffusion models.
Central to the architecture of cVMD is its capacity to perform uncertainty quantification, a feature that is crucial in safety-critical applications.
Experiments show that the proposed architecture achieves competitive trajectory prediction accuracy compared to state-of-the-art models.
arXiv Detail & Related papers (2024-05-23T10:01:39Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Perimeter Control Using Deep Reinforcement Learning: A Model-free
Approach towards Homogeneous Flow Rate Optimization [28.851432612392436]
Perimeter control maintains high traffic efficiency within protected regions by controlling transfer flows among regions to ensure that their traffic densities are below critical values.
Existing approaches can be categorized as either model-based or model-free, depending on whether they rely on network transmission models (NTMs) and macroscopic fundamental diagrams (MFDs)
arXiv Detail & Related papers (2023-05-29T21:22:08Z) - Diffusion Causal Models for Counterfactual Estimation [18.438307666925425]
We consider the task of counterfactual estimation from observational imaging data given a known causal structure.
We propose Diff-SCM, a deep structural causal model that builds on recent advances of generative energy-based models.
We find that Diff-SCM produces more realistic and minimal counterfactuals than baselines on MNIST data and can also be applied to ImageNet data.
arXiv Detail & Related papers (2022-02-21T12:23:01Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.