Interpretable Latent Variables in Deep State Space Models
- URL: http://arxiv.org/abs/2203.02057v1
- Date: Thu, 3 Mar 2022 23:10:58 GMT
- Title: Interpretable Latent Variables in Deep State Space Models
- Authors: Haoxuan Wu, David S. Matteson and Martin T. Wells
- Abstract summary: We introduce a new version of deep state-space models (DSSMs) that combines a recurrent neural network with a state-space framework to forecast time series data.
The model estimates the observed series as functions of latent variables that evolve non-linearly through time.
- Score: 4.884336328409872
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a new version of deep state-space models (DSSMs) that combines a
recurrent neural network with a state-space framework to forecast time series
data. The model estimates the observed series as functions of latent variables
that evolve non-linearly through time. Due to the complexity and non-linearity
inherent in DSSMs, previous works on DSSMs typically produced latent variables
that are very difficult to interpret. Our paper focus on producing
interpretable latent parameters with two key modifications. First, we simplify
the predictive decoder by restricting the response variables to be a linear
transformation of the latent variables plus some noise. Second, we utilize
shrinkage priors on the latent variables to reduce redundancy and improve
robustness. These changes make the latent variables much easier to understand
and allow us to interpret the resulting latent variables as random effects in a
linear mixed model. We show through two public benchmark datasets the resulting
model improves forecasting performances.
Related papers
- Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - A Non-negative VAE:the Generalized Gamma Belief Network [49.970917207211556]
The gamma belief network (GBN) has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data.
We introduce the generalized gamma belief network (Generalized GBN) in this paper, which extends the original linear generative model to a more expressive non-linear generative model.
We also propose an upward-downward Weibull inference network to approximate the posterior distribution of the latent variables.
arXiv Detail & Related papers (2024-08-06T18:18:37Z) - Recurrence Boosts Diversity! Revisiting Recurrent Latent Variable in
Transformer-Based Variational AutoEncoder for Diverse Text Generation [85.5379146125199]
Variational Auto-Encoder (VAE) has been widely adopted in text generation.
We propose TRACE, a Transformer-based recurrent VAE structure.
arXiv Detail & Related papers (2022-10-22T10:25:35Z) - Learning and Inference in Sparse Coding Models with Langevin Dynamics [3.0600309122672726]
We describe a system capable of inference and learning in a probabilistic latent variable model.
We demonstrate this idea for a sparse coding model by deriving a continuous-time equation for inferring its latent variables via Langevin dynamics.
We show that Langevin dynamics lead to an efficient procedure for sampling from the posterior distribution in the 'L0 sparse' regime, where latent variables are encouraged to be set to zero as opposed to having a small L1 norm.
arXiv Detail & Related papers (2022-04-23T23:16:47Z) - Double Control Variates for Gradient Estimation in Discrete Latent
Variable Models [32.33171301923846]
We introduce a variance reduction technique for score function estimators.
We show that our estimator can have lower variance compared to other state-of-the-art estimators.
arXiv Detail & Related papers (2021-11-09T18:02:42Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z) - Deep Switching State Space Model (DS$^3$M) for Nonlinear Time Series
Forecasting with Regime Switching [3.3970049571884204]
We propose a deep switching state space model (DS$3$M) for efficient inference and forecasting of nonlinear time series.
The switching among regimes is captured by both discrete and continuous latent variables with recurrent neural networks.
arXiv Detail & Related papers (2021-06-04T08:25:47Z) - Bayesian neural networks and dimensionality reduction [4.039245878626346]
A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function.
VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable.
We deploy Markov chain Monte Carlo sampling algorithms for Bayesian inference in ANN models with latent variables.
arXiv Detail & Related papers (2020-08-18T17:11:07Z) - Relaxed-Responsibility Hierarchical Discrete VAEs [3.976291254896486]
We introduce textitRelaxed-Responsibility Vector-Quantisation, a novel way to parameterise discrete latent variables.
We achieve state-of-the-art bits-per-dim results for various standard datasets.
arXiv Detail & Related papers (2020-07-14T19:10:05Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - Variational Hyper RNN for Sequence Modeling [69.0659591456772]
We propose a novel probabilistic sequence model that excels at capturing high variability in time series data.
Our method uses temporal latent variables to capture information about the underlying data pattern.
The efficacy of the proposed method is demonstrated on a range of synthetic and real-world sequential data.
arXiv Detail & Related papers (2020-02-24T19:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.