Related papers: Last layer state space model for representation learning and uncertainty quantification

Last layer state space model for representation learning and uncertainty quantification

URL: http://arxiv.org/abs/2307.01566v1
Date: Tue, 4 Jul 2023 08:37:37 GMT
Title: Last layer state space model for representation learning and uncertainty quantification
Authors: Max Cohen (TSP), Maurice Charbit, Sylvain Le Corff (TSP)
Abstract summary: We propose to decompose a classification or regression task in two steps: a representation learning stage to learn low-dimensional states, and a state space model for uncertainty estimation. We demonstrate how predictive distributions can be estimated on top of an existing and trained neural network, by adding a state space-based last layer. Our model accounts for the noisy data structure, due to unknown or unavailable variables, and is able to provide confidence intervals on predictions.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As sequential neural architectures become deeper and more complex, uncertainty estimation is more and more challenging. Efforts in quantifying uncertainty often rely on specific training procedures, and bear additional computational costs due to the dimensionality of such models. In this paper, we propose to decompose a classification or regression task in two steps: a representation learning stage to learn low-dimensional states, and a state space model for uncertainty estimation. This approach allows to separate representation learning and design of generative models. We demonstrate how predictive distributions can be estimated on top of an existing and trained neural network, by adding a state space-based last layer whose parameters are estimated with Sequential Monte Carlo methods. We apply our proposed methodology to the hourly estimation of Electricity Transformer Oil temperature, a publicly benchmarked dataset. Our model accounts for the noisy data structure, due to unknown or unavailable variables, and is able to provide confidence intervals on predictions.

Related papers

A Probabilistic Perspective on Model Collapse [9.087950471621653]
This paper aims to characterize the conditions under which model collapse occurs and, crucially, how it can be mitigated.<n>Under mild conditions, we rigorously show that progressively increasing the sample size at each training step is necessary to prevent model collapse.<n>We also investigate the probability that training on synthetic data yields models that outperform those trained solely on real data.
arXiv Detail & Related papers (2025-05-20T05:25:29Z)
Towards Learning Stochastic Population Models by Gradient Descent [0.0]
We show that simultaneous estimation of parameters and structure poses major challenges for optimization procedures. We demonstrate accurate estimation of models but find that enforcing the inference of parsimonious, interpretable models drastically increases the difficulty.
arXiv Detail & Related papers (2024-04-10T14:38:58Z)
Kalman Filter for Online Classification of Non-Stationary Data [101.26838049872651]
In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. We introduce a probabilistic Bayesian online learning model by using a neural representation and a state space model over the linear predictor weights. In experiments in multi-class classification we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
arXiv Detail & Related papers (2023-06-14T11:41:42Z)
Neural Superstatistics for Bayesian Estimation of Dynamic Cognitive Models [2.7391842773173334]
We develop a simulation-based deep learning method for Bayesian inference, which can recover both time-varying and time-invariant parameters. Our results show that the deep learning approach is very efficient in capturing the temporal dynamics of the model.
arXiv Detail & Related papers (2022-11-23T17:42:53Z)
Fast Estimation of Bayesian State Space Models Using Amortized Simulation-Based Inference [0.0]
This paper presents a fast algorithm for estimating hidden states of Bayesian state space models. After pretraining, finding the posterior distribution for any dataset takes from hundredths to tenths of a second.
arXiv Detail & Related papers (2022-10-13T16:37:05Z)
TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. We propose a versatile method that estimates joint distributions using an attention-based decoder. We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z)
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z)
Likelihood-Free Inference in State-Space Models with Unknown Dynamics [71.94716503075645]
We introduce a method for inferring and predicting latent states in state-space models where observations can only be simulated, and transition dynamics are unknown. We propose a way of doing likelihood-free inference (LFI) of states and state prediction with a limited number of simulations.
arXiv Detail & Related papers (2021-11-02T12:33:42Z)
Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation. We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z)
Can a single neuron learn predictive uncertainty? [0.0]
We introduce a novel non-parametric quantile estimation method for continuous random variables based on the simplest neural network architecture with one degree of freedom: a single neuron. In real-world applications, the method can be used to quantify predictive uncertainty under the split conformal prediction setting.
arXiv Detail & Related papers (2021-06-07T15:12:47Z)
Learning Interpretable Deep State Space Model for Probabilistic Time Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history. We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks. We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z)
Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampling [3.077929914199468]
We show that modifying the sampling distributions for dropout layers in neural networks improves the quality of uncertainty estimation. Our main idea consists of two main steps: computing data-driven correlations between neurons and generating samples, which include maximally diverse neurons.
arXiv Detail & Related papers (2020-03-06T15:20:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.