Exchangeable Sequence Models Can Naturally Quantify Uncertainty Over Latent Concepts
- URL: http://arxiv.org/abs/2408.03307v2
- Date: Mon, 11 Nov 2024 20:23:44 GMT
- Title: Exchangeable Sequence Models Can Naturally Quantify Uncertainty Over Latent Concepts
- Authors: Naimeng Ye, Hanming Yang, Andrew Siah, Hongseok Namkoong,
- Abstract summary: We show that pre-trained sequence models are naturally capable of probabilistic reasoning over exchangeable data points.
A sequence model learns the relationship between observations, which differs from typical Bayesian models.
We show the sequence prediction loss controls the quality of uncertainty quantification.
- Score: 5.095571791233068
- License:
- Abstract: Intelligent agents must be able to articulate its own uncertainty. In this work, we show that pre-trained sequence models are naturally capable of probabilistic reasoning over exchangeable data points -- forming informed beliefs and sharpening them as it gathers more information. A sequence model learns the relationship between observations, which differs from typical Bayesian models that quantify uncertainty over latent parameters through priors and likelihoods (e.g., topic models). Despite the apparent difference, we illustrate how exchangeable sequence modeling provides a valid Bayesian model by going back to De Finetti's classical predictive view of probabilistic reasoning: uncertainty comes from data that has not been observed yet, rather than latent parameters. From this perspective, pre-training autoregressive models is equivalent to formulating informed beliefs based on prior observations ("empirical Bayes"), and forward generation is equivalent to simulating instantiations of an environment ("posterior inference"). In particular, exchangeable sequence models can explicitly perform statistical inference; epistemic uncertainty over latent environments is captured by variation in predicted future observations. Formally, we show the sequence prediction loss controls the quality of uncertainty quantification, and propose several approaches for encoding exchangeability in sequence model architectures: data augmentation, regularization, and causal masking.
Related papers
- On the Efficient Marginalization of Probabilistic Sequence Models [3.5897534810405403]
This dissertation focuses on using autoregressive models to answer complex probabilistic queries.
We develop a class of novel and efficient approximation techniques for marginalization in sequential models that are model-agnostic.
arXiv Detail & Related papers (2024-03-06T19:29:08Z) - Model-agnostic variable importance for predictive uncertainty: an entropy-based approach [1.912429179274357]
We show how existing methods in explainability can be extended to uncertainty-aware models.
We demonstrate the utility of these approaches to understand both the sources of uncertainty and their impact on model performance.
arXiv Detail & Related papers (2023-10-19T15:51:23Z) - Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity.
The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z) - Function-Space Regularization for Deep Bayesian Classification [33.63495888167032]
We apply a Dirichlet prior in predictive space and perform approximate function-space variational inference.
By adapting the inference, the same function-space prior can be combined with different models without affecting model architecture or size.
arXiv Detail & Related papers (2023-07-12T10:17:54Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - Transforming Autoregression: Interpretable and Expressive Time Series
Forecast [0.0]
We propose Autoregressive Transformation Models (ATMs), a model class inspired from various research directions.
ATMs unite expressive distributional forecasts using a semi-parametric distribution assumption with an interpretable model specification.
We demonstrate the properties of ATMs both theoretically and through empirical evaluation on several simulated and real-world forecasting datasets.
arXiv Detail & Related papers (2021-10-15T17:58:49Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - When in Doubt: Neural Non-Parametric Uncertainty Quantification for
Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions.
Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations.
We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z) - Aleatoric uncertainty for Errors-in-Variables models in deep regression [0.48733623015338234]
We show how the concept of Errors-in-Variables can be used in Bayesian deep regression.
We discuss the approach along various simulated and real examples.
arXiv Detail & Related papers (2021-05-19T12:37:02Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.