Related papers: Predictive Querying for Autoregressive Neural Sequence Models

Predictive Querying for Autoregressive Neural Sequence Models

URL: http://arxiv.org/abs/2210.06464v2
Date: Thu, 13 Oct 2022 17:35:42 GMT
Title: Predictive Querying for Autoregressive Neural Sequence Models
Authors: Alex Boyd, Sam Showalter, Stephan Mandt, Padhraic Smyth
Abstract summary: We introduce a general typology for predictive queries in neural autoregressive sequence models. We show that such queries can be systematically represented by sets of elementary building blocks. We leverage this typology to develop new query estimation methods.
Score: 23.85426261235507
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In reasoning about sequential events it is natural to pose probabilistic queries such as "when will event A occur next" or "what is the probability of A occurring before B", with applications in areas such as user modeling, medicine, and finance. However, with machine learning shifting towards neural autoregressive models such as RNNs and transformers, probabilistic querying has been largely restricted to simple cases such as next-event prediction. This is in part due to the fact that future querying involves marginalization over large path spaces, which is not straightforward to do efficiently in such models. In this paper we introduce a general typology for predictive queries in neural autoregressive sequence models and show that such queries can be systematically represented by sets of elementary building blocks. We leverage this typology to develop new query estimation methods based on beam search, importance sampling, and hybrids. Across four large-scale sequence datasets from different application domains, as well as for the GPT-2 language model, we demonstrate the ability to make query answering tractable for arbitrary queries in exponentially-large predictive path-spaces, and find clear differences in cost-accuracy tradeoffs between search and sampling methods.

Related papers

Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables. We learn the causal Bayesian network and its confounding latent variables directly from the observational data. We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z)
Towards a Path Dependent Account of Category Fluency [2.66269503676104]
We present evidence towards resolving the disagreement between each account of foraging by reformulating models as sequence generators. We find category switch predictors do not necessarily produce human-like sequences, in fact the additional biases used by the Hills et al. (2012) model are required to improve generation quality.
arXiv Detail & Related papers (2024-05-09T16:36:56Z)
On the Efficient Marginalization of Probabilistic Sequence Models [3.5897534810405403]
This dissertation focuses on using autoregressive models to answer complex probabilistic queries. We develop a class of novel and efficient approximation techniques for marginalization in sequential models that are model-agnostic.
arXiv Detail & Related papers (2024-03-06T19:29:08Z)
Probabilistic Modeling for Sequences of Sets in Continuous-Time [14.423456635520084]
We develop a general framework for modeling set-valued data in continuous-time. We also develop inference methods that can use such models to answer probabilistic queries.
arXiv Detail & Related papers (2023-12-22T20:16:10Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems. We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases. We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z)
Probabilistic Querying of Continuous-Time Event Sequences [23.85426261235507]
This paper introduces a new typology of query types and a framework for addressing them using importance sampling. Example queries include predicting the $ntextth$ event type in a sequence and the hitting time distribution of one or more event types. We prove theoretically that our estimation method is effectively always better than naive simulation and show empirically based on three real-world datasets that it is on average 1,000 times more efficient than existing approaches.
arXiv Detail & Related papers (2022-11-15T20:58:00Z)
Autoregressive Quantile Flows for Predictive Uncertainty Estimation [7.184701179854522]
We propose Autoregressive Quantile Flows, a flexible class of probabilistic models over high-dimensional variables. These models are instances of autoregressive flows trained using a novel objective based on proper scoring rules.
arXiv Detail & Related papers (2021-12-09T01:11:26Z)
Complex Event Forecasting with Prediction Suffix Trees: Extended Technical Report [70.7321040534471]
Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events. There is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine. We present a formal framework that attempts to address the issue of Complex Event Forecasting.
arXiv Detail & Related papers (2021-09-01T09:52:31Z)
Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z)
Ambiguity in Sequential Data: Predicting Uncertain Futures with Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data. We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.