Evaluating High-Order Predictive Distributions in Deep Learning
- URL: http://arxiv.org/abs/2202.13509v1
- Date: Mon, 28 Feb 2022 02:10:00 GMT
- Title: Evaluating High-Order Predictive Distributions in Deep Learning
- Authors: Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla,
Xiuyuan Lu, Benjamin Van Roy
- Abstract summary: Joint predictive distributions are essential for good performance in decision problems.
We introduce textitdyadic sampling, which focuses on predictive distributions associated with random textitpairs of inputs.
We demonstrate that this approach efficiently distinguishes agents in high-dimensional examples involving simple logistic regression as well as complex synthetic and empirical data.
- Score: 27.076321280462057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most work on supervised learning research has focused on marginal
predictions. In decision problems, joint predictive distributions are essential
for good performance. Previous work has developed methods for assessing
low-order predictive distributions with inputs sampled i.i.d. from the testing
distribution. With low-dimensional inputs, these methods distinguish agents
that effectively estimate uncertainty from those that do not. We establish that
the predictive distribution order required for such differentiation increases
greatly with input dimension, rendering these methods impractical. To
accommodate high-dimensional inputs, we introduce \textit{dyadic sampling},
which focuses on predictive distributions associated with random \textit{pairs}
of inputs. We demonstrate that this approach efficiently distinguishes agents
in high-dimensional examples involving simple logistic regression as well as
complex synthetic and empirical data.
Related papers
- Discussion: Effective and Interpretable Outcome Prediction by Training Sparse Mixtures of Linear Experts [4.178382980763478]
We propose to train a sparse Mixture-of-Experts where both the gate'' and expert'' sub-nets are Logistic Regressors.
This ensemble-like model is trained end-to-end while automatically selecting a subset of input features in each sub-net.
arXiv Detail & Related papers (2024-07-18T13:59:10Z) - Simple and effective data augmentation for compositional generalization [64.00420578048855]
We show that data augmentation methods that sample MRs and backtranslate them can be effective for compositional generalization.
Remarkably, sampling from a uniform distribution performs almost as well as sampling from the test distribution.
arXiv Detail & Related papers (2024-01-18T09:13:59Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity.
The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z) - Distribution Shift Inversion for Out-of-Distribution Prediction [57.22301285120695]
We propose a portable Distribution Shift Inversion algorithm for Out-of-Distribution (OoD) prediction.
We show that our method provides a general performance gain when plugged into a wide range of commonly used OoD algorithms.
arXiv Detail & Related papers (2023-06-14T08:00:49Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Evaluating Predictive Distributions: Does Bayesian Deep Learning Work? [45.290773422944866]
Posterior predictive distributions quantify uncertainties ignored by point estimates.
This paper introduces textitThe Neural Testbed, which provides tools for the systematic evaluation of agents that generate such predictions.
arXiv Detail & Related papers (2021-10-09T18:54:02Z) - Meta-Learning Conjugate Priors for Few-Shot Bayesian Optimization [0.0]
We propose a novel approach to utilize meta-learning to automate the estimation of informative conjugate prior distributions.
From this process we generate priors that require only few data to estimate the shape parameters of the original distribution of the data.
arXiv Detail & Related papers (2021-01-03T23:58:32Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z) - Mind the Trade-off: Debiasing NLU Models without Degrading the
In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization.
It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples.
We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.