Minimum Excess Risk in Bayesian Learning
- URL: http://arxiv.org/abs/2012.14868v1
- Date: Tue, 29 Dec 2020 17:41:30 GMT
- Title: Minimum Excess Risk in Bayesian Learning
- Authors: Aolin Xu, Maxim Raginsky
- Abstract summary: We analyze the best achievable performance of Bayesian learning under generative models by defining and upper-bounding the minimum excess risk (MER)
The definition of MER provides a principled way to define different notions of uncertainties in Bayesian learning.
- Score: 23.681494934015927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We analyze the best achievable performance of Bayesian learning under
generative models by defining and upper-bounding the minimum excess risk (MER):
the gap between the minimum expected loss attainable by learning from data and
the minimum expected loss that could be achieved if the model realization were
known. The definition of MER provides a principled way to define different
notions of uncertainties in Bayesian learning, including the aleatoric
uncertainty and the minimum epistemic uncertainty. Two methods for deriving
upper bounds for the MER are presented. The first method, generally suitable
for Bayesian learning with a parametric generative model, upper-bounds the MER
by the conditional mutual information between the model parameters and the
quantity being predicted given the observed data. It allows us to quantify the
rate at which the MER decays to zero as more data becomes available. The second
method, particularly suitable for Bayesian learning with a parametric
predictive model, relates the MER to the deviation of the posterior predictive
distribution from the true predictive model, and further to the minimum
estimation error of the model parameters from data. It explicitly shows how the
uncertainty in model parameter estimation translates to the MER and to the
final prediction uncertainty. We also extend the definition and analysis of MER
to the setting with multiple parametric model families and the setting with
nonparametric models. Along the discussions we draw some comparisons between
the MER in Bayesian learning and the excess risk in frequentist learning.
Related papers
- Error Bounds of Supervised Classification from Information-Theoretic Perspective [0.0]
We explore bounds on the expected risk when using deep neural networks for supervised classification from an information theoretic perspective.
We introduce model risk and fitting error, which are derived from further decomposing the empirical risk.
arXiv Detail & Related papers (2024-06-07T01:07:35Z) - Performative Prediction with Neural Networks [24.880495520422]
performative prediction is a framework for learning models that influence the data they intend to predict.
Standard convergence results for finding a performatively stable classifier with the method of repeated risk minimization assume that the data distribution is Lipschitz continuous to the model's parameters.
In this work, we instead assume that the data distribution is Lipschitz continuous with respect to the model's predictions, a more natural assumption for performative systems.
arXiv Detail & Related papers (2023-04-14T01:12:48Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Prediction Errors for Penalized Regressions based on Generalized
Approximate Message Passing [0.0]
We derive the forms of estimators for the prediction errors: $C_p$ criterion, information criteria, and leave-one-out cross validation (LOOCV) error.
In the framework of GAMP, we show that the information criteria can be expressed by using the variance of the estimates.
arXiv Detail & Related papers (2022-06-26T09:42:39Z) - Uncertainty estimation of pedestrian future trajectory using Bayesian
approximation [137.00426219455116]
Under dynamic traffic scenarios, planning based on deterministic predictions is not trustworthy.
The authors propose to quantify uncertainty during forecasting using approximation which deterministic approaches fail to capture.
The effect of dropout weights and long-term prediction on future state uncertainty has been studied.
arXiv Detail & Related papers (2022-05-04T04:23:38Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Improving Deterministic Uncertainty Estimation in Deep Learning for
Classification and Regression [30.112634874443494]
We propose a new model that estimates uncertainty in a single forward pass.
Our approach combines a bi-Lipschitz feature extractor with an inducing point approximate Gaussian process, offering robust and principled uncertainty estimation.
arXiv Detail & Related papers (2021-02-22T23:29:12Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Learning to Predict Error for MRI Reconstruction [67.76632988696943]
We demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error.
We propose a novel method that estimates the target labels and magnitude of the prediction error in two steps.
arXiv Detail & Related papers (2020-02-13T15:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.