De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and
Non-smooth Predictors
- URL: http://arxiv.org/abs/2002.09956v3
- Date: Thu, 12 Nov 2020 01:29:31 GMT
- Title: De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and
Non-smooth Predictors
- Authors: Arindam Banerjee, Tiancong Chen and Yingxue Zhou
- Abstract summary: We present a family of de-randomized PACes for deterministic non-smooth predictors, e.g., ReLU-nets.
We also present empirical results of our bounds over changing set size and in labels.
- Score: 21.59277717031637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In spite of several notable efforts, explaining the generalization of
deterministic non-smooth deep nets, e.g., ReLU-nets, has remained challenging.
Existing approaches for deterministic non-smooth deep nets typically need to
bound the Lipschitz constant of such deep nets but such bounds are quite large,
may even increase with the training set size yielding vacuous generalization
bounds. In this paper, we present a new family of de-randomized PAC-Bayes
margin bounds for deterministic non-convex and non-smooth predictors, e.g.,
ReLU-nets. Unlike PAC-Bayes, which applies to Bayesian predictors, the
de-randomized bounds apply to deterministic predictors like ReLU-nets. A
specific instantiation of the bound depends on a trade-off between the
(weighted) distance of the trained weights from the initialization and the
effective curvature (`flatness') of the trained predictor.
To get to these bounds, we first develop a de-randomization argument for
non-convex but smooth predictors, e.g., linear deep networks (LDNs), which
connects the performance of the deterministic predictor with a Bayesian
predictor. We then consider non-smooth predictors which for any given input
realized as a smooth predictor, e.g., ReLU-nets become some LDNs for any given
input, but the realized smooth predictors can be different for different
inputs. For such non-smooth predictors, we introduce a new PAC-Bayes analysis
which takes advantage of the smoothness of the realized predictors, e.g., LDN,
for a given input, and avoids dependency on the Lipschitz constant of the
non-smooth predictor. After careful de-randomization, we get a bound for the
deterministic non-smooth predictor. We also establish non-uniform sample
complexity results based on such bounds. Finally, we present extensive
empirical results of our bounds over changing training set size and randomness
in labels.
Related papers
- Conformalized-DeepONet: A Distribution-Free Framework for Uncertainty
Quantification in Deep Operator Networks [7.119066725173193]
We use conformal prediction to obtain confidence prediction intervals with coverage guarantees for Deep Operator Network (DeepONet) regression.
We design a novel Quantile-DeepONet that allows for a more natural use of split conformal prediction.
We demonstrate the effectiveness of the proposed methods using various ordinary, partial differential equation numerical examples.
arXiv Detail & Related papers (2024-02-23T16:07:39Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Looking at the posterior: accuracy and uncertainty of neural-network
predictions [0.0]
We show that prediction accuracy depends on both epistemic and aleatoric uncertainty.
We introduce a novel acquisition function that outperforms common uncertainty-based methods.
arXiv Detail & Related papers (2022-11-26T16:13:32Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal
Estimation [25.003116148843525]
Surface normal estimation from a single image is an important task in 3D scene understanding.
In this paper, we address two limitations shared by the existing methods: the inability to estimate the aleatoric uncertainty and lack of detail in the prediction.
We present a novel decoder framework where pixel-wise perceptrons are trained on a subset of pixels sampled based on the estimated uncertainty.
arXiv Detail & Related papers (2021-09-20T23:30:04Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy.
We present a framework that treats these two desiderata jointly.
We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z) - Learnable Uncertainty under Laplace Approximations [65.24701908364383]
We develop a formalism to explicitly "train" the uncertainty in a decoupled way to the prediction itself.
We show that such units can be trained via an uncertainty-aware objective, improving standard Laplace approximations' performance.
arXiv Detail & Related papers (2020-10-06T13:43:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.