Maximum Weight Entropy
- URL: http://arxiv.org/abs/2309.15704v1
- Date: Wed, 27 Sep 2023 14:46:10 GMT
- Title: Maximum Weight Entropy
- Authors: Antoine de Mathelin, Fran\c{c}ois Deheeger, Mathilde Mougeot, Nicolas
Vayatis
- Abstract summary: This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods.
Considering neural networks, a practical optimization is derived to build such a distribution, defined as a trade-off between the average empirical risk and the weight distribution entropy.
- Score: 6.821961232645206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper deals with uncertainty quantification and out-of-distribution
detection in deep learning using Bayesian and ensemble methods. It proposes a
practical solution to the lack of prediction diversity observed recently for
standard approaches when used out-of-distribution (Ovadia et al., 2019; Liu et
al., 2021). Considering that this issue is mainly related to a lack of weight
diversity, we claim that standard methods sample in "over-restricted" regions
of the weight space due to the use of "over-regularization" processes, such as
weight decay and zero-mean centered Gaussian priors. We propose to solve the
problem by adopting the maximum entropy principle for the weight distribution,
with the underlying idea to maximize the weight diversity. Under this paradigm,
the epistemic uncertainty is described by the weight distribution of maximal
entropy that produces neural networks "consistent" with the training
observations. Considering stochastic neural networks, a practical optimization
is derived to build such a distribution, defined as a trade-off between the
average empirical risk and the weight distribution entropy. We develop a novel
weight parameterization for the stochastic model, based on the singular value
decomposition of the neural network's hidden representations, which enables a
large increase of the weight entropy for a small empirical risk penalization.
We provide both theoretical and numerical results to assess the efficiency of
the approach. In particular, the proposed algorithm appears in the top three
best methods in all configurations of an extensive out-of-distribution
detection benchmark including more than thirty competitors.
Related papers
- Optimization and Generalization Guarantees for Weight Normalization [19.965963460750206]
We provide the first theoretical characterizations of both optimization and generalization of deep WeightNorm models.
We present experimental results which illustrate how the normalization terms and other quantities of theoretical interest relate to the training of WeightNorm networks.
arXiv Detail & Related papers (2024-09-13T15:55:05Z) - Constrained Reweighting of Distributions: an Optimal Transport Approach [8.461214317999321]
We introduce a nonparametrically imbued distributional constraints on the weights, and develop a general framework leveraging the maximum entropy principle and tools from optimal transport.
The framework is demonstrated in the context of three disparate applications: portfolio allocation, semi-parametric inference for complex surveys, and ensuring algorithmic fairness in machine learning algorithms.
arXiv Detail & Related papers (2023-10-19T03:54:31Z) - Variational autoencoder with weighted samples for high-dimensional
non-parametric adaptive importance sampling [0.0]
We extend the existing framework to the case of weighted samples by introducing a new objective function.
In order to add flexibility to the model and to be able to learn multimodal distributions, we consider a learnable prior distribution.
We exploit the proposed procedure in existing adaptive importance sampling algorithms to draw points from a target distribution and to estimate a rare event probability in high dimension.
arXiv Detail & Related papers (2023-10-13T15:40:55Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Multivariate Deep Evidential Regression [77.34726150561087]
A new approach with uncertainty-aware neural networks shows promise over traditional deterministic methods.
We discuss three issues with a proposed solution to extract aleatoric and epistemic uncertainties from regression-based neural networks.
arXiv Detail & Related papers (2021-04-13T12:20:18Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.