Regularizing Explanations in Bayesian Convolutional Neural Networks
- URL: http://arxiv.org/abs/2105.02653v3
- Date: Wed, 27 Nov 2024 19:06:09 GMT
- Title: Regularizing Explanations in Bayesian Convolutional Neural Networks
- Authors: Yanzhe Bekkemoen, Helge Langseth,
- Abstract summary: We propose a new explanation regularization method compatible with Bayesian inference.
Our method improves predictive performance when models overfit on spurious features or are uncertain of which features to focus on.
- Score: 0.4538232180176148
- License:
- Abstract: Neural networks are powerful function approximators with tremendous potential in learning complex distributions. However, they are prone to overfitting on spurious patterns. Bayesian inference provides a principled way to regularize neural networks and give well-calibrated uncertainty estimates. It allows us to specify prior knowledge on weights. However, specifying domain knowledge via distributions over weights is infeasible. Furthermore, it is unable to correct models when they focus on spurious or irrelevant features. New methods within explainable artificial intelligence allow us to regularize explanations in the form of feature importance to add domain knowledge and correct the models' focus. Nevertheless, they are incompatible with Bayesian neural networks, as they require us to modify the loss function. We propose a new explanation regularization method that is compatible with Bayesian inference. Consequently, we can quantify uncertainty and, at the same time, have correct explanations. We test our method using four different datasets. The results show that our method improves predictive performance when models overfit on spurious features or are uncertain of which features to focus on. Moreover, our method performs better than augmenting training data with samples where spurious features are removed through masking. We provide code, data, trained weights, and hyperparameters.
Related papers
- Sparsifying Bayesian neural networks with latent binary variables and
normalizing flows [10.865434331546126]
We will consider two extensions to the latent binary Bayesian neural networks (LBBNN) method.
Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm.
More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian.
arXiv Detail & Related papers (2023-05-05T09:40:28Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Look beyond labels: Incorporating functional summary information in
Bayesian neural networks [11.874130244353253]
We present a simple approach to incorporate summary information about the predicted probability.
The available summary information is incorporated as augmented data and modeled with a Dirichlet process.
We show how the method can inform the model about task difficulty or class imbalance.
arXiv Detail & Related papers (2022-07-04T07:06:45Z) - Augmenting Neural Networks with Priors on Function Values [22.776982718042962]
Prior knowledge of function values is often available in the natural sciences.
BNNs enable the user to specify prior information only on the neural network weights, not directly on the function values.
We develop an approach to augment BNNs with prior information on the function values themselves.
arXiv Detail & Related papers (2022-02-10T02:24:15Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - Why have a Unified Predictive Uncertainty? Disentangling it using Deep
Split Ensembles [39.29536042476913]
Understanding and quantifying uncertainty in black box Neural Networks (NNs) is critical when deployed in real-world settings such as healthcare.
We propose a conceptually simple non-Bayesian approach, deep split ensemble, to disentangle the predictive uncertainties.
arXiv Detail & Related papers (2020-09-25T19:15:26Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.