Related papers: Dropout Prediction Variation Estimation Using Neuron Activation Strength

Dropout Prediction Variation Estimation Using Neuron Activation Strength

URL: http://arxiv.org/abs/2110.06435v1
Date: Wed, 13 Oct 2021 01:40:33 GMT
Title: Dropout Prediction Variation Estimation Using Neuron Activation Strength
Authors: Haichao Yu, Zhe Chen, Dong Lin, Gil Shamir, Jie Han
Abstract summary: Dropout has been commonly used in various applications to quantify prediction variations. We show how to estimate dropout prediction variation in a resource-efficient manner.
Score: 6.625915508197312
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is well-known DNNs would generate different prediction results even given the same model configuration and training dataset. As a result, it becomes more and more important to study prediction variation, i.e. the variation of the predictions on a given input example, in neural network models. Dropout has been commonly used in various applications to quantify prediction variations. However, using dropout in practice can be expensive as it requires running dropout inference many times to estimate prediction variation. In this paper, we study how to estimate dropout prediction variation in a resource-efficient manner. In particular, we demonstrate that we can use neuron activation strength to estimate dropout prediction variation under different dropout settings and on a variety of tasks using three large datasets, MovieLens, Criteo, and EMNIST. Our approach provides an inference-once alternative to estimate dropout prediction variation as an auxiliary task when the main prediction model is served. Moreover, we show that using activation strength features from a subset of neural network layers can be sufficient to achieve similar variation estimation performance compared to using activation features from all layers. This can provide further resource reduction for variation estimation.

Related papers

Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity. The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models. Common methods often consider the feature statistics as deterministic values measured from the learned features. We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z)
A heteroencoder architecture for prediction of failure locations in porous metals using variational inference [1.2722697496405462]
We employ an encoder-decoder convolutional neural network to predict the failure locations of porous metal tension specimens. The objective of predicting failure locations presents an extreme case of class imbalance since most of the material in the specimens do not fail. We demonstrate that the resulting predicted variances are effective in ranking the locations that are most likely to fail in any given specimen.
arXiv Detail & Related papers (2022-01-31T20:26:53Z)
Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation. We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z)
Set Prediction without Imposing Structure as Conditional Density Estimation [40.86881969839325]
We propose an alternative to training via set losses by viewing learning as conditional density estimation. Our framework fits deep energy-based models and approximates the intractable likelihood with gradient-guided sampling. Our approach is competitive with previous set prediction models on standard benchmarks.
arXiv Detail & Related papers (2020-10-08T16:49:16Z)
Ramifications of Approximate Posterior Inference for Bayesian Deep Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks. Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z)
Beyond Point Estimate: Inferring Ensemble Prediction Variation from Neuron Activation Strength in Recommender Systems [21.392694985689083]
Ensemble method is one state-of-the-art benchmark for prediction uncertainty estimation. We observe that prediction variations come from various randomness sources. We propose to infer prediction variation from neuron activation strength and demonstrate the strong prediction power from activation strength features.
arXiv Detail & Related papers (2020-08-17T00:08:27Z)
Estimation with Uncertainty via Conditional Generative Adversarial Networks [3.829070379776576]
We propose a predictive probabilistic neural network model, which corresponds to a different manner of using the generator in conditional Generative Adversarial Network (cGAN) By reversing the input and output of ordinary cGAN, the model can be successfully used as a predictive model. In addition, to measure the uncertainty of predictions, we introduce the entropy and relative entropy for regression problems and classification problems.
arXiv Detail & Related papers (2020-07-01T08:54:17Z)
Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.