Beyond Point Estimate: Inferring Ensemble Prediction Variation from
Neuron Activation Strength in Recommender Systems
- URL: http://arxiv.org/abs/2008.07032v1
- Date: Mon, 17 Aug 2020 00:08:27 GMT
- Title: Beyond Point Estimate: Inferring Ensemble Prediction Variation from
Neuron Activation Strength in Recommender Systems
- Authors: Zhe Chen, Yuyan Wang, Dong Lin, Derek Zhiyuan Cheng, Lichan Hong, Ed
H. Chi, Claire Cui
- Abstract summary: Ensemble method is one state-of-the-art benchmark for prediction uncertainty estimation.
We observe that prediction variations come from various randomness sources.
We propose to infer prediction variation from neuron activation strength and demonstrate the strong prediction power from activation strength features.
- Score: 21.392694985689083
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite deep neural network (DNN)'s impressive prediction performance in
various domains, it is well known now that a set of DNN models trained with the
same model specification and the same data can produce very different
prediction results. Ensemble method is one state-of-the-art benchmark for
prediction uncertainty estimation. However, ensembles are expensive to train
and serve for web-scale traffic.
In this paper, we seek to advance the understanding of prediction variation
estimated by the ensemble method. Through empirical experiments on two widely
used benchmark datasets MovieLens and Criteo in recommender systems, we observe
that prediction variations come from various randomness sources, including
training data shuffling, and parameter random initialization. By introducing
more randomness into model training, we notice that ensemble's mean predictions
tend to be more accurate while the prediction variations tend to be higher.
Moreover, we propose to infer prediction variation from neuron activation
strength and demonstrate the strong prediction power from activation strength
features. Our experiment results show that the average R squared on MovieLens
is as high as 0.56 and on Criteo is 0.81. Our method performs especially well
when detecting the lowest and highest variation buckets, with 0.92 AUC and 0.89
AUC respectively. Our approach provides a simple way for prediction variation
estimation, which opens up new opportunities for future work in many
interesting areas (e.g.,model-based reinforcement learning) without relying on
serving expensive ensemble models.
Related papers
- Deep Limit Model-free Prediction in Regression [0.0]
We provide a Model-free approach based on Deep Neural Network (DNN) to accomplish point prediction and prediction interval under a general regression setting.
Our method is more stable and accurate compared to other DNN-based counterparts, especially for optimal point predictions.
arXiv Detail & Related papers (2024-08-18T16:37:53Z) - Awareness of uncertainty in classification using a multivariate model and multi-views [1.3048920509133808]
The proposed model regularizes uncertain predictions, and trains to calculate both the predictions and their uncertainty estimations.
Given the multi-view predictions together with their uncertainties and confidences, we proposed several methods to calculate final predictions.
The proposed methodology was tested using CIFAR-10 dataset with clean and noisy labels.
arXiv Detail & Related papers (2024-04-16T06:40:51Z) - From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks [0.0]
We reinvigorate maximum likelihood estimation (MLE) for macroeconomic density forecasting through a novel neural network architecture with dedicated mean and variance hemispheres.
Our Hemisphere Neural Network (HNN) provides proactive volatility forecasts based on leading indicators when it can, and reactive volatility based on the magnitude of previous prediction errors when it must.
arXiv Detail & Related papers (2023-11-27T21:37:50Z) - Human Trajectory Forecasting with Explainable Behavioral Uncertainty [63.62824628085961]
Human trajectory forecasting helps to understand and predict human behaviors, enabling applications from social robots to self-driving cars.
Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well.
We show that BNSP-SFM achieves up to a 50% improvement in prediction accuracy, compared with 11 state-of-the-art methods.
arXiv Detail & Related papers (2023-07-04T16:45:21Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Confidence and Dispersity Speak: Characterising Prediction Matrix for
Unsupervised Accuracy Estimation [51.809741427975105]
This work aims to assess how well a model performs under distribution shifts without using labels.
We use the nuclear norm that has been shown to be effective in characterizing both properties.
We show that the nuclear norm is more accurate and robust in accuracy than existing methods.
arXiv Detail & Related papers (2023-02-02T13:30:48Z) - Dropout Prediction Variation Estimation Using Neuron Activation Strength [6.625915508197312]
Dropout has been commonly used in various applications to quantify prediction variations.
We show how to estimate dropout prediction variation in a resource-efficient manner.
arXiv Detail & Related papers (2021-10-13T01:40:33Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Stochastic Optimization for Performative Prediction [31.876692592395777]
We study the difference between merely updating model parameters and deploying the new model.
We prove rates of convergence for both greedily deploying models after each update and for taking several updates before redeploying.
They illustrate how depending on the strength of performative effects, there exists a regime where either approach outperforms the other.
arXiv Detail & Related papers (2020-06-12T00:31:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.