Simple Regularisation for Uncertainty-Aware Knowledge Distillation
- URL: http://arxiv.org/abs/2205.09526v1
- Date: Thu, 19 May 2022 12:49:37 GMT
- Title: Simple Regularisation for Uncertainty-Aware Knowledge Distillation
- Authors: Martin Ferianc and Miguel Rodrigues
- Abstract summary: In this work, we examine a simple regularisation approach for distribution-free knowledge distillation of ensemble of machine learning models into a single NN.
The aim of the regularisation is to preserve the diversity, accuracy and uncertainty estimation characteristics of the original ensemble without any intricacies, such as fine-tuning.
- Score: 2.792030485253753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Considering uncertainty estimation of modern neural networks (NNs) is one of
the most important steps towards deploying machine learning systems to
meaningful real-world applications such as in medicine, finance or autonomous
systems. At the moment, ensembles of different NNs constitute the
state-of-the-art in both accuracy and uncertainty estimation in different
tasks. However, ensembles of NNs are unpractical under real-world constraints,
since their computation and memory consumption scale linearly with the size of
the ensemble, which increase their latency and deployment cost. In this work,
we examine a simple regularisation approach for distribution-free knowledge
distillation of ensemble of machine learning models into a single NN. The aim
of the regularisation is to preserve the diversity, accuracy and uncertainty
estimation characteristics of the original ensemble without any intricacies,
such as fine-tuning. We demonstrate the generality of the approach on
combinations of toy data, SVHN/CIFAR-10, simple to complex NN architectures and
different tasks.
Related papers
- Empowering Bayesian Neural Networks with Functional Priors through Anchored Ensembling for Mechanics Surrogate Modeling Applications [0.0]
We present a novel BNN training scheme based on anchored ensembling that can integrate a priori information available in the function space.
The anchoring scheme makes use of low-rank correlations between NN parameters, learnt from pre-training to realizations of the functional prior.
We also perform a study to demonstrate how correlations between NN weights, which are often neglected in existing BNN implementations, is critical to appropriately transfer knowledge between the function-space and parameter-space priors.
arXiv Detail & Related papers (2024-09-08T22:27:50Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Reliable uncertainty with cheaper neural network ensembles: a case study in industrial parts classification [1.104960878651584]
In operations research (OR), predictive models often encounter out-of-distribution (OOD) scenarios.
Deep ensembles, composed of multiple independent NNs, have emerged as a promising approach.
This study is the first to provide a comprehensive comparison of a single NN, a deep ensemble, and the three efficient NN ensembles.
arXiv Detail & Related papers (2024-03-15T10:38:48Z) - Diversified Ensemble of Independent Sub-Networks for Robust
Self-Supervised Representation Learning [10.784911682565879]
Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning.
We present a novel self-supervised training regime that leverages an ensemble of independent sub-networks.
Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty.
arXiv Detail & Related papers (2023-08-28T16:58:44Z) - Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks [142.67349734180445]
Existing algorithms that provide risk-awareness to deep neural networks are complex and ad-hoc.
Here we present capsa, a framework for extending models with risk-awareness.
arXiv Detail & Related papers (2023-08-01T02:07:47Z) - Single-model uncertainty quantification in neural network potentials
does not consistently outperform model ensembles [0.7499722271664145]
Neural networks (NNs) often assign high confidence to their predictions, even for points far out-of-distribution.
Uncertainty quantification (UQ) is a challenge when they are employed to model interatomic potentials in materials systems.
Differentiable UQ techniques can find new informative data and drive active learning loops for robust potentials.
arXiv Detail & Related papers (2023-05-02T19:41:17Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear
Modulation [69.34011200590817]
We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation.
By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity.
We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
arXiv Detail & Related papers (2022-05-31T18:33:15Z) - The Unreasonable Effectiveness of Deep Evidential Regression [72.30888739450343]
A new approach with uncertainty-aware regression-based neural networks (NNs) shows promise over traditional deterministic methods and typical Bayesian NNs.
We detail the theoretical shortcomings and analyze the performance on synthetic and real-world data sets, showing that Deep Evidential Regression is a quantification rather than an exact uncertainty.
arXiv Detail & Related papers (2022-05-20T10:10:32Z) - Neural Complexity Measures [96.06344259626127]
We propose Neural Complexity (NC), a meta-learning framework for predicting generalization.
Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a data-driven way.
arXiv Detail & Related papers (2020-08-07T02:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.