Related papers: Simple Regularisation for Uncertainty-Aware Knowledge Distillation

Simple Regularisation for Uncertainty-Aware Knowledge Distillation

URL: http://arxiv.org/abs/2205.09526v1
Date: Thu, 19 May 2022 12:49:37 GMT
Title: Simple Regularisation for Uncertainty-Aware Knowledge Distillation
Authors: Martin Ferianc and Miguel Rodrigues
Abstract summary: In this work, we examine a simple regularisation approach for distribution-free knowledge distillation of ensemble of machine learning models into a single NN. The aim of the regularisation is to preserve the diversity, accuracy and uncertainty estimation characteristics of the original ensemble without any intricacies, such as fine-tuning.
Score: 2.792030485253753
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Considering uncertainty estimation of modern neural networks (NNs) is one of the most important steps towards deploying machine learning systems to meaningful real-world applications such as in medicine, finance or autonomous systems. At the moment, ensembles of different NNs constitute the state-of-the-art in both accuracy and uncertainty estimation in different tasks. However, ensembles of NNs are unpractical under real-world constraints, since their computation and memory consumption scale linearly with the size of the ensemble, which increase their latency and deployment cost. In this work, we examine a simple regularisation approach for distribution-free knowledge distillation of ensemble of machine learning models into a single NN. The aim of the regularisation is to preserve the diversity, accuracy and uncertainty estimation characteristics of the original ensemble without any intricacies, such as fine-tuning. We demonstrate the generality of the approach on combinations of toy data, SVHN/CIFAR-10, simple to complex NN architectures and different tasks.

Related papers

Neural Contraction Metrics with Formal Guarantees for Discrete-Time Nonlinear Dynamical Systems [17.905596843865705]
Contraction metrics provide a powerful framework for analyzing stability, robustness, and convergence of various dynamical systems.<n>However, identifying these metrics for complex nonlinear systems remains an open challenge due to the lack of effective tools.<n>This paper develops verifiable contraction metrics for discrete scalable nonlinear systems.
arXiv Detail & Related papers (2025-04-23T21:27:32Z)
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model [5.624791703748109]
Uncertainty quantification is a critical aspect of reinforcement learning and deep learning. We propose contextual similarity distillation, a novel approach that explicitly estimates the variance of an ensemble of deep neural networks with a single model. We empirically validate our method across a variety of out-of-distribution detection benchmarks and sparse-reward reinforcement learning environments.
arXiv Detail & Related papers (2025-03-14T12:09:58Z)
Regularized Neural Ensemblers [55.15643209328513]
In this study, we explore employing regularized neural networks as ensemble methods.<n>Motivated by the risk of learning low-diversity ensembles, we propose regularizing the ensembling model by randomly dropping base model predictions.<n>We demonstrate this approach provides lower bounds for the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
Empowering Bayesian Neural Networks with Functional Priors through Anchored Ensembling for Mechanics Surrogate Modeling Applications [0.0]
We present a novel BNN training scheme based on anchored ensembling that can integrate a priori information available in the function space. The anchoring scheme makes use of low-rank correlations between NN parameters, learnt from pre-training to realizations of the functional prior. We also perform a study to demonstrate how correlations between NN weights, which are often neglected in existing BNN implementations, is critical to appropriately transfer knowledge between the function-space and parameter-space priors.
arXiv Detail & Related papers (2024-09-08T22:27:50Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks. By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections. Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
Reliable uncertainty with cheaper neural network ensembles: a case study in industrial parts classification [1.104960878651584]
In operations research (OR), predictive models often encounter out-of-distribution (OOD) scenarios. Deep ensembles, composed of multiple independent NNs, have emerged as a promising approach. This study is the first to provide a comprehensive comparison of a single NN, a deep ensemble, and the three efficient NN ensembles.
arXiv Detail & Related papers (2024-03-15T10:38:48Z)
Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning [10.784911682565879]
Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. We present a novel self-supervised training regime that leverages an ensemble of independent sub-networks. Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty.
arXiv Detail & Related papers (2023-08-28T16:58:44Z)
Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks [142.67349734180445]
Existing algorithms that provide risk-awareness to deep neural networks are complex and ad-hoc. Here we present capsa, a framework for extending models with risk-awareness.
arXiv Detail & Related papers (2023-08-01T02:07:47Z)
Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles [0.7499722271664145]
Neural networks (NNs) often assign high confidence to their predictions, even for points far out-of-distribution. Uncertainty quantification (UQ) is a challenge when they are employed to model interatomic potentials in materials systems. Differentiable UQ techniques can find new informative data and drive active learning loops for robust potentials.
arXiv Detail & Related papers (2023-05-02T19:41:17Z)
On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear Modulation [69.34011200590817]
We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation. By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity. We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
arXiv Detail & Related papers (2022-05-31T18:33:15Z)
The Unreasonable Effectiveness of Deep Evidential Regression [72.30888739450343]
A new approach with uncertainty-aware regression-based neural networks (NNs) shows promise over traditional deterministic methods and typical Bayesian NNs. We detail the theoretical shortcomings and analyze the performance on synthetic and real-world data sets, showing that Deep Evidential Regression is a quantification rather than an exact uncertainty.
arXiv Detail & Related papers (2022-05-20T10:10:32Z)
Neural Complexity Measures [96.06344259626127]
We propose Neural Complexity (NC), a meta-learning framework for predicting generalization. Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a data-driven way.
arXiv Detail & Related papers (2020-08-07T02:12:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.