Generalisation and the Risk--Entropy Curve
- URL: http://arxiv.org/abs/2202.07350v1
- Date: Tue, 15 Feb 2022 12:19:10 GMT
- Title: Generalisation and the Risk--Entropy Curve
- Authors: Dominic Belcher, Antonia Marcu, Adam Pr\"ugel-Bennett
- Abstract summary: We show that the expected generalisation performance of a learning machine is determined by the distribution of risks or equivalently its entropy.
Results are presented for different deep neural network models using Markov Chain Monte Carlo techniques.
- Score: 0.49723239539321284
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we show that the expected generalisation performance of a
learning machine is determined by the distribution of risks or equivalently its
logarithm -- a quantity we term the risk entropy -- and the fluctuations in a
quantity we call the training ratio. We show that the risk entropy can be
empirically inferred for deep neural network models using Markov Chain Monte
Carlo techniques. Results are presented for different deep neural networks on a
variety of problems. The asymptotic behaviour of the risk entropy acts in an
analogous way to the capacity of the learning machine, but the generalisation
performance experienced in practical situations is determined by the behaviour
of the risk entropy before the asymptotic regime is reached. This performance
is strongly dependent on the distribution of the data (features and targets)
and not just on the capacity of the learning machine.
Related papers
- Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks [142.67349734180445]
Existing algorithms that provide risk-awareness to deep neural networks are complex and ad-hoc.
Here we present capsa, a framework for extending models with risk-awareness.
arXiv Detail & Related papers (2023-08-01T02:07:47Z) - Computing large deviation prefactors of stochastic dynamical systems
based on machine learning [4.474127100870242]
We present large deviation theory that characterizes the exponential estimate for rare events of dynamical systems in the limit of weak noise.
We design a neural network framework to compute quasipotential, most probable paths and prefactors based on the decomposition of vector field.
Numerical experiments demonstrate its powerful function in exploring internal mechanism of rare events triggered by weak random fluctuations.
arXiv Detail & Related papers (2023-06-20T09:59:45Z) - IRL with Partial Observations using the Principle of Uncertain Maximum
Entropy [8.296684637620553]
We introduce the principle of uncertain maximum entropy and present an expectation-maximization based solution.
We experimentally demonstrate the improved robustness to noisy data offered by our technique in a maximum causal entropy inverse reinforcement learning domain.
arXiv Detail & Related papers (2022-08-15T03:22:46Z) - Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics
for Convex Losses in High-Dimension [25.711297863946193]
We develop a theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features.
We provide a complete description of the joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit.
arXiv Detail & Related papers (2022-01-31T17:44:58Z) - Structure-Preserving Learning Using Gaussian Processes and Variational
Integrators [62.31425348954686]
We propose the combination of a variational integrator for the nominal dynamics of a mechanical system and learning residual dynamics with Gaussian process regression.
We extend our approach to systems with known kinematic constraints and provide formal bounds on the prediction uncertainty.
arXiv Detail & Related papers (2021-12-10T11:09:29Z) - Asymptotic Risk of Overparameterized Likelihood Models: Double Descent
Theory for Deep Neural Networks [12.132641563193582]
We investigate the risk of a general class of overvisibilityized likelihood models, including deep models.
We demonstrate that several explicit models, such as parallel deep neural networks and ensemble learning, are in agreement with our theory.
arXiv Detail & Related papers (2021-02-28T13:02:08Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z) - Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network.
Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation.
We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.