Epistemic Neural Networks
- URL: http://arxiv.org/abs/2107.08924v8
- Date: Wed, 17 May 2023 21:17:29 GMT
- Title: Epistemic Neural Networks
- Authors: Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla,
Morteza Ibrahimi, Xiuyuan Lu, and Benjamin Van Roy
- Abstract summary: In principle, ensemble-based approaches produce effective joint predictions.
But the computational costs of training large ensembles can become prohibitive.
We introduce the epinet: an architecture that can supplement any conventional neural network.
- Score: 25.762699400341972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligence relies on an agent's knowledge of what it does not know. This
capability can be assessed based on the quality of joint predictions of labels
across multiple inputs. In principle, ensemble-based approaches produce
effective joint predictions, but the computational costs of training large
ensembles can become prohibitive. We introduce the epinet: an architecture that
can supplement any conventional neural network, including large pretrained
models, and can be trained with modest incremental computation to estimate
uncertainty. With an epinet, conventional neural networks outperform very large
ensembles, consisting of hundreds or more particles, with orders of magnitude
less computation. The epinet does not fit the traditional framework of Bayesian
neural networks. To accommodate development of approaches beyond BNNs, such as
the epinet, we introduce the epistemic neural network (ENN) as an interface for
models that produce joint predictions.
Related papers
- Predictive Coding Networks and Inference Learning: Tutorial and Survey [0.7510165488300368]
Predictive coding networks (PCNs) are based on the neuroscientific framework of predictive coding.
Unlike traditional neural networks trained with backpropagation (BP), PCNs utilize inference learning (IL), a more biologically plausible algorithm.
As inherently probabilistic (graphical) latent variable models, PCNs provide a versatile framework for both supervised learning and unsupervised (generative) modeling.
arXiv Detail & Related papers (2024-07-04T18:39:20Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Neural Bayesian Network Understudy [13.28673601999793]
We show that a neural network can be trained to output conditional probabilities, providing approximately the same functionality as a Bayesian Network.
We propose two training strategies that allow encoding the independence relations inferred from a given causal structure into the neural network.
arXiv Detail & Related papers (2022-11-15T15:56:51Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - Knowledge Enhanced Neural Networks for relational domains [83.9217787335878]
We focus on a specific method, KENN, a Neural-Symbolic architecture that injects prior logical knowledge into a neural network.
In this paper, we propose an extension of KENN for relational data.
arXiv Detail & Related papers (2022-05-31T13:00:34Z) - AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification [0.9449650062296824]
We propose AutoDEUQ, an automated approach for generating an ensemble of deep neural networks.
We show that AutoDEUQ outperforms probabilistic backpropagation, Monte Carlo dropout, deep ensemble, distribution-free ensembles, and hyper ensemble methods on a number of regression benchmarks.
arXiv Detail & Related papers (2021-10-26T09:12:23Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Learn Like The Pro: Norms from Theory to Size Neural Computation [3.848947060636351]
We investigate how dynamical systems with nonlinearities can inform the design of neural systems that seek to emulate them.
We propose a Learnability metric and quantify its associated features to the near-equilibrium behavior of learning dynamics.
It reveals exact sizing for a class of neural networks with multiplicative nodes that mimic continuous- or discrete-time dynamics.
arXiv Detail & Related papers (2021-06-21T20:58:27Z) - Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data.
The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z) - Neural Rule Ensembles: Encoding Sparse Feature Interactions into Neural
Networks [3.7277730514654555]
We use decision trees to capture relevant features and their interactions and define a mapping to encode extracted relationships into a neural network.
At the same time through feature selection it enables learning of compact representations compared to state of the art tree-based approaches.
arXiv Detail & Related papers (2020-02-11T11:22:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.