SAE: Sequential Anchored Ensembles
- URL: http://arxiv.org/abs/2201.00649v1
- Date: Thu, 30 Dec 2021 12:47:27 GMT
- Title: SAE: Sequential Anchored Ensembles
- Authors: Arnaud Delaunoy, Gilles Louppe
- Abstract summary: We present Sequential Anchored Ensembles (SAE), a lightweight alternative to anchored ensembles.
Instead of training each member of the ensemble from scratch, the members are trained sequentially on losses sampled with high auto-correlation.
SAE outperform anchored ensembles, for a given computational budget, on some benchmarks while showing comparable performance on the others.
- Score: 7.888755225607877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computing the Bayesian posterior of a neural network is a challenging task
due to the high-dimensionality of the parameter space. Anchored ensembles
approximate the posterior by training an ensemble of neural networks on
anchored losses designed for the optima to follow the Bayesian posterior.
Training an ensemble, however, becomes computationally expensive as its number
of members grows since the full training procedure is repeated for each member.
In this note, we present Sequential Anchored Ensembles (SAE), a lightweight
alternative to anchored ensembles. Instead of training each member of the
ensemble from scratch, the members are trained sequentially on losses sampled
with high auto-correlation, hence enabling fast convergence of the neural
networks and efficient approximation of the Bayesian posterior. SAE outperform
anchored ensembles, for a given computational budget, on some benchmarks while
showing comparable performance on the others and achieved 2nd and 3rd place in
the light and extended tracks of the NeurIPS 2021 Approximate Inference in
Bayesian Deep Learning competition.
Related papers
- Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications.
We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time.
Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network.
We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z) - Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in
Deep Learning [24.3370326359959]
We propose to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep neural networks.
We theoretically validate that our approach mitigates overconfidence "far away" from the training data and empirically compare against state-of-the-art baselines on standard uncertainty quantification benchmarks.
arXiv Detail & Related papers (2021-11-05T15:52:48Z) - Total Recall: a Customized Continual Learning Method for Neural Semantic
Parsers [38.035925090154024]
A neural semantic learns tasks sequentially without accessing full training data from previous tasks.
We propose TotalRecall, a continual learning method designed for neural semantics from two aspects.
We demonstrate that a neural semantic trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch.
arXiv Detail & Related papers (2021-09-11T04:33:28Z) - FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training
with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin.
We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z) - Greedy Bayesian Posterior Approximation with Deep Ensembles [22.466176036646814]
Ensembles of independently trained objective are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning.
We show that our method is submodular with respect to the mixture of components for any problem in a function space.
arXiv Detail & Related papers (2021-05-29T11:35:27Z) - Bayesian Deep Ensembles via the Neural Tangent Kernel [49.569912265882124]
We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK)
We introduce a simple modification to standard deep ensembles training, through addition of a computationally-tractable, randomised and untrainable function to each ensemble member.
We prove that our Bayesian deep ensembles make more conservative predictions than standard deep ensembles in the infinite width limit.
arXiv Detail & Related papers (2020-07-11T22:10:52Z) - Subset Sampling For Progressive Neural Network Learning [106.12874293597754]
Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data.
We propose to speed up this process by exploiting subsets of training data at each incremental training step.
Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
arXiv Detail & Related papers (2020-02-17T18:57:33Z) - BatchEnsemble: An Alternative Approach to Efficient Ensemble and
Lifelong Learning [46.768185367275564]
BatchEnsemble is an ensemble method whose computational and memory costs are significantly lower than typical ensembles.
We show that BatchEnsemble yields competitive accuracy and uncertainties as typical ensembles.
We also apply BatchEnsemble to lifelong learning, where on Split-CIFAR-100, BatchEnsemble yields comparable performance to progressive neural networks.
arXiv Detail & Related papers (2020-02-17T00:00:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.