Related papers: SAE: Sequential Anchored Ensembles

SAE: Sequential Anchored Ensembles

URL: http://arxiv.org/abs/2201.00649v1
Date: Thu, 30 Dec 2021 12:47:27 GMT
Title: SAE: Sequential Anchored Ensembles
Authors: Arnaud Delaunoy, Gilles Louppe
Abstract summary: We present Sequential Anchored Ensembles (SAE), a lightweight alternative to anchored ensembles. Instead of training each member of the ensemble from scratch, the members are trained sequentially on losses sampled with high auto-correlation. SAE outperform anchored ensembles, for a given computational budget, on some benchmarks while showing comparable performance on the others.
Score: 7.888755225607877
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Computing the Bayesian posterior of a neural network is a challenging task due to the high-dimensionality of the parameter space. Anchored ensembles approximate the posterior by training an ensemble of neural networks on anchored losses designed for the optima to follow the Bayesian posterior. Training an ensemble, however, becomes computationally expensive as its number of members grows since the full training procedure is repeated for each member. In this note, we present Sequential Anchored Ensembles (SAE), a lightweight alternative to anchored ensembles. Instead of training each member of the ensemble from scratch, the members are trained sequentially on losses sampled with high auto-correlation, hence enabling fast convergence of the neural networks and efficient approximation of the Bayesian posterior. SAE outperform anchored ensembles, for a given computational budget, on some benchmarks while showing comparable performance on the others and achieved 2nd and 3rd place in the light and extended tracks of the NeurIPS 2021 Approximate Inference in Bayesian Deep Learning competition.

Related papers

ANDHRA Bandersnatch: Training Neural Networks to Predict Parallel Realities [0.0]
This work introduces a novel neural network architecture that splits the same input signal into parallel branches at each layer. The branched layers do not merge and form separate network paths, leading to multiple network heads for output prediction.
arXiv Detail & Related papers (2024-11-28T15:36:34Z)
Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications. We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time. Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z)
On Sequential Maximum a Posteriori Inference for Continual Learning [0.0]
We formulate sequential maximum a posteriori inference as a recursion of loss functions and reduce the problem of continual learning to approximating the previous loss function. We propose two coreset-free methods: autodiff quadratic consolidation, which uses an accurate and full quadratic approximation, and neural consolidation, which uses a neural network approximation. We find that neural consolidation performs well in the classical task sequences, where the input dimension is small, while autodiff quadratic consolidation performs consistently well in image task sequences with a fixed pre-trained feature extractor.
arXiv Detail & Related papers (2024-05-26T09:20:47Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network. We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z)
Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in Deep Learning [24.3370326359959]
We propose to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep neural networks. We theoretically validate that our approach mitigates overconfidence "far away" from the training data and empirically compare against state-of-the-art baselines on standard uncertainty quantification benchmarks.
arXiv Detail & Related papers (2021-11-05T15:52:48Z)
Total Recall: a Customized Continual Learning Method for Neural Semantic Parsers [38.035925090154024]
A neural semantic learns tasks sequentially without accessing full training data from previous tasks. We propose TotalRecall, a continual learning method designed for neural semantics from two aspects. We demonstrate that a neural semantic trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch.
arXiv Detail & Related papers (2021-09-11T04:33:28Z)
FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin. We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z)
Greedy Bayesian Posterior Approximation with Deep Ensembles [22.466176036646814]
Ensembles of independently trained objective are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning. We show that our method is submodular with respect to the mixture of components for any problem in a function space.
arXiv Detail & Related papers (2021-05-29T11:35:27Z)
Bayesian Deep Ensembles via the Neural Tangent Kernel [49.569912265882124]
We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK) We introduce a simple modification to standard deep ensembles training, through addition of a computationally-tractable, randomised and untrainable function to each ensemble member. We prove that our Bayesian deep ensembles make more conservative predictions than standard deep ensembles in the infinite width limit.
arXiv Detail & Related papers (2020-07-11T22:10:52Z)
Subset Sampling For Progressive Neural Network Learning [106.12874293597754]
Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data. We propose to speed up this process by exploiting subsets of training data at each incremental training step. Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
arXiv Detail & Related papers (2020-02-17T18:57:33Z)
BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning [46.768185367275564]
BatchEnsemble is an ensemble method whose computational and memory costs are significantly lower than typical ensembles. We show that BatchEnsemble yields competitive accuracy and uncertainties as typical ensembles. We also apply BatchEnsemble to lifelong learning, where on Split-CIFAR-100, BatchEnsemble yields comparable performance to progressive neural networks.
arXiv Detail & Related papers (2020-02-17T00:00:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.