Traversing Between Modes in Function Space for Fast Ensembling
- URL: http://arxiv.org/abs/2306.11304v1
- Date: Tue, 20 Jun 2023 05:52:26 GMT
- Title: Traversing Between Modes in Function Space for Fast Ensembling
- Authors: EungGu Yun, Hyungi Lee, Giung Nam, Juho Lee
- Abstract summary: "Bridge" is a lightweight network that takes minimal features from the original network and predicts outputs for the low-loss subspace without forward passes through the original network.
We empirically demonstrate that we can indeed train such bridge networks and significantly reduce inference costs with the help of bridge networks.
- Score: 15.145136272169946
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep ensemble is a simple yet powerful way to improve the performance of deep
neural networks. Under this motivation, recent works on mode connectivity have
shown that parameters of ensembles are connected by low-loss subspaces, and one
can efficiently collect ensemble parameters in those subspaces. While this
provides a way to efficiently train ensembles, for inference, multiple forward
passes should still be executed using all the ensemble parameters, which often
becomes a serious bottleneck for real-world deployment. In this work, we
propose a novel framework to reduce such costs. Given a low-loss subspace
connecting two modes of a neural network, we build an additional neural network
that predicts the output of the original neural network evaluated at a certain
point in the low-loss subspace. The additional neural network, which we call a
"bridge", is a lightweight network that takes minimal features from the
original network and predicts outputs for the low-loss subspace without forward
passes through the original network. We empirically demonstrate that we can
indeed train such bridge networks and significantly reduce inference costs with
the help of bridge networks.
Related papers
- D'OH: Decoder-Only Random Hypernetworks for Implicit Neural Representations [24.57801400001629]
We present a strategy for the optimization of runtime deep implicit functions for single-instance signals through a Decoder-Only randomly projected Hypernetwork (D'OH)
By directly changing the latent code dimension, we provide a natural way to vary the memory footprint of neural representations without the costly need for neural architecture search.
arXiv Detail & Related papers (2024-03-28T06:18:12Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Fast Conditional Network Compression Using Bayesian HyperNetworks [54.06346724244786]
We introduce a conditional compression problem and propose a fast framework for tackling it.
The problem is how to quickly compress a pretrained large neural network into optimal smaller networks given target contexts.
Our methods can quickly generate compressed networks with significantly smaller sizes than baseline methods.
arXiv Detail & Related papers (2022-05-13T00:28:35Z) - FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training
with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin.
We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z) - Binary Neural Network for Speaker Verification [13.472791713805762]
This paper focuses on how to apply binary neural networks to the task of speaker verification.
Experiment results show that, after binarizing the Convolutional Neural Network, the ResNet34-based network achieves an EER of around 5%.
arXiv Detail & Related papers (2021-04-06T06:04:57Z) - Artificial Neural Networks generated by Low Discrepancy Sequences [59.51653996175648]
We generate artificial neural networks as random walks on a dense network graph.
Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards.
We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
arXiv Detail & Related papers (2021-03-05T08:45:43Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Compact Neural Representation Using Attentive Network Pruning [1.0152838128195465]
We describe a Top-Down attention mechanism that is added to a Bottom-Up feedforward network to select important connections and subsequently prune redundant ones at all parametric layers.
Our method not only introduces a novel hierarchical selection mechanism as the basis of pruning but also remains competitive with previous baseline methods in the experimental evaluation.
arXiv Detail & Related papers (2020-05-10T03:20:01Z) - Mixed-Precision Quantized Neural Network with Progressively Decreasing
Bitwidth For Image Classification and Object Detection [21.48875255723581]
A mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression.
Experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results.
arXiv Detail & Related papers (2019-12-29T14:11:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.