SuperNet -- An efficient method of neural networks ensembling
- URL: http://arxiv.org/abs/2003.13021v1
- Date: Sun, 29 Mar 2020 13:47:13 GMT
- Title: SuperNet -- An efficient method of neural networks ensembling
- Authors: Ludwik Bukowski, Witold Dzwinel
- Abstract summary: Main flaw of neural network ensembling is that it is exceptionally demanding computationally.
The goal of the master thesis is to speed up the execution time required for ensemble generation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main flaw of neural network ensembling is that it is exceptionally
demanding computationally, especially, if the individual sub-models are large
neural networks, which must be trained separately. Having in mind that modern
DNNs can be very accurate, they are already the huge ensembles of simple
classifiers, and that one can construct more thrifty compressed neural net of a
similar performance for any ensemble, the idea of designing the expensive
SuperNets can be questionable. The widespread belief that ensembling increases
the prediction time, makes it not attractive and can be the reason that the
main stream of ML research is directed towards developing better loss functions
and learning strategies for more advanced and efficient neural networks. On the
other hand, all these factors make the architectures more complex what may lead
to overfitting and high computational complexity, that is, to the same flaws
for which the highly parametrized SuperNets ensembles are blamed. The goal of
the master thesis is to speed up the execution time required for ensemble
generation. Instead of training K inaccurate sub-models, each of them can
represent various phases of training (representing various local minima of the
loss function) of a single DNN [Huang et al., 2017; Gripov et al., 2018]. Thus,
the computational performance of the SuperNet can be comparable to the maximum
CPU time spent on training its single sub-model, plus usually much shorter CPU
time required for training the SuperNet coupling factors.
Related papers
- Algebraic Representations for Faster Predictions in Convolutional Neural Networks [0.0]
Convolutional neural networks (CNNs) are a popular choice of model for tasks in computer vision.
skip connections may be added to create an easier gradient optimization problem.
We show that arbitrarily complex, trained, linear CNNs with skip connections can be simplified into a single-layer model.
arXiv Detail & Related papers (2024-08-14T21:10:05Z) - Message Passing Variational Autoregressive Network for Solving Intractable Ising Models [6.261096199903392]
Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks.
Here we propose a variational autoregressive architecture with a message passing mechanism, which can effectively utilize the interactions between spin variables.
The new network trained under an annealing framework outperforms existing methods in solving several prototypical Ising spin Hamiltonians, especially for larger spin systems at low temperatures.
arXiv Detail & Related papers (2024-04-09T11:27:07Z) - A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Dash: Accelerating Distributed Private Convolutional Neural Network Inference with Arithmetic Garbled Circuits [6.912820984005411]
We present Dash, a fast and distributed private convolutional neural network inference scheme secure against malicious attackers.
Building on arithmetic garbling gadgets [BMR16] and fancy-garbling [BCM+19], Dash is based purely on arithmetic garbled circuits.
arXiv Detail & Related papers (2023-02-13T13:48:08Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Rapid training of quantum recurrent neural network [26.087244189340858]
We propose a Quantum Recurrent Neural Network (QRNN) to address these obstacles.
The design of the network is based on the continuous-variable quantum computing paradigm.
Our numerical simulations show that the QRNN converges to optimal weights in fewer epochs than the classical network.
arXiv Detail & Related papers (2022-07-01T12:29:33Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Simultaneous Training of Partially Masked Neural Networks [67.19481956584465]
We show that it is possible to train neural networks in such a way that a predefined 'core' subnetwork can be split-off from the trained full network with remarkable good performance.
We show that training a Transformer with a low-rank core gives a low-rank model with superior performance than when training the low-rank model alone.
arXiv Detail & Related papers (2021-06-16T15:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.