Related papers: Subset Sampling For Progressive Neural Network Learning

Subset Sampling For Progressive Neural Network Learning

URL: http://arxiv.org/abs/2002.07141v2
Date: Mon, 25 May 2020 17:47:19 GMT
Title: Subset Sampling For Progressive Neural Network Learning
Authors: Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis
Abstract summary: Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data. We propose to speed up this process by exploiting subsets of training data at each incremental training step. Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
Score: 106.12874293597754
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data. While this approach exempts the users from the manual task of designing and validating multiple network topologies, it often requires an enormous number of computations. In this paper, we propose to speed up this process by exploiting subsets of training data at each incremental training step. Three different sampling strategies for selecting the training samples according to different criteria are proposed and evaluated. We also propose to perform online hyperparameter selection during the network progression, which further reduces the overall training time. Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably while operating on par with the baseline approach exploiting the entire training set throughout the training process.

Related papers

Training of Spiking Neural Networks with Expectation-Propagation [9.24888258922809]
We propose a unifying message-passing framework for training spiking neural networks (SNNs)<n>Our gradient-free method is capable of learning the marginal distributions of network parameters and simultaneously marginalizes parameters, such as the outputs of hidden layers.
arXiv Detail & Related papers (2025-06-30T11:59:56Z)
Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications. We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time. Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z)
Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer [9.849498498869258]
We introduce the class of SO-friendly neural networks, which include several models used in practice. Performing a precise line search to set the step size has the same cost during full-batch training as using a fixed learning. For the same cost a planesearch can be used to set both the learning and momentum rate on each step.
arXiv Detail & Related papers (2024-06-25T22:06:40Z)
SpikePipe: Accelerated Training of Spiking Neural Networks via Inter-Layer Pipelining and Multiprocessor Scheduling [5.2831841848274985]
Training Spiking Neural Networks (SNNs) is computationally expensive compared to their conventional counterparts. This is the first paper to propose inter-layer pipelining to accelerate training in SNNs using systolic array-based processors and multiprocessor scheduling.
arXiv Detail & Related papers (2024-06-11T01:43:45Z)
Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks. We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations. We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z)
Benchmarking Neural Network Training Algorithms [52.890134877995195]
Training algorithms are an essential part of every deep learning pipeline.<n>As a community, we are unable to reliably identify training algorithm improvements.<n>We introduce a new, competitive, time-to-result benchmark using multiple workloads running on fixed hardware.
arXiv Detail & Related papers (2023-06-12T15:21:02Z)
Deep Active Learning with Structured Neural Depth Search [18.180995603975422]
Active-iNAS trains several models and selects the model with the best generalization performance for querying the subsequent samples after each active learning cycle. We propose a novel active strategy with the method called structured variational inference (SVI) or structured neural depth search (SNDS) At the same time, we theoretically demonstrate that the current VI-based methods based on the mean-field assumption could lead to poor performance.
arXiv Detail & Related papers (2023-06-05T12:00:12Z)
Explaining Deep Learning Representations by Tracing the Training Process [10.774699463547439]
We propose a novel explanation method that explains the decisions of a deep neural network. We investigate how the intermediate representations at each layer of the deep network were refined during the training process. We show that our method identifies highly representative training instances that can be used as an explanation.
arXiv Detail & Related papers (2021-09-13T11:29:04Z)
Improving the Accuracy of Early Exits in Multi-Exit Architectures via Curriculum Learning [88.17413955380262]
Multi-exit architectures allow deep neural networks to terminate their execution early in order to adhere to tight deadlines at the cost of accuracy. We introduce a novel method called Multi-Exit Curriculum Learning that utilizes curriculum learning. Our method consistently improves the accuracy of early exits compared to the standard training approach.
arXiv Detail & Related papers (2021-04-21T11:12:35Z)
All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network. Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student. To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Local Critic Training for Model-Parallel Learning of Deep Neural Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training. We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.