Event-Based Control for Online Training of Neural Networks
- URL: http://arxiv.org/abs/2003.09503v1
- Date: Fri, 20 Mar 2020 21:29:03 GMT
- Title: Event-Based Control for Online Training of Neural Networks
- Authors: Zilong Zhao, Sophie Cerf, Bogdan Robu, Nicolas Marchand
- Abstract summary: We propose two Event-Based control loops to adjust the learning rate of a classical algorithm E (Exponential)/PD (Proportional Derivative)-Control.
The first Event-Based control loop will be implemented to prevent sudden drop of the learning rate when the model is approaching the optimum.
The second Event-Based control loop will decide, based on the learning speed, when to switch to the next data batch.
- Score: 4.640828141705773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Network (CNN) has become the most used method for image
classification tasks. During its training the learning rate and the gradient
are two key factors to tune for influencing the convergence speed of the model.
Usual learning rate strategies are time-based i.e. monotonous decay over time.
Recent state-of-the-art techniques focus on adaptive gradient algorithms i.e.
Adam and its versions. In this paper we consider an online learning scenario
and we propose two Event-Based control loops to adjust the learning rate of a
classical algorithm E (Exponential)/PD (Proportional Derivative)-Control. The
first Event-Based control loop will be implemented to prevent sudden drop of
the learning rate when the model is approaching the optimum. The second
Event-Based control loop will decide, based on the learning speed, when to
switch to the next data batch. Experimental evaluationis provided using two
state-of-the-art machine learning image datasets (CIFAR-10 and CIFAR-100).
Results show the Event-Based E/PD is better than the original algorithm (higher
final accuracy, lower final loss value), and the Double-Event-BasedE/PD can
accelerate the training process, save up to 67% training time compared to
state-of-the-art algorithms and even result in better performance.
Related papers
- Temperature Balancing, Layer-wise Weight Analysis, and Neural Network
Training [58.20089993899729]
This paper proposes TempBalance, a straightforward yet effective layerwise learning rate method.
We show that TempBalance significantly outperforms ordinary SGD and carefully-tuned spectral norm regularization.
We also show that TempBalance outperforms a number of state-of-the-art metrics and schedulers.
arXiv Detail & Related papers (2023-12-01T05:38:17Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - FastHebb: Scaling Hebbian Training of Deep Neural Networks to ImageNet
Level [7.410940271545853]
We present FastHebb, an efficient and scalable solution for Hebbian learning.
FastHebb outperforms previous solutions by up to 50 times in terms of training speed.
For the first time, we are able to bring Hebbian algorithms to ImageNet scale.
arXiv Detail & Related papers (2022-07-07T09:04:55Z) - Value-Consistent Representation Learning for Data-Efficient
Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making.
Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values.
It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z) - Learning in Feedback-driven Recurrent Spiking Neural Networks using
full-FORCE Training [4.124948554183487]
We propose a supervised training procedure for RSNNs, where a second network is introduced only during the training.
The proposed training procedure consists of generating targets for both recurrent and readout layers.
We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems.
arXiv Detail & Related papers (2022-05-26T19:01:19Z) - AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs.
AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z) - Image reconstruction algorithms in radio interferometry: from
handcrafted to learned denoisers [7.1439425093981574]
We introduce a new class of iterative image reconstruction algorithms for radio interferometry, inspired by plug-and-play methods.
The approach consists in learning a prior image model by training a deep neural network (DNN) as a denoiser.
We plug the learned denoiser into the forward-backward optimization algorithm, resulting in a simple iterative structure alternating a denoising step with a gradient-descent data-fidelity step.
arXiv Detail & Related papers (2022-02-25T20:26:33Z) - An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy
Image Compression Systems [73.48927855855219]
Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark.
In this paper, we perform the first large-scale comparison of recent state-of-the-art hybrid neural compression algorithms.
arXiv Detail & Related papers (2022-01-27T19:47:51Z) - Training Aware Sigmoidal Optimizer [2.99368851209995]
Training Aware Sigmoidal functions present landscapes with much more saddle loss than local minima.
We proposed the Training Aware Sigmoidal functions (TASO), which consists of a two-phases automated learning rate schedule.
We compared the proposed approach with commonly used adaptive learning rate schedules such as Adam, RMS, and Adagrad.
arXiv Detail & Related papers (2021-02-17T12:00:46Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step.
Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.