An In-depth Study of Stochastic Backpropagation
- URL: http://arxiv.org/abs/2210.00129v1
- Date: Fri, 30 Sep 2022 23:05:06 GMT
- Title: An In-depth Study of Stochastic Backpropagation
- Authors: Jun Fang, Mingze Xu, Hao Chen, Bing Shuai, Zhuowen Tu, Joseph Tighe
- Abstract summary: We study Backpropagation (SBP) when training deep neural networks for standard image classification and object detection tasks.
During backward propagation, SBP calculates gradients by only using a subset of feature maps to save the GPU memory and computational cost.
Experiments on image classification and object detection show that SBP can save up to 40% of GPU memory with less than 1% accuracy.
- Score: 44.953669040828345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we provide an in-depth study of Stochastic Backpropagation
(SBP) when training deep neural networks for standard image classification and
object detection tasks. During backward propagation, SBP calculates the
gradients by only using a subset of feature maps to save the GPU memory and
computational cost. We interpret SBP as an efficient way to implement
stochastic gradient decent by performing backpropagation dropout, which leads
to considerable memory saving and training process speedup, with a minimal
impact on the overall model accuracy. We offer some good practices to apply SBP
in training image recognition models, which can be adopted in learning a wide
range of deep neural networks. Experiments on image classification and object
detection show that SBP can save up to 40% of GPU memory with less than 1%
accuracy degradation.
Related papers
- Advancing Training Efficiency of Deep Spiking Neural Networks through Rate-based Backpropagation [8.683798989767771]
Recent insights have revealed that rate-coding is a primary form of information representation captured by surrogate-gradient-based Backpropagation Through Time (BPTT) in training deep Spiking Neural Networks (SNNs)
We propose rate-based backpropagation, a training strategy specifically designed to exploit rate-based representations to reduce the complexity of BPTT.
Our method minimizes reliance on detailed temporal derivatives by focusing on averaged dynamics, streamlining the computational graph to reduce memory and computational demands of SNNs training.
arXiv Detail & Related papers (2024-10-15T10:46:03Z) - Efficient Backpropagation with Variance-Controlled Adaptive Sampling [32.297478086982466]
Sampling-based algorithms, which eliminate ''unimportant'' computations during forward and/or back propagation (BP), offer potential solutions to accelerate neural network training.
We introduce a variance-controlled adaptive sampling (VCAS) method designed to accelerate BP.
VCAS can preserve the original training loss trajectory and validation accuracy with an up to 73.87% FLOPs reduction of BP and 49.58% FLOPs reduction of the whole training process.
arXiv Detail & Related papers (2024-02-27T05:40:36Z) - Towards Memory- and Time-Efficient Backpropagation for Training Spiking
Neural Networks [70.75043144299168]
Spiking Neural Networks (SNNs) are promising energy-efficient models for neuromorphic computing.
We propose the Spatial Learning Through Time (SLTT) method that can achieve high performance while greatly improving training efficiency.
Our method achieves state-of-the-art accuracy on ImageNet, while the memory cost and training time are reduced by more than 70% and 50%, respectively, compared with BPTT.
arXiv Detail & Related papers (2023-02-28T05:01:01Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Stochastic Backpropagation: A Memory Efficient Strategy for Training
Video Models [42.31924917984774]
We propose a memory efficient method, named Backpropagation (SBP), for training deep neural networks on videos.
Experiments show that SBP can be applied to a wide range of models for video tasks, leading to up to 80.0% GPU memory saving and 10% training speedup with less than 1% accuracy drop on action recognition and temporal action detection.
arXiv Detail & Related papers (2022-03-31T02:24:53Z) - Deep Q-network using reservoir computing with multi-layered readout [0.0]
Recurrent neural network (RNN) based reinforcement learning (RL) is used for learning context-dependent tasks.
An approach with replay memory introducing reservoir computing has been proposed, which trains an agent without BPTT.
This paper shows that the performance of this method improves by using a multi-layered neural network for the readout layer.
arXiv Detail & Related papers (2022-03-03T00:32:55Z) - FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose.
We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence.
Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z) - Low-memory stochastic backpropagation with multi-channel randomized
trace estimation [6.985273194899884]
We propose to approximate the gradient of convolutional layers in neural networks with a multi-channel randomized trace estimation technique.
Compared to other methods, this approach is simple, amenable to analyses, and leads to a greatly reduced memory footprint.
We discuss the performance of networks trained with backpropagation and how the error can be controlled while maximizing memory usage and minimizing computational overhead.
arXiv Detail & Related papers (2021-06-13T13:54:02Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - BP-DIP: A Backprojection based Deep Image Prior [49.375539602228415]
We propose two image restoration approaches: (i) Deep Image Prior (DIP), which trains a convolutional neural network (CNN) from scratch in test time using the degraded image; and (ii) a backprojection (BP) fidelity term, which is an alternative to the standard least squares loss that is usually used in previous DIP works.
We demonstrate the performance of the proposed method, termed BP-DIP, on the deblurring task and show its advantages over the plain DIP, with both higher PSNR values and better inference run-time.
arXiv Detail & Related papers (2020-03-11T17:09:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.