Related papers: Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

URL: http://arxiv.org/abs/2112.05387v1
Date: Fri, 10 Dec 2021 08:45:35 GMT
Title: Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks
Authors: Qi Sun, Hexin Dong, Zewei Chen, Jiacheng Sun, Zhenguo Li and Bin Dong
Abstract summary: auxiliary-variable methods have attracted much interest lately but suffer from significant communication overhead and lack of data augmentation. We present a novel joint learning framework for training realistic ResNets across multiple compute devices. We demonstrate the effectiveness of our methods on ResNets and WideResNets across CIFAR-10, CIFAR-100, and ImageNet datasets.
Score: 28.775355111614484
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient-based methods for the distributed training of residual networks (ResNets) typically require a forward pass of the input data, followed by back-propagating the error gradient to update model parameters, which becomes time-consuming as the network goes deeper. To break the algorithmic locking and exploit synchronous module parallelism in both the forward and backward modes, auxiliary-variable methods have attracted much interest lately but suffer from significant communication overhead and lack of data augmentation. In this work, a novel joint learning framework for training realistic ResNets across multiple compute devices is established by trading off the storage and recomputation of external auxiliary variables. More specifically, the input data of each independent processor is generated from its low-capacity auxiliary network (AuxNet), which permits the use of data augmentation and realizes forward unlocking. The backward passes are then executed in parallel, each with a local loss function that originates from the penalty or augmented Lagrangian (AL) methods. Finally, the proposed AuxNet is employed to reproduce the updated auxiliary variables through an end-to-end training process. We demonstrate the effectiveness of our methods on ResNets and WideResNets across CIFAR-10, CIFAR-100, and ImageNet datasets, achieving speedup over the traditional layer-serial training method while maintaining comparable testing accuracy.

Related papers

LayerDropBack: A Universally Applicable Approach for Accelerating Training of Deep Networks [5.00301731167245]
We introduce LayerDropBack (LDB), a simple yet effective method to accelerate training across a wide range of deep networks. LDB introduces randomness only in the backward pass, maintaining the integrity of the forward pass. Our experiments show significant training time reductions of 16.93% to 23.97%, while preserving or even enhancing model accuracy.
arXiv Detail & Related papers (2024-12-23T22:39:41Z)
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs. We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning [19.978542231976636]
This paper proposes a novel method to reduce the parameters and FLOPs for computational efficiency in deep learning models. We introduce accuracy and efficiency coefficients to control the trade-off between the accuracy of the network and its computing efficiency.
arXiv Detail & Related papers (2023-01-26T12:32:01Z)
Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z)
Learning in Feedback-driven Recurrent Spiking Neural Networks using full-FORCE Training [4.124948554183487]
We propose a supervised training procedure for RSNNs, where a second network is introduced only during the training. The proposed training procedure consists of generating targets for both recurrent and readout layers. We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems.
arXiv Detail & Related papers (2022-05-26T19:01:19Z)
A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching. We conduct large scale online A/B tests on DiDi's ride-dispatching platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z)
Implicit recurrent networks: A novel approach to stationary input processing with recurrent neural networks in deep learning [0.0]
In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning. We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks. A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
arXiv Detail & Related papers (2020-10-20T18:55:32Z)
A Practical Layer-Parallel Training Algorithm for Residual Networks [41.267919563145604]
gradient-based algorithms for training ResNets typically require a forward pass of the input data, followed by back-propagating the objective gradient to update parameters. We propose a novel serial-parallel hybrid training strategy to enable the use of data augmentation, together with downsampling filters to reduce the communication cost.
arXiv Detail & Related papers (2020-09-03T06:03:30Z)
Coded Computing for Federated Learning at the Edge [3.385874614913973]
Federated Learning (FL) enables training a global model from data generated locally at the client nodes, without moving client data to a centralized server. Recent work proposes to mitigate stragglers and speed up training for linear regression tasks by assigning redundant computations at the MEC server. We develop CodedFedL that addresses the difficult task of extending CFL to distributed non-linear regression and classification problems with multioutput labels.
arXiv Detail & Related papers (2020-07-07T08:20:47Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.