Local Learning with Neuron Groups
- URL: http://arxiv.org/abs/2301.07635v1
- Date: Wed, 18 Jan 2023 16:25:10 GMT
- Title: Local Learning with Neuron Groups
- Authors: Adeetya Patel, Michael Eickenberg, Eugene Belilovsky
- Abstract summary: Local learning is an approach to model-parallelism that removes the standard end-to-end learning setup.
We study how local learning can be applied at the level of splitting layers or modules into sub-components.
- Score: 15.578925277062657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional deep network training methods optimize a monolithic objective
function jointly for all the components. This can lead to various
inefficiencies in terms of potential parallelization. Local learning is an
approach to model-parallelism that removes the standard end-to-end learning
setup and utilizes local objective functions to permit parallel learning
amongst model components in a deep network. Recent works have demonstrated that
variants of local learning can lead to efficient training of modern deep
networks. However, in terms of how much computation can be distributed, these
approaches are typically limited by the number of layers in a network. In this
work we propose to study how local learning can be applied at the level of
splitting layers or modules into sub-components, adding a notion of width-wise
modularity to the existing depth-wise modularity associated with local
learning. We investigate local-learning penalties that permit such models to be
trained efficiently. Our experiments on the CIFAR-10, CIFAR-100, and Imagenet32
datasets demonstrate that introducing width-level modularity can lead to
computational advantages over existing methods based on local learning and
opens new opportunities for improved model-parallel distributed training. Code
is available at: https://github.com/adeetyapatel12/GN-DGL.
Related papers
- HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion [7.9514535887836795]
We propose a novel model that performs hierarchical locally supervised learning and patch-level feature on auxiliary networks.
We conduct experiments on CIFAR-10, STL-10, SVHN, and ImageNet datasets, and the results demonstrate that our proposed HPFF significantly outperforms previous approaches.
arXiv Detail & Related papers (2024-07-08T06:05:19Z) - Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise
Training of Neural Networks [9.718519843862937]
We introduce a block-wise BP-free (BWBPF) neural network that leverages local error signals to optimize sub-neural networks separately.
Our experimental results consistently show that this approach can identify transferable decoupled architectures for VGG and ResNet variations.
arXiv Detail & Related papers (2023-12-20T08:02:33Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Flexible Parallel Learning in Edge Scenarios: Communication,
Computational and Energy Cost [20.508003076947848]
Fog- and IoT-based scenarios often require combining both approaches.
We present a framework for flexible parallel learning (FPL), achieving both data and model parallelism.
Our experiments, carried out using state-of-the-art deep-network architectures and large-scale datasets, confirm that FPL allows for an excellent trade-off among computational (hence energy) cost, communication overhead, and learning performance.
arXiv Detail & Related papers (2022-01-19T03:47:04Z) - Clustered Federated Learning via Generalized Total Variation
Minimization [83.26141667853057]
We study optimization methods to train local (or personalized) models for local datasets with a decentralized network structure.
Our main conceptual contribution is to formulate federated learning as total variation minimization (GTV)
Our main algorithmic contribution is a fully decentralized federated learning algorithm.
arXiv Detail & Related papers (2021-05-26T18:07:19Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Parallel Training of Deep Networks with Local Updates [84.30918922367442]
Local parallelism is a framework which parallelizes training of individual layers in deep networks by replacing global backpropagation with truncated layer-wise backpropagation.
We show results in both vision and language domains across a diverse set of architectures, and find that local parallelism is particularly effective in the high-compute regime.
arXiv Detail & Related papers (2020-12-07T16:38:45Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - Distributed Training of Deep Learning Models: A Taxonomic Perspective [11.924058430461216]
Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster.
We aim to shine some light on the fundamental principles that are at work when training deep neural networks in a cluster of independent machines.
arXiv Detail & Related papers (2020-07-08T08:56:58Z) - Modularizing Deep Learning via Pairwise Learning With Kernels [12.051746916737343]
We present an alternative view on finitely wide, fully trainable deep computation neural networks as stacked linear models in feature spaces.
We then propose a provably optimal modular learning framework for classification that does not require between- module backpropagation.
arXiv Detail & Related papers (2020-05-12T04:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.