A Low Complexity Decentralized Neural Net with Centralized Equivalence
using Layer-wise Learning
- URL: http://arxiv.org/abs/2009.13982v1
- Date: Tue, 29 Sep 2020 13:08:12 GMT
- Title: A Low Complexity Decentralized Neural Net with Centralized Equivalence
using Layer-wise Learning
- Authors: Xinyue Liang, Alireza M. Javid, Mikael Skoglund, Saikat Chatterjee
- Abstract summary: We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers)
In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns.
We show that it is possible to achieve equivalent learning performance as if the data is available in a single place.
- Score: 49.15799302636519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We design a low complexity decentralized learning algorithm to train a
recently proposed large neural network in distributed processing nodes
(workers). We assume the communication network between the workers is
synchronized and can be modeled as a doubly-stochastic mixing matrix without
having any master node. In our setup, the training data is distributed among
the workers but is not shared in the training process due to privacy and
security concerns. Using alternating-direction-method-of-multipliers (ADMM)
along with a layerwise convex optimization approach, we propose a decentralized
learning algorithm which enjoys low computational complexity and communication
cost among the workers. We show that it is possible to achieve equivalent
learning performance as if the data is available in a single place. Finally, we
experimentally illustrate the time complexity and convergence behavior of the
algorithm.
Related papers
- From promise to practice: realizing high-performance decentralized training [8.955918346078935]
Decentralized training of deep neural networks has attracted significant attention for its theoretically superior scalability over synchronous data-parallel methods like All-Reduce.
This paper identifies three key factors that can lead to speedups over All-Reduce training and constructs a runtime model to determine when, how, and to what degree decentralization can yield shorter per-it runtimes.
arXiv Detail & Related papers (2024-10-15T19:04:56Z) - Communication-Efficient Decentralized Federated Learning via One-Bit
Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications.
Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging.
We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z) - Federated K-Means Clustering via Dual Decomposition-based Distributed
Optimization [0.0]
This paper shows how dual decomposition can be applied for distributed training of $ K $-means clustering problems.
The training can be performed in a distributed manner by splitting the data across different nodes and linking these nodes through consensus constraints.
arXiv Detail & Related papers (2023-07-25T05:34:50Z) - Online Distributed Learning with Quantized Finite-Time Coordination [0.4910937238451484]
In our setting a set of agents need to cooperatively train a learning model from streaming data.
We propose a distributed algorithm that relies on a quantized, finite-time coordination protocol.
We analyze the performance of the proposed algorithm in terms of the mean distance from the online solution.
arXiv Detail & Related papers (2023-07-13T08:36:15Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Federated Learning with a Sampling Algorithm under Isoperimetry [9.990687944474738]
Federated learning uses a set of techniques to efficiently distribute the training of a machine learning algorithm across several devices.
We propose a communication-efficient variant of Langevinvin's sample a posteriori.
arXiv Detail & Related papers (2022-06-02T08:19:03Z) - Communication-Efficient Distributionally Robust Decentralized Learning [23.612400109629544]
Decentralized learning algorithms empower interconnected edge devices to share data and computational resources.
We propose a single decentralized loop descent/ascent algorithm (ADGDA) to solve the underlying minimax optimization problem.
arXiv Detail & Related papers (2022-05-31T09:00:37Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.