Related papers: Adaptive Serverless Learning

Adaptive Serverless Learning

URL: http://arxiv.org/abs/2008.10422v1
Date: Mon, 24 Aug 2020 13:23:02 GMT
Title: Adaptive Serverless Learning
Authors: Hongchang Gao, Heng Huang
Abstract summary: We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically. Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers. To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
Score: 114.36410688552579
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the emergence of distributed data, training machine learning models in the serverless manner has attracted increasing attention in recent years. Numerous training approaches have been proposed in this regime, such as decentralized SGD. However, all existing decentralized algorithms only focus on standard SGD. It might not be suitable for some applications, such as deep factorization machine in which the feature is highly sparse and categorical so that the adaptive training algorithm is needed. In this paper, we propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically. To the best of our knowledge, this is the first adaptive decentralized training approach. Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers. Moreover, to reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach, which can also achieve linear speedup with respect to the number of workers. At last, extensive experiments on different tasks have confirmed the effectiveness of our proposed two approaches.

Related papers

Leveraging Stochastic Depth Training for Adaptive Inference [1.996143466020199]
We propose a simpler yet effective alternative for adaptive inference with a zero-overhead, single-model, and time-predictable inference.<n>Compared to original ResNets, our method shows improvements of up to 2X in power efficiency at accuracy drops as low as 0.71%.
arXiv Detail & Related papers (2025-05-23T08:36:56Z)
MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning [0.0]
We introduce a new active learning method to enhance data-efficiency for on-line surrogate training. The surrogate is trained to predict a given timestep directly with different initial and boundary conditions parameters. Preliminary results for 2D heat PDE demonstrate the potential of this method, called Breed, to improve the generalization capabilities of surrogates.
arXiv Detail & Related papers (2024-10-08T09:52:15Z)
Local Methods with Adaptivity via Scaling [38.99428012275441]
This paper aims to merge the local training technique with the adaptive approach to develop efficient distributed learning methods. We consider the classical Local SGD method and enhance it with a scaling feature. In addition to theoretical analysis, we validate the performance of our methods in practice by training a neural network.
arXiv Detail & Related papers (2024-06-02T19:50:05Z)
Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance [15.498559530889839]
In this paper, we put forward a decentralized learning based Alternative Multi-Agent Proximal Policy Optimization (IA-MAPPO) algorithm to execute the pursuit avoidance task in well-formed swarm. We utilize imitation learning to decentralize the formation controller, so as to reduce the communication overheads and enhance the scalability. The simulation results validate the effectiveness of IA-MAPPO and extensive ablation experiments further show the performance comparable to a centralized solution with significant decrease in communication overheads.
arXiv Detail & Related papers (2023-11-06T06:58:16Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
Online Distributed Learning with Quantized Finite-Time Coordination [0.4910937238451484]
In our setting a set of agents need to cooperatively train a learning model from streaming data. We propose a distributed algorithm that relies on a quantized, finite-time coordination protocol. We analyze the performance of the proposed algorithm in terms of the mean distance from the online solution.
arXiv Detail & Related papers (2023-07-13T08:36:15Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput. Until today, no such learning-based algorithms have shown practical potential in this domain. We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks. We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z)
A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning [49.15799302636519]
We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers) In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns. We show that it is possible to achieve equivalent learning performance as if the data is available in a single place.
arXiv Detail & Related papers (2020-09-29T13:08:12Z)
Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.