Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays
- URL: http://arxiv.org/abs/2106.12261v1
- Date: Wed, 23 Jun 2021 09:36:36 GMT
- Title: Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays
- Authors: Rotem Zamir Aviv (1), Ido Hakimi (2), Assaf Schuster (2), Kfir Y. Levy
(1 and 3) ((1) Department of Electrical and Computer Engineering, Technion,
(2) Department of Computer Science, Technion, (3) A Viterbi Fellow)
- Abstract summary: We consider convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory.
We propose a robust training method for the constrained setting and derive non convergence guarantees that do not depend on prior knowledge of update delays, objective smoothness, and variance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We consider stochastic convex optimization problems, where several machines
act asynchronously in parallel while sharing a common memory. We propose a
robust training method for the constrained setting and derive non asymptotic
convergence guarantees that do not depend on prior knowledge of update delays,
objective smoothness, and gradient variance. Conversely, existing methods for
this setting crucially rely on this prior knowledge, which render them
unsuitable for essentially all shared-resources computational environments,
such as clouds and data centers. Concretely, existing approaches are unable to
accommodate changes in the delays which result from dynamic allocation of the
machines, while our method implicitly adapts to such changes.
Related papers
- Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients")
Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift")
We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z) - Robust Networked Federated Learning for Localization [7.332862402432447]
This paper addresses the problem of approximation, which is non-smooth in a federated setting where the data is distributed across a multitude of devices.
We propose a method that adopts an $L_$-norm that a robust formulation within a distributed subgradient framework, explicitly designed to handle these obstacles.
arXiv Detail & Related papers (2023-08-31T13:54:37Z) - Resilient Constrained Learning [94.27081585149836]
This paper presents a constrained learning approach that adapts the requirements while simultaneously solving the learning task.
We call this approach resilient constrained learning after the term used to describe ecological systems that adapt to disruptions by modifying their operation.
arXiv Detail & Related papers (2023-06-04T18:14:18Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Object Representations as Fixed Points: Training Iterative Refinement
Algorithms with Implicit Differentiation [88.14365009076907]
Iterative refinement is a useful paradigm for representation learning.
We develop an implicit differentiation approach that improves the stability and tractability of training.
arXiv Detail & Related papers (2022-07-02T10:00:35Z) - Delay-adaptive step-sizes for asynchronous learning [8.272788656521415]
We show that it is possible to use learning rates that depend on the actual time-varying delays in the system.
For each of these methods, we demonstrate how delays can be measured on-line, present delay-adaptive step-size policies, and illustrate their theoretical and practical advantages over the state-of-the-art.
arXiv Detail & Related papers (2022-02-17T09:51:22Z) - Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent
Flows [4.817429789586127]
We introduce a Poly-based optimization framework for achieving acceleration, based on the notion of fixed-time stability dynamical systems.
We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms.
arXiv Detail & Related papers (2021-12-02T16:04:40Z) - Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems.
Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z) - Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process.
We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.